问题描述
我尝试从json类型数据生成csv文件。这些是我的json测试数据。
{
"realtime_start":"2020-09-25","realtime_end":"2020-09-25","units": "Percent","seriess": [
{
"name": "James","age": 29,"house": "CA"
},{
"name": "Jina","age": 39,"house": "MA","notes": "Million tonne punch"
},}
问题是json数组类型"seriess
“在所有每个节点中都不包含"notes"
节点。
我做了下面的Java代码,以将此json数据更改为带有标题列的csv文件
JSONObject json = getJsonFileFromURL(...)
JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));
docsArray.put(json.get("units"));
JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
Builder csvSchemaBuilder = CsvSchema.builder();
for(JsonNode node : jsonTree) {
node.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
}
CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"),jsonTree);
realtime_start,realtime_end,units,names,age,house,realtime_start,notes,.....
生成的标题列不包含不同的值。标题列重复添加。如何生成如下所示的不同标头
realtime_start,notes
有什么主意吗?
更新部分
我尝试从FRED(圣路易斯联邦储备银行)提取数据。 FRED提供了如下所示的简单便捷的Python API,
from fredapi import Fred
import pandas as pd
fred = Fred(api_key='abcdefghijklmnopqrstuvwxyz0123456789')
data_unemploy = fred.search('Unemployment Rate in California')
data_unemploy.to_csv("test_unemploy.csv")
但是不赞成使用Java api,因此我必须开发将json值转换为csv文件的简单Java api。我通过谷歌搜索找到了以下Java代码
JSONObject json = getJsonFileFromURL("https://api.stlouisfed.org/fred/series/search?search_text=Unemployment+Rate+in+California&api_key=abcdefghijklmnopqrstuvwxyz0123456789&file_type=json");
JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));
JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
JsonNode firstObject = jsonTree.elements().next(); // I am struggling with this line
firstObject.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"),jsonTree);
要从json数据JsonNode firstObject = jsonTree.elements().next();
中提取列,请返回第一个json节点。但是此行不返回notes
列。因为第一行不包含notes
键值。
所以我将此代码行更改为以下几行
for(JsonNode node : jsonTree) {
node.fieldNames().forEachRemaining(fieldName -> {
csvSchemaBuilder.addColumn(fieldName);
} );
}
但是这些行引发了我所不期望的结果。重复的重复列如下所示
realtime_start,.....
我完全被这部分困住了。
解决方法
您可以使用Apache Commons IO库进行操作
pom.xml
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>
ConvertJsonToCSVTest.java
import java.io.File;
import org.apache.commons.io.FileUtils;
import org.json.*;
public class ConvertJsonToCSVTest {
public static void main(String[] args) throws JSONException {
String jsonArrayString = "{\"fileName\": [{\"first name\": \"Adam\",\"last name\": \"Smith\",\"location\": \"London\"}]}";
JSONObject output;
try {
output = new JSONObject(jsonArrayString);
JSONArray docs = output.getJSONArray("fileName");
File file = new File("EmpDetails.csv");
String csv = CDL.toString(docs);
FileUtils.writeStringToFile(file,csv);
System.out.println("Data has been Sucessfully Writeen to "+ file);
System.out.println(csv);
}
catch(Exception e) {
e.printStackTrace();
}
}
}
输出
Data has been Sucessfully Writeen to EmpDetails.csv
last name,first name,location
Smith,Adam,London
,
最可能编写如下所示的bin类型的类最简单:
public class CsvVo {
private String realtime_start;
private String realtime_end;
private String units;
private String name;
private String age;
private String house;
private String notes;
public void setRealtime_start(String realtime_start) {
this.realtime_start = realtime_start;
}
//Other getters and Setters
然后您可以编写:
public class ConvertJsonToCSVTest {
public static void main(String[] args) throws JSONException {
String jsonArrayString = "{\n" +
"\t\"realtime_start\": \"2020-09-25\",\n" +
"\t\"realtime_end\": \"2020-09-25\",\n" +
"\t\"units\": \"Percent\",\n" +
"\t\"seriess\": [{\n" +
"\t\t\t\"name\": \"James\",\n" +
"\t\t\t\"age\": 29,\n" +
"\t\t\t\"house\": \"CA\"\n" +
"\t\t},\n" +
"\t\t{\n" +
"\t\t\t\"name\": \"Jina\",\n" +
"\t\t\t\"age\": 39,\n" +
"\t\t\t\"house\": \"MA\",\n" +
"\t\t\t\"notes\": \"Million tonne punch\"\n" +
"\t\t}\n" +
"\t]\n" +
"}";
JSONObject inJson;
List<CsvVo> list = new ArrayList<>();
inJson = new JSONObject(jsonArrayString);
JSONArray inJsonSeries = inJson.getJSONArray("seriess");
for (int i = 0,size = inJsonSeries.length(); i < size; i++){
CsvVo line = new CsvVo();
line.setRealtime_start(inJson.get("realtime_start").toString());
line.setRealtime_end(inJson.get("realtime_end").toString());
line.setUnits(inJson.get("units").toString());
JSONObject o = (JSONObject)inJsonSeries.get(i);
line.setName(o.get("name").toString());
line.setAge(o.get("age").toString());
line.setHouse(o.get("house").toString());
try {
line.setNotes(o.get("notes").toString());
}catch (JSONException e){
line.setNotes("");
}
list.add(line);
}
String[] cols = {"realtime_start","realtime_end","units","name","age","house","notes"};
CsvUtils.csvWriterUtil(CsvVo.class,list,"in/EmpDetails.csv",cols);
}
}
csvWriterUtil如下所示:
public static <T> void csvWriterUtil(Class<T> beanClass,List<T> data,String outputFile,String[] columMapping){
try{
Writer writer = new BufferedWriter(new FileWriter(outputFile));
ColumnPositionMappingStrategy<T> strategy = new ColumnPositionMappingStrategy<>();
strategy.setType(beanClass);
strategy.setColumnMapping(columMapping);
StatefulBeanToCsv<T> statefulBeanToCsv =new StatefulBeanToCsvBuilder<T>(writer)
.withMappingStrategy(strategy)
.build();
writer.write(String.join(",",columMapping)+"\n");
statefulBeanToCsv.write(data);
writer.close();
} catch (IOException e) {
e.printStackTrace();
} catch (CsvRequiredFieldEmptyException e) {
e.printStackTrace();
} catch (CsvDataTypeMismatchException e) {
e.printStackTrace();
}
}
完整示例可在GitRepo
中找到