Wednesday, 29 May 2013

Convert JSON file to XML in Talend Open Studio

There are couple issues one faces while processing JSON file through Talend Open Studio using tFileInputJSON component. Talend expects the JSON file to have a balanced schema. If the schema is not balanced then JSON file read may not be correct and often we receive warning "Can't find any data with JSONPath .. while running the Job. e.g. if a value is null for some attributes then it becomes unbalanced schema and Talend tFileInputJSON component sends an error message to the std-err.

Hence, I prefer to convert the JSON files to XML before using or processing it. It is very easy to convert JSON file to XML in Talend. You need to download and import the external jar json-20090211.jar into Talend Job. Lets see, How we can convert JSON file to XML using a +Talend job.

We are going to process following JSON file having First Name and Last Name fields.

Input JSON File -
{
"employees": [
{ "firstName":"John" , "lastName":"Doe" },
{ "firstName":"Anna" , "lastName":"Smith" },
{ "firstName":"Peter" , "lastName":"Jones" }
]
}

Output file after converting JSON file to XML:

<?xml version="1.0" encoding="ISO-8859-15"?>
<root>
<employees>
  <lastName>Doe</lastName>
  <firstName>John</firstName>
</employees>
<employees>
  <lastName>Smith</lastName>
  <firstName>Anna</firstName>
</employees>
<employees>
  <lastName>Jones</lastName>
  <firstName>Peter</firstName>
</employees>
</root>

Job Design.



As per above design, create a new Talend Job and follow the below mentioned steps:


Step 1. Import the external jar json-20090211.jar into Talend Job. You can download this jar from here. I am going to use tLibraryLoad component to import this jar into Talend. Click here for getting more details on exporting external Jar/libraries in the Talend Job.

Step 2. Read the JSON File. We are going to read the JSON file as text only so I have used tFileInputFullRow component for read the JSON file.

Step 3. Convert the JSON file read in Step 2 to XML file. Use the following code in tJavaRow component to convert the JSON file to corresponding XML file.


org.json.JSONObject jsonFileObject = new org.json.JSONObject(input_row.line);
output_row.line = "<?xml version=\"1.0\" encoding=\"ISO-8859-15\"?>\n<root>" + org.json.XML.toString(jsonFileObject ) + "</root>";
// First line of code will create jsonFileObject from the read JSON file.
// Second line will org.json.XML.toString(jsonFileObject ) will convert into XML file and concatenate standard XML version & encoding tags.
Step 4. Finally, store the converted XML into a file using tFileOutputDelimited component. In the component properties provide the path and name of the XML file.

These four steps are required to convert JSON files to XML. Now lets run our job:















Converted XML File:


















You can see that it is so easy to convert JSON files to XML. You can also convert all the JSON files at one run by using tFileList component and pass the current file global variable to tFileInputFullRow component. You can refer following design when converting multiple JSON files at once.















5 comments:

  1. Hi,

    My JSON has 2 records. But when I am trying to use this method and trying to convert, the Talend job is failing with the error message
    "[FATAL]: .jsontoxmlfilelist_0_1.jsonToXMLFileList - tJavaRow_1 A JSONObject text must begin with '{' at 1 [character 2 line 1]
    Exception in component tJavaRow_1
    org.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
    at org.json.JSONTokener.syntaxError(JSONTokener.java:433)
    at org.json.JSONObject.(JSONObject.java:194)
    at org.json.JSONObject.(JSONObject.java:321)"

    ReplyDelete
    Replies
    1. Could you please share your json?

      Delete
    2. Thanks for your prompt response. The sample JSON that I am trying to convert to XML is as under:
      [{"entities":[{"first_name":"John","last_name":"Rambo","office_address":{"first_line": "123 Bloor Street","town" : "Mississauga","state": "Ontario","country": "Canada"}}]}]
      [{"entities":[{"first_name":"Arnold","last_name":"Swarzannegar","office_address":{"first_line": "234 Bloor Street","town" : "San Francisco","state": "California","country": "USA"}}]}]

      Delete
  2. Hi Vikram,

    One more issue that I am facing when using the tFileInputRow component. I need to process one file containing 3 JSON objects. But the job is generating 3 XML records instead of one.

    For Example:Say my JSON is as shown below:
    {"entities":[{"first_name":"John","last_name":"Rambo","office_address":{"first_line": "123 Bloor Street","town" : "Mississauga","state": "Ontario","country": "Canada"}}]}
    {"entities":[{"first_name":"Arnold","last_name":"Swarzannegar","office_address":{"first_line": "234 Bloor Street","town" : "San Francisco","state": "California","country": "USA"}}]}
    This is generating an XML as
    /<\?xml version="1.0" encoding="ISO-8859-15"?>
    /<\root> ....
    /<\?xml version="1.0" encoding="ISO-8859-15"?>
    /<\root> ....

    Could you please help me on how I can generate this as
    /<\?xml version="1.0" encoding="ISO-8859-15"?>
    /<\root> ....

    P.S: I was forced to use the "/" and "\" preceding the XML file.

    ReplyDelete
  3. Can you send me your jobs at my email id with sample data?

    ReplyDelete