Our Blogs
XML stand for Extensible Markup Language which is easy to read by human and machine both, it is saved with.xml extension and have markup symbols to describe its file contents like HTML.
XML file should be well structured and have proper opening and closing tags, it is considered as a kind of database in itself. It always start with <?xml version=”1.0″ encoding=”UTF-8″?> which contains its version and the encoding, changing the encoding will let XML to treat special character differently.
JSON stand for JavaScript Object Notation, it is language independent data format and used in exchanging data between a browser and a server. It is text based representation of structured data which is based on key-value pairs. We can convert any JSON into JavaScript and vice-verse.
Note: Before reading any file make sure it is not password protected.
I am reading below file
- tFileInputXML
tFileInputXML component Reads an XML structured file row by row to split them up into fields and sends the fields as defined in the schema to the next component.
tFileInputXML component has a few basic properties that needs to be check/uncheck to process data for proper formatting.
In ‘Edit Schema’ we need add one column with type, ‘Document’. Then in ‘Loop Xpath query’ option we need provide tags within XML file, e.g “/”, a simple backslash means file will be read from beginning to end or we can also provide “/root/value” now under ‘mapping’ in “XPath query” we can provide similar “/” node value to fetch values of all tags.
- tXMLMap
TXMLMap is similar to tMAP component, it is an advanced component fine-tuned for transforming and routing XML data flow (data of the Document type), especially when processing numerous XML data sources, with or without flat data to be joined.
In tMap component if we already have XML file, we can import it by right click on doc and select ‘import from XML file’ the schema will be automatically created. In this we have to set loop element, in the above image loop element is ‘value’, so iteration will happen based on ‘value’ tag.
- tAdvancedFileXMLOutput
tAdvancedFileOutputXML outputs data to an XML type of file and offers an interface to deal with loop and group by elements if needed.
tAdvancedFileOutputXML can be used in place of tXMLMap. In above image ‘entidad’ column is set as loop element, so iteration will happen on this tag. ‘@id’ is called attribute which means it is sub-element of entidad and we can’t add sub-element under it whereas ‘direction’ is also sub-element of entidad but we can add sub-element under it as we can see in above image.
- tFileInputJSON
tFileInputJSON Extracts JSON data from a file and transfers the data to a file, a database table, etc.
JSON stand for ‘JavaScript Object Notation’ is a lightweight data-interchange format and It is based on the JavaScript programming language.
‘Edit schema’ will contain all columns. ‘Read By’ will have 3 options out of which we are taking ‘JsonPath’. We can check ‘Use Url’ if Json file need to be fetched from any website else keep it uncheck. ‘Loop Json query’ is appearing because we have selected ‘JsonPath’ in ‘Read By’ property above, it will have path of tabs in file, please see Json file before this.
In the ‘book’ tag we have 4 attributes which needs to be extracted.
- tFileOutputJSON
tFileOutputJSON receives data and rewrites it in a JSON structured data block in an output file.
Below is the file format that we are going to convert into JSON file.
‘Name of data block’ is what comes in JSON at top, see below image.
Edit schema will have all column that need to be mapped.
Output JSON file:
While working on Talend if in case we came across some issue which is not possible to resolve at our end we can raise it to Talend community on this link. Their team will help in solving the problem.
About Girikon:
Girikon is an IT service organization, headquartered in Phoenix, Arizona with presence across India and Australia. We provide cutting-edge Salesforce consulting services and solutions to help your business grow and achieve sustainable success.