Follow

Connect to XML with a CData data source

Before you attempt this you need to add the XML provider to your list of available sources.

Go through this guide before you start.

Add a CData data source

Contents

XML Information

XML files can be structured in different ways

Here below is how a XML file structured in element form looks:

<?xml version="1.0" encoding="utf-8"?>
<top100rocksongs_billboard_2013>
  <t>
    <ID>1</ID>
    <Sequence>1</Sequence>
    <Song>STAIRWAY TO HEAVEN</Song>
    <Performer>Led Zeppelin</Performer>
  </t>
  <t>
    <ID>2</ID>
    <Sequence>2</Sequence>
    <Song>BOHEMIAN RHAPSODY</Song>
    <Performer>QUEEN</Performer>
  </t>
</top100rocksongs_billboard_2013>

Here is how a XML file structured in attribute form looks:

<top100rocksongs_billboard_2013>
  <t ID="1" Sequence="1" Song="STAIRWAY TO HEAVEN" Performer="Led Zeppelin" />
  <t ID="2" Sequence="2" Song="BOHEMIAN RHAPSODY" Performer="QUEEN" />
  <t ID="3" Sequence="3" Song="FREE BIRD" Performer="LYNYRD SKYNYRD" />
  <t ID="4" Sequence="4" Song="SMOKE ON THE WATER" Performer="DEEP PURPLE" />
  <t ID="5" Sequence="5" Song="COMFORTABLY NUMB" Performer="Pink Floyd" />
  <t ID="6" Sequence="6" Song="KASHMIR" Performer="Led Zeppelin" />
  <t ID="7" Sequence="7" Song="STARGAZER" Performer="RAINBOW" />
  <t ID="8" Sequence="8" Song="ALL RIGHT NOW" Performer="FREE" />
  <t ID="9" Sequence="9" Song="WHOLE LOTTA LOVE" Performer="Led Zeppelin" />
</top100rocksongs_billboard_2013>

To make the Kings.XML file Download this link and save it as Kings.XML

http://mysafeinfo.com/api/data?list=englishmonarchs&mappingtype=element&rootname=englishmonarchs&elementname=Monarch

This is the URI link to the file I use for the element setup.

https://mysafeinfo.com/api/data?list=top100rocksongs_billboard_2013&format=xml&abbreviate=false&case=default&token=e3ODIxQFAgT9TD69GSlvZ6rfgMKnZdVt&mappingtype=element&formatting=indented

This is the URI link to the file that uses attributes.

https://mysafeinfo.com/api/data?list=top100rocksongs_billboard_2013&format=xml&abbreviate=false&case=default&token=e3ODIxQFAgT9TD69GSlvZ6rfgMKnZdVt&mappingtype=attribute&formatting=indented

CData Setup

Add a CData data source and choose the XML provider.

The first step in setting up a CData connector should always be to click the ? in the top right corner.

Sp_rgsm_lstegn.PNG

XML guide from CData

With Elements

If the XML source is a file you need to fill out the following fields.

URI: <the location of the kings.xml file>

Just the link to the file. You can also use the Data Source field, but then you will need t o fill out the XPath field as well. How to fill the XPath is explained below. Remember that all users that executes the project must have access to this folder.

If your XML is stored in an online location, you need to fill out the following fields.

URI: http://mysafeinfo.com/api/data?list=moviesbyearnings2006&mappingtype=element&rootname=moviesbyear&elementname=Movie

It stands for Uniform Resource Provider and it is the resource location. You need to paste in the link to the file.

With Attributes

If the XML file is using attributes as its element type, there is something you need to take into account. It cant automatically read the xpath, so you need to provide it.

Other than that it is pretty much the same.

URI: https://mysafeinfo.com/api/data?list=top100rocksongs_billboard_2013&format=xml&abbreviate=false&case=default&token=e3ODIxQFAgT9TD69GSlvZ6rfgMKnZdVt&mappingtype=attribute&formatting=indented

Again you need to add a link to the file.

XPath: top100rocksongs_billboard_2013\t

XPath is the path to the data. It is the rootname followed by the element name. It is a good idea to locate this beforehand.

Getting the data

Following the setup and successful connection test, you need to synchronize to the data source and preview the data.

After this is done you can add additional connections. You just need to add files with the same metadata.

Additional_Data_source.PNG

 

Besides this you can also set up Query Tables, Data Type overwrites, Incremental load and data selection rules.

Advanced features

Automatically getting the correct data types

As standard CData will scan the first 100 rows and determine what data types these fields are. Sometimes this gives an error because the 101 row contained a longer string than was present in the previous rows. The solution simply is to change the Row Scan Depth field to a higher number. Also in regards to JSON/XML files, this field doesn't only solve the issue of data type, but also what tables are available. If the file contains more than 100 rows in total these will not be added. So a good idea is to work out what the maximum amount of rows that are contained in the file and then add that.

This will usually correct the issue.

Parsing Hierarchical Data

If your file is containing nested objects, you now have the opportunity of seeing this. I copied the Raw data from here and saved it in a file on my computer. Raw nested data. It has this structure.

<?xml version="1.0" encoding="UTF-8" ?>

<root>
  <rootAttr1>rootValue1</rootAttr1>
  <people>
    <personal>
      <age>20</age>
      <gender>M</gender>
      <name>
        <first>John</first>
        <last>Doe</last>
      </name>
    </personal>
    <jobs>support</jobs>
    <jobs>coding</jobs>
    <vehicles>
      <type>car</type>
      <model>Honda Civic</model>
      <insurance>
        <company>ABC Insurance</company>
        <policy_num>12345</policy_num>
      </insurance>
      <features>sunroof</features>
      <features>rims</features>
      <maintenance>
        <date>07-17-2017</date>
        <desc>oil change</desc>
      </maintenance>
      <maintenance>
        <date>01-03-2018</date>
        <desc>new tires</desc>
      </maintenance>
    </vehicles>

When I pull it in like this and use no other features.

It gives me one table called people and when I preview it, each nested part is put into one field.

What you can do about this is to use the Data Model feature. As standard it is not used and it has two options besides that. Flatten Documents and Relational.

Relational creates a table for each nested part and creates ids to relate them to each other.

Like this.

People now looks like this.

And Vehicles

As you can see the jobs and features fields still have issues, but the flatten arrays feature below can solve this as well.

FlattenDocument makes all these previous tables one large one.

Like this.

Again it has the same issues with jobs and features, but this is still solvable with the flatten arrays feature.

Flatten Arrays

There is this XML data source

https://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml

If you use it as explained in the previous steps, it will give you this.

Though if you use the feature Flatten Arrays under the misc area it will split it out into the amount of columns you specify. In this case I worked out that there was 32 different currencies.

Using this feature and only selecting the important bits will give you this.

Following this you can merge it even further by using the query table feature or a view or similar.

Was this article helpful?
0 out of 0 found this helpful

2 Comments

  • 0
    Avatar
    Mark Chinsky

    I don't get how this works.

    I've got a cdata JSON connection to odata that works.  I have to add a separate connection for each odata table (each one relates to a query in an erp system) as there is no way to list all the queries and just pick tables and fields like SQL.

    I tried the nested 'add cdata data source' from the first data source, filled in the URI for a different query but nothing shows up in the the field panel and there is no 'synchronize function'.

    It seems if I have 50 queries representing 50 tables in my ERP system I'll need 50 data sources, and 50 cdata configurations.

    Not only is this clunky, but if I want to install my project at a different client thats a ton of manual url, username, password settings to have to make.

    Is there a slicker way to do this?

    Again, my vendors Odata/REST driver will not properly let you list out all queries and pick individual ones and fields like SQL does from one connection.

  • 0
    Avatar
    Thomas Lind

    Hi Mark

    Can you send this request to support@timextender.com and I will see what I can do.

Please sign in to leave a comment.