Skip to main content

I have a rest interface loading via ODX in V20 into an Azure Data Lake.

Recently an infrastructure change to the server made the Rest Endpoint timeout (it has an IP whitelist and the server IP was removed).

This resulted in the Synchronization task determining that there were no tables available to load, even though there are multiple custom RSD files that have been generated and the CData Rest driver has been configured to Never generate Schema files.

 

The IP has been restored to the whitelist and the Rest interface is working again (after multiple days of the ODX loading nothing). Unfortunately the Meta Data for every table has been flagged as invalid. When I run the ODX load every table comes back with an error that the table has already been used by a table with a different GUID. And If I run the synchronize task on the ODX it deletes the mapping of every table connected to the particular ODX source (40+ tables).

I cannot even run a preview of the data from the Gen 2 storage through TimeXtender.

It seems that my only choice is to completely delete the entire source folder in the Gen2 Storage, run a complete ODX load to recreate parquet files with “valid” meta data and then manually remap all of the tables to the ODX and then promote up through each environment to get production running again!

Can someone verify if I have another option to correct this.

If I don’t have any other option, can someone please explain why and how this can happen. Why can’t we have a “Synchronize on Name“ feature with the ODX like the BU has?

This is not the first time I have had to do a similar thing during a development cycle. it seems the Meta Data the ODX manages is very fragile. Why can’t it sync with existing tables in the DataLake?

Paul.

Hi Paul

No not that I know of. This is an unfortunate thing that can happen with CData file based providers and frequent sync task executions.

The solution you suggested of deleting the folder in the container for this data source is usually the only method.

I tried various things to see if I could avoid it, but this was the only way I found that could resolve it.

Whether you will loose the mapping in the Data warehouse if you sync against the ODX, I am not sure, but I do see that is an task you would rather not do.

The solution to avoid this happening, is to be sure you do not schedule the sync task and only runs it when actual change have been made to the RSD files.


Hmmm,

Ok makes sense, I schedule the sync task to run before every ODX load assuming that would be a best practice approach. but I do see that if you are defining custom RSD files that it is not necessary to continually run a sync process.

But a “Sync on Name” feature would be a nice addition to both the ODX Data sources as well as the ODX tool withing the TimeXtender application as well to help overcome this issue. 

Thanks for the quick response!

Paul.


The synchronization of data sources works differently in the v21 (SaaS) release.

Here you will be able to choose the table in a list or it will automatically add it if it have the same name.


That’s good news! We are actually planning to migrate this client to V21 in February so that will help to limit exposure going forward!


Reply