Skip to main content

Hi all,

I’m using the CData Connector for Parquet files and take files from a SFTP server. The files always contain the data from the last day (with some overlap).

I made the following setup:

  • Primary key
  • Incremental Load with updates

The first load is always ok, but then there is no new folder in the DataLake created? What I’m doing wrong?

Thanks
Michael

Hi @Michael Suppan 

I cannot reproduce the issue you describe. 

I have setup the following ODX data source in 20.10.51:

With the following PK and incremental load settings 

Please see attached parquet files that I am using for my first load and second load. I upload the parquet file from the “before” folder to my data lake and then I run the transfer task and then execute the DW table mapped to the parquet data source table. I then overwrite the parquet file in my data lake with the parquet file from the “after” folder in the attached zip file, and then run the transfer task again and re-execute the DW table.

 

First load:

Second load:

the data DATA_0001 file is the new data file that contains the new record with greater modifieddate. Below is a screenshot from when I view the parquet file in parquet viewer. 

This new record is also brought into my DW table when I execute it again following the second run of the transfer task

 

Do you notice any differences in our setups?


Hi Christian,

the difference I found is the TX Version (20.10.38).
After the first load there is no other DATA_000X file created in the ODX. Are there other options for loading this data?

 

Thanks
Michael


Hi @Michael Suppan 

Please try to upgrade to 20.10.51 and test if the issue persists


Reply