Solved

Duplicates loading data from ODX to MDW

  • 31 March 2023
  • 6 replies
  • 150 views

Hi community,

we are facing the problem, that we create duplicates when we bring data from an ODX API source to the MDW. We are working with an overlapping sliding window of two days in the schema file (because data can change and there is no last modified date) and only set the primary key on the ODX source.

In the MDW (dedicated SQL pool) we enabled the history and set the ID as natural key. All fields are marked as type 1 fields.

The execution brings us duplicated ID values - no updates are made


Thanks for your help
Michael

icon

Best answer by Thomas Lind 11 April 2023, 13:14

View original

6 replies

Userlevel 6
Badge +5

Hi Michael

How is the table in the Data Area set up? It will not update/remove the duplicates on a simple mode table for example.

 

 

Hi Thomas,

These are our table settings:

Thanks
Michael

 

Userlevel 6
Badge +5

Hi Michael

I noticed that it is a History table, what option do you using for the other fields, such as type II fields?

The ID needs to be unique. If it is not it would make it add the same rows again and again.

Hi Thomas,

 

as mentioned in the first post, we are only using SCD 1. The ID is unique in the source. I attach one example for the duplicates - the id is unique, but we have 4 entries for the same Id - we loaded more often, so that we can see it in detail:

 

Thanks
Michael

Userlevel 6
Badge +5

Hi Michael

If you look at the SCD values, does they seem to mention these seen as unique rows or changes?

I mean does only one have a is current value or a date range between them?

If not it would seem like the duplicated natural keys are not seen as being completely unique.

Userlevel 6
Badge +5

Hi @Michael Suppan do you have an update on this? are there multiple records with the same ID where SCD Is current field is set to 1 for several records?

Reply