Hi @kerry_bespoke
- One method is to set this table to use incremental load and history in the DWH Data Area, in the ODX data source you apply a date filter option on the field you use for incremental load to only get a limited amount of data, e.g. one years worth, and then increasing this on each execution. Then once you are all through, you can either keep the filter to avoid running a full load on accident, or turn off the filter and let the incremental rule handle it.
- An alternative method is to still set this table to use incremental load and history and map not the actual table, but instead a managed query that consists of the same fields, but only contains one years worth of data. Then you create new managed queries or changes the existing one to get all the rows through.
In the end the procedure is the same as above, either keep using the manged query with a custom filter based on dates, like only get the last year or last month, or to apply the real table.
The first option is best when you already have a field to use for incremental load and the second is better when you don’t have that.
There was a TimeXtender Tuesday or similar about this once, but I can’t seem to find it.
Thank you for your response Thomas. I am using TX 20.10 and it is using a Business Unit so Option 2 is out. I did watch the Large Data TX Tuesday article but with the version I am using this option will not work.
I have been trying your suggestion in Option 1 and have shrank the query to just a particular year but it still times out. I will bring this down to 6 months and see if I make more progress.
Just wanted to check I was going about this the right way.
Be sure you use a specific batch size in the DWH setup, do not keep it set to 0.
You should not make a table in a BU a history table. If there is any changes in the data source regarding names or so, it will be removed when you synchronize even if it runs with history.
So the BU version of the table should not run with history, but only incremental load.
OK, I have batch size set to 40,000. Would that be OK for a large data set? it was set to higher but with the connection timeouts I was getting, I reduced it.
I can confirm I also do not have the table set to enable history.
With these settings set, I was able to load 6 months of the data so will continue changing the incremental setting to the next 6 months and see how I get on.
Thank you for your help with this!