Skip to main content

One of my colleagues made a change to an incremental field on a very large table, not realizing that this would force a full load. We only realized what had happened halfway through our promotion, but since we needed to make changes to that table anyway, we proceeded. Reloading that table and its downstream tables took five days. Luckily, a long weekend greatly reduced the impact of that reload on our users. 

We have now begun our next promotion cycle.  Unfortunately, despite the fact that the table in question was reloaded in our development environment, differential deployment is again forcing a full reload of that table. We really can’t afford to spend a week reloading this table a second time. 

We’d love to know why the flag remains flipped in the first place, how we can avoid this happening in the future, and how we can avoid this extremely costly reload, as it is blocking our ability to promote urgently needed changes into production. 

Hi @ekw9 

What version are you in?

How is the table in question set up in regards to the incremental load?


dear @ekw9 ,
Why this is working the way it is, I dont have a straight answer. It can be a number of things.
It could be that the ODX table still is on full load. It can be that the data warehouse table is set to full load instead of automatic (or incremental). Maybe the incremental load rule got deleted or is the ‘load incremental when available’ box is not checked.

As for (Incrementally) loading big tables in TimeXtender can be an hastle in my opinion. When incrementally loading the table it will check all PK's and this can take a very long time in the data cleansing part of the load. Espescially when besides updates TX also checks for deletes.
What I would usually do is create 2 query tables. 1 which just load current (and previous year) data and 1 which load the historical data. The current table will run each day and the historical table will run once a year (in the new year). Then keep both tables seperated and load them seperately in my front end tool. this way TX just has to take care of the current table for ‘only’ current year or 1+ current year data, which will be way faster. I've also opted for a functionality within TX which can do this automatically for you, if you liek it please upvote: 


Hope this gives you some ideas

= Daniel


Thank you for your suggestions, Daniel. We are looking for better ways to load large tables moving forward, but right now we really need a fix for our current issue. 

Thomas: 

We are using 20.10.29.64. The load is a normal incremental load from the Staging database into a data warehouse database (our DSA). Deletes are not handled. The incremental load uses a datetime field - the ODX table’s DW_Timestamp. 

The sequence of events is as follows:

  1. The name of the incremental field was changed. 
  2. The name was changed back. No further changes were made to the incremental rules in any environment after this point.
  3. The changes were promoted from Dev to Test. The differential deployment report indicated that the DSA table would be fully reloaded on deployment. 
  4. The table was reloaded in Dev and deployed in Test. 
  5. New fields were added to the table in Dev and Test. At this point, the structures of both tables were identical.
  6. We promoted from Dev to Test again. Again, the differential deployment report indicated that the DSA table would be fully reloaded on deployment. This is where we stopped the promotion process. No further changes have been made to the affected tables. 

I am happy to provide a copy of the project, and/or the repository database if that would help. 


Hi @ekw9 

If you in any way change the field that is used for the rule, I do not mean the name of the field, but what field is being used, it will force a full deploy.

In regards to your step by step process. What field was it you changed the name of?

In my setup I as you use DW_TimeStamp as my rule, which becomes the IncrementalTimeStamp field if you use the auto apply incremental load feature. I could do this without forcing any changes that required a full load.

However, if I do the following it forces a full load.

 


Hi Thomas,

Yes, that’s entirely correct. 

The issue we’re having here is not that. A few weeks ago, we made a change that forced a full deployment. We deployed that change, and migrated the project. We had to fully reload all of the tables in our downstream environments as a result. That is all as expected. 

The problem is that we haven’t made any further changes that would results in a full load since. But now that we’re trying to migrate again, the downstream environments are still being forced to do a full reload anyway. This is the behavior we need to understand and stop. 

Does that make sense? 


Hi @ekw9 

You made a change that forced a full load in development, then you ran that full load. Then you transferred it to production where the change also forced this to do a full load execution.

Since doing that you transferred the project again, without doing any changes to the development project and it wants to do a full load again?

How do you deploy it, if you log onto the prod environment and deploy it using the review task option, what is it it wants to deploy specifically?


Hi Thomas, 

To clarify, the changes were in the following order: 

  1. Full load forcing change made in development. 
  2. Promotion to Test
  3. Development reloaded
  4. Test reloaded
  5. Promotion to Prod
  6. Prod reloaded
  7. New promotion from Dev to Test
  8. In the test environment, the differential deployment report indicates that the table will again be forced to reload. 

Step #8 is our problem. 

The most vital question we have is: how do we stop the full load from happening? This will have a major impact on our production environment. Please advise. 


Hi @ekw9 

The reason for my insistence for a thorough description of the steps is because there was a bug fix in a version later than 20.10.29 that seemed to fit your issue a bit.

Fixed in 20.10.37

  • 16023: Multiple environment perspective deploy always includes all incremental loaded tables
    When deploying a perspective all incremental tables in the project was added to the deployment. This has now been corrected so it will only bring in the tables where the incremental load has a change that requires deployment.

I don’t know if that is entirely the case, but if it was part of an perspective it would explain why you see this.

Also just because a table is to be deployed it does not mean that it will be fully loaded.

If you choose this option it should tell you what it is specifically it needs to deploy.

 


Hi Thomas,

Thank you very much for the heads up on that TimeXtender version! We’ll do our best to avoid it. 

Fortunately, we don’t deploy through the managed deployment window, nor are we getting this information via a deployment window at all. Our process is this: 

  1. Promote the project through the managed deployment dialog
  2. Open the destination environment
  3. Run the Export Deployment Steps report from the advanced options in the project menu: 

 

The file that this generates is where we are seeing that the table has to fully redeploy.

Our next promotion cycle begins next week, and we’re starting to get worried we may miss that cycle as well if we cannot get this resolved soon. Missing an entire month’s worth of deployments would be an issue for us, so anything we can do to get this resolved soon would be much appreciated. 


Hi @ekw9 

I see. As mentioned, I would do to attempt a upgrade to resolve this.

If you log in after transferring the project and attempt a deploy of one of the tables in question and it looks like this.

There is not much you can do to not deploy this part, but if that specific step is not selected it will not fully load the table on the deploy.

So if that is the current hold up, I would attempt an upgrade and a new transfer of the projects. Then go into the prod environment and deploy the table with issues to confirm if it wants to deploy the mentioned part.

If it still does after the upgrade, you will have to deploy it. Once this is done be sure it is also deployed in the DEV and Test environments and do not need to be deployed before transferring it to Prod.


Hi Thomas,

I will attempt a software update as you have advised.

However, we do need a solid diagnosis for what happened here. This has had a major impact on our data warehouse, and my superiors want answers. I’m happy to provide a copy of the project or a backup of the repository database - but we have to know that this won’t reoccur. 

Best regards,

Emily Wynkoop


Hi @ekw9 did you get a chance to upgrade TimeXtender?


While we will be upgrading as soon as possible, our ability to promote code was blocked, and we could not afford to engage in a lengthy dialog with support any longer. Without a timely or workable fix, we’ve had to rush to rebuild the table to make it easier to reload. 


Hi @ekw9 

I apologize for that. The product team is currently investigating a rework of the full load functionality. Please let me know if an upgrade resolves the issue for the time being.


Thank you Christian. I am grateful for that. 

The frustrating fact of the matter is, for all I know, that this issue was down to user error somewhere on our end. However, without being able to tell where the problem came from or any option to remediate it, we were left well and truly stuck. 

I’m pleased to say that we’ve recovered from this and improved some of our loading practices significantly. In a way, it was helpful that this proved to be a catalyst for some overdue and necessary changes in how we handle large data sets. However, we would love to be able to avoid the stress and long hours that accompanied this in the future. 


Reply