Solved

ODX Transfer task running indefinitely

1 year ago
February 14, 2024
20 replies
272 views

Victor
Participant

Hello,

TimeXtender: 20.10.40.64
ODX: 20.10.31

We are using the ODX and TimeXtenders SQL database connector to get data from a synpase database.

Sometimes the synpase databas seems to have problems, where a query runs and doesn’t complete (a seperate issue we are working on).

However what happens in TimeXtender in the ODX is that the transfer task keeps running indefinitely and never times out, leading timextender execution package to run but just sits there waiting for the odx transfer task to finish. However nothing fails anywhere, so we do not get an alert.

We need some way to be able to capture this.

Is there anyway to set up one or both of these things in TimeXtender or if you have any other ideas?

1. Kill the transfer task if it takes more then X minutes
2. Get an alert on failure or long running task

Thank you!

Best answer by Christian Hauggaard

Hi @Victor

Could you please lower the command timeout to 300 to troubleshoot further? How many tables do you have for the transfer task for the TX SQL data source?

View original

Did this topic help you find an answer to your question?

rory.smith
TimeXtender Xpert
1 year ago
February 14, 2024

Hi @Victor ,

I assume your timeouts on the source, ODX storage, ADF if applicable are all short enough that they should have triggered?

Victor
Participant
1 year ago
February 14, 2024

Hello Rory,

The ODX sql connector har standard setting (15s command timeout)

The ODX date lake has a connection timeout of 100s and a command timeout of 7200s.

The task we had ran for alittle over 3 hours.

rory.smith
TimeXtender Xpert
1 year ago
February 14, 2024

Hi @Victor ,

It could be that the TX SQL connector runs the extraction in batches, if so it could be that the command timeout is per batch and did not reach two hours for that batch yet. If you know that the extract should be shorter I would set the timeout to be ~1.5 times the longest extract time you expect.

Victor
Participant
1 year ago
February 21, 2024

It’s possible and we will try some different settings.

But the main question I have is there no way to recieve a notification for ODX transfer tasks similar to the execution packages? Wether they fail, successful with errors or any way to monitor the length of loads?

rory.smith
TimeXtender Xpert
1 year ago
February 21, 2024

Hi @Victor ,

in the 20.10.x series you can only get mail notifications for critical errors regarding the ODX Server service itself as far as I am aware. In the new release you can use on-demand execution on an ODX source to trigger a table's reload when a DWH table gets executed. In that way you can use the DWH mail notification to send a mail on a failure in the ODX.

If you really really need to you could investigate the sqlite backlog database that is synched locally from the cloud repository but you would need to be careful to avoid locking situations.

Victor
Participant
1 year ago
February 27, 2024

Over this weeend we had a similar issue with another connector and with another source.

This time a task that is supposed to take about 20 minutes ran for about 80 hours without failing.

Surely there has to be some way to stop sources from getting stuck on transfer, which makes execution packages not run and brings down production. Or at the very least some way to flag when this is happening so we can catch it.

Does the ODX connection and command timeout setting just not work? This is a big problem.

Christian Hauggaard
Community Manager
1 year ago
February 27, 2024

Hi @Victor

You mention that the synapse database you are connecting to seems to have problems - “where a query runs and doesn’t complete”. Could you please provide some more insight as to why the query does not complete? Are you using managed queries for this TimeXtender SQL data source?

You also mention that you encountered the same issue for another data source - which data source was this?

Do you currently have retries setup on your transfer task?

Victor
Participant
1 year ago
February 27, 2024

Hello!

Both have restart on fail with 3 attempts. One has 2 minutes between and 10 on the other.

For the synpase database, we are not sure why it happened and that is being investigated separately. The the root issues lies within the synpase.

Same with the other source, they had a crash in the database during the weekend so the root cause lies with the source.

This has only happened about 5 times in 2 years. However, the main thing we need is finding a way to timeout transfers and not have it takes 40-50h+ before we manually notice it.

As an example, our qlik reports loading from these sources directly correctly timed out.

Currently I am building a script in the repository that sends an email when a execution packages takes longer then X hours. This will help us identify it in the future earlier, but getting timeout to work correctly is important to the customer.

Christian Hauggaard
Community Manager
1 year ago
February 27, 2024

Hi @Victor

In order to troubleshoot, can you please try disabling retries on the transfer task?

Furthermore, are you using managed queries for this TimeXtender SQL data source or simply selecting tables?

Can you also please confirm which data source you are using for the other data source which encountered a database crash during the weekend?

Christian Hauggaard
Community Manager
1 year ago
March 4, 2024

Hi @Victor could you please provide an update on my questions above when you get the chance so we can troubleshoot further? Thanks

Victor
Participant
1 year ago
March 4, 2024

Hello,

We are selecting tables from the source. There is a query table in one of the sources, but not the others.

It has happened on a total of 3 different sources, and happened again this friday on a new source (2 in list below).

1.One synapse database (with timextender sql database connector) - with retries on transfer task
2.One regular sql database (with azure datafactory connector) - without retries on transfer task
3.Another sql database (with a ODBC driver connector) - with retries on transfer task

We could remove retries and see if it comes back, but that will eventually bring it’s own issues.

All we want to achieve if for the tasks to fail and not get stuck on running forever.

EDIT: forgot to answer the last question.
The database crash that caused one was the ODBC connector

Victor
Participant
1 year ago
March 6, 2024

Update:

Removing retries did not resolve the issue for the 2 tasks.

We are getting an increased amount of transfers getting stuck and running for long times without timing out.

Christian Hauggaard
Community Manager
1 year ago
March 7, 2024

@Victor thanks for confirming. For the ADF data source could you please try to lower the activity timeout? what is it currently set to?

Regarding “We are getting an increased amount of transfers getting stuck and running for long times without timing out” - are these for the same data sources mentioned above or different data sources?

Christian Hauggaard
Community Manager
1 year ago
March 8, 2024

Hi @Victor

You mention that “The ODX sql connector har standard setting (15s command timeout)”

Do you mean connect timeout?

What is your command timeout set to? Please see where to find command timeout below. Can you please attach some screenshots

Victor
Participant
1 year ago
March 8, 2024

Hello,

It is for the same data sources as mentioned above.

This is sql connector setup for the one we are having the biggest problem with.

The sql connectors look like this:

Command setup:

This is on timextender version: 20.10.31.64
ODX version: 20.10.31

Christian Hauggaard
Community Manager
1 year ago
March 8, 2024

Hi @Victor

Could you please lower the command timeout to 300 to troubleshoot further? How many tables do you have for the transfer task for the TX SQL data source?

Victor
Participant
1 year ago
March 8, 2024

Hello,

We could try that , but the nightly full transfer for that task takes about 15 minutes. Will that not make it stop working?

The hourly transfer for that source is 25 tables with most of them being incremental.

Christian Hauggaard
Community Manager
1 year ago
March 8, 2024

@Victor I believe the command timeout is per command i.e. per table

Victor
Participant
1 year ago
March 8, 2024

I have made the change and will monitor it over the weekend and report back here.

Meanwhile, thank you very much for the support.

Victor
Participant
1 year ago
March 18, 2024

A combination of changing the command timeout and removing retries has solved the issues of it getting stuck “forever”.

We will try at a lter date to reintroduce retries to see if it can work with a shorter command timeout.

Thank you @Christian Hauggaard for the help!

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Reply

Related topics

Add Ingest Instances

TimeXtender 6618.1

ODX transfer task results in command timeouticon

Decreasing ODX transfer timeicon

Run transfer job only when ODX transfer task completed without errorsicon

Most helpful members this week

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings