Solved

Sync not working - OracleDB - ADF


I have a setup of a TimeXtender 20.10.40.64, ODX, ADL, ADF, Self hosted IR which I get to work just fine. 

I need to extract data from an IFS10 source, Oracle database, and it works until syncing then it seems to start an endless loop and never finish or gives an error. 

In TX I have

  1. setup a new data source, Azure Data Factory - Oracle
  2. included all connection info information
  3. In Select tables I click “Search” and get all the tables available in the Oracle DB
  4. I Include a couple a tables
  5. Setup a transfer task
  6. Syncing starts running and I can see in ADF that is is running but it never receives any data from the data source

I have tested from a SQL data source with ADF and it works just fine, both sync and transfer

I have in ADF tested to preview tables in the dataset created by TX and I can preview tables from Oracle DB just fine. I have even tried to create my own pipeline in ADF to copy data from the same data source to the same sink that TX created and it works just fine. It creates a parquet file in the destination, ADL, and I can open the file and see that the data is correct.

So anyone have an idea why TX can’t sync?

icon

Best answer by Spelling 29 March 2023, 14:03

View original

20 replies

Userlevel 6
Badge +5

Hi @Spelling Are you getting the error on the synchronization or transfer task? Can you please send the error you are getting?

Hi @Christian Hauggaard,

 

No error, it just keeps running when synchronization.

Even if I just have selected 1 table I have kept it running for hours before canceling and I can see in ADF monitor that during this time no data have been received from data source, so even after an hour no data have been transferred.

If I test from a SQL source the sync only takes a minute or two so it shouldn’t take hours

Is it possible to see the parameters that TX sends to ADF somehow just to see if they work if I run them manually in ADF?

Userlevel 6
Badge +5

Hi @Spelling In order to trouble shoot further, can you please share your data source settings similar to below?

 

Hi @Christian Hauggaard,

Ok, that is strange because when setting up the data source getting to the step of selecting tables to include, there is activity in the created pipeline in ADF. When the ADF activity is finished then the selection of tables appear. When syncing there are pipelines running. This is true both for Oracle and SQL sources.

What I can see is when creating a new data source based on ADF, that is ADF - Oracle or ADF - SQL one pipeline is created to fetch all the tables that I want to include. Then TX creates another pipeline which is used for sync and transfer.

 

These are the settings:

 

Userlevel 5
Badge +5

Hi Aron

I tried to replicate the issue.

The first synchronize is actually for all tables of my Oracle data source.

The second one took double amount of time, but I actually did not select all tables then, I was only selecting a few specific tables.

If you have the time, I would like you to try to connect to all tables as well to see if it then does it faster.

I do not have as many tables in my source as I imagine you have, so it may just take a really long time to synchronize.

I will create a internal question about how it is supposed to work and why it could be slower doing this.

Hi @Thomas Lind ,

Thanks for the feedback, yes I’ll try to include all tables and do a sync. 

It is true that before I only selected a couple of tables before syncing and in real production I will use more but still only a fraction of the total number of tables. 

When I selected just a couple of tables I finally canceled the sync after more than an hour.

I’ll get back to you with result as soon as I have tried to sync with all tables selected

Hi @Thomas Lind ,

I selected all tables and started the sync but after 1 hour and 40 min I got an error, looks like a timeout:

Error:

Operation on target TIMEXTENDER_ODX_f826cc39c99544e5bc27711e3bf51afe failed: Activity failed because an inner activity failed; Inner activity name: Copy Table, Error: Failure happened on 'Source' side. ErrorCode=UserErrorSourceQueryTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Query to source database timeout after '6000' seconds.,Source=Microsoft.DataTransfer.DataContracts,''Type=System.TimeoutException,Message=,Source=Microsoft.DataTransfer.DataContracts,'

 

This is how it looks in TX

 

The two errors are the following:

1.

System.AggregateException: One or more errors occurred. ---> System.Exception: Error executing meta data pipeline: {
  "errorCode": "2200",
  "message": "Failure happened on 'Source' side. ErrorCode=UserErrorSourceQueryTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Query to source database timeout after '6000' seconds.,Source=Microsoft.DataTransfer.DataContracts,''Type=System.TimeoutException,Message=,Source=Microsoft.DataTransfer.DataContracts,'",
  "failureType": "UserError",
  "target": "Copy Table",
  "details": []
}
   at DataSourceEngine.DataFactory.DataFactoryEngine.SynchronizeSource(List`1 existingMetaStructures, DataStorageModel dataStorageModel)
   at ExecutionEngine.Action.SynchronizeAction.Execute()
   at ExecutionEngine.Action.ExecutionAction.<.ctor>b__9_0()
   at System.Threading.Tasks.Task.Execute()
   --- End of inner exception stack trace ---
---> (Inner Exception #0) System.Exception: Error executing meta data pipeline: {
  "errorCode": "2200",
  "message": "Failure happened on 'Source' side. ErrorCode=UserErrorSourceQueryTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Query to source database timeout after '6000' seconds.,Source=Microsoft.DataTransfer.DataContracts,''Type=System.TimeoutException,Message=,Source=Microsoft.DataTransfer.DataContracts,'",
  "failureType": "UserError",
  "target": "Copy Table",
  "details": []
}
   at DataSourceEngine.DataFactory.DataFactoryEngine.SynchronizeSource(List`1 existingMetaStructures, DataStorageModel dataStorageModel)
   at ExecutionEngine.Action.SynchronizeAction.Execute()
   at ExecutionEngine.Action.ExecutionAction.<.ctor>b__9_0()
   at System.Threading.Tasks.Task.Execute()<---
2.

The execution failed with error:
System.Exception: Error executing meta data pipeline: {
  "errorCode": "2200",
  "message": "Failure happened on 'Source' side. ErrorCode=UserErrorSourceQueryTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Query to source database timeout after '6000' seconds.,Source=Microsoft.DataTransfer.DataContracts,''Type=System.TimeoutException,Message=,Source=Microsoft.DataTransfer.DataContracts,'",
  "failureType": "UserError",
  "target": "Copy Table",
  "details": []
}
   at DataSourceEngine.DataFactory.DataFactoryEngine.SynchronizeSource(List`1 existingMetaStructures, DataStorageModel dataStorageModel)
   at ExecutionEngine.Action.SynchronizeAction.Execute()
   at ExecutionEngine.Action.ExecutionAction.<.ctor>b__9_0()

 

I found something

were someone wrote:

We began to suspect that something was amiss with data types, because the problem disappeared if we cast all of our high-precision Oracle NUMBER columns to less precision, or to something like integer.

It got so bad that we opened a case with Microsoft about it, and our worst fears were confirmed.

The Azure Data Factory runtime decimal type has a maximum precision of 28. If a decimal/numeric value from the source has a higher precision, ADF will first cast it to a string. The performance of the string casting code is abysmal.

Check to see if your source has any high-precision numeric data, or if you have not explicitly defined schema, see if you're perhaps accidentally using string.

 

not sure if this is relevant.

Please let me know if you have some feedback

 

Userlevel 6
Badge +5

Thanks @Spelling for providing this info and performing the all tables sync test. Since we are unable to reproduce on our end, and in order to test the theory regarding numerical conversion you refer to above, could you please try selecting only one table and one column (string not numerical) on the data source level, and then try synchronizing again? If this succeeds quickly, then add a numeric column for the same table and try synchronizing again. Please let us know how this impacts the synchronization time.

 

Hi @Christian Hauggaard ,

I tried and it still just running and I don’t think it matters if I select tables and columns as I found the input parameter from TX to ADF which is then creating a query for the Oracle data source to handle. 

The query looks like this:

{
    "source": {
        "type": "OracleSource",
        "oracleReaderQuery": "SELECT \r\n c.OWNER,\r\n c.TABLE_NAME,\r\n c.COLUMN_NAME,\r\n COALESCE\r\n (\r\n (\r\n SELECT \r\n 1\r\n FROM \r\n all_constraints cons, \r\n all_cons_columns cols\r\n WHERE \r\n cons.constraint_type = 'P'\r\n AND cons.constraint_name = cols.constraint_name\r\n AND cols.column_name = c.COLUMN_NAME\r\n AND cols.TABLE_NAME = c.TABLE_NAME\r\n AND cols.OWNER = c.OWNER\r\n AND rownum = 1\r\n ), \r\n 0\r\n ) AS IsKeyColumn,\r\n c.NULLABLE,\r\n c.DATA_TYPE,\r\n c.COLUMN_ID,\r\n c.CHAR_LENGTH,\r\n c.DATA_PRECISION,\r\n c.DATA_SCALE\r\nFROM \r\n SYS.ALL_TAB_COLS c\r\nORDER BY \r\n OWNER, TABLE_NAME, COLUMN_ID",
        "partitionOption": "None",
        "queryTimeout": "01:40:00"
    },
    "sink": {
        "type": "ParquetSink",
        "storeSettings": {
            "type": "AzureBlobFSWriteSettings"
        },
        "formatSettings": {
            "type": "ParquetWriteSettings"
        }
    },
    "enableStaging": false
}

which means that there is no filtering on either the selected table and/or column. 

Just for comparison, here is the query to get the tables which show up as tables to select among when setting up the data source

{
    "source": {
        "type": "OracleSource",
        "oracleReaderQuery": "SELECT DISTINCT\r\n    c.OWNER,\r\n    c.TABLE_NAME\r\nFROM \r\n    SYS.ALL_TAB_COLS c\r\nORDER BY \r\n    c.OWNER, c.TABLE_NAME",
        "queryTimeout": 100
    },
    "sink": {
        "type": "ParquetSink",
        "storeSettings": {
            "type": "AzureBlobFSWriteSettings"
        }
    },
    "translator": {
        "type": "TabularTranslator",
        "mappings": [
            {
                "source": {
                    "name": "OWNER"
                },
                "sink": {
                    "name": "OWNER"
                }
            },
            {
                "source": {
                    "name": "TABLE_NAME"
                },
                "sink": {
                    "name": "TABLE_NAME"
                }
            }
        ]
    },
    "enableStaging": false
}

It is the same main table. SYS.ALL_TAB_COLS, as the sync query so that one works fine.

Here is the input sent by TX to ADF in the sync:

input

[ { "ColumnMappings": [ { "source": { "name": "OWNER" }, "sink": { "name": "OWNER" } }, { "source": { "name": "TABLE_NAME" }, "sink": { "name": "TABLE_NAME" } }, { "source": { "name": "COLUMN_NAME" }, "sink": { "name": "COLUMN_NAME" } }, { "source": { "name": "IsKeyColumn" }, "sink": { "name": "IsKeyColumn" } }, { "source": { "name": "NULLABLE" }, "sink": { "name": "NULLABLE" } }, { "source": { "name": "DATA_TYPE" }, "sink": { "name": "DATA_TYPE" } }, { "source": { "name": "COLUMN_ID" }, "sink": { "name": "COLUMN_ID" } }, { "source": { "name": "CHAR_LENGTH" }, "sink": { "name": "CHAR_LENGTH" } }, { "source": { "name": "DATA_PRECISION" }, "sink": { "name": "DATA_PRECISION" } }, { "source": { "name": "DATA_SCALE" }, "sink": { "name": "DATA_SCALE" } } ], "Source": { "TableId": "00000000-0000-0000-0000-000000000000", "SchemaName": "SYS", "TableName": "ALL_TAB_COLS", "SelectionQuery": "SELECT \r\n c.OWNER,\r\n c.TABLE_NAME,\r\n c.COLUMN_NAME,\r\n COALESCE\r\n (\r\n (\r\n SELECT \r\n 1\r\n FROM \r\n all_constraints cons, \r\n all_cons_columns cols\r\n WHERE \r\n cons.constraint_type = 'P'\r\n AND cons.constraint_name = cols.constraint_name\r\n AND cols.column_name = c.COLUMN_NAME\r\n AND cols.TABLE_NAME = c.TABLE_NAME\r\n AND cols.OWNER = c.OWNER\r\n AND rownum = 1\r\n ), \r\n 0\r\n ) AS IsKeyColumn,\r\n c.NULLABLE,\r\n c.DATA_TYPE,\r\n c.COLUMN_ID,\r\n c.CHAR_LENGTH,\r\n c.DATA_PRECISION,\r\n c.DATA_SCALE\r\nFROM \r\n SYS.ALL_TAB_COLS c\r\nORDER BY \r\n OWNER, TABLE_NAME, COLUMN_ID" }, "Destination": { "FileName": "_2023_03_27__14_40_51_8055031.parquet", "FolderPath": "mycontainerame/IFS10CLOUD/_ODX_C409DB7FD0BE4E2D843E0E55997E204C_/_WORK/METADATA_2023_03_27__14_40_50_1755663" } } ]

I have tried some more and what I find using the queries sent from TX are the following

  1. If I exclude the column IsKeyColumn, that is

COALESCE((SELECT 1 FROM all_constraints cons, all_cons_columns cols 
WHERE cons.constraint_type = 'P' AND 
    cons.constraint_name = cols.constraint_name AND 
    cols.column_name = c.COLUMN_NAME AND 
    cols.TABLE_NAME = c.TABLE_NAME AND 
    cols.OWNER = c.OWNER AND 
    rownum = 1 ), 0) AS IsKeyColumn

       then this query only takes a couple of seconds. 

  1. The SYS.ALL_TAB_COLS table includes almost 700 000 rows
  2. If I include a where clause on the SYS.ALL_TAB_COLS for example c.TABLE_NAME = 'SITE', then the whole qurey, including IsKeyColumn only takes a couple of seconds.

Which means that even if I select to include only a few tables in TX it still tries to fetch all tables and columns with their information, if TX would filter to only fetch the relevant tables, that is the selected ones, then I guess this would work fine

Userlevel 6
Badge +5

Hi @Spelling thanks for providing the info, and results of the test. If you set the a value for the Owner in the data source settings, the synchronization ADF pipeline should only include tables from that owner. From the screenshot it appears that you have set the owner property, is that correct?

 

Userlevel 6
Badge +5

The reason I ask is because if you have set the owner property there should be a where clause for the owner in the oracleReaderQuery, which seems to be missing from the query you sent above.

 

Hi @Christian Hauggaard,

I’v tried this as well but I want to fetch data from two owners (schemas), below is from SYS.ALL_TAB_COLS

The main one includes around 600 000 rows (tables/columns) and it is this owner that takes forever due to the 

COALESCE((SELECT 1 FROM all_constraints cons, all_cons_columns cols 
WHERE cons.constraint_type = 'P' AND 
    cons.constraint_name = cols.constraint_name AND 
    cols.column_name = c.COLUMN_NAME AND 
    cols.TABLE_NAME = c.TABLE_NAME AND 
    cols.OWNER = c.OWNER AND 
    rownum = 1 ), 0) AS IsKeyColumn

I have also investigated this select closer and I find that all the columns I want to fetch have 0 as IsKeyColumn according to this logic above.

 

The other owner works fine as this schema only includes about 15 000 rows and takes about 30 sec.

 

Userlevel 6
Badge +5

Hi @Spelling can you please try deleting the pipeline in ADF, and then increasing the command timeout on the data source within TimeXtender (by default set to 100 minutes i.e. 1 hour and 40 minutes which is the timeout you are seeing), perhaps set it to 200 or 300, and then execute the synchronize task again? The purpose of deleting the pipeline in ADF first is to ensure that the command timeout property is correctly updated in ADF

 

Hi @Christian Hauggaard,

Ok, I have deleted the pipeline and updated the commond timeout to 300 and started a sync again. 

Lets seen if the query will execute within 5 hours :)

I’ll let you now the results

Hi @Christian Hauggaard ,

The ADF says it completed successfully after 2 hours but TX says

and

which is one of the tables from the main owner (the only table from this owner that I selected), there were also a few tables from the owner with fewer row in SYS.ALL_TAB_COLS but no error on these tables, or at least none that is showing in the TX log.

Userlevel 6
Badge +5

What is the data type of the column OBJID in the table IDENTITY_INVOICE_INFO if you check the Oracle Server? Are you able to successfully execute the transfer task on the selected tables you wish to transfer to ODX storage?

It is ROWID

 

I have several tables that I use that have a column called OBJID which has DATA_TYPE = ROWID (all in the owner IFSAPP) so it is not only related to IDENTITY_INVOICE_INFO. 

I did a selection also on columns and deselected OBJID for each table and it seems to work now, both sync and transfer, even if sync takes 2 hours. Right now I’ve selected 11 tables, all which are quite small, and transfer takes only like 1-2 minutes. So it is the sync that, according to me, took longer than I could ever imagine.

Hi again,

I like to revisit this topic as the synchronization fails over and over again which I believe is due to a timeout on the data source side after 2 hours. 

The data source is provided by a tihrd party and which means that the customer doesn’t have control to change any settings and 2 hours should be enough time for a sync. The customer have a couple of environments 

After many tests and discussing with the customer a similar problem have occurred before, but in an environment not using ODX server, where a sync took over 2 hours and this was fixed in Timextender release 20.10.3, issue “8889: Oracle slow synchronization - affects all bit specific data sources”. After this the customer says the sync only takes about 10-15 min. 

Could this be a problem in ODX server as well? I managed to get a successful sync with version 20.10.40 which took about 2 hours but after an upgrade to 20.10.40 I’ve  managed to get 1 successful sync on 16 tries over 6 days where I’ve tried to do the sync different times during the day. 

Is it possible to check if the issue 8889 could also affect ODX server and something possible to fix?

Hi @Christian Hauggaard ,

I couldn’t edit my post above so I create a new post with the same content so if you have the possibility to delete the above post it is ok :)

 

I like to revisit this topic as the synchronization fails over and over again which I believe is due to a timeout on the data source side after 2 hours. 

The data source is provided by a third party and which means that the customer doesn’t have control to change any settings and 2 hours should be enough time for a sync. The customer have a couple of environments 

After many tests and discussing with the customer a similar problem have occurred before, but in an environment not using ODX server, where a sync took over 2 hours and this was fixed in Timextender release 20.10.3, issue “8889: Oracle slow synchronization - affects all bit specific data sources”. After this the customer says the sync only takes about 10-15 min. 

Could this be a problem in ODX server as well? I managed to get a successful sync with version 20.10.40 which took about 2 hours but after an upgrade to 20.10.40 I’ve  managed to get 1 successful sync on 16 tries over 6 days where I’ve tried to do the sync different times during the day. 

Is it possible to check if the issue 8889 could also affect ODX server and something possible to fix?

Reply