This article describes the structure of an ODX, and outlines how to add data from a data source to an ODX instance and then transfer the data from the ODX to a Data Warehouse.
The ODX Server: Windows-based server software installed on a centrally located physical or virtual machine. This software orchestrates the ingestion of data from various sources.
Though data ingestion tasks are configured by the desktop software, the actual movement of data is performed by the ODX server, and because the ODX Server can be installed within your own environment it ensures that the data is always within your control.
The ODX server copies data from data sources to a data storage, which can either be on-premise or in the cloud. The data can then be transferred from the ODX to data warehouses using the TimeXtender application.
Raw Data Storage: You can configure either an Azure Data Lake or SQL Database for the initial storage of raw data from the source. The ODX automatically handles changes in the source schema by creating a new version of the target table to match, keeping the old version by default (just in case).
You may have multiple on-premises machines with TimeXtender Desktop, connecting to a remote ODX Server. To configure this scenario, review this guide: Connect to a Remote ODX Server from TimeXtender Desktop
Connect to ODX in TimeXtender Desktop
Once the the ODX Server and the TimeXtender Desktop Application have been installed, and you have made sure that the TimeXtender ODX Service is currently running by checking the Windows Services, run the TimeXtender Desktop Application and sign-in.
After a successful sign-in, it will connect and list ODX instances, which the user has access to, in the Solution Explorer in the left-hand side of the main window. For more information on how to grant access to instances in the TimeXtender Portal, see Grant Access to Instances. If any changes are made within the portal to instances or data sources, make sure to refresh in TimeXtender. Either by clicking the refresh button:
Or by selecting view and then Refresh.
Create the ODX Data Storage within TimeXtender Desktop
Open the instance, then right-click the instance, click Edit and click Create storage... to create the data storage that has been defined in the portal for the ODX instance.
Add data sources to the ODX instance within TimeXtender Desktop
Opening the ODX Instance in TimeXtender Desktop allows you to configure your data sources like selecting tables & fields for ingestion, identifying primary keys, configuring Incremental load, & scheduling recurring transfers etc.
Right-click on data sources and select Add Data Source... Provide a name and then select one of the data sources that has been added and mapped to the ODX instance in the portal.
Then add a transfer task by right-clicking on the data source and selecting Add Transfer Task...
You may select specific tables that you would like to transfer in the transfer task, for more information see Selecting Tables from Transfer Tasks.
A Synchronization task is added automatically when you add a new data source. Whenever one of the below changes is made to the data source, right-click on the Synchronization task and select Execute in order to apply the changes.
The ODX Server service is remotely controlled from the desktop client to facilitate the transfer of data from the source to the ODX Storage using one of many ADO.net providers or Azure Data Factory. Incremental load ensures only new and updated rows are loaded.
Once the transfer task is executed, data will be transferred from the data source to the ODX data storage.
To learn more about searching for and previewing data in an ODX storage, or transferring data from an ODX, see Transferring Data from an ODX to a Data Warehouse Instance.
- Add ODX Instances
- Tasks in an ODX Instance
- Incremental Load in an ODX Instance
- Table and Column Selection in an ODX Instance
Mismatch in Row Counts
Symptoms: When you preview a table in the ODX, and compare it to the current table rows in the data source (by querying the data source directly), sometimes the row count shows a difference.
Cause: The goal of the ODX is to ingest all raw data. While the ODX supports Incremental load and deletes, it does not remove deleted or updated records from the target database with the intention of giving users full control of previous versions of data. The ODX flags these records and communicates this to the Data Warehouse, enabling the Data Warehouse to only extract the latest valid records.
The ODX also creates new versions of target tables when a schema drift occurs. This may result in duplicate tables with some of the same data.
In other words, the ODX maintains both deleted and updated records of the current version, which is why you may see a higher row count.
The storage management task only deletes old "Versions", but it does not clear up deleted or subsequently updated rows.
You may want to view the Errors report for this table in the Data Warehouse, which may show some rows being left out due to data cleansing.