Follow

Add Azure Data Factory Data Sources to ODX Server

This guide will cover how to configure an Azure Data Factory data source for the ODX in TimeXtender.

If you are creating a TimeXtender Environment from scratch, we highly recommend using one of the supported configuration options for deploying TimeXtender in Azure.

Prerequisites

To use the Azure Data Factory data source for the ODX, you need the following:

In addition, you need the original data source already created and available to connect from Azure Data Factory. For on-Premises data sources, you will also need to create Self-hosted Integration Runtime.

Complete the following steps to configure Azure Data Factory for the ODX in TimeXtender:

  1. Add Data Factory
  2. Create an App Registration
  3. Enable App Registration access to Data Factory
  4. Add Azure Data Factory Data Source in TimeXtender

1. Add Data Factory

Note: If you have already created a Data Factory, you can skip this step.

  1.  Azure Portal -> Create a new Resource -> Data Factory -> Create 
  2.  Select Version = V2
  3.  Assign Subscription name, Resource Group, and Location
    1. Git is not required so you can disable this.
  4. Once deployed, please note the following properties of the Data Factory which will be needed later:
    1. Azure Data Factory Name
    2. Subscription ID
    3. Resource Group Name

2. Create an App Registration

In order to access the data factory resources from TimeXtender, you will need to configure an App Registration in the Azure portal. 

  1. In the Azure Portal menu, click on Azure Active Directory, then click on App Registrations in the menu bar on the left. Then click New Registration.       
  2. Enter a name and select Accounts in this organizational directory only. The value of Redirect URI is the URL at which your application is hosted. Click Register when you are done.
  3. For the newly added App Registration, select Certificates & secrets to create a New Client Secret. This key is encrypted after save, so it needs to be documented somewhere safe. The secret will appear after you click Add.
  4. Please note the following properties of the App Registration which will be needed later:
    1. Azure Tenent ID
    2. Application ID
    3. Client Secret (Application Key)

3. Enable App Registration access to Data Factory

After the App Registration is created, you need to configure access to Data Factory.

Note: The following steps for access control describe the minimum permissions required in most cases. In your deployment/ production, you may fine-tune those permissions to align with your business rules and compliance requirements. Refer to Microsoft Azure documentation for details.

  1. Go back to the resource group where your data factory resource is located and select the Data Factory resource.
  2. In the menu bar on the left, select Access Control (IAM) and add a role assignment.
  3. Add the <App Registration Name> you just created to the role of Data Factory Contributor of the resource.

 Note: When you add or remove role assignments, wait for 5 minutes before executing an ODX task.  It can take up to 30 minutes for changes to take effect.  For more details, review this article Troubleshoot Azure RBAC 

 

4. Add Azure Data Factory Data Source in TimeXtender

After the above configuration is completed in Azure portal, you may add an Azure Data Factory data source in the ODX Server.

Add the Data Source

  1. Open your ODX server in a tab
  2. Right click Data Sources and click Add Data Source...
  3. On the first page, enter a Name and (optional) Description for your data source and click Next
  4. On the Provider page, select one of the following Azure Data Factory data sources provider and click Next
    • mceclip0.png
    • Note:  You may select an on-Premises or Azure based data source in the above list. For an on-Premises data source, you will need to create a Self-hosted Integration Runtime 
  5. On the Connection Info page, enter the below information and click Next
    • Azure Data Factory Info
      • Azure AD App Registration created in the above section
        • Application ID
        • Application Key (Client Secret
      • (Optional)  Azure Data Factory folder name - pipelines and datasets will be placed in this folder in ADF.
      • Azure Data Factory Name
      • Resource Group
      • Subscription ID
      • Azure Tenant ID
    • Database Connection Info
      • Database
      • Username
      • Password
    • Execution Connection - Data source:  fully qualified server name of the data source. 
      • This is the connection property used by Azure Data Factory when extracting data
      • (Optional) Integration runtime name:  when using an on-Premises data source.*
    • Synchronization Connection - Data source:  fully qualified server name of the data source.
      • This is the connection property used by the ODX Server when synchronizing metadata (Schemas, Table Names, Field Names & Datatypes).

Note:  Typically, the two data source addresses/names/URI will be the same.  However, the address could be different in the cases where Azure Data Factory and ODX Server will be located outside vs inside a network. The Data Source does need to be accessible by BOTH ODX Server and Azure Data Factory. 

*Note: When using an on-Premises data source with Azure Data Factory, Create a Self-hosted Integration Runtime via Azure Data Factory UI, and specify its name in the connection property.

 

Continue with the Data Source Setup:

  1. On the Data page, click Let me select the tables if you do not wish to copy all tables from the source. On the next page, you can then select the tables you want to use. Otherwise, just click Next.
  2. Use the Add Task wizard to add an execution task to the data source.
  3.  Execute Task to ensure data source is working.

To copy tables from Data Source to your Data Warehouse

1. Right-Click on ODX -> Select Data 

mceclip1.png

2. To Add Tables, drag the table and drop it over Tables node in your Data Warehouse:

mceclip3.png

Troubleshooting

Important:  The data source needs to be accessed by both an Azure Data Factory and the ODX.  If you receive an error, it is important to isolate which component is unable to access the data source.

If you make changes to Data Lake or Data Factory configuration Access Control (IAM) and role assignment etc.

  • Wait until the changes take effect in Azure  ( Troubleshoot Azure RBAC )
  • Execute Task and verify it succeeds
  • Preview Tables in Data Factory to verify it is working OK
  • Synchronize objects in ODX (if you made changes in Data Source)
  • Preview Tables in ODX to verify the Data Lake setup is working OK
  • Deploy and Execute in TimeXtender
Was this article helpful?
0 out of 0 found this helpful

0 Comments

Please sign in to leave a comment.