Use Azure Data Lake Storage for your ODX Instance

  • 27 January 2023
  • 1 reply
  • 1206 views

Userlevel 6
Badge +5

Overview

This guide will cover how to create and add Azure Data Lake storage for an ODX instance in TimeXtender.

If you are creating a TimeXtender Environment from scratch, we highly recommend using one of the Reference Architectures for deploying TimeXtender in Azure. 

If you have already deployed one of the Azure Marketplace templates with Azure Data Lake then you already have all of the necessary data lake resources and can skip step 1 and proceed with step 2 to register an application.

If you already have an existing TimeXtender deployed in Azure, but don't have not yet configured an Azure Data Lake storage account, you can use this guide to add a Data Lake storage option by starting at Step 1.

Complete the following steps to create Azure Data Lake Storage for the ODX in TimeXtender:

1.  Create an Azure Storage Account

Note: If you already have a Data Lake Storage account, you can skip this step.

  1. In Azure portal, create a Data Lake Storage account that will be used to host ODX database.
  2. Azure Portal -> Create a new Resource -> Storage account - blob, file, table, queue -> Create storage account
  3. Assign Subscription name, Storage account name, Location and other properties.
  4. Select Account kind = StorageV2 (general-purpose v2 )
  5. Advanced tab -> set Hierarchical namespace to Enabled
  6. It is not necessary to create a container at this time, you will create the container in the TimeXtender Desktop in a later step. 

For more details, refer to Microsoft Azure documentation at Creating Azure Data Lake Storage Gen 2

2. Create an App Registration

In order to access the data lake resources from TimeXtender, you will need to configure an App Registration in the Azure portal. 

Note: The following steps for access control describe the minimum permissions required in most cases. In your deployment/ production, you may fine-tune those permissions to align with your business rules and compliance requirements. Refer to Microsoft Azure documentation for details.

  1. In the Azure Portal menu, click on Azure Active Directory, then click on App Registrations in the menu bar on the left. Then click New Registration.       
  2. Enter a name and select Accounts in this organizational directory only. The value of Redirect URI is the URL at which your application is hosted. Click Register when you are done.
  3. For the newly added App Registration, select Certificates & secrets to create a New Client Secret. This key is encrypted after save, so it needs to be documented somewhere safe. The secret will appear after you click Add.

3. Enable App Registration access to Data Lake

After the App Registration is created, you need to configure access to Data Lake.

  1. Go back to the resource group where your data lake resources are located and select the Data Lake storage account resource.
  2. In the menu bar on the left, select Access Control (IAM) and add a role assignment.
  3. Add the <App Registration Name> you just created to the role of Owner of the resource.

Note: When you add or remove role assignments, wait for 5 minutes before executing an ODX task.  It can take up to 30 minutes for changes to take effect.  For more details, review this article Troubleshoot Azure RBAC

4. Configure ODX Instance Storage

In the portal, Add or Edit and existing ODX Instance.

  1. Storage Type: Select Azure Data Lake Gen2 storage
  2. Storage Account: Use the name of your Azure Storage Account. The input only needs the name of the resource instead of the URL.
  3. Container name:  Enter a name for the Azure Storage Container. You will create the container in the TimeXtender Desktop in a later step. 
  4. Tenant ID: Also known as the [Directory ID] found under properties of Azure Active Directory.
  5. Application ID: This is the Application ID of the App Registration created above. This can be found in the Azure portal>Azure Active Directory>App Registrations
  6. Application Key: This is the Secret created above for you App Registration
  7. Timeouts: These are the timeouts for communicating with the Azure Storage account. The defaults are ideal for most situations but may need to be extended for slow connections or exceptionally long data transfers. 
  8. Transfer to data warehouse - Limit memory use: This setting only applies if you are not using Azure Data Factory for Data Transfer. You can learn more about when to use this feature here: Memory limit restraint of parquet file extraction

5. Create the ODX Storage (Container)

Before you can execute transfer tasks in an ODX instance, you must create the data storage.

  1. In the TimeXtender Desktop, Right-click on the instance and select Edit Instance. 
  2. Click Create Storage... in order to create the storage for the ODX instance.

Troubleshooting

Hostname Error

Error: Service request failed: Invalid URI: The hostname could not be parsed. (System.UriFormatException)

Ensure that the details entered in the portal for the ODX instance do not contain spaces. 


1 reply

Is it possible to configure a data lake connection through a private end point? 

Reply