This guide will cover how to create and add Azure Data Lake storage for the ODX in TimeXtender.
If you are creating a TimeXtender Environment from scratch, we highly recommend using one of the Reference Architectures for deploying TimeXtender in Azure.
If you have already deployed one of the Azure Marketplace templates with Azure Data Lake then you already have all of the necessary data lake resources and can skip step 1 and proceed with step 2 to register an application.
If you already have an existing TimeXtender deployed in Azure, but don't have not yet configured an Azure Data Lake storage account, you can use this guide to add a Data Lake storage option by starting at Step 1.
Complete the following steps to create Azure Data Lake Storage for the ODX in TimeXtender:
- Create an Azure Storage Account
- Create an App Registration
- Enable App Registration access to Data Lake
- Add Azure Data Lake Storage in TimeXtender
1. Create an Azure Storage Account
Note: If you already have a Data Lake Storage account, you can skip this step.
- In Azure portal, create a Data Lake Storage account that will be used to host ODX database.
- Azure Portal -> Create a new Resource -> Storage account - blob, file, table, queue -> Create storage account
- Assign Subscription name, Storage account name, Location and other properties.
- Select Account kind = StorageV2 (general-purpose v2 )
- Advanced tab -> set Hierarchical namespace to Enabled
For more details, refer to Microsoft Azure documentation at Creating Azure Data Lake Storage Gen 2
2. Create an App Registration
In order to access the data lake resources from TimeXtender, you will need to configure an App Registration in the Azure portal.
Note: The following steps for access control describe the minimum permissions required in most cases. In your deployment/ production, you may fine-tune those permissions to align with your business rules and compliance requirements. Refer to Microsoft Azure documentation for details.
- In the Azure Portal menu, click on Azure Active Directory, then click on App Registrations in the menu bar on the left. Then click New Registration.
- Enter a name and select Accounts in this organizational directory only. The value of Redirect URI is the URL at which your application is hosted. Click Register when you are done.
- For the newly added App Registration, select Certificates & secrets to create a New Client Secret. This key is encrypted after save, so it needs to be documented somewhere safe. The secret will appear after you click Add.
3. Enable App Registration access to Data Lake
After the App Registration is created, you need to configure access to Data Lake.
- Go back to the resource group where your data lake resources are located and select the Data Lake storage account resource.
- In the menu bar on the left, select Access Control (IAM) and add a role assignment.
- Add the <App Registration Name> you just created to the role of Owner of the resource.
Note: When you add or remove role assignments, wait for 5 minutes before executing an ODX task. It can take up to 30 minutes for changes to take effect. For more details, review this article Troubleshoot Azure RBAC
4. Add Azure Data Lake Storage in TimeXtender
After the configuration is completed in Azure portal, you may create an ODX Azure Data Lake Storage in TimeXtender.
In the ODX Server tab, right-click to Add Azure Data Lake Gen 2 Data Storage.
- Name: Create a friendly name for the ODX Storage.
- Tenant ID: Also known as the [Directory ID] found under properties of Azure Active Directory.
- Application ID: This is the Application ID of the App Registration created above. This can be found in the Azure portal>Azure Active Directory>App Registrations
- Application Key: This is the Secret created above for you App Registration
- Account Name: Use the name of your Azure Storage Account. The input only needs the name of the resource instead of the URL.
- Container name: Enter a name for the Azure Storage Container and click Create. This will create a Container within the Azure Storage account.
- Timeouts: These are the timeouts for communicating with the Azure Storage account. The defaults are ideal for most situations but may need to be extended for slow connections or exceptionally long data transfers.
- Transfer to data warehouse - Limit memory use: This setting only applies if you are not using Azure Data Factory for Data Transfer. You can learn more about when to use this feature here: Memory limit restraint of parquet file extraction
With the information provided above, you should be able to successfully create Azure Data Lake Storage in TimeXtender.