Configuring a CSV data source stored in an Azure Data Lake Gen2 storage container

  • 8 May 2023
  • 0 replies
  • 228 views

Userlevel 3
Badge

We were able to get this CSV file data source stored in an Azure Data Lake Gen2 storage container to work, but thought it may be useful to share, as we found that that the setup worked best when we were careful not to enter any other settings, but instead just entered those settings that were needed and described below.

 

Source File: CSV.

Storage Location:  Azure Data Lake Gen2 storage container

Authentication Method: App Registration added to the storage container via a role assignment of "Storage Blob Data Contributor".

 

Started by adding a new data source and chose the following:

On the data source details page, entered the information for the 9 items outlined below, being careful not to enter information in any other boxes and just updating the necessary items.

 

Item #1 will already be set to CSV based on the initial selection above, so that does not need to be updated. The following numbers correspond to the red boxes in the screenshot below.
 
 

  1. Provider Name: This should be CSV based on the data source selection above.
  2. In the Authentication section, set the Auth Scheme to "AzureServicePrincipal".
  3. In the Azure Authentication section, enter the name of your Azure Storage Account, which will be different from the name in the screenshot below.
  4. In the Azure Authentication section, enter the Azure Tenant ID for your organization.
  5. In the Connection section, enter the URI in the format of "abfss://<containername>/<filename>". The format of this may be slightly different depending on where your CSV file is located inside the container.
  6. In the OAuth section, the "Initiate OAuth" setting should be set to "GETANDREFRESH".
  7. In the OAuth section, the "OAuth Grant Type" setting should be set to "Code".
  8. In the OAuth section, the "OAuth Client ID" box should have the client ID for the app registration that has been given a role assignment for your ADLS Gen2 storage account.
  9. In the OAuth section, the "OAuth Client Secret" box should have the secret for the app registration that has been given a role assignment for your ADLS Gen2 storage account.

The URI section might be a little different, depending on your type of Azure Data Lake storage. The CData documentation clarifies this as follows:

 

Azure Data Lake Storage

Set the following to identify your CSV resources stored on Azure Data Lake Storage:

 

ConnectionType: Set this to Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, or Azure Data Lake Storage Gen2 SSL.

URI: Set this to the name of the file system and the name of the folder which contains your CSV files. For example:

Gen 1: adl://myfilesystem/folder1

Gen 2: abfs://myfilesystem/folder1

Gen 2 SSL: abfss://myfilesystem/folder1

 

 

I was interested if anyone has any feedback on a similar setup that you may have done using a Service Principal for authentication to data source files as stored in a data lake. Did you find it easy enough to set up, or did you encounter any challenges? What was your experience like?


0 replies

Be the first to reply!

Reply