Follow

Use Discovery Hub with Azure Synapse Analytics (SQL Data Warehouse)

DiscoveryHubPlustAzureSynapse_Shadow.png

Azure Synapse Analytics is a limitless analytics service enabling insights across relational and non-relational big-data. At it's core is the SQL Pool (Previously Azure SQL Data Warehouse), a massively parallel processing database. You can learn more about this service here.

Discovery Hub began supporting Azure Synapse as a target database in version 19.11.2. This functionality enables the use of Azure Synapse Analytics as a target Data Warehouse or Staging Database. When also connected to Azure Data Lake via the ODX Server, users can simply drag and drop data from Data Lake to a Synapse SQL Pool.

In this article you will learn how to:

 

Create and Access an Azure Synapse SQL Pool in the Azure Portal

  1. You can view this Quickstart Guide to create the SQL Data Warehouse from the Azure Portal
    1. The resource can be created in an existing Azure SQL Server or you can create a new one. 
    2. Create a Server Level Firewall Rule
      1. In the Azure SQL Server, click Firewalls and virtual networks, click Add client IP, click save. 

Using the Server Admin, login to the server using SQL Server Management Studio to run the following queries. 

Create your own SQL Server Login (More Info here). Sample statement:

--Create SQL Server Login in the Master Database
USE MASTER
CREATE LOGIN USERNAME WITH PASSWORD = 'STRONGPASSWORD';

--Create User in the desired Data Warehouse
USE --***Insert Database Name Here***
CREATE USER USERNAME FROM LOGIN USERNAME;

--Assign User to the desired role in the Data Warehouse
EXEC sp_addrolemember 'db_owner', 'USERNAME'

Assign your user to the desire resource class (more about Resource Classes here). Sample Statement:

EXEC sp_addrolemember 'xlargerc', 'USERNAME'

Create a Master Key (more about Master Keys here). Sample Statement:

CREATE MASTER KEY ENCRYPTION BY PASSWORD ='STRONGPASSWORD';

Connect to Azure Synapse SQL Pool from Discovery Hub

  1. Add a Data Warehouse in Discovery Hub
    1. Right Click on Data Warehouses in Discovery Hub Solution Explorer > Add Data Warehouse. 
    2. Type the Azure SQL Server name from the Azure portal
    3. Type the User Name and Password created earlier.
    4. Type the Data Warehouse name from the Azure portal.
    5. Go to Advanced Settings > Deployment Target > SQL Data Warehouse
  2. Right Click Data Warehouse > Create required objects for SQL Data Warehouse
  3. If you have not already, run a query to create a Master key in the SQL Data Warehouse. This can be done using the Discovery Hub Query Tool or Management Studio. 
  4. Add Tables, ideally from ODX Server with Azure Data Lake storage. 
  5. Right click a Table > Table Settings > SQL Data Warehouse Tab > Select desired Distribution method.
  6. Deploy & Execute!

Currently Supported Features 

As of November 6, 2019

  • Full and Incremental load from ODX and Business Units
  • Configure Table Type - Right Click Table > Table Settings > SQL Data Warehouse
    • Columnstore table
    • Rowstore (Heap) table
  • Configure Distribution Method - Right Click Table > Table Settings > SQL Data Warehouse
    • Round-Robin
    • Replicate
    • Hashing and Hash column selection
  • Field transformations
  • Field validations
  • Checkpoints
  • Views
  • Supernatural Keys
  • Conditional Lookups
  • Indexes
  • Security
  • Date Table
  • Copy Command from data lake (Currently in public preview) instead of Polybase CTAS
    • To enable > Right-click on Data Warehouse > Enable SQL Data Warehouse Preview Features
    • mceclip0.png
    • This feature resolve the issue in Polybase when loading tables where the source table contains any column with values containing double quotes ["]. More info in Troubleshooting Common Errors below. 

 

Known Issues

As of September 18, 2019

This is a public preview, therefore there are a number of features that are still unsupported when using Azure Synapse. The below list is not inclusive.

  • Custom Table inserts
  • Aggregate Tables
  • Junk Dimension
  • History Tables

 

Troubleshooting Common Errors

Error: "An error occurred during executing this statement \r\n DELETE FROM [dbo].[TimeXtenderExtendedProperties]…"

Error: "Please create a master key in the database or open the master key in the session before performing this operation."

Error: "Unexpected error encountered filling record reader buffer: HadoopExecutionException: Could not find a delimiter after string delimiter."

  • This is a known issue when using Polybase to copy data from Data Lake and there are values in the source file that contain double quotes ["].  To circumvent this issue you can switch to the "COPY command" by enabling SQL Data Warehouse Preview features. See more under the Copy Command in Currently Supported Features above.  

Error: "Azure SQL DW Distribution column cannot be updated. Please use insert and delete"

  • This error is received when attempting to combine Incremental Load on a table using the Hash Distribution Method.
  • This is a known issue and we are working on resolving this in an upcoming release.

Error: "All columns of the table must be specified in the INSERT BULK statement"

  • This error is received when attempting to use one of Discovery Hubs custom tables such as an Aggregate Table or Custom Table Insert etc.

Error: "Common table expressions followed by INSERT, UPDATE, DELETE, or MERGE are not supported in this version."

  • This error is received when attempting to use one of Discovery Hubs custom tables such as an Aggregate Table or Custom Table Insert etc.

 

Was this article helpful?
0 out of 0 found this helpful

0 Comments

Please sign in to leave a comment.