For tables on a Prepare Instance deployed on Azure Synapse Analytics, additional settings are available on the Performance tab in Table Settings. The Distribution option controls how data in the table is distributed among the 60 distributions and x amount of compute nodes on the server.
The following settings are available:
- Round-robin (default): Rows of data are distributed evenly across the server.
- Replicate: A copy of all data in the table is stored on all nodes.
- Hash: Rows of data are split among the distributions using a hash function based on the values in a 'distribution column' that you set. The default is the system field 'DW_id'.
To set distribution for the table:
- Right click the table > Table setting > Performance tab, and then click the setting you want to use under Distribution
To set a distribution column for the table:
Note: This option is only visible for a table in Azure Synapse Analytics
- Right click a field on the table and click Distribution Column
Note: Since tables with hash distribution enabled requires a distribution column, you cannot unselect a distribution column. However, you can set another column as your distribution column.
Please refer to Azure Synapse Analytics documentation for more information on distribution.