This article provides insight around the differences in performance in each of the Azure SQL Database (PaaS offering) Performance Models and Service Tiers to help determine performance and cost needs. Included in this article are results and findings from extensive testing on different Azure SQL Database service tiers and performance levels along with suggestions on how to improve performance.
We have conducted various performance tests against the different types, purchase models, and service tiers of Azure SQL Database for both single-database and managed instance. The goal is to help provide a level of guidance around how performance varies between the different service tiers and performance levels, along with the associated costs around each, to help offer insight around what Azure SQL PaaS offering to leverage in your modern data estate.
Please note that these test results are not meant to be used as an exact sizing guide. Every environment is inherently different. Database structure, size, complexity, ETL process', etc. vary greatly depending on data, business requirements and other variables.
In this topic:
- Performance Testing
- Additional Information
Why do I need the correct performance level and service tier?
Choosing the most cost-effective service tier is very important in achieving the required performance level for your Azure SQL databases. When compute utilization becomes high, performance begins to suffer so it is crucial to choose the optimal amount of compute, storage, and IO resources.
- Compute: When compute utilization becomes high, query latency increases and data transfer times increase, transformation and cleansing procedures times increase, and tasks can time out.
- Storage: When the size of your database reaches the max size limit, the execution of data transfer tasks, database inserts, and updates can fail.
- Sessions and workers: When the max number of sessions and workers are reached, new requests are rejected. Often times the number of concurrent workers piles up when database resource limits are reached as queries and load times take much longer.
- Backup and Availability: It is essential to the business that your database is up and running 99.99% of the time and data is backed up without having to worry about the impact of maintenance operations and outages.
- Costs: Higher service tiers and compute resources increase the total cost of ownership exponentially. It is crucial to choose the most cost-effective service tier based on the number of resources and performance required.
Azure SQL Database options
- Single Database: A single database owns its set of resources managed via a SQL Database server. A single database is similar to a contained database in SQL Server. You can use either the DTU or vCore pricing model for a single database.
- Elastic Pool: SQL Database elastic pools are a simple, cost-effective solution for managing and scaling multiple databases that have varying resource demands. The databases in an elastic pool are on a single Azure SQL Database server and share compute resources, and the databases only use the necessary resources they need when needed. You can use either the DTU or vCore pricing model with elastic pools.
- Managed Instance (recommended): This is a fully managed SQL Server Database Engine hosted in your own VNET in Azure, essentially making this a cloud-based version of on-premises SQL Server allowing for greater flexibility and security. Managed instance supports database migration from on-premises with minimal to no database change with near 100% compatibility with on-premises SQL Server. And similar to on-premises SQL Server, the databases allocated to the managed instance are managed collectively and share compute resources. With a managed instance, on the vCore pricing model is available.
Azure SQL Database allows you to select between two different purchasing models based on your performance and cost needs. These models are available for both single databases and elastic pools.
Each purchase model also has several different levels of service tiers. The service tiers offer varying levels of compute, IO, availibility, backups, among other features.
- DTU Model: Preconfigured blend of compute, storage, and IO resources at a fixed price. This is the simpler option that is great for paying a fixed amount each month for a preconfigured bundles of resources. The DTU model also offers lower performance and pricing options when compared to the vCore model.
- vCore Model (recommended): Customizable blend of compute, storage, and IO resources that allows you to scale them independently. This is the more flexible option that allows you to pay for only what you need, and also allows you to save when you use Azure Hybrid Benefit for SQL Server.
- Provisioned compute: This tier allows you to choose the exact amount of resources that are always provisioned for your workloads. Billing for this compute tier is per hour.
- Serverless compute (preview): This tier allows for the autoscaling of resources across a range of compute where you define the minimum and maximum number compute resources. This tier also automatically pauses during inactivity and resumes your databases during increased workloads. Billing for this compute tier is per second.
- Read more about the vCore Model here
- Read more about the vCore service tiers here
One of the major benefits of using Azure is that you can quickly scale up/down to make it easier to test your workloads between different purchase models and service tiers. It's also important to note that the DTU model has more options for lower compute resources at a lower price point, whereas the vCore model offers greater flexibility in managing your costs when using higher compute resources.
- Read Microsoft's documentation for more in-depth knowledge and comparisons about the two different purchasing models and service tiers.
- Microsoft's Pricing Calculator can help you estimate costs for your Azure SQL Database based on the purchase model and service tier you plan to use.
- Full load of the entire sample database from ODX to MDW
- Tested across 5 different sized databases (1GB, 5GB, 10GB, 20GB, 50GB)
- Data is transferred from one Azure SQL Database (ODX) to another Azure SQL Database (MDW) to reduce any network latency.
- The Azure SQL Database used for the ODX remained the same across each test (Standard S0:10 DTUs)
- Discovery Hub is hosted on a Virtual Machine in Azure
- VM Size: DS2_v2 (2 VCPUS, 7 GB RAM)
- MDW has a number of complex transformations and cleansing procedures that occur during execution
- Max rows to copy = 300,000
- This is the number of rows to copy when using ADO.NET providers for data transfer
The structure of the sample database that was used for testing is the same across each test.
- 95 tables
- 6 different schemas
- 4 large tables (1 million+ records)
The structure of each of the different sized databases (1GB, 10GB, 20GB, 50GB) is the same. The only difference between them is the amount of data in each table.
To view the detailed test results, download the Excel file at the end of this article.
Below are visual representations of the test results that plots the data transfer times and data cleansing times versus the hourly cost for each service tier and performance level that was tested.
Increase performance using Azure
- Increasing ADO.NET transfer speeds requires more server compute power, which can be accomplished by increasing the number of DTU's or vCores on the target database.
- If your environment has a lot of complex transformations and data cleansing procedures, this requires more IO. To increase the time it takes to perform data cleansing procedures and other IO-intensive procedures, increase the service tier on the target database.
- For faster ADO.NET transfer speeds from source to Azure, increase the memory of the virtual machine that Discovery Hub is hosted on.
Increase performance using Discovery Hub
- To increase the speed of both data transfers and cleansing procedures, increase the number of max threads for executions packages in Discovery Hub. Discovery Hub automatically calculates object dependencies and will still honor the inherent dependencies when running across multiple threads. This helps with both speed and performance.
The optimal number of threads depends on server resources. Too few threads can result in slow execution times, but too many threads can cause the server to become slow or unresponsive.
- Use Incremental loads to facilitate faster load times by enabling you to load only new transactional data into the MDW and ODX. If you are using an Azure Data Lake as your ODX, you can leverage the power of Databricks and a powerful Spark engine for even better performance on incremental loads.
- Enable batch data cleansing to help increase performance for data cleansing procedures on large tables. You can choose to split the INSERT statement up in batches during data cleansing, i.e. when copying data from the transformation view to the valid table. This saves log space on the SQL Server which gives you better performance on large tables with 100,000s or millions of rows.
- Enable row and page-based compression on larger tables to reduce the size of your database. This can also improve performance on I/O intensive workloads. Compressing and decompressing data, however, requires more CPU resources.
- Use the Performance Recommendations tool to take advantage of the performance-improving features in Discovery Hub. The Performance Recommendations tool analyzes your project and recommends changes that can improve the project's performance.
You can use T-SQL and the ALTER DATABASE statement to scale database resources before/after your execution packages to automatically scale resources and save on costs.
SELECT Edition = DATABASEPROPERTYEX('databasename', 'EDITION'),
ServiceObjective = DATABASEPROPERTYEX('databasename', 'ServiceObjective'),
MaxSizeInBytes = DATABASEPROPERTYEX('databasename', 'MaxSizeInBytes');
ALTER DATABASE [databasename] MODIFY (EDITION = 'Premium', MAXSIZE = 1024 GB, SERVICE_OBJECTIVE = 'P15');
You can read more about this and the availible values here.
In your Discovery Hub project, you can add this as a Script Action to scale database resources up/down. when needed. After creating the Script Action, right-click on the database you want to run this on (e.g. MDW), click Advanced, and click Set Pre- and post scripts to select when you want to run this script.
Please note, it can take several minutes or an hour to scale database resources up/down, so space out the execution of this script before deploying/executing to your Azure SQL Database accordingly.