How to auto-scale your Azure SQL database in Azure Data Factory
25 November 2020
Why autoscaling is important?
One of the advantages of running databases on Microsoft Azure SQL Database is the ability to dynamically manage them to adapt to changing workload demands. Autoscaling databases is the most cost effective way to increase the performance of data operations.
It allows you to easily:
- adjust the maximum DTUs (CPU/memory/IOPS resources) used or the storage you need
- to scale into other editions at any time
This implies that resources can be scaled up at high workloads or can be scaled-down at low workloads. This typically results in an overall decrease in costs. In practice upscaling may happen at the start of the business day or at the beginning of an ETL-run, because high workloads are expected at those moments.
This guide explains how to automatically tune database parameters using a low code approach in Azure Data Factory.
How to autoscale your SQL database with Azure Data Factory?
Step1: Write query to rescale your database
With Azure SQL database the following query can be used for modifying the database tier:
ALTER DATABASE [DatabaseName] MODIFY (EDITION = ‘edition’, SERVICE_OBJECTIVE = ‘service objective’);
Afterwards check if the scaling is performed by inspecting the resource in the Azure Portal or by using following code:
SELECT DATABASEPROPERTYEX(db_name(),'edition') AS ServiceTier, DATABASEPROPERTYEX(db_name(),'serviceobjective') AS ComputeSize
After executing this query, you will be able to check the service tier and the compute size of your database. In this case, we have chosen to scale up the database to a Standard S4 instance.
Step 2: Create a stored procedure
It's not very user friendly to execute the T-SQL statement each time you expect a change in workload demand. The optimal solution is to create a stored procedure on the database that executes automatically at a fixed time or before a changing workload of data operations.
Use the following Azure SQL database query to create a stored procedure that scales up your database to a Standard S4 instance.
CREATE PROCEDURE [schema].[name procedure] AS ALTER DATABASE [DatabaseName] MODIFY (EDITION = 'Standard', SERVICE_OBJECTIVE = 'S4');
Step 3: Create a pipeline in Azure Data Factory
In this step it is explained how to create a pipeline that executes a stored procedure in Azure Data Factory. Do this by adding the stored procedure activity.
3.1. CREATE A NEW PIPELINE IN AZURE DATA FACTORY
3.2. CHOOSE THE STORED PROCEDURE ACTIVITY
This can be found under the tab named “general”:
3.3. CLICK ON THE STORED PROCEDURE ACTIVITY AND COMPLETE THE SETTINGS
- Specify the linked service you want to use: this is the name of the connection you made to the database
- Specify the stored procedure name : this is the name of the stored procedure you created in the previous step
3.4. RENAME THE PIPELINE AND THE STORED PROCEDURE ACTIVITY
In this example we have chosen for “UpscaleDB” to clarify the goal of the data pipeline:
Step 4: Schedule the pipeline in Azure Data Factory
In this step it is explained how the stored procedure activity can be executed at a fixed moment by adding a trigger to the datapipeline.
4.1. ADDING THE TRIGGER
First, click "Add Trigger" in the menu, then select "New/Edit" and "+ New":
To complete the trigger page:
- Confirm that ‘schedule’ is selected for type
- Specify the start datetime of the trigger ( UTC-time). It is set current datetime by default
- Specify the ‘Recurrence’ for the trigger (hourly, daily, weekly etc)
- If you don’t want to specify an end datetime for the trigger select ‘No end’ (make sure the activated box is ticked!)
4.2. PUBLISH YOUR CHANGES
Step 5: Scale your database down
In our example we've demonstrated how to scale up a database, but it's equally important to scale down the database to reduce overall costs. A datapipeline can be created in the same way to scale down your SQL database at a certain timestamp.
By using this guide, you’ll be able to dynamically scale databases with a very flexible low-code approach. Performance issues that were not solved by indexing, or rewriting queries can be mitigated with scaling up your database. Scaling down at non-peak times can reduce your cost by reducing the available resources when you don’t need as much.
In case you have any additional questions feel free to reach out at any moment.