You can use an Azure SQL database as a data source with the PDI client. This connection is required if you want to use the Bulk load into Azure SQL DB job entry to load data into your Azure SQL database from Azure Data Lake Storage. Pentaho supports the Always Encrypted option, dynamic masking, and multiple authentication methods for connecting to an Azure SQL database.
Because one physical server may host databases for multiple customers, keep in mind that SQL for Azure is different from MSSQL. For more information regarding the differences between Azure SQL and MSSQL, see https://docs.microsoft.com/en-us/azure/azure-sql/database/features-comparison
Before you begin
You must have an Azure account with an active subscription and an instance of an Azure SQL database. You also need to install the Azure SQL database drivers. For help installing your drivers, see your Microsoft documentation for details.
Additionally, you need to obtain the following information from your system administrator:
- Host name
- Database name
- Port number
- Authentication method
If you use the Always Encryption Enabled option, you also need to obtain the Client id and Client Secret Key.
Pentaho supports four authentication methods for connecting to the Azure SQL DB instance:
Connect using the Azure SQL Server username and password.
Azure Active Directory
Connect using Multi Factor Authentication (MFA). The MFA password must be entered on the displayed webpage.
Azure Active Directory with password
Connect using an Azure AD username and password.
Azure Active Directory with integrated authentication
Connect using the federated on-premises Active Directory Federation Services (ADFS) with Azure Active Directory in the cloud.
Connect to an Azure database
Start the PDI client and create a new transformation or job.NoteYou can also use the Pentaho User Console to make this connection. See Define data connections
In the View tab of the Explorer pane, double-click on the Database connections folder. The Database Connection dialog box appears, as shown below:
Enter your database connection information.
Field Description Host Name The name of the Azure SQL server instance. Database Name The name of the Azure SQL database to which you are connecting. Port Number The TCP/IP port number. The Azure SQL Database service is only available through TCP port 1433. You must set your firewall to allow outgoing TCP communication on port 1433. Authentication method The authentication method used to connect to the Azure SQL DB instance. The default is SQL Authentication. Username The username used to connect to the database. Password The password used to connect to the database. Always Encryption Enabled Select to use encryption. See Use the Always Encryption Enabled option for instructions on using this option. Client id The unique client identifier, used to identify and set up a durable connection path to the server. Client Secret Key The unique name of the key value in the Azure Key Vault.
Click Test to verify your connection.
Use the Always Encryption Enabled option
Generate a column master key in the Azure Key Vault.
Encrypt the column using the column master key.
Register the app under Azure Active Directory and obtain both the Client id and Client Secret Key.
Grant permissions to the Client id for accessing the Azure Key Vault.
Select Always Encryption Enabled and provide the Client id and Client Secret Key.