Skip to content

Easily Modify Dataverse Option Set Integers to Text • Synapse Serverless Edition

Please Don’t Make Me Manually Do 37 Left Joins

Modifying integers that represent text is a key part of consuming Dataverse data from a Data Lake. However, many of the published solutions tell you to do this process one integer at a time or don’t create an identical schema to the TDS endpoint. 🤯

We’ve come up with a better way.

Not only will this blog explain the solution – we’ve made it available as Open Source code on GitHub!

Customer Type Code is 10?

Requirements

Need the ability to change fields like “statecode” and “statuscode” from 0, 1, 2 to “Active”, “Inactive”, etc. All fields need to be resolved automagically.

Make the Data Lake output match the native Dataverse connector output.

Open Source Code

Dataflows and Power BI Desktop examples are available in Cooptimize’s GitHub.

Solution

Use dynamic SQL to do N joins to resolve all Option Set Choice fields, including boolean type fields.

Use the equivalent naming convention ({Option Set Field}name) as the Dataverse TDS/SQL connector for maximum compatibility.

Script parameters enable user to create Synapse serverless views for all tables, a subset of tables, or specific fields of a table.

Known Unknowns

  • The SQL code has been optimized, but we haven’t done large dataset performance testing.

Option Sets in Action

Prerequisites

  1. Synapse Serverless database
  2. Download the three SQL scripts from our GitHub:
    1. External data source
    2. External file format
    3. Expanded Dataverse Table from Data Lake

External data source

An external data source is used by Synapse external tables and views to provide access to live Data Lake data leveraging T-SQL.

If you’re working with Data Lakes we recommend downloading the Azure Storage Explorer app. It simplifies finding the URLs, exploring data in the lake, and managing data access.

To create the external data source a Shared Access Signature is required. In Storage Explorer right click on the Dataverse container and select Get Shared Access Signature.

Update the Expiry Time value from the default value of the current date:

Copy the Query string value:

Use the External Data Source SQL script from our GitHub and replace the following values:

  1. Create your own password for the Master Key Encryption
  2. Paste the Shared Access Signature you copied into SECRET. Delete the ? prefix from the secret, it will not work with the ? at the beginning
  3. Enter your storage account and container name into the LOCATION

More information on External Data Sources:
CREATE EXTERNAL DATA SOURCE (Transact-SQL) – SQL Server | Microsoft Docs

External file format

An external file format is used by Synapse external tables and views to define the structure of the files in the lake. For Dataverse data is stored in CSV, thus the file format defines the comma delimiter as a field terminator.

Execute the External file format SQL script on our GitHub.

More information on External File Formats:
CREATE EXTERNAL FILE FORMAT (Transact-SQL) – SQL Server | Microsoft Docs

Expanded Dataverse Table from Data Lake

The script creates the following three objects:

1. Single Option Set View

The query will create a unioned view to collate Option Set tables. Simplifying the Option Set tables makes doing the table joins much easier. Every join can be done on Table Name (Filter), Field Name (Filter) and Option Id. With Microsoft’s approach, you have to use four different sets of join logic.

Why not just use the Option Sets tables from the Lake Database that Microsoft automagically creates? At the time of writing there is more metadata available in the JSON files published to the Data Lake that help the query resolve all Option Sets. For example, boolean (True/False) fields are represented as strings (varchar) in the Lake Database while the metadata in the lake stores them as boolean.

2. External Tables

An external table gives SQL the location and structure of the data to be read.

For each table selected in the query parameters an external table is created in the Synapse Serverless database.

3. Views

In order to resolve the metadata labels views are used to provide the necessary joins to the Option Set view defined above.

For each table selected in the query parameters a Synapse Serverless view is created that resolves all Option Set columns selected in the query parameters. The views will include all fields

Running the Query (4 ways)

The query is designed so that it can run for all tables, a subset of tables, or a subset of columns for a specific table. It can run on an ad hoc basis or be created as a stored procedure. As a stored procedure it can be executed by orchestration tools like Power Automate or Azure Data Factory to automatically make new Option Sets available.

Parameters

ExternalDataSource (required) – external data source as defined above

ExternalFileFormat (required) – external file format as defined above

Schema (required) – database schema to be used for external tables and views. Defaulted to dbo.

ExternalTablePrefix (prefix or suffix required) – external table name prefix.

ExternalTableSuffix (prefix or suffix required) – external table suffix. Defaulted to table.

Table (optional) – comma delimited list (no spaces) if specific tables want to be created or modified. Leaving default value (”) will run for all tables.

Columns (optional) – comma delimited list (no spaces) of columns to be resolved. Leaving default value (”) will run for all columns. This should only be used if one table is supplied to the Table parameter.

Option 1 – Run For All Tables

By leaving the Table and Columns parameter as the default ” value the query will run for all tables published to the data lake.

Option 2 – Run for Specific Table(s)

The Table parameter accepts either a single value or a comma delimited list (no spaces) of tables.

An external table and view will be created for the account and contact table

Option 3 – Run for Specific Option Sets on a Single Table

For added flexibility the query can resolve specific Option Sets within a table. This can be helpful for migrating existing solutions to data lake that only resolved specific Option Sets. In this case, the table should be added to the Table parameter and a comma delimited list (no spaces) of columns added to the Columns parameter.

The account view will be created with just customer type, industry code, and payment terms code Option Sets resolved

Option 4 – Store Query as Stored Procedure

To store the query as a Stored Procedure simply comment out or delete the first set of parameters and uncomment the CREATE PROCEDURE section.

Creating query as a stored procedure makes future runs easier and opens the door to automation

Frequently Asked Questions

“Should I Use the Dataverse Connector or a Data Lake?”

Yes.

The appropriate nuance is out of the scope of this blog. However, if you are running into the Dataverse connector 80 MB size limit, this approach uses the same naming conventions. Therefore it should be possible (easy?) to migrate from Dataverse direct approach to Data Lake + this code in Synapse Serverless.

Is This Code Free?

Yes. The code is under the MIT License, a very simple and permissive open source license.

We Do Ask…

  1. Log any question, issue, shortcoming, or fatal flaw in GitHub.
  2. Share this blog if it helped you.
  3. If you need additional help, check out our Services and Pricing.

Related Microsoft Docs

Leave a Reply

Your email address will not be published.