Skip to content

Easily Modify Dataverse Option Set Integers to Text • Power Query Edition

Please Don’t Make Me Manually Do 37 Left Joins

Modifying integers that represent text is a key part of consuming Dataverse data from a Data Lake. However, many of the published solutions tell you to do this process one integer at a time. 🤯

We’ve come up with a better way.

Not only will this blog explain the solution – we’ve made it available as Open Source code on GitHub!

An assortment of gelato flavors
Which flavor would you like to try? “4”

Requirements

Need the ability to change fields like “statecode” and “statuscode” from 0, 1, 2 to “Active”, “Inactive”, etc. All fields need to be resolved automagically.

Make the Data Lake output match the native Dataverse connector output.

Solution

Use a recursive function to do N joins to resolve all Option Set Choice fields.

Use the equivalent naming convention ({Option Set Field}name) as the Dataverse TDS/SQL connector for maximum compatibility.

Open Source Code

Dataflows and Power BI Desktop examples are available in Cooptimize’s GitHub.

Known Unknowns

  • The Power Query code has been optimized, but we haven’t done large dataset performance testing.
  • There’s room for improvement by pre-filtering the list of fields to resolve. The code can do this; however, it made the open source code user experience too complicated.
  • Booleans are switched to 0/1 to support the joins and are not switched back.

Option Sets in Action – Account Example

The scenario below is for data synchronized with a Data Lake. None of the Data Lake tables have any “name” columns for Option Sets, only integer columns.

Using either the Dataflow or Power BI Desktop code we provide, you will be able to easily return the name columns from a Data Lake Table.

Prerequisites

Download from our GitHub either the Dataflow JSON and import as a new Dataflow, or the Power BI Desktop Template.

Installation

The fastest way to explore this code is via the Power BI Desktop file. Simply open the Template file.

To import the Dataflow:

  1. Login to a PowerBI.com Workspace.
  2. Click New > Dataflow.
  3. Click the Import model button.
  4. Select the JSON file downloaded from our GitHub.

Parameters

If you’re working with Data Lakes and Power Query, download the Azure Storage Explorer app. It simplifies finding the URLs you need for Power BI. You can connect to your Dataverse Data Lake account and find the Dataverse folder’s Properties.

Copy and paste the DFS URL into the Power Query parameter.

Account Example

Assuming you’ve synced the “account” table into your Data Lake, the Accounts table should “just work” in the Examples folder. When running the first time, you will need to authorize the connection to your Data Lake Storage account.

Choose columns showing statecode and statuscode with name fields.
Name fields!

Adding Additional Tables

The best way to add additional tables is to duplicate the “account” query. Then modify these parameters:

  • DataverseTableName – The name of the table. Must be a table synced to your Data Lake.
  • ListOfFieldsToExtend – Optional comma delimited list to return the “name” field. Limiting this list will improve performance.

About the Automagical Function

In the Dataflow and Power BI Desktop examples, there is a function doing most of the heavy lifting for adding the name fields. It has some interesting code that is worth exploring.

Resursive Power Query Function

One interesting component of the function is a recursive Power Query function. This function loops through the table N number of times to add N number of name fields. It escapes the function when the the count of Option Set fields is reached. The recursive function allows the code to work against any table with any number of Option Sets.

fnTableJoinRecursion = (CombinedTable, i, CountColumnsLocal) as table =>
   let
      UpdatedTable = if List.Contains( BooleanColumns, OptionColumns{i} )
         then
            Table.NestedJoin(...)
         else
            Table.NestedJoin(...),
            UpdatedTableExpanded = Table.ExpandTableColumn(UpdatedTable, OptionColumns{i} & "Table", {"Option Field Label"}, {OptionColumns{i} & "name"}),
          CheckComplete = 
            if i < (CountColumnsLocal - 1) 
            then @fnTableJoinRecursion(UpdatedTableExpanded, i + 1, CountColumnsLocal)
            else UpdatedTableExpanded
   in
      CheckComplete

Option Set Table Function

Microsoft recently started writing the metadata as files and as Lake Database External Tables.

However, we have written a more user friendly function that UNIONs OptionsetMetadata, GlobalOptionsetMetadata, StateMetadata and StatusMetadata from the Data Lake into a single table using the JSON metadata files. For now we’ve kept our approach.

Single Option Set Table

Simplifying the Option Set Table makes doing the table joins much easier. Every join can be done on Table Name (Filter), Field Name and Option Id. With Microsoft’s approach, you have to use four different sets of join logic.

Four types of metadata combined into one Query.
The Fantastic Four Metadatas?

Writing the UNIONed table back to the Data Lake might be necessary for performance reasons. We also have a Dataflow for that approach. Requires Power BI Premium or PPU to write back to a Data Lake.

Frequently Asked Questions (Probably)

“Should I Use the Dataverse Connector or a Data Lake?”

Yes.

The appropriate nuance is out of the scope of this blog. However, if you are running into the Dataverse connector 80 MB size limit, this approach uses the same naming conventions. Therefore it should be possible (easy?) to migrate from Dataverse direct approach to Data Lake + This Code in Power Query.

Is This Code Free?

Yes. The code is under the MIT License, a very simple and permissive open source license.

We Do Ask…

  1. Log any question, issue, shortcoming, or fatal flaw in GitHub.
  2. Share this blog if it helped you.
  3. If you need additional help, check out our Services and Pricing.

Related Microsoft Docs

Leave a Reply

Your email address will not be published.