When using the Dataverse connector in Power BI Desktop or Dataflows, you may receive this error: “Microsoft SQL: Return records size cannot exceed 83886080. Make sure to filter result set to tailor it to your report.”
If you run into this, don’t do a quick fix and move on – think carefully through your next steps. The key considerations are spelled out below.
Microsoft lists this limitation on the Dataverse SQL Docs article.
- Remove fields from query
- Use Dataflow with Incremental Loads
- Use Data Lake as a source
Do The Math
Approximate how much data is requested from Dataverse.
- GUID: 16 Bytes
- Date/Time: 8 Bytes
- Integer: 8 Bytes
- String: 1 Byte per character
Do you set arbitrary limits at work or at home? Consider how you set and present those limits to others.
Why The Error Happens
Simply put, there’s a hard limit on the size of the query returned by the Dataverse connector. The query size is a function of:
(Fields selected * Field size) * Number of Records
This issue likely won’t be a problem until your query size is in the ~100,000+ records range for limited fields or ~10,000+ records returning every field.
You Have a Lot of Data – Now What
Some type of design change must be done. Which path to choose depends on the nature of the data.
Option 1: Remove Fields or Rows From the Query
Remove extra fields not actually used in the report or query. The number of rows returned can be limited via aggregation or filtering.
⚠️ Will the data size grow enough to hit the error in the future? This option may only be a temporary fix.
Option 2: Refactor the Query
In Power Query, the order of operations can have an impact on the size and performance of the source query. Filter and Choose columns towards the beginning of the query prior to doing transformations.
This concept is significantly easier to research in PowerBI.com Dataflows than in Power BI Desktop. The Dataflow UI has a green/red indicator bar to indicate the steps the Dataverse/SQL Server is doing.
Option 3: Dataflows with Incremental Refresh
Dataflow Incremental refresh uses a Date/Time field to segment the data by Months or Years. This also segments the queries done by the Dataverse connector into smaller chunks.
⚠️ Understand the Power BI licensing implications. Incremental requires Premium or Premium Per User for everyone who will use the downstream Dataset.
Option 4: Sync to Data Lakes and Retrieve with Data Lake Connector
This begins the “Setup a Meeting With IT” options.
Dataverse has an “Azure Synapse Link” feature to write Dataverse tables to a Data Lake.
One option for pulling the data back to Power BI is to use the Data Lake Gen2 connector with the “CDM” option. If you go down this path, read my Cooptimize blog so you can get the Option Set “name” fields exactly like the Dataverse connector does.
⚠️ This option allows you to consume as much data as you want, but the data filter and join performance will be significantly worse than the Dataverse connector for an equivalent query.
Option 5: Sync to Data Lakes and use SQL On Demand
Just like Option 4, you write data to the lake.
The Azure Synapse Link feature will give you an option to see Dataverse “tables” using the Power BI SQL Connector. ⚠️ However, the auto-published tables also lack the Option Set “name” fields.
Cooptimize has you covered here as well. Eric wrote a blog and open source code to publish Dataverse SQL Views with the name fields, so your solution is more easily upgraded from the Dataverse connector. The field list and data types should be identical. These views are also consumed using the SQL connector.
✅ Bonus of this path is excellent performance, which should be similar to the Dataverse connector performance.
Unfortunately you probably won’t find this blog until you run into this error and are trying to figure out what to do. Sorry. Maybe we can help – click the chat button.
If you’re expecting large data volumes, then Option 5 is the way to go from the start.