cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
KSven
Advocate III
Advocate III

Dataflow Performance | Daily Update of 100-200K Records | Suitable Datawarehouse?

Hi lovely Community,

 

I'm using the Dataverse as a kind of Low Code Data Warehouse for my customer and I would like to update ca. 100-200 records per Day with Dataflows like classis ETL / ELTL process:

 

1. Dataflow: Extract data as raw data and save it into simpe table with text datatype

2. Dataflow: Load the Raw data table to clean and transform Data with the right Datatypes and set relations to other tables via integration key

3. Dataflow: Create some Calculation table with aggregated Data for Power BI

 

Unfortunately the Dataflows needs round about 2hours per Step and compared to Datasets in Power BI, the Power Query Steps would need only 5-10min.

 

A. So is it not the right use case for the Dataverse and Dataflows or is there a trick to handle the low code low Performance?

 

B. Is Microsoft working for a better performance? (Or maybe my performance expectations with experience of SSIS / SQL are to high?)

 

C. In addition there are some limits like "Number of requests exceeded the limit of 6000 over time window of 300 seconds", I guess this makes no sense for a Dataflow?

 

BR,

Sven

2 ACCEPTED SOLUTIONS

Accepted Solutions

Hi @KSven ,

 

Additional to the advice that @EricRegnier has given, the other thing you need to consider with Dataverse is the API limits that are enforced by Microsoft. They will be slowing down your ETL processes if you are processing 100 - 200k a day.  So even if you used SSIS or Power Automate you will still see slow performance. See this article and related articles for more details on the API limits https://docs.microsoft.com/en-us/powerapps/maker/common-data-service/api-limits-overview

View solution in original post

Hi @KSven,

Anything hitting the APIs are subject to the limits even Dataflows and other out-of-the-box functionalities, main reason is even though Dataflows are made for ETL, they are configured by developers (or citizen developers) and might NOT be configured in the most optimal way and thus as @HSheild said needs to protect the services. That said, depending on your implementation, if you have a lot of user licenses (500+) and a complex implementation you can contact Microsoft via your local customer success manager or FastTrack to have exemptions on those limits, upgrade data center scale groups and/or even possibly have dedicated servers. 

Cheers

View solution in original post

7 REPLIES 7
EricRegnier
Super User II
Super User II

Hi @KSven,

Although Dataflows is a great low code/no code ETL tool it is still limited and immature in some aspects. Microsoft has announced enhancements for the last few waves updates so we can expect new features and performance improvements. That said, in my opinion Dataflows is more fit for purpose for small datasets and simple target data models.

In your case with >200K records, SSIS (with optionally Kingwaysoft) would be better for control and performance outcome. You can also look at Power Automate (or Logic Apps) which will give you a more control and better performance than Dataflows but not as much as SSIS or custom route.

Hope this helps!

Hi @EricRegnier ,

 

thank you for your reply. 

I know Microsoft is working hard to deliver best features for their customers. Hoepfully they will improve the Performance of Dataflows and their scheduling / running order.

 

For me always SSIS would be the best solution for ETL/ELTL processes, but unfortunately not a suitable Solution for my customer. We need the Low Code / NoSQL solution. So I will try to split the Transformation records to have better performance till Microsoft will deliver better one 🙂

 

Would be nice to have a better overview of upcoming Wave updates and their dates. This seems to be very rough:

KSven_0-1607075996290.png

 

BR,

Sven

Hi @KSven ,

 

Additional to the advice that @EricRegnier has given, the other thing you need to consider with Dataverse is the API limits that are enforced by Microsoft. They will be slowing down your ETL processes if you are processing 100 - 200k a day.  So even if you used SSIS or Power Automate you will still see slow performance. See this article and related articles for more details on the API limits https://docs.microsoft.com/en-us/powerapps/maker/common-data-service/api-limits-overview

View solution in original post

Hi @HSheild ,

 

thank you for your reply! Its good to know, but I don't understand why Dataflows (as "BIG DATA" Data integration method for Dataverse) has the same API Limits like for User steps in Power Apps.

 

In my opinion there should be seperated limits or?

 

BR,

Sven

Hi Sven,

 

I can see your point. However, at the end of the day Dataflows are still hitting the Dataverse API and Microsoft need to be able to protect the Dataverse service.

 

 

Hi @KSven,

Anything hitting the APIs are subject to the limits even Dataflows and other out-of-the-box functionalities, main reason is even though Dataflows are made for ETL, they are configured by developers (or citizen developers) and might NOT be configured in the most optimal way and thus as @HSheild said needs to protect the services. That said, depending on your implementation, if you have a lot of user licenses (500+) and a complex implementation you can contact Microsoft via your local customer success manager or FastTrack to have exemptions on those limits, upgrade data center scale groups and/or even possibly have dedicated servers. 

Cheers

View solution in original post

Hi @EricRegnier ,

 

many thanks for your response. I also see the API more for real-time processes like in Power Automate, for which I use it very excessively 🙂
Thanks for your tips, I will collect all the restrictions for my customer and provide them for clarification with Micorosoft in order to possibly get an individual solution.

 

BR,

Sven

Helpful resources

Announcements
PA_User Group Leader_768x460.jpg

Manage your user group events

Check out the News & Announcements to learn more.

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Carousel 2021 Release Wave 2 Plan 768x460.jpg

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

R2 (Green) 768 x 460px.png

Microsoft Dynamics 365 & Power Platform User Professionals

DynamicsCon is a FREE, 4 half-day virtual learning experience for 11,000+ Microsoft Business Application users and professionals.

Users online (1,890)