01-20-2023 13:09 PM
Update & Create Excel Records 50-100x Faster
I was able to develop an Office Script to update rows and an Office Scripts to create rows from Power Automate array data. So instead of a flow creating a new action API call for each individual row update or creation, this flow can just send an array of new data and the Office Scripts will match up primary key values, update each row it finds, then create the rows it doesn't find.
And these Scripts do not require manually entering or changing any column names in the Script code.
• In testing for batches of 1000 updates or creates, it's doing ~2500 row updates or creates per minute, 50x faster than the standard Excel create row or update row actions at max 50 concurrency. And it accomplished all the creates or updates with less than 25 actions or only 2.5% of the standard 1000 action API calls.
• The Run Script code for processing data has 2 modes, the Mode 2 batch method that saves & updates a new instance of the table before posting batches of table ranges back to Excel & the Mode 1 row by row update calling on the Excel table.
The Mode 2 script batch processing method will activate for creates & updates on tables less than 1 million cells. It does encounter more errors with larger tables because it is loading & working with the entire table in memory.
Shoutout to Sudhi Ramamurthy for this great batch processing addition to the template!
Code Write-Up: https://docs.microsoft.com/en-us/office/dev/scripts/resources/samples/write-large-dataset
The Mode 1 script row by row method will activate for Excel tables with more than 1 million cells. But it is still limited by batch file size so updates & creates on larger tables will need to run with smaller cloud flow batch sizes of less than 1000 in a Do until loop.
The Mode 1 row by row method is also used when the ForceMode1Processing field is set to Yes.
Office Script Code
(Also included in a Compose action at the top of the template flow)
Batch Update Script Code: https://drive.google.com/file/d/1kfzd2NX9nr9K8hBcxy60ipryAN4koStw/view?usp=sharing
Batch Create Script Code: https://drive.google.com/file/d/13OeFdl7em8IkXsti45ZK9hqDGE420wE9/view?usp=sharing
You can download the Version 5 of this template attached to this post, copy the Office Script codes into an online Excel instance, & try it out for yourself.
-Open an online Excel workbook, go the the automate tab, select New Script, then copy & paste the Office Script code into the code editor. Do this for both the Batch Update and the Batch Create script code. You may want to name them BatchUpdateV6 & BatchCreateV5 appropriately.
-Once you get the template flow into your environment, follow the notes in the flow to change the settings to your datasources, data, & office scripts.
If you need just a batch update, then you can remove the batch create scope.
If you need just a batch create, then you can replace the Run script Batch update rows action with the Run script Batch create rows action, delete the update script action, and remove the remaining batch create scope below it. Then any update data sent to the 1st Select GenerateUpdateData action will just be created, it won't check for rows to update.
(ExcelBatchUpsertV5 is the core piece, ExcelBatchUpsertV5b includes a Do until loop set-up if you plan on updating and/or creating more than 1000 rows on large tables.)
Anyone facing issues with the standard zip file import package method can check this post for an alternative method of importing the flow: https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Excel-Batch-Create-Update-and-Upsert/m-p...
Also be aware that some characters in column names, like \ / - _ . : ; ( ) & $ may cause errors when processing the data. Also backslashes \ in the data, which are usually used to escape characters in strings, may cause errors when processing the JSON.
Thanks for any feedback!
Back on this filtering problem I experienced earlier.
Some context: I am filtering data from an SQL table to isolate the data I should copy into Dataverse tables by batches. I explained earlier that it was taking ages to do a simple filter using something like: "string(<Dataverse dump>) contains item(<SQL table>)" as a filtering criterion.
Since SQL server supports 100K records per page, I went with it initially when performing my Get Rows. This works fast, however when hitting the Filter option, it slowed everything down when I had lots of records to compare with.
Filtering the number of items from the left array used to compare helped, but this is not comparable to the gain I obtained by reducing the total number of records the Get Rows SQL action should deal with:
When Top Count variable is set to 100000 and you have pages that are fully loaded, this filter may take several hours, if not days, to execute properly.
Setting the Top Count variable to 50000 changed the whole game: each SQL page now takes about 8 minutes to get parsed and processed (read, filter, batch insert into Dataverse in batches of 1,000).
Last, as I was trying to troubleshoot the issue, I placed various traces in my flow which basically sent Teams messages into a DEBUG channel I created for this purpose. Since you cannot view what's happening into a loop before it completes, that's the only way I thought to provide some visibility about which step (or action) the engine was currently processing. And it turns out that when cancelling long-running flows, execution does not immediately stops. In my case, I kept on receiving messages for actions that followed the Filter action that took hours.
Hope that helps!
Great video. I mapped out everything for the flow but i am keep getting the below error from the office script. I am new to office script 😕
thanks for your help in advance.
What is the name of your table in Excel & the name of the table you input in the script action?
1st make sure those match
And is your table empty?
Try adding a blank row to the table 1st if it is completely empty.
@takolota Thanks for responding so quick. I see what i was doing wrong. I did indeed had different table name for my destination table. I left out one character. But now the flow works but its only bring up one item from the range of source data. for example if the source data has 210 rows its grabbing the last item and inserting the same item 210 times in destination table.
Are you using the most recent V5.1?
If you are using the most recent version, then try changing the Mode1Processing input to Yes
Thank you, its working now. You are the best. 🙂
I also created a script to delete everything from the destination table and then insert new data from batch insert. Its working flawlessly
Here is a fix for the negative number issue - I'm sure I posted a message about this but I can't find it. You have to replace your "SelectArrayofArraysUpdate" map function with this code below:
I’d be pretty concerned that most “fixes” attempted to try to get that to also parse integers would also introduce other edge-cases & errors.
Again, Excel should accept string values for Number/Integers, Booleans, Dates, etc. and they will be the correct datatype if you have set the table column to that data type.
So I’d advise to simply make sure all the data going in are strings, rather than try to adjust the expression there.
Maybe there would be better options with additional functionality in Power Automate like Regex, but the ways I’ve seen to incorporate Regex require custom & premium connectors with extra set-up.