I've got a flow working that watches for a monthly email containing a ZIP file. That file is then extracted to a sharepoint location (about 40 CSV files). For each of the CSV files, I parse them and then parse the JSON to get usable data. Works great!
I then have an element of the flow that for each extracted file, add this data to SQL via the add row command.
My question is, combining all the CSV data rows together, we're talking well over 500,000 rows of data imported each month.
Currently I'm managing to import about 1000 rows an hour. Is there any way to speed this process up by using mass import methods for data, or am I stuck at 1 row a time?
I would consider using the select action to group T-SQL insert statements into batches of 1,000, then use the Execute a SQL query (V2) action to execute the Insert commands instead of using the "Add a Row" action.
You might have to experiment a bit to see which batch sizes work best for your particular database.
If I answered your question, please accept it as a solution 😘
Thanks for the update. I'm sorry to ask, but are you able to assist me in constructing the t-sql queries. I've tried to research what I need to do, but I'm struggling.
I have though made it more effective so far. I've managed to go from 1000 imports an hour, to near on 20,000 imports an hour using concurrency controls on the apply to each sections (folders > Files). But I'm not hitting the API limits per day. So need to cut that down, one API call should call multiple records.
I'm running into a a timeout/slow performance trying to loop through just 20 records in a csv file, and trying to loop/parse about 30 columns in each record. The nested apply-to-each actions seem to take a long time and ultimately my flow times out.
Any help would be appreciated.
He solved his CSV parsing issue by following my guide:
I think this is all you need to get past your problem.
Thanks @Paulie78! The JSON parsing is helpful. I also have a need to insert a large set of data into dataverse tables. I'm curious what alternatives there are besides the "Add row" action.
To make this more interesting, the rows that I'm inserting into the main table (invoices) contain lookups to two more tables. So I do have to run a query for each of the columns that are lookups.
If it’s going into dataverse you can use a dataflow. I did a video on it here:
That video is mostly relevant to you.
Thanks @Paulie78! Great video! Do you have any ideas for how to modify the dataflow during the same execution to populate one of the columns with a foreign key/lookup? or do you suggest inserting the data first, then run a step after the refresh to loop through the rows and update the lookup column?
Explore the latest innovations, learn from product experts and partners, level up your skillset, and create connections from around the world.
Learn to digitize and optimize business processes and connect all your applications to share data in real time.