03-18-2022 21:41 PM - last edited 10-19-2022 15:30 PM
Template for converting large CSV files to JSON, then sending the data to a table or list.
This get data for CSV works even if there are commas in the CSV data. The only requirement is the the CSV file must follow the standard of placing double quotes “ , “ around any item eith in-data commas.
(If you can export as a semicolon, tab, or other delimited file, then you can probably just use a simpler method like Paul’s here: https://www.tachytelic.net/2021/02/power-automate-parse-csv/?amp)
For more information on the delimiter change piece, visit this previous post:
To make a flow to send any CSV data to a new Excel table with given CSV header names without any column mappings, check this template:
*Copying the template into an existing flow may create issues with expressions. You may need to copy the rest of your existing flow into the CSV template flow, then move the template scope where you need it.
(Older) Version 2 Uploaded 03/26/2022
(Adjusted the 1st Select input so it can now also deal with in-data commas in the 1st column. Added more lines to the 1st Select so it can now handle up to 50 columns with commas in them.)
Google Drive Link To Download: https://drive.google.com/file/d/1UVEewW9J1m9wvSppgFrf7rs3H3txSPUN/view?usp=sharing
Version 3 Uploaded 04/09/2022
(More minor fixes & additions.
I adjusted several expressions so it can now handle a few more scenarios with arrays in the CSV data. It should handle any array that doesn't include double quotes and any array that is all strings with double quotes, so ["String1", "String2", "String3"], but it will have issues if it is a mixed array with some double-quoted strings and some other values, for example ["String", 4, 03/05/2022, "String2"] won't work.
I also adjusted how the LineBreak setting is set-up so it now uses the /r/n for the LineBreak. I also provided this link in the flow so anyone can look up the right string for the decodeUriComponent expression(s) if they happen to have different LineBreak characters. This change also made it possible to differentiate between in-data line-breaks and CSV row line-breaks on the files I tested, so it should now replace the in-data line-breaks, like the multiple-choice fields some sites use, with semi-colons. That should make those records much easier to deal with & parse in later actions.
I also looked over a problem with in-data trailing commas. I added a line in the settings where anyone can toggle whether they want it to adjust for trailing OR leading commas in the data, it just can't handle both in one dataset. So if one column in one row has ",String1 String2" and another column in another row has "String 3 String4," then it will have errors.)
Google Drive Link To Download: https://drive.google.com/file/d/1X6BGhXWSnZtFgUK0v-5bHo7RuKSNjNtV/view?usp=sharing
Microsoft Power Platform & Paul Murana recently did a video demonstration of how to handle CSV & other files in dataflows: https://youtu.be/8IvHxRnwJ7Q
But it will only output to a Dataverse or Dataverse for teams table.
SharePoint & Outlook Uses
You should be able to change the OneDrive action out for one of the Get file content or Get attachments in SharePoint if your file is on SharePoint.
However, to automate loading CSV files from an email trigger or action you may need to make an Apply to each loop at the bottom of the actions, drag each action down into the loop from bottom to top, use "Attachments" in the main loop input, then use the formula
Also, I recently found this awesome method of building custom connectors & functions with C# code and John here created his own custom action for parsing a CSV.
The only downside is these custom actions are not readily share-able without copying the code and creating one yourself, or going through the Microsoft certification process for your action. Also there are many edge cases you would probably need to handle before Microsoft would be willing to certify an official Parse CSV action. But if I or anyone else wants to continue from where I or John left off, then we could further build things out for edge cases & try for the action certification.
Version 2 Now Available
I adjusted the 1st Select input so it can now also deal with in-data commas in the 1st column. I also added more lines to the 1st Select so it can now handle up to 50 columns with commas in them.
Google Drive Link: https://drive.google.com/file/d/1UVEewW9J1m9wvSppgFrf7rs3H3txSPUN/view?usp=sharing
This is pure gold and truly appreciate the time/effort for sharing all of this.
We have been working on a monster of a Flow with receiving CSV files and I first encountered your solution to address the commas in our values.
As I came back to download the V3 of that solution, I stumbled on this one -which is exactly the next step that was giving us performance issues.
Quickly deployed your steps here for the JSON and things are going well.
Def made my Monday, so thank you for sharing the knowledge.
I'm tyring to see how far I get, but wanted to share the scenario,
All is working really great with these solutions and now just facing a small challenge.
One of the CSV reports that comes in, will have the option of blank values in at least one column.
we get a CSV report from computer management, one column will have a yes/no on "reboot needed"
if reboot needed = yes then the next column will contain the date since that reboot has been needed,
if reboot needed = no then the next column will be blank.
our issue is that in the step "Select Reformat file data, replace in-data commas"
when a row is created that originates from one that has no reboot needed date then all the rest of the data shifts one column to the left as the "blank" column is eliminated.
80% of the benefit of this is avoiding the need for a bunch of conditionals in the last Select statement where you input the column numbers on the split expression. I’m not immediately thinking of a better way to deal with that, you may just need to use conditional expressions where you would normally put the column number. Like…
add(Insert column number under ‘no’ scenario,
if(equals(split(item(), outputs(‘Compose_CSV_data_+_Set-up)[‘NewDelimiter’])?[Insert reboot column number], ‘yes’), 1, 0)
If you have smaller sets of data, you aren’t worried about speed, & you aren’t worried about daily action limits, you could use a simpler PA condition action & a different way to parse the data line by line. Otherwise these modifications are a decent method.
Also I don’t know if you want to go down this rabbit hole, but if you’re really concerned about performance, the 2nd half of this flow is the more user-friendly/readable, but less efficient version.
There is a way to boost efficiency & greatly reduce daily action api calls by using batch actions if your data is going to SharePoint. Up to 1000x less daily actions transferring CSV data. It works quite well for this thing where your data is already set in a Select action.
Then Paulie also mentioned modifying this for Dataverse too. I might have to look into developing that myself if he doesn’t have the time soon.
@chico Oh I just remembered another way,
You can copy the From field expression from the last Select action that helps skip headers & footers, & paste it into the From field of a new Filter array action before the Select action. In the Filter array filter condition you can still use the
split(item(), outputs(‘Compose_CSV_data_+_Set-up)[‘NewDelimiter’])?[Insert reboot column number]
to reference the reboot column value. So you can set it to filter to where that column equals yes. Then copy that Filter array to a second action & set that second one to not equal to yes. That should separate out the ones with & without the extra column.
Then you can copy the last Select action & use the different Filter array outputs for each & insert those different Select outputs to different Parse JSON actions, etc.
appreciate the tips, thank you for helping to make the world a better place!
I'm going to explore the rabbit hole.
We are currently still doing a large volume of data store with Sharepoint, so the that batch option seems like the best next step.
In that case it will just put us in a better position for the eventual flip over to using dataverse.