cancel
Showing results for 
Search instead for 
Did you mean: 
Courtenay

Extracting Text from PDF Files

Parserr is an easy connector for Microsoft Flow that allows you to easily extract PDF (and email) data and send it directly to the application which actually needs it. 

 

In the short introduction below, we will show exactly how to select the exact data you need from your PDF form and add it straight into Excel. However you could take your extracted PDF data and add it into any one of the 3rd party connectors supported by Microsoft Flow!

 

1. Sign up for a free Parserr account

 

Parserr1.PNG

 

2. Once you have filled out your details, you should recieve an email which looks a little like this:

 

Parserr2.PNG

 

3. Click on the confirmation link in the email and login with the details you provided previously. Once logged in, you should be presented with the setup screen. Click the "Great. Lets get started" button

Parserr3.PNG

 

4. The next screen will provide you with your incoming email address. This address is where you will forward all your incoming inquiries that you wish to extract to Dynamics Crm. Go ahead and copy the email address provided. Then make sure to forward a valid inquiry or email you wish to extract and make a note of your unique Parserr inbox (eg. BSB8GEBA@mgparserr.com) as shown below

 

Parserr4.PNG

 

5. Once you have forwarded the email to the assigned email address (ending in mg.parserr.com), Parserr will detect the email and then ask you a few onboarding questions. In our case we would like to extract information from a PDF which is an attachment to the email:

 

parserr1.PNG

 

6. Choose "Microsoft Flow".

 

parserr2.PNG

7. Parserr should have detected your PDF attachment. For the purposes of this walk-through, we will choose "Invoice/Receipt".

 

 parserr3.PNG

8. Next Parserr asks us where we'd like to extract this PDF information. Choose Excel and click "Finish".

 

parserr4.PNG

9. Choose "Attachments".

 

parserr5.PNG

 

10. Next we need to show Parserr the exact piece of text required for extraction. Click the green "+" sign and choose the "Extract text from PDF" rule.

 

parserr6.PNG

 

11. Using the cropper tool, choose the area of the PDF you wish to extract.

 

parserr7.PNG 

 

12. Give your rule a name and click "Save". You should see the text extracted from PDF in your rule. You can choose to add more rules to manipulate the text further or simply click Save again on your rule:

 

parserr8.PNG

 

13. Now its time to jump into Microsoft Flow and connect Parserr to Excel! Choose to create a new Flow and choose the Parserr connector and the trigger "Parserr - When an email is received". 

 

Parserr11.PNG

Parserr12.PNG

 

14. Next you will be prompted to add your username and password for parserr. Please use the same username and password as would when you login to Parserr.

 

Parserr13.PNG

 

15. If you are connected properly to Parserr, you will see your parserr email address appear in the trigger step. If you don't see a value, go back a step and try and reset your connection to Parserr:

 

Parserr14.PNG

 

16. Click "New step" and then click "Add an action" as shown below:

 

Parserr15.PNG

 

17. Choose Excel and "Insert row" as your action.

 

1.png

 

18. Choose the location of your Excel document. For this example, we will connect to an Excel document in OneDrive. Although this document could live anywhere.

 

2.png

 

19. Once you've correctly authenticated to OneDrive (or to the location of your Excel doc), you should see a familiar layout:

 

3.png

 

20. Choose the Excel file you would like to insert your rows into. If you haven't created a file already, make sure you create one and insert a table into the excel file in the location of where you would like the parsed data to go. NB The file needs to be closed (not open in any tab or on your desktop anywhere when the flow runs MS Flow needs sole access to the file when executing)

 

4.png

 

21. An extract of our Excel file before we start the Flow. Make sure it is closed when you are finished inserting the table

 

5.png

 

22. Choose the name of your table within the Excel document. The right panel shows dynamic content which originate from the rules created within Parserr. These rules represent the rules you chose in the PDF extracted content. In the example below we have the option of using a number of different fields (eg "first name", "last name" etc) from the incoming PDF in your original email.

 

6.png

 

7.png

 

21. Save your Flow by clicking the "Save Flow" button.

 

22. Test your Flow by sending an email to Parserr. You should see a run complete in Microsoft Flow on your dashboard. You should also see the extracted PDF data in your Excel document! Perfect! 

 

8.png

 

23 Click on the Icon of your flow to see the completed runs:

 

9.png

 

Parserr has a free account which allows you to test this integration up to 15 times per month. The steps above only need to be setup once, however the amount of time saved on data input alone is definitely worth the effort.

Comments
About the Author
  • Experienced Consultant with a demonstrated history of working in the information technology and services industry. Skilled in Office 365, Azure, SharePoint Online, PowerShell, Nintex, K2, SharePoint Designer workflow automation, PowerApps, Microsoft Flow, PowerShell, Active Directory, Operating Systems, Networking, and JavaScript. Strong consulting professional with a Bachelor of Engineering (B.E.) focused in Information Technology from Mumbai University.
  • I am a Microsoft Business Applications MVP and a Senior Manager at EY. I am a technology enthusiast and problem solver. I work/speak/blog/Vlog on Microsoft technology, including Office 365, Power Apps, Power Automate, SharePoint, and Teams Etc. I am helping global clients on Power Platform adoption and empowering them with Power Platform possibilities, capabilities, and easiness. I am a leader of the Houston Power Platform User Group and Power Automate community superuser. I love traveling , exploring new places, and meeting people from different cultures.
  • Read more about me and my achievements at: https://ganeshsanapblogs.wordpress.com/about MCT | SharePoint, Microsoft 365 and Power Platform Consultant | Contributor on SharePoint StackExchange, MSFT Techcommunity
  • Encodian Owner / Founder - Ex Microsoft Consulting Services - Architect / Developer - 20 years in SharePoint - PowerPlatform Fan
  • Founder of SKILLFUL SARDINE, a company focused on productivity and the Power Platform. You can find me on LinkedIn: https://linkedin.com/in/manueltgomes and twitter http://twitter.com/manueltgomes. I also write at https://www.manueltgomes.com, so if you want some Power Automate, SharePoint or Power Apps content I'm your guy 🙂
  • I am the Owner/Principal Architect at Don't Pa..Panic Consulting. I've been working in the information technology industry for over 30 years, and have played key roles in several enterprise SharePoint architectural design review, Intranet deployment, application development, and migration projects. I've been a Microsoft Most Valuable Professional (MVP) 15 consecutive years and am also a Microsoft Certified SharePoint Masters (MCSM) since 2013.
  • Big fan of Power Platform technologies and implemented many solutions.
  • Passionate #Programmer #SharePoint #SPFx #M365 #Power Platform| Microsoft MVP | SharePoint StackOverflow, Github, PnP contributor
  • Web site – https://kamdaryash.wordpress.com Youtube channel - https://www.youtube.com/channel/UCM149rFkLNgerSvgDVeYTZQ/