cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Regular Visitor

Extract text from PDF without external (non-Microsoft) connectors

Hi,

 

I need to extract the full text (no layout needed) from PDF files without using third party connectors (Plumsail, Parser et al) as this is a GDPR and security issue (besides being insanely priced if you need to do the operation on a large number of files). I have temporarily solved it with my own PowerShell Azure Function that pipes the incoming PDF through a commandline tool and return the text after -replacing away unwanted junk characters. However this is a little less maintainable than solving it all through flow.


Any creative suggestions on how to achieve it?

 

Cheers!

3 REPLIES 3
Solution Sage
Solution Sage

Re: Extract text from PDF without external (non-Microsoft) connectors

Hi @MS2 ,

I am afraid that there is no way to achieve your needs in Microsoft Flow currently.

There is a similar idea with your issue, you can vote here:

https://powerusers.microsoft.com/t5/Flow-Ideas/Convert-PDF-to-Text-Table-Image/idi-p/176806?advanced...

Best Regards,

Community Support Team _ Zhongys

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Highlighted
Regular Visitor

Re: Extract text from PDF without external (non-Microsoft) connectors

As you can search for text in PDF's both in OneDrive and Sharepoint some part of those backends is able to reach the text in PDF's, there are no ways of using that through REST or other to "get the text out"?

Highlighted
Solution Sage
Solution Sage

Re: Extract text from PDF without external (non-Microsoft) connectors

Hi @MS2 ,

I have made some tests with the OneDrive and Sharepoint.

I get the file content from a pdf, then create a .txt file with the content.

However, it will return a bunch of garbled in the .txt file.

If you want to achieve your needs, I am afraid you need to use the Plumsail connector.

Best Regards,

Community Support Team _ Zhongys

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Helpful resources

Announcements
firstImage

Super User Program Update

Three Super User rank tiers have been launched!

firstImage

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021

firstImage

New & Improved Power Automate Community Cookbook

We've updated and improved the layout and uploading format of the Power Automate Cookbook!

thirdimage

Power Automate Community User Group Member Badge

Fill out a quick form to claim your user group badge now!

Top Solution Authors
Top Kudoed Authors
Users online (6,245)