cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
vaibhavtandon87
Helper IV
Helper IV

How to extract text from PDF using PAD?

Dear community ,

 

How to extract the data from PDF's and store in excel using PAD?

 

My flow is failing at step 'Extract text with OCR' with error message - Failed to extract text with OCR.

 

Steps used-

1-Create Tesseract OCR engine

2-Extract text with OCR

3-Write text to file ( just for testing) , eventually it will be excel sheet.

 

Please let me know if I have to do any specific configurations?

1 ACCEPTED SOLUTION

Accepted Solutions

@vaibhavtandon87 

The team has almost completed work on the said feature, so it will be available really soon.

Best regards, 
James

View solution in original post

7 REPLIES 7
JamesP_MSFT
Microsoft
Microsoft

Hello @vaibhavtandon87,

Right now there is not an ability to extract text or images from a PDF file. 
The appropriate group of actions will be available in Power Automate Desktop in the near future.

 

Best regards, 
James

Thanks James, is it in near future? Tentative timelines will help.

@vaibhavtandon87 

The team has almost completed work on the said feature, so it will be available really soon.

Best regards, 
James

This sounds very interesting and will sure be useful! 

When you say it will be available really soon, could it be before the ending of 2020 or at the beginning of 2021?

 

Keep up the good work!

vaibhavtandon87
Helper IV
Helper IV

@JamesP_MSFT ,

 

I can see the functions like extract from pdf which is great!

 

Could you please guide, if I extract a table on the first page along with headers containing useful information, how to pull that into excel as separate information?

 

What it is doing is taking all the content from PDF page and just dumping that into a cell. Can i further decompose that information into useful information and how?

 

 

GakinImara
Advocate I
Advocate I

Hi, @JamesP_MSFT ! Good evening! 🙂

Any news on this CV working with PDF-files?... I'm wondering if you know the site I can track for future PAD updates. 🙂 I am able to work with the alternative "Extract text from PDF" and just use RegEx with some extra steps... But would love to implement this alternative as soon as it has been released! 

GK

Anonymous
Not applicable

Hi,

 

Any news on this?

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

New Ideas Forum MPA.jpg

A new place to submit your Ideas for Power Automate

Announcing a new way to share your feedback with the Power Automate Team.

MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

Super User 2 - 2022 Congratulations 768x460.png

Welcome Super Users

The Super User program for 2022- Season 2 has kicked off!

Top Solution Authors
Top Kudoed Authors
Users online (2,553)