cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
vaibhavtandon87
Helper IV
Helper IV

How to extract text from PDF using PAD?

Dear community ,

 

How to extract the data from PDF's and store in excel using PAD?

 

My flow is failing at step 'Extract text with OCR' with error message - Failed to extract text with OCR.

 

Steps used-

1-Create Tesseract OCR engine

2-Extract text with OCR

3-Write text to file ( just for testing) , eventually it will be excel sheet.

 

Please let me know if I have to do any specific configurations?

1 ACCEPTED SOLUTION

Accepted Solutions

@vaibhavtandon87 

The team has almost completed work on the said feature, so it will be available really soon.

Best regards, 
James

View solution in original post

7 REPLIES 7
JamesP_MSFT
Microsoft
Microsoft

Hello @vaibhavtandon87,

Right now there is not an ability to extract text or images from a PDF file. 
The appropriate group of actions will be available in Power Automate Desktop in the near future.

 

Best regards, 
James

Thanks James, is it in near future? Tentative timelines will help.

@vaibhavtandon87 

The team has almost completed work on the said feature, so it will be available really soon.

Best regards, 
James

This sounds very interesting and will sure be useful! 

When you say it will be available really soon, could it be before the ending of 2020 or at the beginning of 2021?

 

Keep up the good work!

vaibhavtandon87
Helper IV
Helper IV

@JamesP_MSFT ,

 

I can see the functions like extract from pdf which is great!

 

Could you please guide, if I extract a table on the first page along with headers containing useful information, how to pull that into excel as separate information?

 

What it is doing is taking all the content from PDF page and just dumping that into a cell. Can i further decompose that information into useful information and how?

 

 

GakinImara
Advocate I
Advocate I

Hi, @JamesP_MSFT ! Good evening! 🙂

Any news on this CV working with PDF-files?... I'm wondering if you know the site I can track for future PAD updates. 🙂 I am able to work with the alternative "Extract text from PDF" and just use RegEx with some extra steps... But would love to implement this alternative as soon as it has been released! 

GK

Anonymous
Not applicable

Hi,

 

Any news on this?

Helpful resources

Announcements
Power Automate News & Announcements

Power Automate News & Announcements

Keep up to date with current events and community announcements in the Power Automate community.

Microsoft 365 Conference – December 6-8, 2022

Microsoft 365 Conference – December 6-8, 2022

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Power Automate Community Blog

Power Automate Community Blog

Check out the latest Community Blog from the community!

Top Solution Authors
Top Kudoed Authors
Users online (2,957)