cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
LNeaga_Next
Frequent Visitor

AI Builder form processing multiple pages from one pdf file

Hello,

We are considering using AI Builder forms processing to process multiple invoices and insert the data into a excel file.

 

We have created a model but when we run the flow for a one pdf file with multiple invoices (the same format), the AI Builder is reading and insert the data for only first invoice.

 

Is there a way to read the information from all pages and insert them into an excel file in the same time?

Thanks!

5 REPLIES 5
CedrickB
Power Apps
Power Apps

Hi,

It depends on your documents.

If you have one invoice per page, you can split the PDF and perform the extraction on the splited files.

Otherwise, you can use a Text Recognizer model to extract all the texts and identify the pages break based on parterns and then split the PDF.

In coming weeks, we are going to provide the ability to predict on a specific page which would avoid having to split the PDF once you have extracted the page break.

Feel free to send an email to aihelpen@microsoft.com so we can have a call and discuss what is the best option in your scenario.

wintechchen
Regular Visitor

Hi LNeaga,

I have the business case that a single PDF invoice file contains multiple pages. I use Encordian to split PDF file to multiple files with each file contains only one invoice. Then use List Files action to get those splited files and then use Apply to each action to get file content passing to AI Model for Form processing. I hope this info provides you some idea.

@CedrickB , @LNeaga_Next ,
I'm having a very similar scenario. We want to put multiple signed paper documents (same 'collection', exact same lay-out) on a scanner and let the computer separate out every document (length of the documents is variable: 1, 2, 3, 4... pages long). When that one long pdf is split up we want to use the AI model to extract the data from every document.
I'm stuck on the part to correctly separate every single document.

Detailed help on how to fix this would be appreciated. 

Hi Sebastien,

First of all, using AI to perform this intelligent splitting is in our radar but I can't give an ETA so far.

In the meantime, the alternative is to use an "old school" pattern search approach.

If your invoices have some "Page 1, Page 2..." texts or noticeable page breaks, using Text Recognition to extract the texts then searching for those patterns will allow to detect the invoice first pages locations.

 

So here is in a nutshell the process

1. Call AI Builder Text Recognition

2. "Apply to each" on results

3. Use the Filter Array to match for the text (Edit in advanced mode to build a custom search function)

CedrickB_0-1643015316587.png

4. Gather the page breaks in an Array variable that you have declared upfront

5. "Apply to each" on your array variable

6. Call Invoice Processing with "page range" using the pages split calculated above

JoeF-MSFT
Power Apps
Power Apps

Hi!

 

For future reference, here you can download a sample flow that will return the page ranges that delimits different documents within a PDF: Know where to split a PDF with multiple documents ... - Power Platform Community (microsoft.com)

Helpful resources

Announcements
MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

Power automate tips 768x460 v2.png

Restore a Deleted Flow

Did you know that you could restore a deleted flow? Check out this helpful article.

Microsoft Build 768x460.png

Microsoft Build is May 24-26. Have you registered yet?

Come together to explore latest innovations in code and application development—and gain insights from experts from around the world.

May UG Leader Call Carousel 768x460.png

What difference can a User Group make for you?

At the monthly call, connect with other leaders and find out how community makes your experience even better.

Top Solution Authors
Users online (2,287)