cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Brentleigh2021
New Member

Splitting Documents with Multiple Instances of the Same Form into individual Documents

I am looking for advice on a complex Flow including AI Builder.

I have scanned PDF documents that contain many of the same type of form. 

These are check-off sheets that need to be extracted and saved as individual PDF's. 

The name of the PDF needs to be the unique "Supplier Name - Order Number" that is in the header of each Check-Back sheet. 

Sample.pngThese Check-Off sheets can be single or multi-page.  I am trying to build a flow that takes the large initial document and:

a.) Identifies each different check-off sheet and extracts it to its own document

b.) Saves each different check-off sheet in a standard folder, with the name of each PDF Document being the Supplier Name and Order Number of each check-off sheet.

I have built and trained the AI Model for recognizing the Supplier Name and Order Number but am stuck on how to tackle the extracting.

Grateful if anyone can offer any advice on how to create a solution for this.

4 REPLIES 4
JoeF-MSFT
Power Apps
Power Apps

Hi @Brentleigh2021,

 

Interesting scenario!

 

Is there a word that delimits the beginning for a check-off sheet? (For example, "Our Order No" or "Page 1") If so, in the following link you can download a sample flow that will help you achieve this: https://aibuilderdemo.blob.core.windows.net/storage/DetectPageBreaks-CloudFlow.zip Once you have downloaded it, import it into your environment by going to My flows > Import

 

Once you have imported the flow set the delimiter word or sentence in the following action: 

 

JoeFMSFT_0-1638913001926.png

 

The flow will extract all text on the document using AI Builder Text Recognition, look for the delimiter word or sentence and return a list of page numbers where that delimiter is present. You can then use a connector like the Adobe PDF Services connector to split your document by these pages. 

 

Hope this helps!

Brentleigh2021
New Member

Thank you @JoeF-MSFT !

That is definitely the right direction, however the Split PDF connector keeps failing and giving the following error...

"Invalid page range syntax"

Brentleigh2021_0-1638968776145.png

 

I tested it using the "Documnet Delimiter".

Brentleigh2021_2-1638969067306.png

 

Am I using the correct output?

 

The documentation for the Split PDF connector hints at the syntax that is needed but I am not sure how to see what syntax is being created by the flow.

 

Brentleigh2021_4-1638969329094.png

Grateful of any further assistance.

Hello again,

 

Further to the above, I just realized that I could see the syntax the Flow is creating in the "Compose 2" step.

 

Brentleigh2021_0-1638994987064.png

Any suggestions for how to get it to give the first and last page numbers, separated by a dash?

And possibly in a single line if necessary?

 

The desired syntax, I believe, is: 1,2,3-4,5,6-7,8-11,12

JoeF-MSFT
Power Apps
Power Apps

Hi!

 

For future reference, here you can download a sample flow that will return the page ranges, separated by a dash, that delimits different documents within a PDF: Know where to split a PDF with multiple documents ... - Power Platform Community (microsoft.com)

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

New Ideas Forum MPA.jpg

A new place to submit your Ideas for Power Automate

Announcing a new way to share your feedback with the Power Automate Team.

MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

MPA Licensing.jpg

Ask your licensing questions at the Power Automate AMA!

Join Priya Kodukula and the licensing team, super users and MVPs to find answers to your questions on Power Automate licensing.

Top Solution Authors
Users online (2,581)