I am looking for advice on a complex Flow including AI Builder.
I have scanned PDF documents that contain many of the same type of form.
These are check-off sheets that need to be extracted and saved as individual PDF's.
The name of the PDF needs to be the unique "Supplier Name - Order Number" that is in the header of each Check-Back sheet.
These Check-Off sheets can be single or multi-page. I am trying to build a flow that takes the large initial document and:
a.) Identifies each different check-off sheet and extracts it to its own document
b.) Saves each different check-off sheet in a standard folder, with the name of each PDF Document being the Supplier Name and Order Number of each check-off sheet.
I have built and trained the AI Model for recognizing the Supplier Name and Order Number but am stuck on how to tackle the extracting.
Grateful if anyone can offer any advice on how to create a solution for this.
Is there a word that delimits the beginning for a check-off sheet? (For example, "Our Order No" or "Page 1") If so, in the following link you can download a sample flow that will help you achieve this: https://aibuilderdemo.blob.core.windows.net/storage/DetectPageBreaks-CloudFlow.zip Once you have downloaded it, import it into your environment by going to My flows > Import.
Once you have imported the flow set the delimiter word or sentence in the following action:
The flow will extract all text on the document using AI Builder Text Recognition, look for the delimiter word or sentence and return a list of page numbers where that delimiter is present. You can then use a connector like the Adobe PDF Services connector to split your document by these pages.
Hope this helps!
Thank you @JoeF-MSFT !
That is definitely the right direction, however the Split PDF connector keeps failing and giving the following error...
"Invalid page range syntax"
I tested it using the "Documnet Delimiter".
Am I using the correct output?
The documentation for the Split PDF connector hints at the syntax that is needed but I am not sure how to see what syntax is being created by the flow.
Grateful of any further assistance.
Further to the above, I just realized that I could see the syntax the Flow is creating in the "Compose 2" step.
Any suggestions for how to get it to give the first and last page numbers, separated by a dash?
And possibly in a single line if necessary?
The desired syntax, I believe, is: 1,2,3-4,5,6-7,8-11,12
For future reference, here you can download a sample flow that will return the page ranges, separated by a dash, that delimits different documents within a PDF: Know where to split a PDF with multiple documents ... - Power Platform Community (microsoft.com)
The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.
Announcing a new way to share your feedback with the Power Automate Team.
Learn to digitize and optimize business processes and connect all your applications to share data in real time.
Join Priya Kodukula and the licensing team, super users and MVPs to find answers to your questions on Power Automate licensing.