cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
oldhamjr
New Member

AI Builder Training Error

I am encountering the error below when attempting to train a Form model, using multipage PDF files as the source.   These files have already been optimized to reduce the data file size.  Each of the 5 PDFs used in the attempted training contains no more than 50 pages.

 

"Fields could not be loaded for this document. It looks like this PDF document has many pages to process. We recommend that you split the PDF with only the pages you need to extract data from and upload the reduced document instead."

 

Does manually extracting the pages with the desired data for the training model negatively impact the real-world use of the model, which won't have pages manually extracted for future documents?

3 REPLIES 3
JoeF-MSFT
Power Apps
Power Apps

Hi @oldhamjr.

 

Thank you posting this question and apologies that you are experiencing this. 

 

Unfortunately a PDF with 50 pages might be too many pages today to train a model. Do you need the model to extract data from all those pages? If not, then what you can do is generate 5 sample PDFs for trainign with only the pages where there is data you want to extract.

 

Once the model is trained, you can use the model in a cloud flow in Power Automate and you will be able to specify which pages to process as described here. This will also help reduce the cost, as the cost of using AI Builder is per page. 

 

We're actively working so that beginning of next year, PDFs with 50 pages and beyond won't be an issue to train a form processing model.  

oldhamjr
New Member

Thank you for the quick response!  Your comments regarding the training model make sense.  Unfortunately, I won't know which pages contain the data, since these files are provided from many different third-party sources.  Single page training to single page evaluation by Flow works well.  Single-page training with a multi-page flow doesn't produce usable results in my table.  Performing the upfront manual work to extract single pages for the Flow to scan seems like the only option with this tool.  This doesn't seem to provide me with an easy, low-effort, solution for my end users.  Is there an option for the AI model to evaluate an .txt file, if the data from the PDF is converted to text?

 

JoeF-MSFT
Power Apps
Power Apps

Thank you @oldhamjr for the feedback. 

For documents with as many pages as 50 the recommendation today is to use the page range option in flow. We keep working to enable processing of up to 500 pages at once by beginning of next year.

Helpful resources

Announcements
Power Apps News & Annoucements carousel

Power Apps News & Announcements

Keep up to date with current events and community announcements in the Power Apps community.

Microsoft 365 Conference – December 6-8, 2022

Microsoft 365 Conference – December 6-8, 2022

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Power Apps Community Blog Carousel

Power Apps Community Blog

Check out the latest Community Blog from the community!

Top Solution Authors
Users online (2,554)