cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Doffe83
Frequent Visitor

Which AI builder model to use for sorting documents containing trigger words?

Hi, I have a case where we need to run through and sort out PDF documents in a SharePoint library so that files containing certain defined tigger words are moved to another library. This need to be done automatically. 
It would be nice to extract how many times each triggerword is mentioned in each file, but not crucial. 

Which AI builder model is most suitable for this? 🙂  

Doffe83_0-1654871007584.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
JoeF-MSFT
Power Apps
Power Apps

Hi @Doffe83, as nicely shared by @MarvinBangert the document processing model is one option. However, you will need to provide samples of the different documents layouts you have to train a custom model.

 

If you only need to search for specific words in the document, then a more lightweight solution is to leverage the text recognition model which returns all the text contained in a document and then you can search if a keyword is present on it in a cloud flow than then orchestrates moving the PDF documents in different folders. Here's a cookbook of a similar case that can be used as inspiration: Extract Text from an image - Power Platform Community (microsoft.com)

 

Hope this helps and you successfully build this solution!

View solution in original post

3 REPLIES 3
MarvinBangert
Super User
Super User

Hi @Doffe83 

You could check out the document processing model, since MS Build 2022 the "unstructured documents" model is available. Depending on your documents and information in these, maybe the "structured / semi-structured documents" model could also be useful for you (structured document if it is something that always looks the same, like an invoice and unstructured if you want to analyze a text within a document, like a letter). Because unstructured documents are still in preview, you need to make sure to use an environment in where it is available: Feature availability by region or US Government environment - AI Builder | Microsoft Docs and to also keep more an eye on it.

 

You can directly use the documents from your SharePoint library to select for training (you should split it, so that the model doesn't train on all documents) and then select the area you want to extract. Using this extracted information, you can extend your flow to move the document based on the value to another library.

 

Does this help you? Otherwise please give me some more information.

Best regards
Marvin

If you like this post, give a Thumbs up. If it solved your request, Mark it as a Solution to enable other users to find it.

Blog: Cloudkumpel

JoeF-MSFT
Power Apps
Power Apps

Hi @Doffe83, as nicely shared by @MarvinBangert the document processing model is one option. However, you will need to provide samples of the different documents layouts you have to train a custom model.

 

If you only need to search for specific words in the document, then a more lightweight solution is to leverage the text recognition model which returns all the text contained in a document and then you can search if a keyword is present on it in a cloud flow than then orchestrates moving the PDF documents in different folders. Here's a cookbook of a similar case that can be used as inspiration: Extract Text from an image - Power Platform Community (microsoft.com)

 

Hope this helps and you successfully build this solution!

Doffe83
Frequent Visitor

Thank you very much guys!  I think the last and simplest option is the best fit here, because i only need it to trigger on specific  words. 

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

New Ideas Forum MPA.jpg

A new place to submit your Ideas for Power Automate

Announcing a new way to share your feedback with the Power Automate Team.

MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

365 EduCon 768x460.png

Microsoft 365 EduCon

Join us for two optional days of workshops and a 3-day conference, you can choose from over 130 sessions in multiple tracks and 25 workshops.

Top Solution Authors
Users online (3,297)