how we can know if none of the models inside a collection was able to extract/undertand page
We are working on a system with the following main requirements:-
1) we have 3 types of documents (driving license, vehicle registration & Personal Id )
2) users will be uploading the 3 types of documents inside the same SharePoint document library as part of a larger pdf documents. for example a pdf can contain 10 pages and one of the pages contain the driving license, where other documents as supported documents that we do not need to extract any data from.
3) to achieve this we are planning to create Form Processing Model which contain 3 collections (Collection for each document type)
4) then inside power automate we will read the PDF file >> loop through its pages >> and run the Model on each page.
so we will have these cases:-
1) the page contain either driving license, vehicle registration or Personal Id >> so the related collection should be able to extract the data.
2) the page contain anonymous image.
So my question is can we inside our power automate determine if any of the collections have worked on the page or not ? So if none of the collections were able to process the document then this means that the document is anonymous document and we should not extract any data from it.. can anyone advice on this please?
1. You just send the document to your multi collection Form Processing model. If all data extracted are empty, it is likely that you have processed an anonymous document. The problem with that is that you will consume credits to process anonymous documents.
2. You have a way to find special keyword in your document that provide the document type. In that case, you can use Text Recognition to perform a basic OCR (this model is really cheap). If the document is not anonymous, you will send it to your Form Processing model for extraction.
For the keyword finding, you can take a look at this template for inspiration