Wanted to confirm on the AI Builder capability on extracting data from non-standard pdf invoices. The semi-structured invoices can be of 200 different formats.
I am using AI builder to extract data from around 80 layouts, 200 should not be a problem if they vary from each other enough.
In my case, layouts were almost identical between some (coming from the same vendor system, but for example buyer and seller data switched over).
I ended up with splitting data sets into 2 separate models for those and then orchestrating via flow:
1. Try with 1 model, see if TAX number is allocated to model 1 (excel table) if not:
2. Extract with 2 model, see if TAX number is allocated to model 2 (excel table), if yes - put data into excel/list with extracted data, or cancel flow if tax number not allocated to any model (meaning totally new format, data not recognized correctly)
Issue with this approach is you consume your AI credits twice for those layouts in model 2, first when you pass them through model 1 and then when you pass through model 2.
Thanks a lot. Will check and come back.
You can explore prebuilt model for invoice processing from AI builder. That returns for almost all standard parameters . Training is practically not possible for 200 odd templates .