cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
tareenmj
Helper I
Helper I

Custom AI Model vs putting OCR conditions

I have a flow, in which a user submits a screenshot via Microsoft Flows. These screenshots contain statistics. The issues is that the screenshots can vary significantly in not only position of relevant stats but also the amount of stats. Examples of the screenshots are as follows: (i) Samsung 1

 Samsung 1Samsung 1

 

(ii) Samsung 2:

Samsung 2Samsung 2

 

(iii) iPhone:

Dev7 - Service mode.jpg.png

 

You can see these screenshots have different content and positions. These are just some of the images (other devices like LG, etc). Currently, I'm doing OCR and storing it into an array and then looping through the array to find relevant information. It's just a bit difficult since the OCR array for every image can vary dramatically and garnering relevant information is a bit difficult.  

 

Since I don't have a lot of experience in AI Builder, is this task more suited for the AI Builder or should I stick with my current process of OCR and extract information. 

 

Thanks for the help. 

2 ACCEPTED SOLUTIONS

Accepted Solutions
JoeF-MSFT
Power Apps
Power Apps

Hi @tareenmj - it's great to see you pushing forward to automate this process. 🙂 And thanks @jinivthakkar for sharing your experience! 

 

The good thing is that AI Builder Form Processing is built on top of Azure Form Recognizer. Same great AI technology in both cases. AI Builder gives you in addition an intuitive user experience and seamless integration with Power Automate and the rest of the Power Platform. 

 

We can try the following for this use case:

  1. Create a new AI Builder Form Processing model: Create a form processing custom model - AI Builder | Microsoft Docs
  2. Define the difference fields you want to extract from the screenshots.
  3. Upload at least 5 sample screenshot of each type, group them by collection. (1 collection for Samsung, 1 collection for iPhone...) https://docs.microsoft.com/en-us/ai-builder/create-form-processing-model#group-documents-by-collecti...
  4. Tag and train the document.

 

Once the model has been trained, you can test the model with new screenshots and see if it's able to extract the data. If you try this, I'd be super interested in hearing back how it works.

View solution in original post

JoeF-MSFT
Power Apps
Power Apps

Hi @tareenmj! For randomly extracted data, you can check the confidence score of the value. If the confidence score is low, you can discard the result. 

 

On detecting values from text, one possible approach could be to use Entity Extraction: Entity extraction custom AI model overview - AI Builder | Microsoft Docs

View solution in original post

5 REPLIES 5
jinivthakkar
Dual Super User
Dual Super User

@tareenmj I have worked with AI in the past, I can say for sure using OCR is going to be super painful. I had a similar use case where the information was random every time and I used Azure Custom Form Recognizer which is super powerful 1.5 yrs back then AI Builder was not that good but now it has also improved a lot.

Now even Azure Custom Form recognizer has become more powerful. We used Azure because it was much cheaper and more powerful.

 

You need to do analysis for your use case and then take a decision, I am still inclined towards Azure due to its ease of use and integration

 

Also there is a dedicated forum for AI, you will get precise information as the AI team itself responds there

https://powerusers.microsoft.com/t5/AI-Builder/bd-p/AIBuilder

 

--------------------------------------------------------------------------------

If this post helps answer your question, please click on “Accept as Solution” to help other members find it more quickly. If you thought this post was helpful, please give it a Thumbs Up.

JoeF-MSFT
Power Apps
Power Apps

Hi @tareenmj - it's great to see you pushing forward to automate this process. 🙂 And thanks @jinivthakkar for sharing your experience! 

 

The good thing is that AI Builder Form Processing is built on top of Azure Form Recognizer. Same great AI technology in both cases. AI Builder gives you in addition an intuitive user experience and seamless integration with Power Automate and the rest of the Power Platform. 

 

We can try the following for this use case:

  1. Create a new AI Builder Form Processing model: Create a form processing custom model - AI Builder | Microsoft Docs
  2. Define the difference fields you want to extract from the screenshots.
  3. Upload at least 5 sample screenshot of each type, group them by collection. (1 collection for Samsung, 1 collection for iPhone...) https://docs.microsoft.com/en-us/ai-builder/create-form-processing-model#group-documents-by-collecti...
  4. Tag and train the document.

 

Once the model has been trained, you can test the model with new screenshots and see if it's able to extract the data. If you try this, I'd be super interested in hearing back how it works.

Thank you to you both @jinivthakkar  and @JoeF-MSFT for your help. It is greatly appreciated! This is so helpful and I will try training an AI builder with 3 different collections and checking if the AI model correctly detects information. The problem that I faced when using AI builder in the past was that it seemed to extract information from the screenshot even if it wasn't there. An example was if the model was trained for detecting 'variable A' and I uploaded a screenshot which didn't have any mention of 'variable A', it would grab anything from the screenshot, whereas it should have been blank. 

 

I was thinking if there is an AI builder in Power Automate which can detect values from text. Meaning if I complete OCR on the image, store it in a string and then perform AI on that string? Do you believe this would be a better approach to avoid any edge or unfamiliar cases? 

JoeF-MSFT
Power Apps
Power Apps

Hi @tareenmj! For randomly extracted data, you can check the confidence score of the value. If the confidence score is low, you can discard the result. 

 

On detecting values from text, one possible approach could be to use Entity Extraction: Entity extraction custom AI model overview - AI Builder | Microsoft Docs

Very helpful. In your opinion, should I stick with image extraction with three different categories or should I go with entity extraction? The only issue I have with entity extraction is the text data can look significantly different after I do OCR (i.e. different screenshots from different devices will look differently).

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

May UG Leader Call Carousel 768x460.png

June User Group Leader Call

Join us on June 28 for our monthly User Group leader call!

MPA Virtual Workshop Carousel 768x460.png

Register for a Free Workshop

Learn to digitize and optimize business processes and connect all your applications to share data in real time.

Power Automate Designer Feedback_carousel.jpg

Help make Flow Design easier

Are you new to designing flows? What is your biggest struggle with Power Automate Designer? Help us make it more user friendly!

Users online (3,659)