Showing results for 
Search instead for 
Did you mean: 
Frequent Visitor

Machine Learning - Text Classification applied to my group mailbox

Hi experts,


This is Ken. I plan to use Machine learning method to group my group mailbox based on the subject. I have prepared a database including email subject & related group, each email will assign 1 group for the training data.


I get some inquiries would like the team for advice.


Q1: I have build a model which performance score is up to 93%. However, during my testing for the model. I input some email subject has no clues to identify to my target group. It shows me that 99% for a specific group. May I know how should I interpret the 99%? 




Q2: In AI builder, we need to input the training database into the Dataverse table; and we can test it by input the subject one by one. Is there an easy way to train and test the data in batch? 


Q3: Learnt from the documentation, each tag/category should has at least 100 sample for testing and each category should even distributed the test case. Is there any max category or tag I can apply in the AI Builder? 

Power Apps
Power Apps

Here are some answers


- The performance score provided is a self assessment of the model. A subset of the dataset set S1 is used for training and a subset S2 is used to test the created model. Let's consider a record R1 belonging to S2 associated with tags T1. We test R1 with the model generated using S1. If the result returned is T1 then it is successful, otherwise failing. We do this against all records of S2 and calculate the accuracy based on the success rate.

- If you only use the subject, it may not contain enough semantic information to allow the model to perform a good predictions. We recomend to avoid small text (<100 characters). An option would be to concatenate the body of the email, truncating at 1000 characters for instance, to provide more information to learn. 


You can use Power Automate to batch predict using the AI Builder action see 


You can have update to 200 categories, see The more categories, the more difficult it will be for the model to differentiate and so the better your training set you should be prepared (even distribution of samples, significant semantic information on samples provided, suffiicent number of samples provided >20 for each category)


Let me know if this helps.

Frequent Visitor

Thank you so much for your response.


Follow inquiry for Q1: Thank you so much for your comment. Per comment, AI Builder will automatically separate my data set into training and test data to get the score. Any idea for the ratio? 


 Follow inquiry for Q3: If there are 3 categories for my database. How should I train the model to clarify "Other" ( not group under the 3 categories)? 

Helpful resources

Microsoft 365 Conference – December 6-8, 2022

Microsoft 365 Conference – December 6-8, 2022

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Difinity Conference 2022

Difinity Conference 2022

Register today for two amazing days of learning, featuring intensive learning sessions across multiple tracks, led by engaging and dynamic experts.

European SharePoint Conference

European SharePoint Conference

The European SharePoint Conference returns live and in-person November 28-December 1 with 4 Microsoft Keynotes, 9 Tutorials, and 120 Sessions.

Top Solution Authors
Top Kudoed Authors
Users online (2,824)