cancel
Showing results for 
Search instead for 
Did you mean: 
stpuceli

Summarize Text from large documents

Ever have a large document that you don’t have time to read and want a summary so you can determine if it’s worth investing the time?  The Summarize Text API in Cognitive Services was designed to extract key sentences in a document as they relate to the topic so you can quickly understand what the document is about.

Overview

1-Overview.png

 

 

 

 

 

 

 

 

 

 

 

  • The document is first uploaded to a SharePoint library
  • Power Automate will OCR and extract the text of the document
  • The text is then sent to the summarization API via an Azure Function.
  • The resulting summary text is then updated in a multi-line text field in the SharePoint document library where it is indexed and searchable.

Setup

In your Azure subscription you’ll need to configure the Cognitive Services Text Analytics feature.  This will setup all the necessary components you need, specifically the Endpoint to invoke the service and key to authenticate properly.  I created a new resource group on Azure to contain all the components we’ll need.

 

2-Setup.png

 

 

 

 

 

 

 

 

Endpoint API needed to call the service

 

3-Setup.png4-PowerAutomate.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PowerAutomate core activities

 

To accomplish there are 2 main steps that need to be configured in Power Automate:

  1. Recognize Text In Image – use AI Builder in Power Automate to train a model that will “Extract all the text in photos and PDF Documents (OCR)”.  This will extract text in PDF (Image and text) documents which is what we see the most.

5-PowerAutomate.png

 

2.  HTTP call to an Azure function – The summarization API does not have a user friendly interface so I created an Azure function that will conduct all the processing necessary and returns the summary of the document.

 

 

Putting it all together

Now that we have the results from AI Builder, we can send this to the Azure Function for processing.

Send the contents of the file to AI Builder

 

6-PowerAutomate.png

 

 

 

 

 

 

 

Extract the OCRed text

The result of the AI Builder activity is a JSON string that needs to be parsed.  Ultimately what were looking for is the “text” attribute in the JSON file.  This will get appended to our variable as we loop through the entire JSON output.

 

7-JSON.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8-PowerAutomate.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Loop through the JSON file extracting the “text” attribute

 

Deploy the Azure function

As part of this solution, I have created an Azure function that will need to be deployed to the resource group created earlier.  Download the Visual Studio solution from GitHub and update the AzureKeyCredential with a valid Key from the Cognitive Services Text Analytics feature you created earlier.

 

9-TextAnalytics.png

 

 

 

 

 

 

Build and deploy the Azure function to the resource group created above.  This will provide you with the endpoint URL need to make the HTTP call in Power Automate.

10-HTTP.png

 

 

 

 

 

 

 

 

 

Call the Azure Function

Now that you have the Azure function deployed, we have everything configured to send the extracted text to the summarization API for processing.

11-HTTP.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Once you have the results back, update the library.

12-PowerAutomate.png

 

 

 

 

 

 

 

 

 

 

The resulting text is then displayed in the library for users to view and search.

13-Library.png

Meet Our Blog Authors
  • Experienced Consultant with a demonstrated history of working in the information technology and services industry. Skilled in Office 365, Azure, SharePoint Online, PowerShell, Nintex, K2, SharePoint Designer workflow automation, PowerApps, Microsoft Flow, PowerShell, Active Directory, Operating Systems, Networking, and JavaScript. Strong consulting professional with a Bachelor of Engineering (B.E.) focused in Information Technology from Mumbai University.
  • I am a Microsoft Business Applications MVP and a Senior Manager at EY. I am a technology enthusiast and problem solver. I work/speak/blog/Vlog on Microsoft technology, including Office 365, Power Apps, Power Automate, SharePoint, and Teams Etc. I am helping global clients on Power Platform adoption and empowering them with Power Platform possibilities, capabilities, and easiness. I am a leader of the Houston Power Platform User Group and Power Automate community superuser. I love traveling , exploring new places, and meeting people from different cultures.
  • Blog site: https://ganeshsanapblogs.wordpress.com/ MCT | SharePoint, Microsoft 365 and Power Platform Consultant | Contributor on SharePoint StackExchange, Techcommunity
  • Encodian Owner / Founder - Ex Microsoft Consulting Services - Architect / Developer - 20 years in SharePoint - PowerPlatform Fan
  • Founder of SKILLFUL SARDINE, a company focused on productivity and the Power Platform. You can find me on LinkedIn: https://linkedin.com/in/manueltgomes and twitter http://twitter.com/manueltgomes. I also write at https://www.manueltgomes.com, so if you want some Power Automate, SharePoint or Power Apps content I'm your guy 🙂
  • I am the Owner/Principal Architect at Don't Pa..Panic Consulting. I've been working in the information technology industry for over 30 years, and have played key roles in several enterprise SharePoint architectural design review, Intranet deployment, application development, and migration projects. I've been a Microsoft Most Valuable Professional (MVP) 15 consecutive years and am also a Microsoft Certified SharePoint Masters (MCSM) since 2013.
  • Big fan of Power Platform technologies and implemented many solutions.
  • Passionate #Programmer #SharePoint #SPFx #M365 #Power Platform| Microsoft MVP | SharePoint StackOverflow, Github, PnP contributor
  • Web site – https://kamdaryash.wordpress.com Youtube channel - https://www.youtube.com/channel/UCM149rFkLNgerSvgDVeYTZQ/