cancel
Showing results for 
Search instead for 
Did you mean: 

Convert PDF to Text / Table / Image

Being able to handle PDF-files is one of the most basic requirements.

 

Ideally it would be able to extract table data from the pdf (see power BI connector), or at least extract the text from the pdf. If that is too difficult, being able to save it as an image format is an option, after which we would be able to OCR it.

Status: New
Comments
Level: Power Up

Surely this can be connected to either SelectPDF or SpirePDF which both have free API/NuGet PDF raster functions allowing pdf conversion to ".jpg" or ".png" image formats, which is required "Azure > Cognitive Services > Computer Vision".

 

Then "Azure > Cognitive Services > Computer Vision" can handle the text recognition, which allows 5,000 documents month (max 20 per minute) on the FREE instance. That's a decent workload for free in my opinion.