
November 26th, 2012, 01:57 AM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 20
Time spent in forums: 1 h 20 m 39 sec
Reputation Power: 0
|
|
|
php5 - PDF File Conversion to Images &Text Extraction from Images in Cloud
Saaspose development team is very happy to announce the conversion of PDF file to images and recognize text using Saaspose APIs. A very important and interesting aspect of Saaspose APIs is that you can integrate multiple file format APIs to combine a variety of features and achieve the desired results. There might be scenarios where you want to get PDF file as images using Saaspose.Pdf and extract text from the images using Saaspose.OCR.
Saaspose.Pdf is a REST API for creating and editing PDF files and converting to other file formats. Saaspose.OCR is a REST API for optical character recognition and document scanning. Let’s have a look at how you can use these two REST APIs together to work with PDF files and text recognition. You can convert PDF file to images using Saaspose.Pdf API.
This REST API allows converting the PDF file to images in the cloud; it converts the PDF file to images, you may choose to convert the whole PDF file to image, or you may choose to convert the required pages. The supported image formats are JPEG, PNG, GIF, BMP, TIFF etc. Once you have converted the PDF files to images, you can use Saaspose.OCR REST API to recognize text from images and save it to the database. You can also recognize the font attributes from extracted text such as font type, font style and font size through Saaspose.OCR.
Saaspose.Pdf supports this very strong and useful feature of converting PDF files to images. You can also convert PDF page to image with default size or specified size. You can choose to manipulate the images using Saaspose APIs; for instance, Saaspose.OCR to recognize a collection of characters from images in different languages like English, French, and Spanish. So using a combination of these two REST APIs, you can easily achieve quality results of image extraction and character recognition.
|