AWS Amazon Textract - Extract Text and Data with Machine Learning


Amazon TextractIs a service that automatically extracts text and data from scanned documents.Amazon Textract can recognize not only simple optical character recognition (OCR), but also the content of the fields in the form and the information stored in the form.

Today, many companies extract data from documents and forms through slow and expensive manual data entry or through simple optical character recognition (OCR) software that requires manual customization or configuration.The rules and workflow of each document and form usually need to be hard-coded and updated every time a form is changed or multiple forms are processed.If the form deviates from the rules, the output is often messy and unusable.

Amazon TextractThese challenges are overcome by using machine learning to instantly "read" almost any type of document to accurately extract text and data without any manual operations or custom codes.Using Textract, you can quickly automate your document workflow, enabling you to process millions of document pages in a matter of hours.After capturing the information, you can manipulate it in the business application to initiate the next steps in loan application or medical claims processing.

Advantages of Amazon Textract:

  1. Optical Character Recognition (OCR)
  2. Machine learning backend
  3. No machine learning expertise required
  4. Extract data quickly and accurately
  5. No code or templates to maintain
  6. Reduce file processing costs
  7. Automatically identify key-value pairs
  8. Automatic recognition of table values
  9. Image scanner
  10. PDF scanner
  11. Detect Latin alphabet characters (English letters) and ASCII symbols
  12. Support PDF, JPG, PNG file formats
  13. JPG and PNG files up to 10MB
  14. PDF documents up to 300MB
  15. PDF documents up to 3000 pages
  16. Pay-as-you-go payment model
  17. Easy to customize






  • Create an IAM user with Amazon Textract and Amazon S3 policies attached.
  • For PDF and Image Textract options, just include your configuration in the configurationAWS IAM user access and Secret access keyand yoursAWS S3 bucket nameEverything is ready!
  • Maximum Textract requires setting up Amazon Lambda/SNS/SQS/SES services. provided instructions.



Cost of running Amazon Textract:



  • You can use any hosting platform as the application itself according to your preferences
  • AWS account (open for free-you will beFree tierthe first year)
  • Amazon S3 storage cost (fordata storage and Data flow output)

With Amazon TextractYou only pay for what you use. There is no minimum fee and no upfront commitment.Amazon Textract charges you for each page you process and whether you only extract text from documents or text containing tables and/or form data. A single page may contain 0 to 3,000 words.

Dokumenttext-API erkennen:The Detect Document Text API uses optical character recognition (OCR) technology to extract text from the provided documents.

Analysis document API:The Analyze Document API extracts data from the form and extracts key-value pairs from the form. For example, the form label of "First Name" and the associated value.When using the Analyze Document API, you can use the Detect Document Text API to perform OCR for free.

You can get started for freeAWS free tier. In the first three months, new AWS customers can use the Detect Document Text API to analyze up to 1,000 pages per month, and use the Analyze Documents API to analyze up to 100 pages per month.

  • latest price -click here



Installation Notes:



Setup requirements:

  • Requires AWS PHP SDK v3 (already provided with the application)-Build link
  • AWS IAM user with Amazon Textract and Amazon S3 access policies attached -Build link
  • Amazon S3 bucket with public access -Build link
  • Also listed and explained in the document
  • To set the maximum Textract, please refer to the documentation




  • File saverEligre
  • JSZip byStewart Knightley
  • PDF object byPhilip Hutchison


Release notes- Change log:


20.06.2020 - 1.0.0
     - Update: Documentation 
     - Fix: Lambda function minor fix

08.05.2020 - 1.0.0
     - Initial Release


AWS Amazon Textract - Extract Text and Data with Machine Learning [Free Download]
AWS Amazon Textract - Extract Text and Data with Machine Learning [Nulled]
PHP Scripts

  • High Resolution: Yes
  • Compatible Browsers: Firefox, Opera, Chrome, Edge
  • Files Included: JavaScript JS, HTML, CSS, PHP
  • Software Version: PHP 7.x