AI-102 Designing and Implementing a Microsoft Azure AI Solution

Extract text from images or PDFs by using the Computer Vision service

Concepts

Azure Computer Vision is a powerful service that allows you to extract text from images or PDFs with ease. Whether you need to process scanned documents, extract information from receipts, or digitize handwritten notes, the Computer Vision service can assist you in automating these tasks. In this article, we will explore how to use the Computer Vision service to extract text from images or PDFs in a Designing and Implementing a Microsoft Azure AI Solution exam scenario.

Prerequisites

To get started, you will need an Azure subscription and the Azure Computer Vision service provisioned in your Azure portal. If you haven’t set up the service yet, you can follow the documentation to create a Computer Vision resource.

Once you have the required resources set up, you can use the Computer Vision API to perform Optical Character Recognition (OCR) on images or PDF files. OCR is the process of converting images of text into machine-readable text.

Steps to Extract Text from Images

To use the Computer Vision service, you need an API key and an endpoint URL. You can find these details in the Azure portal after provisioning the Computer Vision resource. Make sure to keep your API key secure.

Let’s begin by extracting text from an image. Assume that you have a PNG image named “sample_image.png” stored locally on your machine. Here’s how you can extract the text from the image using the Computer Vision service:

const apiKey = "YOUR_API_KEY"; const endpoint = "YOUR_ENDPOINT_URL";


  // Azure Cognitive Services SDK

  const CognitiveServicesCredentials = require("@azure/ms-rest-js").CognitiveServicesCredentials;

  const ComputerVision = require("@azure/cognitiveservices-computervision").ComputerVisionClient;
  // Instantiate the Computer Vision client

  const credentials = new CognitiveServicesCredentials(apiKey);

  const client = new ComputerVision(credentials, endpoint);
  // Read the image file

  const fs = require("fs");

  const image = fs.readFileSync("sample_image.png");
  // Perform OCR on the image

  async function extractTextFromImage() {

    const result = await client.recognizePrintedTextInStream(image);
    // Extract and display the recognized text

    const extractedText = result.analyzeResult.readResults.map(page => page.lines.map(line => line.words.map(word => word.text).join(" ")).join(" ")).join(" ");

    console.log(extractedText);

  }

extractTextFromImage();

In the above code, we first load the required SDKs and create an instance of the Computer Vision client using your API key and endpoint. Then, we use the `readFileSync` function from the Node.js `fs` module to read the image file. Finally, we make an asynchronous call to the `recognizePrintedTextInStream` method to perform OCR on the image. The result is then processed to extract the recognized text.

Steps to Extract Text from PDFs

Now, let’s explore how to extract text from a PDF. Assume that you have a PDF file named “sample_pdf.pdf” stored locally on your machine. Here’s an example of how you can extract the text from the PDF using the Computer Vision service:

Extract Text from a PDF

// Instantiate the Computer Vision client and read the PDF file const extractTextFromPDF = async () => { const pdfPath = "sample_pdf.pdf"; const result = await client.recognizePrintedTextInStream(fs.readFileSync(pdfPath));


    // Extract and display the recognized text

    const extractedText = result.analyzeResult.readResults.map(page => page.lines.map(line => line.words.map(word => word.text).join(" ")).join(" ")).join(" ");

    console.log(extractedText);

  }

extractTextFromPDF();

In this code snippet, we again instantiate the Computer Vision client and read the PDF file using `readFileSync`. Then, we call the `recognizePrintedTextInStream` method, passing in the PDF file. The resulting text is extracted and displayed.

Using the above examples, you can easily extract text from images or PDFs by leveraging the Computer Vision service. This functionality can be extremely useful in automating data extraction and digitizing processes. Make sure to refer to the Microsoft documentation for more details and explore the various additional features offered by the Computer Vision service to enhance text extraction capabilities in your Azure AI solution.

Answer the Questions in Comment Section

Which service can be used to extract text from images or PDFs in Azure?

a) Azure Cognitive Services
b) Azure Machine Learning
c) Azure Computer Vision
d) Azure Text Analytics

Answer: c) Azure Computer Vision

True/False: The Computer Vision service in Azure supports extracting text from handwritten documents.

Answer: True

What is the maximum file size supported for text extraction using the Computer Vision service in Azure?

a) 2 MB
b) 4 MB
c) 6 MB
d) 8 MB

Answer: b) 4 MB

Which programming languages are supported for integrating the Computer Vision service in Azure?

a) Python
b) C#
c) Java
d) All of the above

Answer: d) All of the above

True/False: The Computer Vision API in Azure can extract text from images in multiple languages, including English, Spanish, French, German, and Chinese.

Answer: True

What format does the extracted text from images or PDFs come in when using the Computer Vision service?

a) CSV (Comma-Separated Values)
b) JSON (JavaScript Object Notation)
c) TXT (Plain Text)
d) XLSX (Microsoft Excel)

Answer: b) JSON (JavaScript Object Notation)

Which OCR (Optical Character Recognition) engine does the Computer Vision service in Azure use for text extraction?

a) Tesseract
b) OCRopus
c) Microsoft OCR
d) Google Cloud Vision OCR

Answer: c) Microsoft OCR

True/False: The Computer Vision service can automatically detect and extract text from tables within images or PDFs.

Answer: True

Which cognitive skill of the Computer Vision service enables automatic extraction of printed and handwritten text?

a) Optical Character Recognition (OCR)
b) Image Classification
c) Object Detection
d) Facial Recognition

Answer: a) Optical Character Recognition (OCR)

What type of subscription is required to use the Computer Vision service in Azure?

a) Free Trial
b) Pay-as-you-go
c) Enterprise Agreement
d) None, it is included in Azure services

Answer: b) Pay-as-you-go

0 0 votes

Article Rating

26 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Pierre Moulin

1 year ago

This blog post on using the Computer Vision service to extract text from images is fantastic. Really helpful for the AI-102 exam!

Jennie Diaz

1 year ago

I found the section on handling different image formats particularly useful. Thanks!

Viktor Groven

1 year ago

How accurate is the OCR feature in Azure Computer Vision? Does anyone have real-world examples?

Teerth Acharya

1 year ago

Can this service also be used for PDF files with multiple pages?

Anja Đokić

1 year ago

Appreciate the detailed breakdown of the OCR setup process!

Selma Christensen

1 year ago

Great blog! The examples made it super easy to follow.

Virginia Quiñones

1 year ago

This is exactly what I needed for my project. Thanks!

Marie Ross

1 year ago

Well-written post. It really helped clarify some points for my AI-102 exam prep.

Extract text from images or PDFs by using the Computer Vision service

Concepts

Prerequisites

Steps to Extract Text from Images

Steps to Extract Text from PDFs

Extract Text from a PDF

Answer the Questions in Comment Section

Which service can be used to extract text from images or PDFs in Azure?

True/False: The Computer Vision service in Azure supports extracting text from handwritten documents.

What is the maximum file size supported for text extraction using the Computer Vision service in Azure?

Which programming languages are supported for integrating the Computer Vision service in Azure?

True/False: The Computer Vision API in Azure can extract text from images in multiple languages, including English, Spanish, French, German, and Chinese.

What format does the extracted text from images or PDFs come in when using the Computer Vision service?

Which OCR (Optical Character Recognition) engine does the Computer Vision service in Azure use for text extraction?

True/False: The Computer Vision service can automatically detect and extract text from tables within images or PDFs.

Which cognitive skill of the Computer Vision service enables automatic extraction of printed and handwritten text?

What type of subscription is required to use the Computer Vision service in Azure?

Related Post

Integrate Cognitive Services into a bot, including question answering, language understanding, and Speech service

Test a bot using the Bot Framework Emulator or the Power Virtual Agents web app

Test a bot in a channel-specific environment