PYTHON FOR CHARACTER RECOGNITION – TESSERACT

in #opencv3 years ago

Tesseract is an optical character recognition tool in Python. It is used to detect embedded characters in an image. Tesseract, when integrated with powerful libraries like OpenCV, can be used to combine the tasks of localizing text (Text detection) in an image along with understanding what the text is (Text recognition).

INSTALLATION PYTHON (3.X):
Open terminal/ command prompt and type:
~pip install pytesseract
~pip install opencv-python

OPENING A SIMPLE IMAGE:

  1. Import cv2.

  2. Import pytesseract.

  3. Save the test image in the same directory.

  4. Create a variable to store the image using cv2.imread() function and pass the name of the image as parameter.

  5. To resize the image use cv2.resize() function and pass the required resolution.

  6. Use cv2.imshow(‘window_name’, image_name).

  7. Add a cv2.waitKey(0) to display image for infinity.

     import pytesseract
     import cv2
     img = cv2.imread('test.jpg')
     img = cv2.resize(img, (720, 480))
     cv2.imshow('Result', img)
     cv2.waitKey(0)
    

Screenshot_2020-12-10_10_52_24.png

CONVERTING IMAGE TO STRING

  1. Import cv2, pytesseract.
  2. Save the test image in the same directory.
  3. Create a variable to store the image using cv2.imread() function and pass the name of the image as parameter.
  4. Use cv2.imshow(‘window_name’, Image_name).
  5. To convert to string use pytesseract.image_to_string(‘image_name’) and store it in a variable.
  6. Print the string.
  7. Add a cv2.waitKey(0) to display image for infinity.
    import pytesseract
    import cv2
    img = cv2.imread('test.jpg')
    img = cv2.resize(img, (600, 360))
    print(pytesseract.image_to_string(img))
    cv2.imshow('Result', img)
    cv2.waitKey(0)

Screenshot_2020-12-10_10_52_39.png

CONVERTING IMAGE-TEXT TO AUDIO
To convert image to audio we first need to convert image to text and text to audio.

  1. Import tesseract and cv2

  2. Import os.

  3. Open command prompt and type ~pip install gtts.

  4. From gtts import gTTS.

  5. Follow the above steps to convert image to string.

  6. Store the extracted string in a variable.

  7. Play the audio using gTTS() function and pass the parameter as text, language.

  8. Save the audio using save() function.

  9. Play the audio using os.system(‘file_name’)

     import pytesseract
     import cv2
     from gtts
     import gTTS
     import os
     img = cv2.imread('test.jpg')
    
     img = cv2.resize(img, (600, 360))
     hImg, wImg, _ = img.shape
    
     boxes = pytesseract.image_to_boxes(img)
     xy = pytesseract.image_to_string(img)
     for b in boxes.splitlines():
     b = b.split(' ')
    
     x, y, w, h = int(b[1]), int(b[2]), int(b[3]), int(b[4])
     cv2.rectangle(img, (x, hImg - y), (w, hImg - h), (50, 50, 255), 1)
     cv2.putText(img, b[0], (x, hImg - y + 13), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (50, 205, 50), 1)
    
     cv2.imshow('Detected text', img)
    
     audio = gTTS(text = xy, lang = 'en', slow = False)
     audio.save("saved_audio.wav")
     os.system("saved_audio.wav")
    

Screenshot_2020-12-10_10_54_03.png