Node.js: Extract text from image using Tesseract.
In this article, we will see how to extract text from images using Tesseract . So let's start with this use-case, Suppose you have 300 screenshot images in your mobile which has an email attribute that you need for some reason like growing your network or for email marketing. To get an email from all these images manually into CSV or excel will take a lot of time. So now we will check how to automate this thing. First, you need to install Tesseract OCR( An optical character recognition engine ) pre-built binary package for a particular OS. I have tested it for Windows 10. For Windows 10, you can install it from here. For other OS you make check this link. So once you install Tesseract from windows setup, you also need to set path variable probably, 'C:\Program Files\Tesseract-OCR' to access it from any location. Then you need to install textract library from npm. To read the path of these 300 images we can select all images and can rename it to som