If you exported your email using Google Takeout or other services, you might have ended up with an MBOX file. To have access to those emails, you can import that file into Thunderbird, here’s how:
First you need to rename the file to something that Thunderbird will understand. Usually the file will have a long name and an mbox extension, for example: “All mail Including Spam and Trash.mbox“. Rename the file to a single word without any extension, like MyMail for example. This name will be the folder name in your email, so pick whatever you want to see there.
Open Thunderbird and go to your your Profile Directory. You can access this by going to Help->More Troubleshooting Information. A new tab will open and you will be able to click on Open Directory under Profile Directory. Now that you have that window open, close down Thunderbird and copy the MBOX file you just renamed into this folder.
Now re-open Thunderbird. You should be able to see all your emails under Local Folders->MyMail (or whatever name you chose). That’s it, you should be able to see all your emails there, search, etc.
If you want to backup these messages into another email account online, simply add an email account in Thunderbird, and configure it as IMAP. Once it’s setup, simply drag and drop the MyMail folder from Local Folders to the newly added email account Inbox. A new folder will be created with the same name and all the emails will start to be uploaded. Let it finish, and then your emails will be backed up in that server as well.
If you install tesseract from the Ubuntu 22.04 LTS repositories, like this:
sudo apt-get install tesseract-ocr
You’ll end up with tesseract v4.1.1. Since tesseract v5.3.0 is out already, we’re going to install that version instead. So, if you already installed it from the repositories, make sure to first uninstall it:
sudo apt-get remove tesseract-ocr
Now we’re going to install it. First, let’s make sure you have libraries for reading different types of image files:
Now, let’s get the latest version of leptonica(v1.83.1), an image processing library used by tesseract:
cd ~/Desktop
wget https://github.com/DanBloomberg/leptonica/releases/download/1.83.1/leptonica-1.83.1.tar.gz
tar -xzvf leptonica-1.83.1.tar.gz
cd leptonica-1.83.1
mkdir build
cd build
cmake ..
make -j`nproc`
sudo make install
Now we’re going to grab the source code from tesseract and compile it:
cd ~/Desktop
wget https://github.com/tesseract-ocr/tesseract/archive/refs/tags/5.3.0.tar.gz
tar -xzvf 5.3.0.tar.gz
cd tesseract-5.3.0/
mkdir build
cd build
cmake ..
make -j `nproc`
sudo make install
Now we need to specify where the tessdata folder is to the system. Open your ~/.bashrc file like this:
nano ~/.bashrc
And simply write the following at the end of the file:
export TESSDATA_PREFIX=/usr/local/share/tessdata
Now save the file(Ctrl-O) and exit(Ctrl-X). Now run this to activate the setting:
source ~/.bashrc
We now need to grab some language models and other data files and put them in that folder. Note that we’re going to get the English models that are based on the relatively new(since v4) LSTM neural networks engine, and the most accurate version of them. You can read more about these files here. Let’s get them:
And now we should be able to use tesseract from anywhere. Open a new console and test that it’s all working properly:
tesseract --version
It should say: tesseract 5.3.0, leptonica-1.83.1.
Now, let’s actually use it. In general you’ll need to preprocess your images beforehand. For example here’s how you can align the images with Python or C++. Once you have aligned the text correctly, you should have an image like this:
Now you can simply call tesseract like this:
tesseract ~/Desktop/image.png -
458 ADDITIONAL EXAMPLES:
of the beam was brought over the prop, it required the weight of
2 man, which was 200 /0. at the less end to keep it in equilibrios
Hence the weight is required ?
Ans. 3000 1b.
100. The weight of a ladder 20 feet long is 70 ¢b. and its cen=
tre of gravity 11 feet from the less end; now what weight will a
man sustain in raising this ladder when he pushes directly against
it at the distance of 7 fect from the greater end, and his hands are
5 feet above the ground?
Ans. 63 1b. nearly.
101. If the quantity of matter in the moon, be to that of the
earth, as 1 to 39, and the distance of their centres 240000 miles ;
where is their common centre of gravity ?
Ans. 6000 miles from the earth’s centre.
102. Supposing the data as in the last question, to find the
distance from the moon in the line joining the centres, where a
body would be equally attracted by the carth and moon; the
force of attraction in bodies being directly as the quantities of
matter, and inversely as the squares of the distances from the
centres.
240000 .
Ans. ———— = 331264 miles, nearly.
9 y
103. If two fires, one giving 2 times the heat of the other, are
6 yards asunder; where must I stand directly between them to
be heated on both sides alike; the heat being inversely as the
square of the distance?
Ans. 2 yards from the less fire, or 4 from the greater.
104. To what height above the carth’s surface should a body
be carricd to lose 5 of its weight; the ecarth’s radius being
3970 miles, and the force of gravity inversely as the square of
the distance from its centre?
Ans. 214} miles.
If you want to save the output text to a file, simply specify a filename and it will create a .txt file. In this example it will create a file in your working directory, named image_ocr.txt:
tesseract ~/Desktop/image.png image_ocr
As you can see, it works fairly well for most of the text. As long as you give a reasonably clear input image, tesseract will be able to generate the correct text from it. You can read more about how to improve the quality of the output here.
I’m a Computer Engineer with a Master of Science degree in computer vision. I’ve worked in different areas such as cognitive computing, embedded systems, and augmented reality. Some of the articles I’ve written in this blog contain links to products, such as my image processing book below. As an Amazon Associate I earn from qualifying purchases.
OpenCV Video Course
Simple Background Remover
Image Processing Book
Research Papers
You can see a list of my research publications here