Skip to content


How to import an MBOX file into Thunderbird

If you exported your email using Google Takeout or other services, you might have ended up with an MBOX file. To have access to those emails, you can import that file into Thunderbird, here’s how:

First you need to rename the file to something that Thunderbird will understand. Usually the file will have a long name and an mbox extension, for example: “All mail Including Spam and Trash.mbox“. Rename the file to a single word without any extension, like MyMail for example. This name will be the folder name in your email, so pick whatever you want to see there.

Open Thunderbird and go to your your Profile Directory. You can access this by going to Help->More Troubleshooting Information. A new tab will open and you will be able to click on Open Directory under Profile Directory. Now that you have that window open, close down Thunderbird and copy the MBOX file you just renamed into this folder.

Now re-open Thunderbird. You should be able to see all your emails under Local Folders->MyMail (or whatever name you chose). That’s it, you should be able to see all your emails there, search, etc.

If you want to backup these messages into another email account online, simply add an email account in Thunderbird, and configure it as IMAP. Once it’s setup, simply drag and drop the MyMail folder from Local Folders to the newly added email account Inbox. A new folder will be created with the same name and all the emails will start to be uploaded. Let it finish, and then your emails will be backed up in that server as well.

Posted in Ubuntu.

Tagged with , , , , , .


How to install the latest version of the open source OCR tesseract in Ubuntu 22.04 LTS

If you install tesseract from the Ubuntu 22.04 LTS repositories, like this:

sudo apt-get install tesseract-ocr

You’ll end up with tesseract v4.1.1. Since tesseract v5.3.0 is out already, we’re going to install that version instead. So, if you already installed it from the repositories, make sure to first uninstall it:

sudo apt-get remove tesseract-ocr

Now we’re going to install it. First, let’s make sure you have libraries for reading different types of image files:

sudo apt-get install libpng-dev libjpeg-dev libtiff-dev libgif-dev libwebp-dev libopenjp2-7-dev zlib1g-dev

Now, let’s get the latest version of leptonica(v1.83.1), an image processing library used by tesseract:

cd ~/Desktop
wget https://github.com/DanBloomberg/leptonica/releases/download/1.83.1/leptonica-1.83.1.tar.gz
tar -xzvf leptonica-1.83.1.tar.gz
cd leptonica-1.83.1
mkdir build
cd build
cmake ..
make -j`nproc`
sudo make install

Now we’re going to grab the source code from tesseract and compile it:

cd ~/Desktop
wget https://github.com/tesseract-ocr/tesseract/archive/refs/tags/5.3.0.tar.gz
tar -xzvf 5.3.0.tar.gz 
cd tesseract-5.3.0/
mkdir build
cd build
cmake ..
make -j `nproc`
sudo make install

Now we need to specify where the tessdata folder is to the system. Open your ~/.bashrc file like this:

nano ~/.bashrc

And simply write the following at the end of the file:

export TESSDATA_PREFIX=/usr/local/share/tessdata

Now save the file(Ctrl-O) and exit(Ctrl-X). Now run this to activate the setting:

source ~/.bashrc

We now need to grab some language models and other data files and put them in that folder. Note that we’re going to get the English models that are based on the relatively new(since v4) LSTM neural networks engine, and the most accurate version of them. You can read more about these files here. Let’s get them:

wget https://raw.githubusercontent.com/tesseract-ocr/tessdata_best/main/eng.traineddata
wget https://github.com/tesseract-ocr/tessdata/raw/3.04.00/osd.traineddata
wget https://raw.githubusercontent.com/tesseract-ocr/tessdata/3.04.00/equ.traineddata
sudo mv *.traineddata /usr/local/share/tessdata

And now we should be able to use tesseract from anywhere. Open a new console and test that it’s all working properly:

tesseract --version

It should say: tesseract 5.3.0, leptonica-1.83.1.

Now, let’s actually use it. In general you’ll need to preprocess your images beforehand. For example here’s how you can align the images with Python or C++. Once you have aligned the text correctly, you should have an image like this:

Now you can simply call tesseract like this:

tesseract ~/Desktop/image.png -
458 ADDITIONAL EXAMPLES:

of the beam was brought over the prop, it required the weight of
2 man, which was 200 /0. at the less end to keep it in equilibrios
Hence the weight is required ?

Ans. 3000 1b.

100. The weight of a ladder 20 feet long is 70 ¢b. and its cen=
tre of gravity 11 feet from the less end; now what weight will a
man sustain in raising this ladder when he pushes directly against
it at the distance of 7 fect from the greater end, and his hands are
5 feet above the ground?

Ans. 63 1b. nearly.

101. If the quantity of matter in the moon, be to that of the
earth, as 1 to 39, and the distance of their centres 240000 miles ;
where is their common centre of gravity ?

Ans. 6000 miles from the earth’s centre.

102. Supposing the data as in the last question, to find the
distance from the moon in the line joining the centres, where a
body would be equally attracted by the carth and moon; the
force of attraction in bodies being directly as the quantities of
matter, and inversely as the squares of the distances from the
centres.

240000 .
Ans. ———— = 331264 miles, nearly.
9 y

103. If two fires, one giving 2 times the heat of the other, are
6 yards asunder; where must I stand directly between them to
be heated on both sides alike; the heat being inversely as the
square of the distance?

Ans. 2 yards from the less fire, or 4 from the greater.
104. To what height above the carth’s surface should a body
be carricd to lose 5 of its weight; the ecarth’s radius being

3970 miles, and the force of gravity inversely as the square of
the distance from its centre?

Ans. 214} miles.

If you want to save the output text to a file, simply specify a filename and it will create a .txt file. In this example it will create a file in your working directory, named image_ocr.txt:

tesseract ~/Desktop/image.png image_ocr

As you can see, it works fairly well for most of the text. As long as you give a reasonably clear input image, tesseract will be able to generate the correct text from it. You can read more about how to improve the quality of the output here.

Did you enjoy the article?

Posted in Computer Vision, Open Source, Ubuntu.


The best device for tracking your heart rate using open source

TL:DR; If you want to be able to track your heart rate using open source, the best device that you can get today is the Polar H10. Note that I use affiliate links in this post, so if you end up buying something, I might get a commission.

There are countless devices out there that promise you to track your heart rate such as the Fitbit Sense 2, Garmin HRM-Pro Plus, Kummel Fitness Tracker, and many more. Many of them only work with their proprietary apps. Some of these devices also offer a subscription model to keep using certain features. Most will not allow you access to the data from the device itself. At best, you might be able to get the data from their website, after you upload it to them.

In the search for a device that would be able to accurately measure your heart rate using open source, I had a few requirements in mind. It should allow you to use it with 3rd party apps, not only their official one. You shouldn’t need to subscribe to any service in order to use it. And most importantly, you should have access to the raw data directly from the device. I quickly found out that there aren’t many alternatives that pass all these requirements, and that the absolute best of them all, by far, is the Polar H10.

Polar H10

Polar, the brand that produces the H10, has been around for a long time. They were founded in Finland in 1977 and in fact, they made the first ever wireless heart rate monitor. They continue doing research to this day at their Polar Research Center and they even invite people to collaborate with them. So, the company behind it is great. But what about the device itself?

This device offers one of the most, if not the most, accurate heart rate measurements in the market. It has been recently proven in an academic study done in the Czech Republic that “ECG data captured by the Polar H10 heart rate sensor is usable in real practice for the evaluation of baseline rhythm, atrial fibrillation and premature contractions”.

In terms of connectivity, the Polar H10 comes with three options: Bluetooth Low Energy(BLE), which is available in pretty much all modern phones and laptops(note that it can connect to two simultaneous Bluetooth devices), ANT+, which is available in some devices, and 5kHz(Gymlink), which is used to connect your heart rate monitor to machines in the gym.

Getting the heart rate data while exercising using open source

Because the Polar H10 implements the Bluetooth Heart Rate Profile it can work with many apps that use this standard. In particular, you can connect it directly to one of the best, if not the best, open source tracking app: RunnerUp(Play | F-Droid | GitHub). This app, combined with the Polar H10, will allow you to keep yourself at your optimum level while running, because it speaks to you whenever you cross your boundaries of your defined heart rate zone. This is extremely handy because you don’t even have to look at any screens, you will know exactly when you need to go harder, and when you need to slow down to train at your optimum level. Combined with the interval training options of the app, this is by far the best deal that you can get anywhere, and it’s all available to you in the app for free, forever. Also, you will have access to the sensor data directly from the app, without having to upload it anywhere if you don’t want to. You can then export the data(GPS tracks from the phone and Heart Rate data from H10) to your computer for further analysis with more dedicated open source tools like GoldenCheetah.

Full access to the raw data of the device using Polar SDK

The previous application should be enough for most people wanting to simply have full access to their heart rate data while exercising. But if you want to dig deeper and get a better understanding of the underlying raw data from the device, you’ll be happy to know that Polar publishes their SDK with examples in GitHub. With the Polar SDK you’ll be able to read and process live data from the H10 heart rate sensor. This means that you can get Electrocardiography (ECG) data at 130Hz, acceleration data at up to 200Hz, heart rate as beats per minute, and more. All directly from the device. The Polar SDK supports both, Android and iOS devices.

Advanced access through BLE Generic Attribute Profile (GATT)

If you want to have full access the H10 directly using a computer, or any other device not covered in the previous sections, you can have a look at their published technical details. These include all you need to know to interface with this BLE sensor through GATT. This means that you could connect to it through any device that can talk to a BLE sensor and get the raw data. You can have a look at projects like this one if you’re planning to use a computer to interface with the H10.

Conclusion

As you can see, the Polar H10 is one of the most accurate heart rate sensors in the market at the moment, and you can get the data from it using open source in many different ways. You can simply use any tracking apps in your phone that support standard Bluetooth heart trackers, you can get the full raw data with the Polar SDK, or you can even connect to the H10 with a computer, or any kind of device that has BLE. It’s one of the most open devices that you can get at the time.

Enjoyed the article?

Posted in IoT, Open Source.

Tagged with , , .


22 great apps for GrapheneOS, or how to setup a useful Android smartphone with open source apps only

GrapheneOS is basically the open source version of Android, the Android Open Source Project, enhanced with many privacy and security features on top. It’s basically the most secure and private mobile OS you can run in Android these days.

To install it you’ll need a Pixel phone. This is because those devices offer the best security in their hardware. Here’s the current list of supported phones. At the time of writing, there’s official support for devices from the Pixel 4a to the Pixel 7 pro. If you’re planning to buy a new phone, it’s recommended to get at least a Pixel 6(that means you should get either a Pixel 6, Pixel 6 Pro, Pixel 6a, Pixel 7, or a Pixel 7 Pro), given the better and more secure hardware(Titan M2 was introduced here), and 5 years of security updates. If you’re on a budget, you should consider buying a Pixel 6a as it offers an incredible value for money, and still gets the same 5 years of security updates and similar hardware. Note that if you do buy them from those links, I might get a small commission from it.

If you already have a supported phone in your hands, the next step is to install GrapheneOS in it. This is much simpler than what you imagine. There are two official ways to install it, either through a web installer, or through a command line installer. It doesn’t matter which one you use, you’ll end up with the same result if you follow the official instructions correctly. If you ever want to ask something about GrapheneOS, head towards the GrapheneOS community where you can ask anything you want, and will receive excellent support.

After the install, you’ll have a very basic, yet functional smartphone. If you only need to use the phone for calls, SMS, and basic stuff like that, then you’re done. If you want to add more functionality through apps, you have a couple of options:

F-Droid

F-Droid is an app store dedicated to host open source applications. You can get many apps from this store, but don’t expect to see the big brands as mostly they have closed sourced applications. The good news is that there are plenty of open source alternatives that you can use instead, which gives you more freedom. To install F-Droid, simply open your web browser, which is called Vanadium, and is available at the bottom of your screen, and go to https://f-droid.org. There, click on Download F-droid. You’ll be downloading an apk file(an Android app). Once it’s downloaded, open it and install it. You should be able to install apps from this store now.

GitHub et al

Many apps have their source code and sometimes also their releases hosted in GitHub or similar sites like GitLab. You can simply go to the project GitHub page, go to the releases, and download the apk file from there. Note that you can install some apps from F-Droid and others from GitHub without any issues. The benefit of installing apps from GitHub directly is that sometimes the apps are not available in F-Droid, or they have a different set of features. The downside is that you’ll have to check manually if there’s a new update. Some people use Obtainium to address this, but I haven’t used it.

Curated list of open source apps

Aegis(F-Droid | GitHub). “A free, secure and open source app for Android to manage your 2-step verification tokens”. You can also easily import and export your tokens from and to other apps.

AntennaPod(F-Droid | GitHub). “Easy-to-use, flexible and open-source podcast manager and player”.

BookReader(F-Droid | GitLab). “Simple book reader”. This one is great for reading PDFs as it has text re-flow.

Catima(F-Droid | GitHub). “a Loyalty Card & Ticket Manager for Android”. Basically anything with a bar-code in it, like a library card, etc, can be stored here.

DAVx⁵(F-Droid | GitHub). “open-source CalDAV/CardDAV suite and sync app for Android. You can also access your online files (WebDAV) with it.”

FairEmail(F-Droid | GitHub). “Fully featured, open source, privacy friendly email app for Android”

Loop Habit Tracker(F-Droid | GitHub). “a mobile app for creating and maintaining long-term positive habits”

Infinity(F-Droid | GitHub). “A Reddit client for Android”

KDE Connect(F-Droid | GitHub). “Multi-platform app that allows your devices to communicate”. The best way to move data between your computer and your phone.

Organic Maps(F-Droid | GitHub). “Open-source, community-driven maps for travelers, tourists, cyclists & hikers”. Works offline too.

OpenKeyChain(F-Droid | GitHub). “OpenPGP implementation for Android”. Works with FairEmail.

OpenScale(F-Droid | GitHub). “Open-source weight and body metrics tracker, with support for Bluetooth scales”

RHVoice(F-Droid | GitHub). “free and open source speech synthesizer”

RunnerUp(F-Droid | GitHub). “Track your sport activities with RunnerUp using the GPS in your Android phone:”. Works with RHVoice.

Simple Calendar Pro(F-Droid | GitHub). “highly customizable, offline monthly calendar app for Android”.

Simple Contacts Pro(F-Droid | GitHub). “Easy and quick contact management with no ads, handles groups and favorites too.”

Simple Gallery Pro(F-Droid | GitHub). “Your favorite photo album. Professional file organizer to edit photos with ease”

Simple Notes Pro(F-Droid | GitHub). “To do list widget with a notebook organizer, checklist, simple shopping list”. Probably the best notes/checklist app for Android.

Vinyl(F-Droid | GitHub). “Music player easy to use and customizable”

VLC(F-Droid | GitHub). “Video and audio player that supports a wide range of formats, for both local and
remote playback.”

Voice(F-Droid | GitHub). “Minimalistic audiobook player”

Geometric Weather(F-Droid | GitHub). “A Material Design Weather Application”

Did you enjoy the article?

Posted in Open Source.

Tagged with , , , , , , , .


Fixing out of memory error when loading initial ramdisk and Kernel Panic – not syncing: VFS: Unable to mount root fs on unknown-block(0,0) while booting Ubuntu 22.04

If while booting Ubuntu, it stops and shows you this error:

out of memory error when loading initial ramdisk
Or this one:
Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

It means that there’s an issue with the Linux initial RAM disk(initrd) in that particular kernel you’re trying to boot to. This might happen because of a recent upgrade, or other reasons. Let’s fix it.

While restarting, on the grub menu, select “Advanced options for Ubuntu“. After you select that, you will be able to see all the kernels installed in your system. Try with a different one until you’re able to boot.

Now that you’re in a working system, we’re going to change some configurations for initrd:

sudo nano /etc/initramfs-tools/initramfs.conf

Change the following setting:

MODULES=dep
COMPRESS=xz

Save the file(CTRL+O) and close the editor(CTRL+X).

Now you need to know which version of the kernel is the one having issues. You can list all the installed kernels with this command:

dpkg --list | grep linux-image

The name should be something like this: “5.15.0-58-generic”. Now update the initrd for that kernel(remember to change it to whatever your system has, the one that doesn’t work):

sudo update-initramfs -u -k 5.15.0-58-generic

You can also try different compression algorithms and check the current size of the image with this:

ls -ltrh /boot/initrd.img* 

Now simply update grub:

sudo update-grub

Restart and you should be able to boot into the system without issues.

Enjoyed the article?

Posted in Ubuntu.


Installing OpenCV 4.7.0 in Ubuntu 22.04 LTS

OpenCV is the most popular computer vision library in the world, widely used in research and industry for more than twenty years. The latest version available at the time of writing is 4.7.0, and it comes with many new features and bug fixes across all modules.

We’re also going to include the extra modules so that we have access to the latest research in other areas such as the RGBD module for depth cameras. If you’re planning to buy a new depth camera, make sure to check this post to help you decide which one to get.

We’re going to be using Ubuntu 22.04 LTS for this guide, but it should be similar for other versions of Ubuntu. At the end of this tutorial, you will be able to make computer vision applications with OpenCV using either C++ or Python, and the extra modules will be available to you.

Let’s begin. The first step is to make sure you have everything up to date:

sudo apt-get update
sudo apt-get upgrade

Now let’s grab some dependencies:

sudo apt-get install build-essential cmake python3-numpy python3-dev python3-tk libavcodec-dev libavformat-dev libavutil-dev libswscale-dev libdc1394-dev libeigen3-dev libgtk-3-dev libvtk7-qt-dev

Time to grab the source code of OpenCV and the extra modules, configure it and compile it:

mkdir ~/opencv
cd ~/opencv
wget https://github.com/opencv/opencv/archive/refs/tags/4.7.0.tar.gz
tar -xvzf 4.7.0.tar.gz
rm 4.7.0.tar.gz
wget https://github.com/opencv/opencv_contrib/archive/refs/tags/4.7.0.tar.gz
tar -xvzf 4.7.0.tar.gz
rm 4.7.0.tar.gz
cd opencv-4.7.0
mkdir build
cd build
cmake -D WITH_TBB=ON -D BUILD_opencv_apps=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_VTK=ON .. -DCMAKE_BUILD_TYPE=RELEASE -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.7.0/modules
make -j`nproc`
sudo make install
echo '/usr/local/lib' | sudo tee --append /etc/ld.so.conf.d/opencv.conf
sudo ldconfig
echo 'PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig' | sudo tee --append ~/.bashrc
echo 'export PKG_CONFIG_PATH' | sudo tee --append ~/.bashrc
source ~/.bashrc

Now you should have OpenCV correctly installed, and you should be able to start building your computer vision applications in either C++ or Python. Let’s get you started with some examples.

We’re going to create a C++ application based on one of the examples included in the extra modules of OpenCV. It’s an application that generates Fine-Grained saliency based on this paper. And we’re going to use CMake to configure it so that you can use the same project structure later for your own applications. The first step is to grab the sample files:

cd ~
mkdir saliency
cd saliency
cp ../opencv/opencv_contrib-4.7.0/modules/saliency/samples/computeSaliency.cpp .
cp ../opencv/opencv-4.7.0/samples/data/Megamind.avi .

We’re now going to create a new text file in that same directory called CMakeLists.txt so that cmake can understand how to build it.

nano CMakeLists.txt

Paste this content into the file, save it(Ctrl+O) and then exit(Ctrl+X).

cmake_minimum_required (VERSION 3.0.2)
project (saliency)

find_package(OpenCV REQUIRED)
include_directories( ${OpenCV_INCLUDE_DIRS} )
include_directories( . )

add_executable (${PROJECT_NAME} computeSaliency.cpp )
target_link_libraries (${PROJECT_NAME} ${OpenCV_LIBS})

And now we’re ready to build the application:

mkdir build
cd build
cmake ..
make

We can now run the application. It expects as arguments the type of saliency (FINE_GRAINED), a video file (../Megamind.avi), and a frame number to use (23):

./saliency FINE_GRAINED ../Megamind.avi 23
Original frame on the left, Fine-Grained Saliency calculated on the right.

You are now ready to start making your own computer vision application in C++.

Now let’s see how we can run a Python example as well:

mkdir ~/python_app
cd ~/python_app
cp ../opencv/opencv-4.7.0/samples/python/gaussian_mix.py .
python3 gaussian_mix.py
An example of the Expectation–Maximisation algorithm with a Gaussian Mixture Model in Python

Now you should be able to build your own computer vision applications, with either C++ or Python.

Enjoyed the tutorial?

Posted in Computer Vision, Open Source, OpenCV.

Tagged with , , , , , .


Overview of current Luxonis Oak cameras, or which one you should get for your next computer vision application

Microsoft stopped manufacturing the Kinect in 2017. Google stopped manufacturing Project Tango in 2018 in favour of ARCore. Intel started winding down their RealSense cameras in 2021 to focus on their main business. Thankfully, we have Luxonis stepping up and producing the next generation of perceptual cameras for robotics and computer vision applications.

Luxonis offers perceptual cameras for basically three types of applications: visible spectrum inference(VSI), VSI plus passive infrared stereo, and VSI plus active infrared stereo. Let’s see each one in detail. Note: I’m using referral links in this post, so if you buy something I might receive a commission.

Visible spectrum inference (VSI) cameras

These cameras have a single RGB sensor, just like a normal camera, but have the ability to perform optimised neural network inference inside the camera with a specialised chip such as the Myriad X VPU. For example if your application is to detect or classify objects, visual tracking, or anything related to doing inference in the visual spectrum, then this type of cameras should be enough for you. They have the least amount of power consumption, smallest size, and they’re also the cheapest ones.

An example of VSI, the object was classified as a Potted Plant with 99.51% certainty.

For this section the standard camera would be the Oak-1. If you don’t have any special requirements, this should be the one you get. There are some variants that you might want to consider in case you need something special: If you want to have a wider field of view, you can get the Oak-1 W. If you’re looking for higher resolution, then you should get the Oak-1 Max. If you’re on a budget, you can get the OAK-1-Lite Auto-Focus which is basically the same but with a cheaper sensor, and if you’re putting this on a vibrating environment, such as a drone for example, grab the OAK-1-Lite Fixed-Focus.

VSI plus passive infrared stereo cameras

These cameras can do everything that the VSI cameras can do but also come with two additional fast infrared sensors with global shutter that are used to estimate depth through disparity matching inside the camera. The nice thing about this is that your host device can simply use the depth information, all the heavy computation is done inside the camera itself. And of course you can also use those additional infrared sensors however you like, not only for depth estimation, but for example for applications where doing inference in the near-infrared spectrum would be more effective than VSI.

An example of depth estimation using passive infrared stereo. Whiter pixels are closer to the camera.

For this section the standard camera would be the Oak-D S2 Auto-Focus. If you don’t have any special requirements, this should be the one you get. There are some variants that you might want to consider in case you need something special. If you’re planning to use this on a small device like a Raspberry Pi, then you should either get the Oak-D(this was the original camera from the Kickstarter project) as it comes with a built-in barrel jack connector for external power(some devices like the Raspberry Pi cannot deliver the required power over USB only), or buy a separate Y-Adapter for your camera. If you’re putting the camera in a vibrating environment, such as a car for example, then you might want to get the Oak-D S2 Fixed-Focus, although note that this only affects the RGB camera. If you’re on a budget, or want to have a lighter camera by sacrificing some infrared resolution, you can get either the Oak-D Lite Auto-Focus, or the Oak-D Lite Fixed-Focus. Again, the auto or fixed focus only applies to the RGB camera.

VSI plus active infrared stereo cameras

These cameras can do everything that the VSI plus passive infrared stereo cameras can do but also come with an infrared laser dot projector for active stereo and an infrared illumination LED for night vision. This means that you will get a much better depth estimation for uniform areas such as walls for example, as passive stereo relies on matching the features of the scene only, but blank walls have very few distinctive features. By projecting a known pattern in infrared (invisible to the human eye and the RGB camera), the extra features in the infrared frames help the camera to get a better depth estimation in those cases.

Active stereo can produce better depth maps in certain cases. Image by luxonis

Another thing you can do with this camera is to use the infrared illumination LED to be able to “see in the dark” as the RGB camera(or your eyes) would not be able to see anything, but both infrared cameras would be completely illuminated. This is useful for applications that need to do inference in the near-infrared spectrum with no natural light.

When there’s no natural light, the infrared illumination LED makes IR images clear. Image by luxonis

For this section the standard camera would be the Oak-D Pro Auto-Focus. If you don’t have any special requirements, this should be the one you get. There are some variants that you might want to consider in case you need something special. If you’re planning to use the camera in a vibrating environment (such as a car for example), then you might want to grab the Oak-D Pro Fixed-Focus (note that the fixed focus only applies to the RGB camera). If you’re interested in a wider field of view, then you should get the Oak-D-Pro Wide.

Industrial usage

Finally, if you’re going to use these cameras in industrial applications, Luxonis offers them in a more ruggedised version and with a Power over Ethernet (PoE) connection. The features of the cameras are otherwise the same as already discussed. You can check the industrial version of these cameras here.

Conclusion

And that’s it, you should now be able to decide which Oak camera is best for your particular application. If you want some help starting out with the installation of the software, check out this post.

Enjoyed the article?

Posted in Computer Vision, IoT, Open Source, OpenCV.


How to use the Oak-D spatial camera C++ interface with OpenCV in Ubuntu 22.04 LTS

The Luxonis Oak-D(affiliate link) is a great infrared stereo camera. It also comes with an RGB camera, and the Intel Movidius MyriadX chip to perform neural inference inside the device. They’ve named it a “spatial camera” because of all these features. It’s quite a bargain for its price, and it’s also quite compact, so it’s perfect for robotic and computer vision applications in general. If you don’t have one of these cameras and would like to get one, check out this blog post.

The Oak-D camera. It has two IR cameras and one RGB camera. It has a USB-C connector, and an extra power connector in case the host cannot provide enough power.

In this post I’ll show you how to start using it in Ubuntu to make your own computer vision applications with OpenCV. First, let’s make sure you have an updated system:

sudo apt-get update
sudo apt-get upgrade

Now let’s get some dependencies that are needed to build the Oak-D library and the examples:

sudo apt-get install cmake build-essential git libopencv-dev

Now we’re ready to grab the source code, compile it, and install it:

cd ~
mkdir oakd
cd oakd
wget https://github.com/luxonis/depthai-core/releases/download/v2.20.2/depthai-core-v2.20.2.tar.gz
tar -xzvf depthai-core-v2.20.2.tar.gz
cd depthai-core-v2.20.2/
mkdir build
cd build
cmake -D BUILD_SHARED_LIBS=OFF -D DEPTHAI_BUILD_EXAMPLES=ON -D DEPTHAI_OPENCV_SUPPORT=ON -DCMAKE_INSTALL_PREFIX=/usr/local ..
make -j10
sudo make install

Now the camera library is installed, but it is only accessible using sudo, so let’s set the udev rules to allow normal users to access it(make sure to disconnect the camera now):

echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="03e7", MODE="0666"' | sudo tee /etc/udev/rules.d/80-movidius.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

And now connect the camera again, the new rules should apply, and a normal user should be able to access it. Try running some of the demos(you can press q to exit):

cd ~/oakd/depthai-core-v2.20.2/build/examples
./depth_preview
./rgb_preview
Depth frame obtained from the camera. Whiter pixels mean they are closer to the camera.

In the depth preview you’ll see gray scale pixels representing depth. The closer the pixel is to the camera, the higher value(whiter) it would be. The RGB preview simply shows the RGB camera. Make sure to explore the rest of the examples to see the capabilities of the camera.

The simplest way to build your own application is to copy one example and use it as the starting point, using the same building pipeline. For example, let’s write an app that gets the depth data like the depth_preview example:

cd ~/oakd/depthai-core-v2.20.2/examples
mkdir myapp
cd myapp
cp ../StereoDepth/depth_preview.cpp main.cpp

We now have a main.cpp file with the code for our application. Make a tiny change to the code so that you know it’s a different file, open main.cpp and make this change:

//cv::imshow("disparity", frame)
cv::imshow("mydisparity", frame)

Now let’s add our application to the building pipeline, alongside the other examples. Edit this file:

nano ~/oakd/depthai-core-v2.20.2/examples/CMakeLists.txt

At the end of the file, simply write this one line and save the file:

dai_add_example(my_oak_app myapp/main.cpp ON)

my_oak_app is the name of your executable, and myapp/main.cpp is where the source code is. Let’s build it and run it:

cd ~/oakd/depthai-core-v2.20.2/build/
make
cd examples
./my_oak_app

You should now see the same depth_preview application but the window name should be “mydisparity” instead.

Same application as depth_preview but now we have full control of the source code.

That’s it, remember to explore the other examples, and also check out the Luxonis C++ API Reference to see all that you can do with these incredible cameras.

Enjoyed the tutorial?

Posted in Computer Vision, IoT, Open Source, OpenCV, Photography, Programming.


Simplest and cheapest way to run Stable Diffusion with your own hardware

TL;DR: Grab a Mac Mini M2 or a Mac Mini M2 Pro if you want to generate images faster, then download MochiDiffusion and start generating your images at home. Note: I’m using referral links in this post, so if you buy something I might receive a commission.

Stable Diffusion is a deep learning model that allows you to generate images by simply writing a text prompt, like this:

“A photograph of an astronaut riding a horse”

Before the release of Stable Diffusion, you would have needed to access cloud services to generate images like this. Some of the most popular ones are DALL-E and Midjourney.

Now, you’re able to do this with your own hardware. Usually, this would require you to setup a rather expensive PC with a powerful GPU, but with the latest research from Apple, you’re now able to run Stable Diffusion with Core ML on their new ARM devices(M1, M2, etc).

The base configuration of the newly released Mac Mini M2 is the best deal you can get in terms of the cheapest device that can be used with Stable Diffusion. But if you want to have faster generation of images, then the Mac Mini M2 Pro is an incredible device that runs faster and comes with more RAM and more disk space, which is handy since each of these models is quite large. Also, keep in mind that you might be able to find the older Mac Mini M1 at good prices. Any of these devices are able to run Stable Diffusion, the main difference would be the speed in which they do it.

Now, to be able to run Stable Diffusion on the Mac Mini natively, you’ll need to download an application called MochiDiffusion. After you install it, you’ll also need to download a Stable Diffusion model, prepared specially for Core ML. You can find some ready to use models here. In particular, you want to get the split_einsum version of the model, which is the one compatible with the Neural Engine of the Mac Mini. For example, if you want the latest Stable Diffusion model, which is 2.1, you can find it here. Simply unzip the file and place it under /Users/YOUR_USERNAME/Documents/MochiDiffusion/models.

Mochi Diffusion v2.2 with Stable Diffusion v2.1 model

You can now simply write your prompt where it says “Include in Image:“, and also you can write stuff you don’t want to see in the “Exclude from image:” box. When you’re asked about the Compute Unit option, make sure to select “Use Neural Engine” as we’re using the split_einsum models.

Select Use Neural Engine whenever you see this as we’re using split_einsum models

Now you’re ready to generate your first image, simply click Generate and wait. When you click Generate for the first time after you load a model, it will take some extra time to compile it and optimise it for your device. After this is done, it will generate the next images much faster. Once the image appears on the screen, you can simply select it and click on the icon with the arrow pointing down at the top to save it.

The generated images are 512×512, but you have the option to check the HD box to enlarge your images in generation time to 2024×2024 using Real-ESRGAN. You can also enlarge the images later if you select them and click on the magic wand icon on top. And that’s really all you need to start generating images with your Mac Mini. Enjoy!

Posted in Computer Vision, Open Source, Photography.

Tagged with , .


The best USB thermal camera for computer vision applications

TL:DR; For real-time processing get the Seek Thermal CompactPRO (the one with USB-C) and also grab a USB-C extender cable.
If you don’t need real time processing, you can get any FLIR One as you’ll be able to get absolute temperature readings from the saved images. Note: I’m using referral links in this post, so if you buy something I might receive a commission.

If you’ve read anything about thermal cameras, you would have probably stumbled upon FLIR cameras. They are one of the most popular brands, and a few years ago they created the first affordable USB thermal camera, the FLIR One. Because it is a well known brand and the price is relatively cheap for a thermal camera, I decided to get one of them and see how useful it would be for computer vision applications.

FLIR One thermal camera with a USB-C connector. Note that it comes with both, a thermal and an RGB camera, side by side(left).

These days, FLIR offers a 3rd generation of this affordable USB thermal camera. They actually have three different models: The FLIR One, the FLIR One Pro, and the FLIR One Pro LT, each one offering different features and different resolutions at different prices. They also make two versions of each, one for iOS and another one with an Android(USB-C) connector. You can check them out here.

All of these FLIR cameras are supposed to be used connected directly to a smartphone, which uses the official FLIR app to communicate with the camera, and show the images on the screen of the phone, with the option to save the images for later processing.

The FLIR One camera connected to a phone for normal operation

Something that got my attention with these FLIR cameras is that they all come with a rechargeable battery. You need to keep it charged in order to use it, as it doesn’t use the phone battery to power itself.

USB-C female connector for charging its battery on the right, and the power button and led power indicator in the middle

After you install the official FLIR app, charge the camera, and connect it to the phone, you need to turn the camera on and wait for a few seconds. If everything goes right, you should be able to see and save thermal images on your phone, like this one:

Thermal view of a pot with boiling water. The FLIR app tries to blend it with the RGB camera, but it’s not really well aligned out of the box.

After you save the FLIR images you can extract the temperature information per pixel as a csv file, and also the thermal and visual images, using an open source library: read_thermal. You can get the temperature in Celsius per pixel saved as a csv file like this(and also you can get the thermal and RGB images):

python flir_image_extractor.py -p -i FLIR_IMAGE.jpg -csv temps.csv

Thermal view of the scene

As far as I know, there is no possible way to use any of these FLIR One thermal cameras in real time connected to a computer. You can only use them connected to a phone, and using the official FLIR app. After you save the images you can then process them in a computer with the previously mentioned library.

After using the FLIR One for a bit I realised it has some pros and cons. It’s great that you can get the absolute temperatures per pixel while post processing the images. It’s also handy to have an RGB image alongside the thermal data. But, the fact that they cannot be used directly from a computer device, and that you need to constantly keep charging their battery makes the FLIR One thermal cameras not a great fit for real time computer vision applications.

Fortunately, there is another brand that makes USB thermal cameras, Seek Thermal:

Seek Thermal CompactPRO camera

Seek offers two models, the Seek Compact and the Seek Compact Pro. Also, for each model they offer a version with a telephoto lens, marked with an XR suffix. The Seek Compact Pro has a higher resolution(320 x 240) compared to the Seek Compact(206 x 156). The field of view of the Compact Pro is 32° whereas the Compact Pro XR has only a 24° field of view. For general computer vision tasks I found the wider field of view and the higher resolution of the Seek Compact Pro was the best.

Seek also has an official app that you can use with your phone, and it works similar to how the FLIR app works:

Thermal view of a coffee pot. Note that Seek cameras don’t come with an extra RGB camera.

The main advantage for the Seek cameras is that you can also use them on your computer with this great open source project: libseek-thermal. If you’re going to connect the Seek camera to your computer, you should also grab a USB-C extender. If your computer doesn’t have a USB-C connector then you can simply grab a USB converter cable.

Seek Thermal CompactPRO camera with a USB-C extender

You can now connect the camera to a computer, and by using libseek-thermal you should be able to see thermal images in real time(note that the library as of the time of writing doesn’t provide absolute temperature readings):

Real-time view of a coffee pot from the CompactPro camera connected to a computer.

So, to summarise, these two cameras are great for doing computer vision but each has some pros and cons, and the best one for you will depend on your specific requirements.

If you need to do real time processing, then the best camera would be the Seek Thermal CompactPRO (the one with USB-C) connected to your computer through either a USB-C extender or a USB converter cable. You won’t have to worry about charging any batteries, and you’ll be able to get the thermal images in real time in your computer. The downside is that you won’t get absolute thermal information. This means you don’t know the exact temperature, you just know which areas are relatively hotter/colder.

If you don’t need to do real time processing, then any FLIR One camera that is compatible with your phone should be fine. You will need to keep the battery charged, and use your phone to save the images using the FLIR official app. But once you have those images in your computer you can post process them and get the absolute temperature values for every pixel. You will also get an RGB view of the scene as well, which might be useful for certain applications.

Posted in Computer Vision, Open Source, Photography, Programming.

Tagged with , , .