Skip to content


Install Jekyll without sudo

When using the default installation of jekyll it will ask for sudo access when creating a new project:

Your user account is not allowed to install to the system RubyGems.
You can cancel this installation and run:

    bundle install --path vendor/bundle

to install the gems into ./vendor/bundle/, or you can enter your password
and install the bundled gems to RubyGems using sudo.

Password:

That’s not good. Here’s how you can install Jekyll without requiring sudo access.

First, make sure that you have installed ruby with this command:

bundle env

By default Gem Home is set to a directory that requires sudo access. You can change it by adding these two lines to ~/.bashrc (or equivalent)

export GEM_HOME=$HOME/gems
export PATH=$HOME/gems/bin:$PATH

Now open a new console, restart, or login again, and run the same command again:

bundle env

If it’s showing the changes you just did, then you’ll be able to install jekyll locally:

gem install bundle jekyll

And now you can proceed to use jekyll as usual:

jekyll new blog

Posted in Programming.

Tagged with , , .


OpenVINO allows you to colourise greyscale images

One of the things that you can do with OpenVINO is to add colour to a greyscale image. For example, you could turn this greyscale image of an elephant into a colour one:

Given an input greyscale image (left), OpenVINO allows you to produce a colourised version of it(right)

It supports fully automatic and also user-guided image colourisation by using two different types of pre-trained models. This means that you could have an application that allows the user to fine-tune the end result by simply clicking and selecting the colours they want to be included in the colourised image.

If you would like to know more about how to do this, and other amazing applications of OpenVINO, specially about Deep Learning in Computer Vision, with source code and explanations, make sure to sign up here. I will be sending more information about this project to the people in my list first, before anyone else.

Posted in Computer Vision, OpenVINO, Photography.

Tagged with , , .


Learn What OpenVINO Is and How to Use It

On Thursday 10th of December 2020 I presented an online webinar about OpenVINO. Here are some slides from the presentation.

Posted in Computer Vision, IoT, OpenVINO.

Tagged with .


Align text images with OpenCV using Python

Most optical character recognition(OCR) software first aligns the image properly before detecting the text in it. Here I’ll show you how to do that with OpenCV:

Initial image on the left, aligned image on the right

First we need to import opencv:

import cv2

Let’s read an image in. You probably will have a color image, so first we need to convert it to gray scale (one channel only).

image = cv2.imread("text_rotated.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Initial image, converted to grayscale(one channel)

We’re interested in the text itself and nothing else, so we’re going to create an image that represents this. Non-zero pixels represent what we want, and zero pixels represent background. To do that, we will threshold the grayscale image, so that only the text has non-zero values. The explanation of the inputs is simple: gray is the grayscale image used as input, the next argument is the threshold value, set to 0, although it is ignored since it will be calculated by the Otsu algorithm(it’s 170 in this case) because we’re using THRESH_OTSU flag. The next argument is 255 which is the value to set the pixels that pass the threshold, and finally the other flag indicates that we want to use a binary output (all or nothing), and that we want the output reversed (since the text is black in the original). The returned values are the actual value of the threshold to be used as calculated by the Otsu algorithm(otsu_thresh), and the thresholded image itself (thresh).

otsu_thresh, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
By thresholding the initial image, we now have an image that has non-zero values for the pixels that represent text, although it still has some noise

First we need to remove those extra white pixels outside of the area of the text. We can do that with the morphological operator Open, which basically erodes the image (makes the white areas smaller), and then dilates it back (make the white areas larger again). By doing that any small dots, or noise in the image, will be removed. The larger the kernel size, the more noise will be removed. You can do this either by hand, or with an iterative method that checks, for example, the ratio of non-zero to zero pixels for each value, and select the one that maximizes that metric.

kernel_size = 4
ksize=(kernel_size, kernel_size)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, ksize)
thresh_filtered = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
We now have an image that only represents text, without any external noise

Now that we have an image that represents the text, we’re interested in knowing the angle in which it is rotated. There are a few different ways of doing this. Since OpenCV has a convenient function that gives you the minimum rectangle that contains all non-zero values, we could use that. Let’s see how it works:

nonZeroCoordinates = cv2.findNonZero(thresh_filtered)
imageCopy = image.copy()
for pt in nonZeroCoordinates:
    imageCopy = cv2.circle(imageCopy, (pt[0][0], pt[0][1]), 1, (255, 0, 0))
In blue are all the pixels selected as text in the original image. We don’t need to be exact here and select every text pixel, but we don’t want any non text pixel to be blue.

We can now use those coordinates and ask OpenCV to give us the minimum rectangle that contains them:

box = cv2.minAreaRect(nonZeroCoordinates)
boxPts = cv2.boxPoints(box)
for i in range(4):
    pt1 = (boxPts[i][0], boxPts[i][1])
    pt2 = (boxPts[(i+1)%4][0], boxPts[(i+1)%4][1])
    cv2.line(imageCopy, pt1, pt2, (0,255,0), 2, cv2.LINE_AA);
In green is the minimum rectangle that contains all the blue points, which represents text in this case

The estimated angle can then be simply retrieved from the returned rectangle. Note: Remember to double check the returned angle, as it might be different to what you’re expecting. The function minAreaRect always returns angles between 0 and -90 degrees.

angle = box[2]
if(angle < -45):
    angle = 90 + angle

Once you have the estimated angle, it’s time to rotate the image back. First we calculate a rotation matrix based on the angle, and then we apply it. The rotation angle is expressed in degrees, and positive values mean a counter-clockwise rotation.

h, w, c = image.shape
scale = 1.
center = (w/2., h/2.)
M = cv2.getRotationMatrix2D(center, angle, scale)
rotated = image.copy()
cv2.warpAffine(image, M, (w, h), rotated, cv2.INTER_CUBIC, cv2.BORDER_REPLICATE )

And that’s it. You now have an aligned image, ready to be parsed for OCR, or any other application.

Posted in Computer Vision, OpenCV, Programming.

Tagged with , , , , , , , .


Align text images with OpenCV

Most optical character recognition(OCR) software first aligns the image properly before detecting the text in it. Here I’ll show you how to do that with OpenCV:

Initial image on the left, aligned image on the right

First we need to include the required header files:

#include <opencv2/core.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>
#include <iomanip>
#include <string>

Let’s read an image in. You probably will have a color image, so first we need to convert it to gray scale (one channel only).

cv::Mat image = cv::imread( "text_rotated.png", cv::IMREAD_COLOR);
cv::Mat gray;
cv::cvtColor(image, gray, cv::COLOR_BGR2GRAY);
Initial image, converted to grayscale(one channel)

We’re interested in the text itself and nothing else, so we’re going to create an image that represents this. Non-zero pixels represent what we want, and zero pixels represent background. To do that, we will threshold the grayscale image, so that only the text has non-zero values. The explanation of the inputs is simple: gray is the grayscale image used as input, thresh is where the thresholded image will be stored. The next argument is the threshold value, set to 0, although it is ignored since it will be calculated by the Otsu algorithm because we’re using THRESH_OTSU flag. The next argument is 255 which is the value to set the pixels that pass the threshold, and finally the other flag indicates that we want to use a binary output (all or nothing), and that we want the output reversed (since the text is black in the original).

cv::Mat thresh;
cv::threshold(gray, thresh, 0, 255, cv::THRESH_BINARY_INV | cv::THRESH_OTSU);
By thresholding the initial image, we now have an image that has non-zero values for the pixels that represent text, although it still has some noise

First we need to remove those extra white pixels outside of the area of the text. We can do that with the morphological operator Open, which basically erodes the image (makes the white areas smaller), and then dilates it back (make the white areas larger again). By doing that any small dots, or noise in the image, will be removed. The larger the kernel size, the more noise will be removed. You can do this either by hand, or with an iterative method that checks, for example, the ratio of non-zero to zero pixels for each value, and select the one that maximizes that metric.

double kernel_size = 4;
cv::Mat thresh_filtered;
cv::Mat kernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(kernel_size, kernel_size));
cv::morphologyEx(thresh, thresh_filtered, cv::MORPH_OPEN, kernel);
We now have an image that only represents text, without any external noise

Now that we have an image that represents the text, we’re interested in knowing the angle in which it is rotated. There are a few different ways of doing this. Since OpenCV has a convenient function that gives you the minimum rectangle that contains all non-zero values, we could use that. Let’s see how it works:

cv::Mat nonZeroCoordinates;
cv::findNonZero(thresh, nonZeroCoordinates);
cv::Mat imageCopy = image.clone();

for (int i = 0; i < nonZeroCoordinates.total(); i++ ) 
{
    cv::Point pt = nonZeroCoordinates.at<cv::Point>(i);
    cv::circle(imageCopy, pt, 1, cv::Scalar(255, 0, 0), 1);
}
In blue are all the pixels selected as text in the original image. We don’t need to be exact here and select every text pixel, but we don’t want any non text pixel to be blue.

We can now use those coordinates and ask OpenCV to give us the minimum rectangle that contains them:

cv::RotatedRect box = cv::minAreaRect(nonZeroCoordinates);
cv::Point2f vertices[4];
box.points(vertices);
for (int i = 0; i < 4; i++) 
{
    cv::line(imageCopy, vertices[i], vertices[(i+1)%4], cv::Scalar(0,255,0), 2);
}

In green is the minimum rectangle that contains all the blue points, which represents text in this case

The estimated angle can then be simply retrieved from the returned rectangle. Note: Remember to double check the returned angle, as it might be different to what you’re expecting. The function minAreaRect always returns angles between 0 and -90 degrees.

float angle = box.angle;
if (angle < -45.0f)
{
   angle = (90.0f + angle);
}

Once you have the estimated angle, it’s time to rotate the image back. First we calculate a rotation matrix based on the angle, and then we apply it. The rotation angle is expressed in degrees, and positive values mean a counter-clockwise rotation.

cv::Point2f center((image.cols) / 2.0f, (image.rows) / 2.0f);
double scale = 1.;
cv::Mat M = cv::getRotationMatrix2D(center, angle, scale);
cv::Mat rotated;
cv::warpAffine(image, rotated, M, image.size(), cv::INTER_CUBIC, cv::BORDER_REPLICATE);

And that’s it. You now have an aligned image, ready to be parsed for OCR, or any other application.

Posted in Computer Vision, OpenCV, Programming.

Tagged with , , , , , , .


Installing OpenCV 4.5.0 in Ubuntu 20.04 LTS

OpenCV was initially released about 20 years ago. It’s one of the most well established computer vision libraries in the world, with thousands of algorithm implementations ready to be used in commercial and research applications.

In this guide I’ll show you how to install OpenCV 4.5.0 in your Ubuntu 20.04 LTS and how to create computer vision applications with C++ and Python.

face detection

Note: I have noticed some copies of my posts elsewhere, so make sure that you are reading this from the original source, at samontab dot com, accessible from here so that you don’t miss the comments.

First, make sure you have the latest software installed:

sudo apt-get update
sudo apt-get upgrade

Now, you need to install some dependencies, such as support for reading and writing video files, drawing on the screen, some needed tools, etc… This step is very easy, you only need to write the following command in the Terminal:

sudo apt-get install build-essential cmake python3-numpy python3-dev python3-tk libavcodec-dev libavformat-dev libavutil-dev libswscale-dev libavresample-dev libdc1394-dev libeigen3-dev libgtk-3-dev libvtk7-qt-dev

Time to download and compile OpenCV 4.5.0:

wget https://github.com/opencv/opencv/archive/4.5.0.tar.gz
tar -xvzf 4.5.0.tar.gz
rm 4.5.0.tar.gz
cd opencv-4.5.0
mkdir build;cd build
cmake -DBUILD_EXAMPLES=ON ..

configuration for opencv

Check that the above command produces no error and that in particular it reports FFMPEG as YES. If this is not the case you will not be able to read or write videos. Double check that every feature you want present in OpenCV is reported correctly. If you’re missing something, make sure to install the missing package and run the cmake line again. In some cases you might need to delete the whole build directory first to get a correct detection. Once you’re happy with what’s shown then we’ll build it. After running the following lines you can go grab a coffee as it will take a long time.

make -j4
sudo make install
echo '/usr/local/lib' | sudo tee --append /etc/ld.so.conf.d/opencv.conf
sudo ldconfig
echo 'PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig' | sudo tee --append ~/.bashrc
echo 'export PKG_CONFIG_PATH' | sudo tee --append ~/.bashrc
source ~/.bashrc

After that’s finished you should be able to use OpenCV with C++ and Python. Let’s do a quick check:

python3
import cv2
cv2.__version__

python_opencv

You should see no errors and 4.5.0 as the version.

Now let’s run some Python demos. You can start all of them from a central place:

cd ~/opencv-4.5.0/samples/python
python3 demo.py

python examples

You can also execute any of the already compiled C++ demos. For example let’s run the intelligent scissors demo:

cd ~/opencv-4.5.0/build/bin
./example_cpp_intelligent_scissors

If you click on the image and move your mouse you’ll see that the selection moves to match the object edges. Once you are ready selecting the contour of the object press right click and the object will be selected.

intelligent scissors

Now you’ll learn how to compile an application that uses OpenCV with C++ using cmake. Let’s copy another example source code and prepare it for compilation.

cd ~
mkdir sample
cd sample
cp ~/opencv-4.5.0/samples/cpp/text_skewness_correction.cpp .
cp ~/opencv-4.5.0/samples/data/imageTextR.png .

Now we’re going to create a cmake file that will allow us to compile the sample. Create a text file called CMakeLists.txt and put this content in it:

cmake_minimum_required (VERSION 3.0.2)
project (text_skewness_correction)
find_package(OpenCV REQUIRED)
include_directories( ${OpenCV_INCLUDE_DIRS} )
add_executable (${PROJECT_NAME} text_skewness_correction.cpp)
target_link_libraries (${PROJECT_NAME} ${OpenCV_LIBS})

OK, now we’re ready to build it and run it:

mkdir build
cd build
cmake ..
make
./text_skewness_correction ../imageTextR.png

rotated text alignment

Now you have OpenCV 4.5.0 correctly installed in your Ubuntu 20.04 LTS system, and you know how to create applications using C++ and Python.

Posted in Computer Vision, Open Source, OpenCV.

Tagged with , , , , , , , , , .


Minimal install of OpenVINO 2020.4 in Kubuntu 20.04 LTS for C++ inference on CPU

The official installer of OpenVINO 2020.4 requires Ubuntu 18.04 LTS, but since Kubuntu 20.04 LTS is already out I wanted to use that instead. Here’s what you need to do to get a minimal install for C++ inference on the CPU.

First make sure you have everything up to date:

sudo apt-get update
sudo apt-get upgrade

You’ll need git installed to get the source code. Also, python3 comes already pre-installed with 20.04 but not python, so we’re going to assign any python calls to python3.

sudo apt-get install git python-is-python3

Now get the OpenVINO source code into your home directory (or wherever you prefer). Don’t download the zip file directly from github as there are some 3rd party dependencies that need to get downloaded as well. That’s the –recursive part of the following command. The specific –branch we’re using is the same as the official installer uses, releases/2020/4.

cd ~
git clone --recursive --branch releases/2020/4 https://github.com/openvinotoolkit/openvino.git

Install the dependencies by running the included script:

~/openvino/install_dependencies.sh

Before continuing you need to change some of the files you just downloaded. I’m using xdg-open which will invoke your preferred application but you can use whatever you prefer instead like vi, nano, etc.

First open up the gna_helper.cpp file.

xdg-open ~/openvino/inference-engine/src/gna_plugin/gna_helper.cpp

Update the code for these functions as follows:

void profilerRtcStart(intel_gna_profiler_rtc *p) {
    if (nullptr == p) return;
    clearTimeB(p->passed);
    clearTimeB(p->stop);
    //ftime(&p->start);
    timespec start;
    clock_gettime(CLOCK_REALTIME, &start);
    p->start.time = start.tv_sec;
    p->start.millitm = start.tv_nsec/1000000;
}

void profilerRtcStop(intel_gna_profiler_rtc *p) {
    if (nullptr == p) return;
    //ftime(&p->stop);
    timespec stop;
    clock_gettime(CLOCK_REALTIME, &stop);
    p->stop.time = stop.tv_sec;
    p->stop.millitm = stop.tv_nsec/1000000;
}

Now open the execution_engine.cpp file:

xdg-open ~/openvino/inference-engine/thirdparty/ade/sources/ade/source/execution_engine.cpp

In line 141, simply update this:

//return std::move(ret);
return ret;

Now you’re ready to build it. And this will take a long time.

  • ENABLE_MKL_DNN=ON means we want the CPU plugin
  • ENABLE_CLDNN=OFF means we don’t want the GPU plugin
cd ~/openvino
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_MKL_DNN=ON -DENABLE_CLDNN=OFF -DENABLE_OPENCV=OFF -DENABLE_PYTHON=OFF ..
make -j4

You should now be able to run the compiled applications, for example to get information about your device you can run the following:

~/openvino/bin/intel64/Release/hello_query_device

It should output detailed information about your device. Check out the Intel Model Zoo for models and samples that you can use.

Posted in Open Source, OpenVINO, Programming.

Tagged with , , , .


Installing OpenCV 3.2.0 with contrib modules in Ubuntu 16.04 LTS

UPDATE: You can also install OpenCV 4.5.0 in Ubuntu 20.04LTS.

OpenCV 3.2.0 has been out for a while and contains many improvements and exciting new features, so it’s time to update this guide using the latest Ubuntu 16.04LTS.

A big change in OpenCV 3.2.0 is that now many of the newest algorithms now reside separately in the contrib repository.

Some of these modules include Face Recognition, RGB-Depth processing, Image Registration, Saliency, Structure From Motion, Tracking, and much more.

So let’s install OpenCV 3.2.0 with the contrib modules and other good stuff by executing the following code on the command line:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential libgtk2.0-dev libjpeg-dev libtiff5-dev libjasper-dev libopenexr-dev cmake python-dev python-numpy python-tk libtbb-dev libeigen3-dev yasm libfaac-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev libx264-dev libqt4-dev libqt4-opengl-dev sphinx-common texlive-latex-extra libv4l-dev libdc1394-22-dev libavcodec-dev libavformat-dev libswscale-dev default-jdk ant libvtk5-qt4-dev
cd ~
mkdir opencv
cd opencv
wget https://github.com/opencv/opencv/archive/3.2.0.tar.gz
tar -xvzf 3.2.0.tar.gz
wget https://github.com/opencv/opencv_contrib/archive/3.2.0.zip
unzip 3.2.0.zip
cd opencv-3.2.0

But before we build it, we need to fix one problem currently present in the contrib modules, specifically in the freetype module, which allows you to draw UTF-8 strings. If you are getting an error similar to ImportError: /usr/local/lib/libopencv_freetype.so.3.2: undefined symbol: hb_shape, this will fix it:

sed -i 's/${freetype2_LIBRARIES} ${harfbuzz_LIBRARIES}/${FREETYPE_LIBRARIES} ${HARFBUZZ_LIBRARIES}/g' ../opencv_contrib-3.2.0/modules/freetype/CMakeLists.txt

Since now that problem is solved, now we are ready to build OpenCV.

mkdir build
cd build
cmake -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_VTK=ON .. -DCMAKE_BUILD_TYPE=RELEASE -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-3.2.0/modules ..
make
sudo make install
echo '/usr/local/lib' | sudo tee --append /etc/ld.so.conf.d/opencv.conf
sudo ldconfig
echo 'PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig' | sudo tee --append ~/.bashrc
echo 'export PKG_CONFIG_PATH' | sudo tee --append ~/.bashrc
source ~/.bashrc

Now you should be able to compile with the OpenCV libraries, including the contrib repositories.

For example, let’s compute some Fine-Grained Saliency, which is available in the saliency module of the contrib repository:

cd ~
mkdir saliency
cd saliency
cp ../opencv/opencv_contrib-3.2.0/modules/saliency/samples/computeSaliency.cpp .
cp ../opencv/opencv-3.2.0/samples/data/Megamind.avi .
g++ -o computeSaliency `pkg-config opencv --cflags` computeSaliency.cpp `pkg-config opencv --libs`
./computeSaliency FINE_GRAINED Megamind.avi 23

Original Image:

Computed Saliency:

Posted in Open Source, OpenCV, Programming.


Interfacing Intel RealSense F200 with OpenCV

RealSense from Intel is an interesting technology that brings 3D cameras into the mainstream.
Although the RealSense SDK provides a good starting point for many applications, some users would prefer to have a bit more control over the images. In this post, I’ll describe how to access the raw streams from an Intel RealSense F200 camera, and how to convert those images into OpenCV cv::Mat objects.

Here is the video with all the explanations

And here are the files used in the video:

01-alignedSingleThreaded.cpp:

#include &lt;pxcsensemanager.h&gt;
#include &lt;opencv2/opencv.hpp&gt;

PXCSenseManager *pxcSenseManager;

PXCImage * CVMat2PXCImage(cv::Mat cvImage)
{
    PXCImage::ImageInfo iinfo;
    memset(&amp;iinfo,0,sizeof(iinfo));
    iinfo.width=cvImage.cols;
    iinfo.height=cvImage.rows;

    PXCImage::PixelFormat format;
    int type = cvImage.type();
    if(type == CV_8UC1)
        format = PXCImage::PIXEL_FORMAT_Y8;
    else if(type == CV_8UC3)
        format = PXCImage::PIXEL_FORMAT_RGB24;
    else if(type == CV_32FC1)
        format = PXCImage::PIXEL_FORMAT_DEPTH_F32;

    iinfo.format = format;

    PXCImage *pxcImage = pxcSenseManager-&gt;QuerySession()-&gt;CreateImage(&amp;iinfo);

    PXCImage::ImageData data;
    pxcImage-&gt;AcquireAccess(PXCImage::ACCESS_WRITE, format, &amp;data);

    data.planes[0] = cvImage.data;

    pxcImage-&gt;ReleaseAccess(&amp;data);
    return pxcImage;
}

cv::Mat PXCImage2CVMat(PXCImage *pxcImage, PXCImage::PixelFormat format)
{
    PXCImage::ImageData data;
    pxcImage-&gt;AcquireAccess(PXCImage::ACCESS_READ, format, &amp;data);

    int width = pxcImage-&gt;QueryInfo().width;
    int height = pxcImage-&gt;QueryInfo().height;
    if(!format)
        format = pxcImage-&gt;QueryInfo().format;

    int type;
    if(format == PXCImage::PIXEL_FORMAT_Y8)
        type = CV_8UC1;
    else if(format == PXCImage::PIXEL_FORMAT_RGB24)
        type = CV_8UC3;
    else if(format == PXCImage::PIXEL_FORMAT_DEPTH_F32)
        type = CV_32FC1;
    else if(format == PXCImage::PIXEL_FORMAT_DEPTH)
        type = CV_16UC1;

    cv::Mat ocvImage = cv::Mat(cv::Size(width, height), type, data.planes[0]);

    pxcImage-&gt;ReleaseAccess(&amp;data);
    return ocvImage;
}

int main(int argc, char* argv[])
{
    //Define some parameters for the camera
    cv::Size frameSize = cv::Size(640, 480);
    float frameRate = 60;

    //Create the OpenCV windows and images
    cv::namedWindow("IR", cv::WINDOW_NORMAL);
    cv::namedWindow("Color", cv::WINDOW_NORMAL);
    cv::namedWindow("Depth", cv::WINDOW_NORMAL);
    cv::Mat frameIR = cv::Mat::zeros(frameSize, CV_8UC1);
    cv::Mat frameColor = cv::Mat::zeros(frameSize, CV_8UC3);
    cv::Mat frameDepth = cv::Mat::zeros(frameSize, CV_8UC1);

    //Initialize the RealSense Manager
    pxcSenseManager = PXCSenseManager::CreateInstance();

    //Enable the streams to be used
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_IR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_COLOR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_DEPTH, frameSize.width, frameSize.height, frameRate);

    //Initialize the pipeline
    pxcSenseManager-&gt;Init();

    bool keepRunning = true;
    while(keepRunning)
    {
        //Acquire all the frames from the camera
        pxcSenseManager-&gt;AcquireFrame();
        PXCCapture::Sample *sample = pxcSenseManager-&gt;QuerySample();

        //Convert each frame into an OpenCV image
        frameIR = PXCImage2CVMat(sample-&gt;ir, PXCImage::PIXEL_FORMAT_Y8);
        frameColor = PXCImage2CVMat(sample-&gt;color, PXCImage::PIXEL_FORMAT_RGB24);
        cv::Mat frameDepth_u16 = PXCImage2CVMat(sample-&gt;depth, PXCImage::PIXEL_FORMAT_DEPTH);
        frameDepth_u16.convertTo(frameDepth, CV_8UC1);

        cv::Mat frameDisplay;
        cv::equalizeHist(frameDepth, frameDisplay);

        //Display the images
        cv::imshow("IR", frameIR);
        cv::imshow("Color", frameColor);
        cv::imshow("Depth", frameDisplay);

        //Check for user input
        int key = cv::waitKey(1);
        if(key == 27)
            keepRunning = false;

        //Release the memory from the frames
        pxcSenseManager-&gt;ReleaseFrame();
    }

    //Release the memory from the RealSense manager
    pxcSenseManager-&gt;Release();

    return 0;
}

02-unalignedSingleThreaded.cpp:

#include &lt;pxcsensemanager.h&gt;
#include &lt;opencv2/opencv.hpp&gt;

cv::Mat PXCImage2CVMat(PXCImage *pxcImage, PXCImage::PixelFormat format)
{
    PXCImage::ImageData data;
    pxcImage-&gt;AcquireAccess(PXCImage::ACCESS_READ, format, &amp;data);

    int width = pxcImage-&gt;QueryInfo().width;
    int height = pxcImage-&gt;QueryInfo().height;

    if(!format)
        format = pxcImage-&gt;QueryInfo().format;

    int type;
    if(format == PXCImage::PIXEL_FORMAT_Y8)
        type = CV_8UC1;
    else if(format == PXCImage::PIXEL_FORMAT_RGB24)
        type = CV_8UC3;
    else if(format == PXCImage::PIXEL_FORMAT_DEPTH_F32)
        type = CV_32FC1;

    cv::Mat ocvImage = cv::Mat(cv::Size(width, height), type, data.planes[0]);

    pxcImage-&gt;ReleaseAccess(&amp;data);
    return ocvImage;
}

int main(int argc, char* argv[])
{
    //Define some parameters for the camera
    cv::Size frameSize = cv::Size(640, 480);
    float frameRate = 60;

    //Create the OpenCV windows and images
    cv::namedWindow("IR", cv::WINDOW_NORMAL);
    cv::namedWindow("Color", cv::WINDOW_NORMAL);
    cv::namedWindow("Depth", cv::WINDOW_NORMAL);
    cv::Mat frameIR = cv::Mat::zeros(frameSize, CV_8UC1);
    cv::Mat frameColor = cv::Mat::zeros(frameSize, CV_8UC3);
    cv::Mat frameDepth = cv::Mat::zeros(frameSize, CV_8UC1);

    //Initialize the RealSense Manager
    PXCSenseManager *pxcSenseManager = PXCSenseManager::CreateInstance();

    //Enable the streams to be used
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_IR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_COLOR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_DEPTH, frameSize.width, frameSize.height, frameRate);

    //Initialize the pipeline
    pxcSenseManager-&gt;Init();

    bool keepRunning = true;
    while(keepRunning)
    {
        //Acquire any frame from the camera
        pxcSenseManager-&gt;AcquireFrame(false);
        PXCCapture::Sample *sample = pxcSenseManager-&gt;QuerySample();

        //Convert each frame into an OpenCV image
        //You need to make sure that the image is there first
        if(sample-&gt;ir)
            frameIR = PXCImage2CVMat(sample-&gt;ir, PXCImage::PIXEL_FORMAT_Y8);
        if(sample-&gt;color)
            frameColor = PXCImage2CVMat(sample-&gt;color, PXCImage::PIXEL_FORMAT_RGB24);
        if(sample-&gt;depth)
            PXCImage2CVMat(sample-&gt;depth, PXCImage::PIXEL_FORMAT_DEPTH_F32).convertTo(frameDepth, CV_8UC1);

        //Display the images
        cv::imshow("IR", frameIR);
        cv::imshow("Color", frameColor);
        cv::imshow("Depth", frameDepth);

        //Check for user input
        int key = cv::waitKey(1);
        if(key == 27)
            keepRunning = false;

        //Release the memory from the frames
        pxcSenseManager-&gt;ReleaseFrame();
    }

    //Release the memory from the RealSense manager
    pxcSenseManager-&gt;Release();

    return 0;
}

03-unalignedMultiThreaded.cpp:

#include &lt;pxcsensemanager.h&gt;
#include &lt;iostream&gt;
#include &lt;opencv2/opencv.hpp&gt;

cv::Mat frameIR;
cv::Mat frameColor;
cv::Mat frameDepth;
cv::Mutex framesMutex;

cv::Mat PXCImage2CVMat(PXCImage *pxcImage, PXCImage::PixelFormat format)
{
    PXCImage::ImageData data;
    pxcImage-&gt;AcquireAccess(PXCImage::ACCESS_READ, format, &amp;data);

    int width = pxcImage-&gt;QueryInfo().width;
    int height = pxcImage-&gt;QueryInfo().height;

    if(!format)
        format = pxcImage-&gt;QueryInfo().format;

    int type;
    if(format == PXCImage::PIXEL_FORMAT_Y8)
        type = CV_8UC1;
    else if(format == PXCImage::PIXEL_FORMAT_RGB24)
        type = CV_8UC3;
    else if(format == PXCImage::PIXEL_FORMAT_DEPTH_F32)
        type = CV_32FC1;

    cv::Mat ocvImage = cv::Mat(cv::Size(width, height), type, data.planes[0]);

    pxcImage-&gt;ReleaseAccess(&amp;data);
    return ocvImage;
}

class FramesHandler:public PXCSenseManager::Handler
{
public:
    virtual pxcStatus PXCAPI OnNewSample(pxcUID, PXCCapture::Sample *sample)
    {
            framesMutex.lock();
                if(sample-&gt;ir)
                    frameIR = PXCImage2CVMat(sample-&gt;ir, PXCImage::PIXEL_FORMAT_Y8);
                if(sample-&gt;color)
                    frameColor = PXCImage2CVMat(sample-&gt;color, PXCImage::PIXEL_FORMAT_RGB24);
                if(sample-&gt;depth)
                    PXCImage2CVMat(sample-&gt;depth, PXCImage::PIXEL_FORMAT_DEPTH_F32).convertTo(frameDepth, CV_8UC1);
            framesMutex.unlock();
    return PXC_STATUS_NO_ERROR;
    }
};

int main(int argc, char* argv[])
{
    cv::Size frameSize = cv::Size(640, 480);
    float frameRate = 60;

    cv::namedWindow("IR", cv::WINDOW_NORMAL);
    cv::namedWindow("Color", cv::WINDOW_NORMAL);
    cv::namedWindow("Depth", cv::WINDOW_NORMAL);
    frameIR = cv::Mat::zeros(frameSize, CV_8UC1);
    frameColor = cv::Mat::zeros(frameSize, CV_8UC3);
    frameDepth = cv::Mat::zeros(frameSize, CV_8UC1);

    PXCSenseManager *pxcSenseManager = PXCSenseManager::CreateInstance();

    //Enable the streams to be used
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_IR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_COLOR, frameSize.width, frameSize.height, frameRate);
    pxcSenseManager-&gt;EnableStream(PXCCapture::STREAM_TYPE_DEPTH, frameSize.width, frameSize.height, frameRate);

    FramesHandler handler;
    pxcSenseManager-&gt;Init(&amp;handler);
    pxcSenseManager-&gt;StreamFrames(false);

    //Local images for display
    cv::Mat displayIR = frameIR.clone();
    cv::Mat displayColor = frameColor.clone();
    cv::Mat displayDepth = frameDepth.clone();

    bool keepRunning = true;
    while(keepRunning)
    {
        framesMutex.lock();
            displayIR = frameIR.clone();
            displayColor = frameColor.clone();
            displayDepth = frameDepth.clone();
        framesMutex.unlock();

        cv::imshow("IR", displayIR);
        cv::imshow("Color", displayColor);
        cv::imshow("Depth", displayDepth);

        int key = cv::waitKey(1);
        if(key == 27)
            keepRunning = false;
    }
    //Stop the frame acqusition thread
    pxcSenseManager-&gt;Close();

    pxcSenseManager-&gt;Release();

    return 0;
}

04-alignedMultiThreaded.cpp:

#include &lt;pxcsensemanager.h&gt;
#include &lt;iostream&gt;
#include &lt;opencv2/opencv.hpp&gt;

cv::Mat frameIR;
cv::Mat frameColor;
cv::Mat frameDepth;
cv::Mutex framesMutex;

cv::Mat PXCImage2CVMat(PXCImage *pxcImage, PXCImage::PixelFormat format)
{
    PXCImage::ImageData data;
    pxcImage-&gt;AcquireAccess(PXCImage::ACCESS_READ, format, &amp;data);

    int width = pxcImage-&gt;QueryInfo().width;
    int height = pxcImage-&gt;QueryInfo().height;

    if(!format)
        format = pxcImage-&gt;QueryInfo().format;

    int type;
    if(format == PXCImage::PIXEL_FORMAT_Y8)
        type = CV_8UC1;
    else if(format == PXCImage::PIXEL_FORMAT_RGB24)
        type = CV_8UC3;
    else if(format == PXCImage::PIXEL_FORMAT_DEPTH_F32)
        type = CV_32FC1;

    cv::Mat ocvImage = cv::Mat(cv::Size(width, height), type, data.planes[0]);

    pxcImage-&gt;ReleaseAccess(&amp;data);
    return ocvImage;
}

class FramesHandler:public PXCSenseManager::Handler
{
public:
    virtual pxcStatus PXCAPI OnNewSample(pxcUID, PXCCapture::Sample *sample)
    {
            framesMutex.lock();
                frameIR = PXCImage2CVMat(sample-&gt;ir, PXCImage::PIXEL_FORMAT_Y8);
                frameColor = PXCImage2CVMat(sample-&gt;color, PXCImage::PIXEL_FORMAT_RGB24);
                PXCImage2CVMat(sample-&gt;depth, PXCImage::PIXEL_FORMAT_DEPTH_F32).convertTo(frameDepth, CV_8UC1);
            framesMutex.unlock();
    return PXC_STATUS_NO_ERROR;
    }
};

int main(int argc, char* argv[])
{
    cv::Size frameSize = cv::Size(640, 480);
    float frameRate = 60;

    cv::namedWindow("IR", cv::WINDOW_NORMAL);
    cv::namedWindow("Color", cv::WINDOW_NORMAL);
    cv::namedWindow("Depth", cv::WINDOW_NORMAL);
    frameIR = cv::Mat::zeros(frameSize, CV_8UC1);
    frameColor = cv::Mat::zeros(frameSize, CV_8UC3);
    frameDepth = cv::Mat::zeros(frameSize, CV_8UC1);

    PXCSenseManager *pxcSenseManager = PXCSenseManager::CreateInstance();

    //Enable the streams to be used
    PXCVideoModule::DataDesc ddesc={};
    ddesc.deviceInfo.streams = PXCCapture::STREAM_TYPE_IR | PXCCapture::STREAM_TYPE_COLOR | PXCCapture::STREAM_TYPE_DEPTH;

    pxcSenseManager-&gt;EnableStreams(&amp;ddesc);

    FramesHandler handler;
    pxcSenseManager-&gt;Init(&amp;handler);
    pxcSenseManager-&gt;StreamFrames(false);

    //Local images for display
    cv::Mat displayIR = frameIR.clone();
    cv::Mat displayColor = frameColor.clone();
    cv::Mat displayDepth = frameDepth.clone();

    bool keepRunning = true;
    while(keepRunning)
    {
        framesMutex.lock();
            displayIR = frameIR.clone();
            displayColor = frameColor.clone();
            displayDepth = frameDepth.clone();
        framesMutex.unlock();

        cv::imshow("IR", displayIR);
        cv::imshow("Color", displayColor);
        cv::imshow("Depth", displayDepth);

        int key = cv::waitKey(1);
        if(key == 27)
            keepRunning = false;
    }
    //Stop the frame acqusition thread
    pxcSenseManager-&gt;Close();

    pxcSenseManager-&gt;Release();

    return 0;
}

Posted in Computer Vision, IoT, OpenCV, Photography, Programming.


Understanding how virtual make-up apps work

Virtual make-up is an interesting technology that can be used to aid in the decision of buying cosmetics, enhancing portraits, or just for fun.

Since the final color of an applied cosmetic depends both, on the color of the cosmetic, and on skin color, most of the time people have to go to the stores and try the products on themselves to see how they would look like. With virtual make-up technology, this can be conveniently simulated on a computer, or a mobile phone. The only thing needed is a photograph of their face looking towards the camera, and the software is able to simulate how a particular make-up would look on that particular skin color. This can also be applied to a live camera feed for a more realistic application with real time rendering.

To create an application like this, the first step is to estimate where the faces are in the photograph. This can be solved with computer vision. In the general case, this problem is called object detection. You need to define how your object looks like, and then train an algorithm with many images of the object. Most computer vision algorithms that perform this task assume that the face appears on a roughly frontal view, with little or no objects covering it. This is because most faces have a similar structure when viewed from the front, whereas profile or back views of the head change a lot from person to person because of hair styles, among other things.

In order to capture the facial structure, these algorithms are usually trained with features such as Haar-like, lbp, or HoG. Once trained, the algorithm is able to detect faces in images.

So, for example, let’s say that you start with an image like this:

0

After detecting the face, you will end up with an region of interest on the image. Something like this:

girl.jpg_faceBorder

Now, inside that region of interest, we need to detect specific face landmarks. These landmarks represent the position of different parts of the face, such as eyes, mouth, and eyebrows. This is again, an object detection problem. You need to train an algorithm with many annotated images and a specific number of facial landmarks. Then, you can use this trained algorithm to detect those facial features in a new image. Just like this for example:

girlPoints

Once you have the position of those landmarks, you need to design your own make-up, and align it to those features. Once you have that, you can then blend together the original image with your designed make-up. Here are some basic examples with a few different colors:

1  3 24

The position of the detected landmarks and the design of the make-up are crucial to make it appear realistic. On top of that, there are many different computer vision techniques that can be applied in order to blend the make-up into the face in a more realistic manner.

Posted in Computer Vision, Photography.