ONNX files in OpenCV

I have been aware of OpenCV’s ‘dnn’ module for some time: Last time we tried to use it in a project was a number of years ago, and it didn’t seem to be ready for what we needed – or perhaps we just misunderstood it and didn’t give it a good enough look.

Aside from that, I’ve been using .ONNX (Open Neural Network eXchange) files for a while now. My standard usage of these is to transport a trained model from PyTorch – for example a ResNet classifier – onto a Jetson Nano, NX or Orin. PyTorch can export as .ONNX, and TensorRT on the Jetson can import them, so it’s been literally an ‘exchange’ file format for me.

However, pulling these two things together, I have recently learned that OpenCV’s ‘dnn’ module can load directly from .ONNX files, specifically including ResNet models such as the ResNet18 classifier I have recently trained for a client.

There are a few ‘tricks’ required to prepare images to be classified, and it took me a fair amount of research (including some trial-and-error, and using ChatGPT – that was a day I can never get back…) but it works now: I can classify images, using a ONNX file, in OpenCV, from either C++ or Python.

This means that models that I originally trained for Jetson hardware can now be used on any platform with OpenCV. I will be testing this on a Raspberry Pi 5 shortly to gauge performance.

Currently, it’s using CPU only – but it does use all CPU cores available – but I believe GPU is also supported given a suitably-compiled OpenCV: I may try that next.

Computer Vision with OpenCV on a Raspberry Pi

This week I have taken delivery of a Raspberry Pi 2, and a Pi camera module:  Total cost around UKP50.  The aim of the experiment is to see whether the Pi is powerful enough to be used for computer vision applications in the real world.  More of that over the coming days, but the short version is:  Yes it is.

I also needed several other Pi-related components (again, more details of the fun we’re having at a later date).  For various reasons mostly to do with who had what in stock, I split the purchases between two UK companies – 4Tronix, who supply all sorts of superb robotics stuff for Pi and Arduino, and The Pi Hut, who as the name implies sell all things Pi-related.  Both orders were handled quickly, and I recommend both companies highly.

Setting up the new Pi took 2 minutes, and attaching the camera module is easy, if slightly fiddly.

I used the ‘picamera’ module and was getting images displayed on screen, and saved to the filesystem, all within a further few minutes.  The ‘picamera’ module appears to be a very well written library, and the API is certainly powerful.

It was then time to build OpenCV.  This is a slightly more involved process (build it from the source code), which took a few minutes of hands-on time, followed by about 4 hours of waiting for it to compile.  A quick experiment then showed OpenCV working properly from both C++ and Python.

The picamera module can process images in such a way that they can be handled by OpenCV – the interface between the two is straightforward.  As such, within a few more minutes I was grabbing images live from the Pi camera module, and processing them with normal OpenCV Python calls.  I don’t yet know what would be involved in getting images from the camera from C++, but with a Python interface this good, it may not be necessary to worry about it (Python can of course call C/C++ routines anyway).

Initial impressions are that it all works beautifully.  On the *initial* setup, it seems to take about one second to capture a frame from the camera, but the good news is that OpenCV processing (standard pre-processing such as blurring, and Canny edge detection) are faster than I’d expect from a computer this size.  After playing with a few settings, I am now able to increase the frame rate to many frames per second at capture, and around 4 FPS even including some OpenCV work (colour conversion, blur, and Canny edge detection) – bearing in mind some of those are compute-intensive tasks, I think that’s impressive.

So yes:  The Raspberry Pi 2 and the Pi camera module are certainly suitable for computer vision tasks using OpenCV, and I have two contracts lined up already to work on this.

Some more OpenCV tricks

A busy month of OpenCV contracting for a number of clients, including some work in areas of OpenCV I’ve not used much, if at all, before (non-chargeable, of course – I only charge for productive time).

I am now more familiar than I ever thought I’d be with the HoughLines(P) and HoughCircles functions – the former of which is more complex than it first seems.  Like many things in computer vision, it takes some coaxing to get good results, and even more coaxing to get really robust results across a range of ‘real live’ images in the problem domain.

I have also worked a lot this month with the whole ‘camera calibration’ suite of functions, and then followed that up by gaining experience with the ‘project image points into the plane’ routines, which can lead to some interesting ‘augmented reality’ applications.  However, in my case, I’ve used them to simply determine exactly where (in the 2D image) a specific point in the 3D space would appear.  It works very well, and I have a project lined up ready to put this into action.

I’ve revisited one of my ‘favourite’ (i.e. most used) parts of the library:  contour finding, and associated pre- and post-processing, but this time all from Python.

During the last few days, I’ve started looking at 2D pose estimation:  specifically in this case, trying to determine the location of a known set of 2D points in a target image, given possible translation, rotation and scale invariance.  Not finished with that one, yet.

Last (but not least – this isn’t going to go away) I’ve been making an effort to learn Git.  I was pleased to find this simple guide, which at least let me get on with my work while I learn the rest.

OpenCV with Python – first impressions

I’ve spent a month or so trying to make an effort to learn Python, mostly by forcing myself to do any new ‘prototype’ vision / OpenCV work in the language.  This has cost me some money – I only charge for ‘productive’ time, not ‘learning’ time, and at times the temptation to go back to ‘nice familiar C++’ has been great.  But I’ve made good progress with Python, and I’m glad I’ve stuck at it.  Apart from anything else, the language itself isn’t hard to pick up.

The pros and cons from a computer vision perspective are roughly as expected.  It can be slower to run, but depending on how the code is written, it’s not a big difference.  Once ‘inside’ the OpenCV functions, the speed appears to be about the same (as you’d expect:  it’s just a wrapper for the same code), but any code run actually in Python needs careful planning, and if large amounts of compution were going to be done, C++ would no doubt still be the best bet.

But anything it lacks in runtime speed, it certainly makes up for in speed of development.  As a prototyping language, I think I’m already more productive in Python than C++ (and that’s after 20+ years of C++, and a month of part-time Python).  There will always be more to learn, of course, but I think I’m at the point where the learning curve is beginning to get less steep.