How to Train Your Camera

- 4 mins

[Norwegian version - Norsk versjon]

Image recognition made simple. How to recognise a specific object from a camera feed?


The Challenge

This week I have been given the challenge to suggest a solution to the following problem:

“Given a simple computer, like a Raspberry Pi, a camera and a bird feeder, create a system that can identify the birds feeding there throughout the day.”

Since the question was more of a theoretical kind, I decided to limit this post to introducing the building blocks and giving an explanation of how you would typically build a system like that.

Recognising objects is a pretty common task these days, and it has been solved in quite a few ways by different approaches. Let’s have a look at how this works.

TL;DR There are some traditional ways of doing image recognition and object detection, like in OpenCV, and there are some techniques based on Deep Learning object detection, like in TensorFlow. Want to know more? Keep on reading.


A Tiny Bit of Theory

The more traditional principles behind OpenCV are well described in this blog post, as well as in this tutorial for detecting (brace yourselves!) cats in images.

While this post explains how image recognition works in TensorFlow that is using a model called a deep convolutional neural network. Like OpenCV, it will also let you train your own image classifier.

Training your own model

Someone has to train the model that you will be using for image classification (recognition).

Where to Start?

Basically, you have two choices – you should either train your own model, or find a model that has been trained by someone else, and you might always want to start by looking for a pre-trained model. However, the more specific your image classification requirements are, the higher are chances that you will need to train your own model. This will be the same whether you go for OpenCV or TensorFlow.

Since we want to run the model on a relatively low-end computer, you might consider doing the processing in the Cloud. However, it should also be possible to run OpenCV and TensorFlow on the latest Raspberry Pis.


Doing Image Recognition

By now, you should know a bit about the theory. So, let’s have a quick look into how we can train the models for our needs.

TL;DR The way this works, is that you feed the model with quite a few pictures of an object, and the similar amount of images without that object.

Say you want to use the model to recognize birds outside your house in Norway. A good starting point would be to get a list of the typical species you are most likely to see in your backyard and collect as many pictures of each type as possible.

Your own model in action

Your trained model in action.

Here is what you can do:

  1. Start by looking at Wikipedia for a list of the Norwegian birds, or Norwegian Encyclopaedia (in Norwegian).
  2. Search the web for the images of each bird type. You might want to automate that task and make sure you are picking images with the right copyright permissions.
  3. Use those images to train your model.
  4. Set up your Raspberry Pi with a camera and the bird feeder, and get ready to identify. You might want to optimise the software not to do the image classification all the time, but only when movement is detected.

Pro tip: It might also be a bit challenging to take a good picture of our feathered friends, so make sure your camera is well-placed, and the feeder is in a well-lit location. Obviously, without disturbing the wildlife.

In case you wonder if similar systems have been implemented, or if it is even possible. The answer is yes. I will provide you with some links to inspire for further reading:

Now, try putting it all together, and let me know how it goes!

Good luck!


Rustam Mehmandarov

Rustam Mehmandarov

Passionate Computer Scientist