Skip to content

Interesting times in image processing

You’ve all seen the awesome Photosynth demo on TED. (If you haven’t, do.)

There’re several interesting things there. I especially liked the infinite-resolution Seadragon demo, with the startling claim that the only thing that should limit the speed of the application is the number of pixels being thrown around on-screen: not the size of the underlying image.

But the star of the show is arguably the ability to recognise features in photos, in order to composite them together intelligently. I presume the same technology would allow the computer to recognise known features or places, adding a semantic layer onto images currently absent. Imagine having your holiday snaps automatically tagged with the correct placenames, landmarks and from that, geodata.

Simultaneously, lots of companies (and certainly lots of government agencies) are working on facial recognition. You can already use Riya to search and tag your photo collection.

I expect this to be offered by Picasa, Flickr, iPhoto etc. within 2 years or so, whether they buy the technology, or develop it from scratch. And I certainly expect it to help power Google searches, within the same timeframe. (In the meantime, they’re building up a semantic layer around photos by other means, e.g. the delightful Image Labeler.)
I’ll leave it for a different post (or commenters) to explore the implications for privacy.

Actually recognising faces (but not identity) in photos is already becoming common in cameras, e.g. my Canon Ixus 70. In tests, it actually recognised the faces of sandstone angels in a cemetary, and in the office we were able to draw rudimentary faces on paper that the camera recognised.

Riya also extended their technology into shopping, their proof-of-concept like.com allowing you to search for shoes, handbags, clothes etc. on the basis of “likeness”. I don’t think it’s a silver bullet for the online shopping experience, but certainly valuable (to the user, and to them as a business).

Here are another couple of interesting things. Given sufficient processing power, and numbers of photos (and that’s not hard these days), you can perform what seems like magic.

  • Scene completion using millions of photographs
    “The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid.”
  • Content-Aware Image Sizing
    “It demonstrates a software application that resizes images in such a way that the content of the image is preserved intelligently.” Has to be seen to be believed.
  • Reconstructing 3D models from 2D photographs, e.g. Fotowoosh

These are all things that our brains are capable of doing without thinking, but we are gradually developing the processing power, the visual memory (repository of images), and clever algorithms to make it possible.