Tuesday, March 21, 2006

Managing Pictures

Went to Hampstead Heath with Besim for some finger-freezing photography. In the pitch dark. We looked for the parrots, but in vain; however, we came up with some nice ideas for photo software. Current problems with photo software:

  • multiple upload with metadata
  • keeping metadata with photos
  • retaining ordering of photos in an album
  • tagging photos + searching
Basically, for all its spangliness, Google Picasa is nearly useless for photographers, as you have to type in the tags/keywords by hand. I want a program on my PC which will auto-classify my images (at least to some extent).

I reckon I can see a way, using Flickr, to bring all of these things together. Originally, I was going to use Google Images, but there is no Image API, so screw them. Flickr has a nice clean XML-based API. I see the full application interactions looking like this:

  1. Use desktop app to sort photos into albums.
  2. Desktop app scans Flickr for photos...
  3. ...and uses their metadata to suggest tags...
  4. making use of a feature-recognition algorithm.
  5. User then submits batch of photos to Flickr with metadata
  6. Tag-suggestion gets better over time as learning data improves.
  7. User can build external photo site from Flickr RSS feeds.
  8. Synch tags, albums, photos between the desktop app and Flickr.
Basically, you get Flickr (and its users) to do most of the work of 'identifying' images, by using RSS-feed-by-tag data. All the neural network has to do is to generate some kind of similarity metric for the photos, rip the tags from the best matches, and use those for the suggestions.

In the future, you could auto-submit full-size images - with metadata - to stock photo libraries, having tagged them beforehand. In fact, basing tags on existing images could well be a very good thing, as your image will be placed alongside other similar images automagically.

There is quite some milage in this, and I suspect that most of the code is already written. THe magic would be in the glue.


  • Absolutely, this builds on the what we talked about a while ago, I think you also need to consider the 'wikipedia problem' you must incorporate some way to validate or assess a tags relevance, of course we all know who's good at relevance classification don't we :)


    By Blogger Pondskater, at 11:08 am  

  • Yes, I agree, there is possibility for keyword pollution here. However, I think I have a solution involving a combination of central server + desktop processing.
    Basically, a user would be able to tweak the accuracy of the keyword suggestion by doing a "pick n mix" from the suggestions from their own machine and those from a central server, which would collate iamge/keyword metrics from any user who wants to use it.

    By Blogger Skellywag, at 3:25 pm  

Post a Comment

<< Home