RSS

Hierarchical Keyword Madness in Digital Photos

Thursday, September 4th, 2008

Adobe, Apple, Featured, Google, Microsoft, Tech

Context

Digital Asset Management (DAM) is becoming a hot topics these days. When you’re having thousands of photos on your system and that you want tagging your souvenirs, you wish to do that once as it is quite a tedious job.

But you do not necessarily want a large bunch of tags for the same reason that having a lot of sister folders on your system does not help. More probably, you will want to nest these folders in an orderly manner.

For instance, instead of having your photos tagged as « Maricler, California, San Francisco, trees, streets and sidewalks », you may want instead having:

people
   Maricler
USA
   California
      San Francisco
      Los Angeles
      Santa Barbara
nature
   trees
city
   streets and sidewalks

 

Tools

In the realm of digital photos, some popular applications that enable photo tagging are:

Picasa 2.7
Picasa 3 (beta)
Microsoft Windows Live Photo Gallery
Microsoft Pro Photo Tools
Microsoft Expression Media 2
Adobe Photoshop Elements (versions 5, 6 and probably 7)
Adobe Photoshop Lightroom 2

But now, you have to read that carefully:

Your tagging system will not survive the tools that use proprietary catalogs to store your tags

As I said, you want tagging your thousands of photos only once. Fortunately, standards exist that permit you to embed these tags directly into your jpeg images. You may not have noticed but a lot of information is already inserted into your photo files. For example, your digital camera will include that kind of information in your jpeg files (extract only):

<exif:ExifVersion>0220</exif:ExifVersion>
<exif:ExposureTime>1/320</exif:ExposureTime>
<exif:ShutterSpeedValue>8321928/1000000</exif:ShutterSpeedValue>
<exif:FNumber>56/10</exif:FNumber>
<exif:ApertureValue>4970854/1000000</exif:ApertureValue>
<exif:DateTimeOriginal>2007-10-06T18:49:18-04:00</exif:DateTimeOriginal>
<exif:DateTimeDigitized>2007-10-06T18:49:18-04:00</exif:DateTimeDigitized>
...
</exif:Flash>
<exif:FlashpixVersion>0100</exif:FlashpixVersion>
<exif:ColorSpace>1</exif:ColorSpace>
<exif:ComponentsConfiguration>
<rdf:Seq>
   <rdf:li>1</rdf:li>
   <rdf:li>2</rdf:li>
   <rdf:li>3</rdf:li>
   <rdf:li>0</rdf:li>
</rdf:Seq>
</exif:ComponentsConfiguration>
<exif:CompressedBitsPerPixel>5/1</exif:CompressedBitsPerPixel>

 

And along with this kind of information, your tags can also be embedded in a likewise manner.

If you just want a bunch (or bag) of tags, you are probably fine. Problems start when you want hierarchical tags!

Hierarchical tags

Semi-standards appear to exist for preserving hierarchical tags. For instance, Microsoft is suffering from a kind of disorderliness as Expression Media 2 uses the symbol « | » as a tag separator to reflect hierarchy whereas the popular Windows Live Photo Gallery will use « / » (…). The default in Adobe Lightroom will retain the « | » symbol.

Just to have fun, look at this short extract from the metadata  section embedded in a sample picture (original tags in French):

<rdf:Description rdf:about=""
    xmlns:dc="http://purl.org/dc/elements/1.1/">
 <dc:title>
    <rdf:Alt>
       <rdf:li xml:lang="x-default">Maricler sur Haight Street</rdf:li>
    </rdf:Alt>
 </dc:title>
 <dc:subject>
    <rdf:Bag>
       <rdf:li>Californie</rdf:li>
       <rdf:li>Maricler</rdf:li>
       <rdf:li>San Francisco</rdf:li>
       <rdf:li>USA</rdf:li>
       <rdf:li>arbres</rdf:li>
       <rdf:li>lieux</rdf:li>
       <rdf:li>nature</rdf:li>
       <rdf:li>pays</rdf:li>
       <rdf:li>personnes</rdf:li>
       <rdf:li>rues et trottoirs</rdf:li>
       <rdf:li>règne végétal</rdf:li>
       <rdf:li>thèmes</rdf:li>
       <rdf:li>urbains</rdf:li>
    </rdf:Bag>
 </dc:subject>
 <dc:description>

<rdf:Description rdf:about=""
    xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0">
 <MicrosoftPhoto:LastKeywordXMP>
    <rdf:Bag>
       <rdf:li>personnes/Maricler</rdf:li>
       <rdf:li>lieux/pays/USA/Californie/San Francisco</rdf:li>
       <rdf:li>thèmes/nature/règne végétal/arbres</rdf:li>
       <rdf:li>thèmes/lieux/urbains/rues et trottoirs</rdf:li>
    </rdf:Bag>
 </MicrosoftPhoto:LastKeywordXMP>
</rdf:Description>

<rdf:Description rdf:about=""
    xmlns:lr="http://ns.adobe.com/lightroom/1.0/">
 <lr:hierarchicalSubject>
    <rdf:Bag>
       <rdf:li>lieux|pays|USA|Californie|San Francisco</rdf:li>
       <rdf:li>personnes|Maricler</rdf:li>
       <rdf:li>thèmes|lieux|urbains|rues et trottoirs</rdf:li>
       <rdf:li>thèmes|nature|règne végétal|arbres</rdf:li>
    </rdf:Bag>
 </lr:hierarchicalSubject>
</rdf:Description>

 

For those of you having a codewise mentality, it’s not too hard to understand that a complete mess resides in this metadata. For instance, the last section of this extract is issued by Adobe Lightroom 2 (despite the « lightroom/1.0 » appearing in the xmlns). When you want to save the metadata of an image, Lightroom 2.0 will put the above information in your jpeg file. For sure, Microsoft Expression Media 2 will understand and show the tag hierarchy properly. But not so with Microsoft Windows Live Photo Gallery nor with the quite recent Pro Photo Tools.

In fact, Lightroom 2.0 will issue a bunch of those tags (without hierarchical information) for the rest of the universe that does not see the symbol « | » as a keyword (tag) separator.

On the other hand, Picasa 3 seems to be agnostic in this regard. It has a new feature hidden in the menu “Tools/Experimental/Show tag as album…” that will show “tag/subtag1/sub-subtag2″ as well as ”tag|subtag1/sub-subtag2″ without any complaint, but not really in a serviceable manner.

Why fussing about this matter?

In our family, we have 2 desktops and one laptop (all PCs) linked to a network attached storage (NAS) Apple Time Capsule (works well by the way!). 

Should my wife decide to tag a photo, I want to be informed of that on my system or on the laptop. But if our tagging system is dependent of a proprietary catalog, I will not be able to see that change unless there is a way to share that catalog, which is quite difficult with the actual applications. But if we know that the photo has been tagged inside the image file then everything is fine as the information for that photo (tags, caption, title, author, etc…) will travel whatever the application we use.

That’s why Digital Asset Management (DAM) is important. You work once for a said photo and that’s all.

In the meantime, big players in the field will have to talk together to synchronize their acts.

Should you have any recommendation for those of you who had to battle with this tag madness, do not hesitate to leave your comments —as I’m still pending on this matter!

 

,

This post was written by:

zakoops - who has written 9 posts on dharma blues.


Contact the author

9 Comments For This Post

  1. Fieryneck Says:

    Great article! I have been battling with this for some time now…

    I just wish I could find some free (or cheap) hierarchical tagging software that saves the tags in the photo files.

    Thanks for the article…

  2. zakoops Says:

    Fieryneck,

    Since I wrote this article, I’ve been informed of The Metadata Working Group which is a consortium of some companies like Adobe, Microsoft, Canon, Nokia and Sony.

    The main purpose of this group is to clean out the mess mentioned in my post. To that effect, a guideline has been published in September 2008. Alas, at the end of page 32 of this guideline, it says:

    « Hierarchical keywords are not covered. However it’s well understood that this is an important use case even in the context of the consumer and will be added to future versions of this document. There are existing solutions available e.g. Adobe Bridge, Adobe Lightroom as well as Microsoft Expression Media and Windows Live Photo Gallery that have introduced hierarchical keyword workflows specific to their needs. »

    I guess it will take some months before they act on this problem and set a standard for hierarchical keywords.

    In the meantime, if you’re looking for a free solution and that you’re on a Windows box, I would go for Windows Live Photo Gallery where hierarchical keywords is neatly implemented!

    Thanks for your comments!

  3. Tim Says:

    Well said.

    I’ve been using Photoshop Elements Organizer with some success, but fear the problem that you cite: It doesn’t store the hierarchical tags in the photo itself.

    Further, I run Linux, so finding exactly what I want is often difficult (that’s not to say that there’s not great stuff on Linux — au contraire — just not for this particular use-case.

    Keep us all posted!

    -Tim

  4. Sherwood Botsford Says:

    Part of the solution is decent import/export tools. If there is a means of exporting the metadata in a standard format, then you have a chance of pulling that data out, beating up on it to get it into a differnt form, and shoving it back in.

    Further: Pulling the data out of the file is slow, especially compared to a database. The ideal solution is to have belt and suspenders: It’s in a database for fast access, it’s embedded in the picture tags for robustness. So you work with your images in the database, but your software has a separate application that can run in background or as a service/daemon that syncs the database and the pictures.

    Good DAM needs a couple other features:

    1. If I copy an image, the information gets copied with it. With embedded tags, this happens automatically.

    2. If I edit an image, the system needs to figure out what was done.
    * Adjusting tonal values — processing tag changes, subject tags don’t change.
    * Resize. Diminsion tags, and posibly suitibility tags change.
    * Light Crop — Amber warning. Subject tags may change.
    * Heavy Crop — Red warning. Subject tags almost certain to change.

    For these warning situations, the two images should be presented to the editor and ask to verify the tags.

    3. External programs. Dealing with other programs that can rummage trough the same filing system is tricky. This is one reason that embedded tags are preferred. Some external programs play fast and loose with metadata. Cautious testing required.

    A valuable feature in a DAM program is the ability to copy/paste metadata. Some metadata you want to apply to a group of pix. E.g.
    You want to say that ALL of this folder should be tagged “Yugoslavia 2008″ Or if you used Photoshop to turn a color print into a black and white pic, you want to copy most of the metadata from the original to the copy.

    Good DAM systems allow flexible keyword=value tags. For example, you may require that every photo has a value for the keyword “Location” This forces you to explicitly say that it is ‘unknown’ if you don’t know. With GPS enabled cameras this will be increasingly rare, but it will still be necessary to translate 48.999932 -115.334821 to “Grandmother’s farm”

    If you run a tree farm (as I do.) you want a keyword “species” for a lot of your pictures. It should be able to take multiple values (If you have a mixed group of spruce and fir…)

    I’m finding that the more metadata I can assign to a photo, the more useful the photo becomes. If a picture is worth a thousand words than a picture with metadata is worth a million.

  5. zakoops Says:

    Thank you for your comments.

    Very constructive!

    Since my post, I’m now using Adobe Lightroom 2 which allows for a nice balance between embedded metadata and an external database.

  6. Hui Munda Says:

    What theme is that this webpage implementing? I know it’s a website engine web site however I’ve by no means viewed this palette before.

  7. zakoops Says:

    It’a WordPress theme called “Fresh News”, still available at the WooThemes web site.

    But you need your own server to use such theme. Do not think you could use such theme on LiveJournal!

  8. Halt Seven Says:

    Thanks for covering this topic so well. It’s very confusing. I had been using Windows Live Gallery on my PC and it was, frankly, the best tool I’ve encountered for hierarchical keywording - very swift and intuitive.

    Unfortunately, I’ve switched over to a Mac (actually, in the big picture that’s a huge blessing, but my only regret is that I can’t use Windows Live Gallery on it). I’m presently using Adobe Bridge to tag my photos because it embeds keywords into the metadata. iPhoto only uses a proprietary database (though you have the choice to export the keywords with photos), and is very difficult to maintain a hierarchical structure. Bridge works okay, but is not nearly as flexible as Windows Live Gallery - if you want to change a keyword from, say, “Chicagi” to “Chicago,” it literally only changes the photo you have selected — it will not intuitively change all instances.

    I hope you keep posting about this topic - you clearly understand the issues!

    Thanks!

  9. zakoops Says:

    To Halt Seven:

    Thank you for your constructive comments. It reminds me that things always go slowly in any standardization process —as my article has been written near 2 years ago and nothing has really changed since then (sigh).

    And since then also, we have a big iMac and, if some time is given to me, the last Windows box will also be changed to another iMac (that’s the reason Windows Live Gallery has been dropped since a few months). That also makes your comments more appealing to our immediate needs.

    One license of Adobe’s Lightroom permits you to have both a Windows and a Mac version (provided they are not used simultaneously): you download a PC version and then you download a Mac version for the same license key. And that’s the reason for which we now use Lightroom. We discovered that keywords (even when hierarchical) travel well between both platforms! And Lightroom allows to write in the metadata section of a photo file (menu « Save Metadata to File »), not just in its internal data base.

    That’s our standing for now but when time is available (time again…) I will take note of your experimentation for the Mac with your last articles (Keyword DAM on iPhoto vs. Bridge as well as More on DAM Digital Asset Management, iPhoto and Bridge). Both articles appear to cover all aspects to know in this Hierarchical Keyword Madness in Digital Photos!

1 Trackbacks For This Post

  1. More on DAM Digital Asset Management, iPhoto and Bridge | Halt7 Says:

    [...] Dharma Blues gives an excellent synopsis of the reason why DAM is so important, as well as why it’s so troublesome. It’s amazing this topic isn’t being treated more seriously with greater standardization. [...]

Leave a Reply