Image titles sometimes import with "lang="x-default" prefix

Submitted by an unknown user

Assigned to Lucas Beeler

Description

---- Reported by shotwell-maint@gnome.bugs 2010-11-07 17:05:00 -0800 ----

Original Redmine bug id: 2773
Original URL: http://redmine.yorba.org/issues/2773
Searchable id: yorba-bug-2773
Original author: Nyall - Original description:

Using shotwell 0.7.2 from the yorba ppa on maverick:

I'm seeing a lot of my photos import with the title field prefixed by the string â€œlang=â€x-default". Looking at the exif information for these photos doesn't show this same prefix:

$ exiftool 20091113_141150.%(=caps)JPG%

ExifTool Version Number : 8.15

File Name : 20091113_141150.%(=caps)JPG%

â€¦

XMP Toolkit : Image::ExifTool 8.15

Description : Outside the Chester Beatty Library, Dublin.

Title : Chester Beatty

â€¦

I've attached a screenshot showing how this same photo appears in shotwell, as well as the sample image showing this problem.

---- Additional Comments From shotwell-maint@gnome.bugs 2013-05-01 11:39:00 -0700 ----

History

Comment 1

Updated by Jim Nelson about 3 years ago

Priority set to High

Comment 2

Updated by Adam Dingle about 3 years ago

This was also reported on the mailing list, where the reporter said that it occurs whenever the name includes non-%(=caps)ASCII% characters:

http://lists.yorba.org/pipermail/shotwell/2010-November/001248.html

Comment 3

Updated by Adam Dingle almost 3 years ago

Status changed from Open to Review
Assignee changed from Anonymous to Lucas Beeler

Comment 4

Updated by Adam Dingle almost 3 years ago

Priority deleted (<strike>_High_</strike>)

Comment 5

Updated by Adam Dingle almost 3 years ago

Note Lucas's research on this, which I'm quoting here:

2773 occurs because a simple text node isn't the only possible

child of the !dc:description and !dc:title nodes defined by the XML

grammar for RDF/%(=caps)XMP%. These XML elements can have simple text children,

as shown below:

dc:descriptionForget the big bang approach. When it

comes to demonstrating the value

of knowledge management, a piecemeal

strategy works best.&lt;/dc:description&gt;

Or they can have have a so-called â€œ!LangAltâ€ element as their child.

The LangAlt element allows different title/description values to be

expressed for different languages. The LangAlt element looks like

this:

&lt;dc:title&gt;

&lt;rdf:Alt&gt;

&lt;rdf:li xml:lang="de-DE"&gt;Sonnenuntergang am Strand&lt;/rdf:li&gt;

&lt;rdf:li xml:lang="en-US"&gt;Sunset on the beach&lt;/rdf:li&gt;

&lt;/rdf:Alt&gt;

&lt;/dc:title&gt;

In this case, the LangAlt element carries strings in both English and German.

Of course, we don't deal with XMP directly; our XMP handling is

entirely moderated by Exiv2. Running the exiv2 command-line utility on

a photo with XMP description or title data wrapped in a LangAlt

element yields output that looks like this:

Xmp.dc.title LangAlt 2

lang="de-DE" Sonnenuntergang am Strand, lang="en-US" Sunset on the

beach

Or in the case of the bad photo that the bug reporting user

contributed, like this:

Xmp.dc.title LangAlt 1

lang="x-default" Chester Beatty

The point is that, in this case, the actual value of Xmp.dc.title

isn't text but a comma-delimited list of strings in various languages.

So what I'm proposing is this. For 0.8, let's use just use regexp

match/substitute functions to strip the text lang=â€œx-defaultâ€ out of

the XMP strings we get back from Exiv2. In the future, however, let's

properly parse comma-delimited list and attempt to select an

appropriate title given the user's operating system language. What

makes this tricky however is what policy to use. For example, consider

not XMP titles but XMP keywords. Let's say that there are 4 keywords:

two in English and two in German. Does this photo have four tags or

only two tags appropriate to the user's operating system language? For

example, German users would see only the two German tags and English

users would see only the two English tags. While being

language-appropriate seems clever, one issue is that users would

actually see different sets of tags depending on their operating

system language. I don't know the answers to these questions, so what

I'm proposing is we use the quick and dirty regexp-stripping solution

on the first title in the LangAlt list for 0.8, then develop a policy

and an implementation of that policy for a later release.

Comment 6

Updated by Adam Dingle almost 3 years ago

The quick and dirty regexp fix sounds good to me for 0.8.

Comment 7

Updated by Jim Nelson almost 3 years ago

I just sent an email discussing a helper class in Exiv2 for exactly this problem: http://www.exiv2.org/doc/classExiv2_1_1LangAltValue.html

Comment 8

Updated by Jim Nelson almost 3 years ago

Status changed from Review to 5
Resolution set to fixed
% Done changed from 0 to 100

It turns out we don't need to modify gexiv2 for this problem. gexiv2 already has a function to return multiple string values for a particular tag, which works with alt lang values in XMP. The language code is not returned, so one cannot be picked, but for now we were planning just on using the first one anyway, so it all works out.

It still might be useful to be able to find a string by language code: #2964 (closed)

r2473

Comment 9

Updated by Charles Lindsay 7 months ago

Status changed from 5 to Fixed

--- Bug imported by chaz@yorba.org 2013-11-25 21:48 UTC ---

This bug was previously known as bug 2773 at http://redmine.yorba.org/show_bug.cgi?id=2773 Imported an attachment (id=261880) Imported an attachment (id=261881)

Unknown Component Using default product and component set in Parameters Unknown version " in product shotwell. Setting version to "!unspecified". Unknown milestone "unknown in product shotwell. Setting to default milestone for this product, "---". Setting qa contact to the default for this product. This bug either had no qa contact or an invalid one.

Resolution: RESOLVED FIXED