Image titles sometimes import with "lang="x-default" prefix
Submitted by an unknown user
Assigned to Lucas Beeler
Link to original bug (#717073)
Description
---- Reported by shotwell-maint@gnome.bugs 2010-11-07 17:05:00 -0800 ----
Original Redmine bug id: 2773
Original URL: http://redmine.yorba.org/issues/2773
Searchable id: yorba-bug-2773
Original author: Nyall -
Original description:
Using shotwell 0.7.2 from the yorba ppa on maverick:
I'm seeing a lot of my photos import with the title field prefixed by the string “lang=â€x-default". Looking at the exif information for these photos doesn't show this same prefix:
$ exiftool 20091113_141150.%(=caps)JPG%
ExifTool Version Number : 8.15
File Name : 20091113_141150.%(=caps)JPG%
…
XMP Toolkit : Image::ExifTool 8.15
Description : Outside the Chester Beatty Library, Dublin.
Title : Chester Beatty
…
I've attached a screenshot showing how this same photo appears in shotwell, as well as the sample image showing this problem.
---- Additional Comments From shotwell-maint@gnome.bugs 2013-05-01 11:39:00 -0700 ----
History
Comment 1
Updated by Jim Nelson about 3 years ago
- Priority set to High
Comment 2
Updated by Adam Dingle about 3 years ago
This was also reported on the mailing list, where the reporter said that it occurs whenever the name includes non-%(=caps)ASCII% characters:
http://lists.yorba.org/pipermail/shotwell/2010-November/001248.html
Comment 3
Updated by Adam Dingle almost 3 years ago
- Status changed from Open to Review
- Assignee changed from Anonymous to Lucas Beeler
Comment 4
Updated by Adam Dingle almost 3 years ago
-
Priority deleted (
<strike>
_High_</strike>
)
Comment 5
Updated by Adam Dingle almost 3 years ago
Note Lucas's research on this, which I'm quoting here:
2773 occurs because a simple text node isn't the only possible
child of the !dc:description and !dc:title nodes defined by the XML
grammar for RDF/%(=caps)XMP%. These XML elements can have simple text children,
as shown below:
dc:descriptionForget the big bang approach. When it
comes to demonstrating the value
of knowledge management, a piecemeal
strategy works best.</dc:description>
Or they can have have a so-called “!LangAlt†element as their child.
The LangAlt element allows different title/description values to be
expressed for different languages. The LangAlt element looks like
this:
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="de-DE">Sonnenuntergang am Strand</rdf:li>
<rdf:li xml:lang="en-US">Sunset on the beach</rdf:li>
</rdf:Alt>
</dc:title>
In this case, the LangAlt element carries strings in both English and German.
Of course, we don't deal with XMP directly; our XMP handling is
entirely moderated by Exiv2. Running the exiv2 command-line utility on
a photo with XMP description or title data wrapped in a LangAlt
element yields output that looks like this:
Xmp.dc.title LangAlt 2
lang="de-DE" Sonnenuntergang am Strand, lang="en-US" Sunset on the
beach
Or in the case of the bad photo that the bug reporting user
contributed, like this:
Xmp.dc.title LangAlt 1
lang="x-default" Chester Beatty
The point is that, in this case, the actual value of Xmp.dc.title
isn't text but a comma-delimited list of strings in various languages.
So what I'm proposing is this. For 0.8, let's use just use regexp
match/substitute functions to strip the text lang=“x-default†out of
the XMP strings we get back from Exiv2. In the future, however, let's
properly parse comma-delimited list and attempt to select an
appropriate title given the user's operating system language. What
makes this tricky however is what policy to use. For example, consider
not XMP titles but XMP keywords. Let's say that there are 4 keywords:
two in English and two in German. Does this photo have four tags or
only two tags appropriate to the user's operating system language? For
example, German users would see only the two German tags and English
users would see only the two English tags. While being
language-appropriate seems clever, one issue is that users would
actually see different sets of tags depending on their operating
system language. I don't know the answers to these questions, so what
I'm proposing is we use the quick and dirty regexp-stripping solution
on the first title in the LangAlt list for 0.8, then develop a policy
and an implementation of that policy for a later release.
Comment 6
Updated by Adam Dingle almost 3 years ago
The quick and dirty regexp fix sounds good to me for 0.8.
Comment 7
Updated by Jim Nelson almost 3 years ago
I just sent an email discussing a helper class in Exiv2 for exactly this problem: http://www.exiv2.org/doc/classExiv2_1_1LangAltValue.html
Comment 8
Updated by Jim Nelson almost 3 years ago
- Status changed from Review to 5
- Resolution set to fixed
- % Done changed from 0 to 100
It turns out we don't need to modify gexiv2 for this problem. gexiv2 already has a function to return multiple string values for a particular tag, which works with alt lang values in XMP. The language code is not returned, so one cannot be picked, but for now we were planning just on using the first one anyway, so it all works out.
It still might be useful to be able to find a string by language code: #2964 (closed)
r2473
Comment 9
Updated by Charles Lindsay 7 months ago
- Status changed from 5 to Fixed
--- Bug imported by chaz@yorba.org 2013-11-25 21:48 UTC ---
This bug was previously known as bug 2773 at http://redmine.yorba.org/show_bug.cgi?id=2773 Imported an attachment (id=261880) Imported an attachment (id=261881)
Unknown Component Using default product and component set in Parameters Unknown version " in product shotwell. Setting version to "!unspecified". Unknown milestone "unknown in product shotwell. Setting to default milestone for this product, "---". Setting qa contact to the default for this product. This bug either had no qa contact or an invalid one.
Resolution: RESOLVED FIXED