Commit ea314d91 authored by Claude Paroz's avatar Claude Paroz

trunk removed

svn path=/trunk/; revision=1146
parent e4c95481
This diff is collapsed.
By Categories:
* GNOME 2.14
* desktop
* developer-libs
* GNOME 2.15
* desktop
* developer-libs
* proposed
* Office
* Fifth Toe
* Extras
ID: epiphany
Descriptive name (@ID): Epiphany Web Browser
Maintainers: Marco Pesenti Gritti <>, Christian Persch <>
Branches (HEAD): gnome-2-14, HEAD
Bugzilla Product (@ID): epiphany
CVS root:
CVS module (@ID): epiphany
PO dirs (po): po
DOC dirs (help): help
<module id="epiphany">
<_description>Epiphany Web Browser</_description><!-- defaults to ID -->
<maintainer><name>Marco Pesenti Gritti</name>
<maintainer><name>Christian Persch</name>
<cvs-module>epiphany</cvs-module><!-- defaults to ID -->
<branches><!-- defaults only to HEAD -->
<product>epiphany</product><!-- defaults to ID -->
<component>general</component><!-- defaults to "general" -->
<domains><!-- defaults to a single "po" -->
<domain priority="1">po</po-domain>
<domain priority="0">po-properties</po-domain>
To debug this web application on the command line with the Python debugger,
insert "import pdb; pdb.set_trace()" where you want the program to enter the
debugger, go to the app root directory and type:
$ export PATH_INFO=/url/you/want/to/debug
$ python
PATH_INFO should not include the domain name.
Judging Translation Support Levels:
Not everything that can be counted counts; and not everything that
counts can be counted. //George Gallup
- Level 0:
not fulfilling even Level 1 requirements
- Level 1:
check percentage for every PO file EXCLUDING:
- irrelevant modules (gtk+/po-properties, gstreamer, xkeyboard-config, gnome-applets/po-locations ...)
- translations exclusively for files
- special checks for what we know we can exclude?
- Level 2 (fully usable UI translation: no gconf or non-visible things translated)
- xkeyboard-config
- iso_codes
- gnome-applets/po-locations (at least the relevant part: top-level
countries and entries for relevant country for a language)
- Level 3 (complete UI translation)
- translations as well
- gnome-applets/po-locations (full translation)
- gtk+/po-properties
- gstreamer
- scrollkeeper
- Level 4 (UI + docs)
- Level 2 + docs translation
- Level 5 (full UI+docs translation)
- Level 3 + docs translation
Level 0: unsupported
Level 1: partially supported
Level 2/3: supported
Level 4: excellent support
Level 5: "Gnome badge of honour" or "we owe you a beer" award :)
List of messages untranslated but present in POT
Extend, stripping untranslated messages so we can inform
translators what they need to translate without forcing them to look
at the PO file (stop after eg. 10 differences?)
Not regenerating old stuff [DONE]
"regenerate" attribute on <branch>: keep branches, but don't regenerate for them
Speed optimizations
- update only single module, branch, domain, language (to speed up on
PO file CVS commits)
- check if POT diff between old and new POT files is null, AND PO file
hasn't changed (compare mtimes), when there is no need to go through
entire update procedure (in [DONE]
- allow to process only a single module/branch on e-mail notifications
(procmail + cvs commit notifications): almost ready, only need a
short script to start this with module/branch parameters [DONE]
SVN, bazaar, tarball source code
Support other methods than CVS of fetching source code.
Better error reporting for docs
- parse documents looking for mediaobjects, look for missing figures in translations, etc. speed testing
Input data: two random PO files of 833/844 messages with 21 differences
Measured user time using time(1).
USE_DIFFLIB (ms): 94 82 92 88 77 76 87
average: 85.2 ~ 80
NO_DIFFLIB(ms) : 65 59 59 60 67 61 66 61
average: 62.3 ~ 60
- fuzzy_matching = 0
- generate_docs = 0
~16 minutes for 5442 UI strings, 843 documentation strings.
GNOME is in the range of 100K UI strings, 20K doc strings, which is
roughly 20 times this size, so estimated initial run-time would be
~5.5 hours.
1. Copy all source code to where you want it to sit, eg. to DAMNED_DIR
2. Look at DAMNED_DIR/ and edit it
- scratchdir: where to do CVS checkouts to, put generated documentation, etc.
- webroot: where will damned-lies appear on the web, URI is not necessary,
so if it will be on the domain itself, use empty string
- database_connection: SQLObject string, given is example for SQLite and MySQL
- notifications_to: where to send string freeze break notifications
- WHOAREWE: "from" address in any mails sent
- WHEREAREWE: mentioned in any automatic notices as where the notice is coming from
- modules_xml, releases_xml, teams_xml: Modify if you're not setting it up for Gnome.
- fuzzy_matching: 0 - fast, bad for translators; 1 - slow, good for translators.
- DEBUG: lots of output on stderr
3. Look at DAMNED_DIR/.htaccess and change RewriteBase to defaults.webroot
4. Recreate .xml files from
$ cd DAMNED_DIR && make
5. Initialise database: "python DAMNED_DIR/"
6. First run of DAMNED_DIR/ (may take a long time and eat at least 20Gb be sure you have enough space):
$ cd DAMNED_DIR && python ./
7. Make sure:
- os.path.join(defaults.scratchdir, "POT") is available via defaults.webroot + "/POT" and
- os.path.join(defaults.scratchdir, "xml") is available via defaults.webroot + "/xml" and
SINGLE MODULE UPDATE (useful for crontab+procmail setup):
- All branches for a module:
$ cd DAMNED_DIR && make && python ./ MODULEID
- A single branch:
$ cd DAMNED_DIR && make && python ./ MODULEID BRANCH
cd ~/public_html/damned-lies && make && python ./ epiphany gnome-2-12
cd ~/public_html/damned-lies && make && python ./ nautilus
.htaccess \
ChangeLog \
Makefile \
index.html \ \ \ \ \ \ \ \ \ \
data/cyan-bar.png \
data/download.png \
data/error.png \
data/green-bar.png \
data/info.png \
data/main.css \
data/nobody.png \
data/purple-bar.png \
data/red-bar.png \
data/warn.png \
templates/language-release-doc-stats-modules.tmpl \
templates/language-release-doc-stats.tmpl \
templates/language-release-ui-stats-modules.tmpl \
templates/language-release-ui-stats.tmpl \
templates/language-release.tmpl \
templates/list-languages.tmpl \
templates/list-modules.tmpl \
templates/list-releases.tmpl \
templates/list-teams.tmpl \
templates/module.tmpl \
templates/release.tmpl \
templates/show-stats.tmpl \
po/C/%.xml: po/*.po
(cd po && intltool-merge -x -m . ../$< `basename $@`)
all: po/C/gnome-modules.xml po/C/translation-teams.xml po/C/releases.xml po/C/people.xml
(cd po && make)
dist: $(FILES)
@mkdir -p damned-lies-$(VERSION) && \
for file in $(FILES); do DIR=`dirname "damned-lies-$(VERSION)/$$file"` && mkdir -p $$DIR && cp $$file "damned-lies-$(VERSION)/$$file"; done && \
tar cvzf damned-lies-$(VERSION).tar.gz damned-lies-$(VERSION) && \
rm -rf damned-lies-$(VERSION)
release: dist
MYVERSION=`echo $(VERSION) | sed -e 's/\./_/g'` && cvs tag "damned_lies_$$MYVERSION"
Damned Lies: better statistics for Gnome
There are lies, damned lies, and statistics! — not[1] Benjamin Disraeli
Damned Lies is designed to provide translation status for intltool
(for UI translation) and gnome-doc-utils (for docs) using modules.
Currently, the only supported source-code fetching method is via CVS,
but I plan to support others (SVN, tarballs) as well.
It also provides a nice outlook on all releases you've got, but this
is only useful for complicated compilations of software such as Gnome
On naming
Well, we are obviously more correct than just any statistics out
there, but still giving out crap (lies), so we must be somewhere
in between: damned lies it is!
Short: Apache2 + mod_rewrite, Python + SQLObject + CheetahTemplate, intltool
Python 2.2? 2.3? 2.4?
Python modules:
o SQLObject for database access ( (package python-sqlobject on Debian/Ubuntu)
Tested only with 0.6.1
o Any of sqlitedb, MySQLdb, postgresdb modules
o Cheetah for HTML templating ( (package python-cheetah on Debian/Ubuntu)
Programs (need to be in PATH):
o msgfmt
o msgmerge
o intltool-update (>= 0.37)
o xml2po
o vcs programs depending on module repositories (currently supported: cvs,
subversion, hg, git)
intltool-update requirements:
o Perl
o XML::Parser
o intltool-extract
o xgettext
xml2po requirements:
o libxml2 + Python bindings
# -*-python-*-
# Copyright (C) 1999-2002 The ViewCVS Group. All Rights Reserved.
# By using this file, you agree to the terms and conditions set forth in
# the LICENSE.html file which can be found at the top level of the ViewCVS
# distribution or at
# Contact information:
# Greg Stein, PO Box 760, Palo Alto, CA, 94302
# -----------------------------------------------------------------------
# parse/handle the various Accept headers from the client
# -----------------------------------------------------------------------
import re
import string
def language(hdr):
"Parse an Accept-Language header."
# parse the header, storing results in a _LanguageSelector object
return _parse(hdr, _LanguageSelector())
# -----------------------------------------------------------------------
_re_token = re.compile(r'\s*([^\s;,"]+|"[^"]*")+\s*')
_re_param = re.compile(r';\s*([^;,"]+|"[^"]*")+\s*')
_re_split_param = re.compile(r'([^\s=])\s*=\s*(.*)')
def _parse(hdr, result):
# quick exit for empty or not-supplied header
if not hdr:
return result
pos = 0
while pos < len(hdr):
name = _re_token.match(hdr, pos)
if not name:
raise AcceptParseError()
a = result.item_class(string.lower(
pos = name.end()
while 1:
# are we looking at a parameter?
match = _re_param.match(hdr, pos)
if not match:
param =
pos = match.end()
# split up the pieces of the parameter
match = _re_split_param.match(param)
if not match:
# the "=" was probably missing
pname = string.lower(
if pname == 'q' or pname == 'qs':
a.quality = float(
except ValueError:
# bad float literal
elif pname == 'level':
a.level = float(
except ValueError:
# bad float literal
elif pname == 'charset':
a.charset = string.lower(
if hdr[pos:pos+1] == ',':
pos = pos + 1
return result
class _AcceptItem:
def __init__(self, name): = name
self.quality = 1.0
self.level = 0.0
self.charset = ''
def __str__(self):
s =
if self.quality != 1.0:
s = '%s;q=%.3f' % (s, self.quality)
if self.level != 0.0:
s = '%s;level=%.3f' % (s, self.level)
if self.charset:
s = '%s;charset=%s' % (s, self.charset)
return s
class _LanguageRange(_AcceptItem):
def __repr__(self):
def matches(self, tag):
"Match the tag against self. Returns the qvalue, or None if non-matching."
if tag ==
return self.quality
# are we a prefix of the available language-tag
name = + '-'
if tag[:len(name)] == name:
return self.quality
return None
class _LanguageSelector:
"""Instances select an available language based on the user's request.
Languages found in the user's request are added to this object with the
append() method (they should be instances of _LanguageRange). After the
languages have been added, then the caller can use select_from() to
determine which user-request language(s) best matches the set of
available languages.
Strictly speaking, this class is pretty close for more than just
language matching. It has been implemented to enable q-value based
matching between requests and availability. Some minor tweaks may be
necessary, but simply using a new 'item_class' should be sufficient
to allow the _parse() function to construct a selector which holds
the appropriate item implementations (e.g. _LanguageRange is the
concrete _AcceptItem class that handles matching of language tags).
item_class = _LanguageRange
def __init__(self):
self.requested = [ ]
def select_from(self, avail):
"""Select one of the available choices based on the request.
Note: if there isn't a match, then the first available choice is
considered the default. Also, if a number of matches are equally
relevant, then the first-requested will be used.
avail is a list of language-tag strings of available languages
# tuples of (qvalue, language-tag)
matches = [ ]
# try matching all pairs of desired vs available, recording the
# resulting qvalues. we also need to record the longest language-range
# that matches since the most specific range "wins"
for tag in avail:
longest = 0
final = 0.0
# check this tag against the requests from the user
for want in self.requested:
qvalue = want.matches(tag.lower())
#print 'have %s. want %s. qvalue=%s' % (tag,, qvalue)
if qvalue is not None and len( > longest:
# we have a match and it is longer than any we may have had.
# the final qvalue should be from this tag.
final = qvalue
longest = len(
# a non-zero qvalue is a potential match
if final:
matches.append((final, tag))
# if there are no matches, then return the default language tag
if not matches:
return avail[0]
# get the highest qvalue and its corresponding tag
qvalue, tag = matches[-1]
# if the qvalue is zero, then we have no valid matches. return the
# default language tag.
if not qvalue:
return avail[0]
# if there are two or more matches, and the second-highest has a
# qvalue equal to the best, then we have multiple "best" options.
# select the one that occurs first in self.requested
if len(matches) >= 2 and matches[-2][0] == qvalue:
# remove non-best matches
while matches[0][0] != qvalue:
del matches[0]
#print "non-deterministic choice", matches
# sequence through self.requested, in order
for want in self.requested:
# try to find this one in our best matches
for qvalue, tag in matches:
if want.matches(tag):
# this requested item is one of the "best" options
### note: this request item could match *other* "best" options,
### so returning *this* one is rather non-deterministic.
### theoretically, we could go further here, and do another
### search based on the ordering in 'avail'. however, note
### that this generally means that we are picking from multiple
### *SUB* languages, so I'm all right with the non-determinism
### at this point. stupid client should send a qvalue if they
### want to refine.
return tag
# return the best match
return tag
def append(self, item):
class AcceptParseError(Exception):
def _test():
s = language('en')
assert s.select_from(['en']) == 'en'
assert s.select_from(['en', 'de']) == 'en'
assert s.select_from(['de', 'en']) == 'en'
# Netscape 4.x and early version of Mozilla may not send a q value
s = language('en, ja')
assert s.select_from(['en', 'ja']) == 'en'
s = language('fr, de;q=0.9, en-gb;q=0.7, en;q=0.6, en-gb-foo;q=0.8')
assert s.select_from(['en']) == 'en'
assert s.select_from(['en-gb-foo']) == 'en-gb-foo'
assert s.select_from(['de', 'fr']) == 'fr'
assert s.select_from(['de', 'en-gb']) == 'de'
assert s.select_from(['en-gb', 'en-gb-foo']) == 'en-gb-foo'
assert s.select_from(['en-bar']) == 'en-bar'
assert s.select_from(['en-gb-bar', 'en-gb-foo']) == 'en-gb-foo'
# non-deterministic. en-gb;q=0.7 matches both avail tags.
#assert s.select_from(['en-gb-bar', 'en-gb']) == 'en-gb'
#!/usr/bin/env python
import defaults
from releases import Releases, get_aggregate_stats
# Percentage which qualifies a language to be considered 'supported'
supported_limit = 80
if __name__ == "__main__":
import sys, os
if len(sys.argv)>=2:
if os.access(defaults.modules_xml, os.R_OK):
defaults.DEBUG = 0
releases = sys.argv[1:]
for release in releases:
status = get_aggregate_stats(release)
languages = 0
total = 0
for stats in status:
mytotal = (stats['ui_translated'] + stats['ui_fuzzy'] +
if (100*stats['ui_translated']/mytotal >= supported_limit):
languages += 1
if mytotal > total:
total = mytotal
print "%s: %d langs, %d msgs" % (release, languages, total)