.gitlab-ci.yml · 7e381982872d74b5b135c183ee9188586bb83645 · GNOME / Geary · GitLab

ImapDb.Database: Register new ICU-based tokeniser for FTS · 7e381982

Michael Gratton authored Nov 13, 2020 and

Michael Gratton committed Jan 19, 2021

The SQLite tokeniser does not deal with scripts that do not use spaces
for word breaking (CJK, Thai, etc), thus searching in those languages
does not work well.

This adds a custom SQLite tokeniser based on ICU that breaks words for
all languages supported by that library, and uses NFKC_Casefold
normalisation to handle normalisation, case folding, and dropping of
ignorable characters.

Fixes #121

7e381982

Validating GitLab CI configuration… Learn more