The search function as implemented in rhythmbox searches for exact phrases, such
that a search for "Lazy Days" is different from a search for "Lazy Days". I
think a better search method is to search for each word, with spaces and
whitespace ignored as immaterial. Then it wouldn't matter of someone searched
for "Lazy Days" or "Days Lazy" or "Lazy Days". It should then search for all
the songs with the words "days" and "lazy" in it.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
I agree - I would really like to have the search box contents treated as a list
of space seperated search terms as opposed to one search term.
As well as changing orders as above, I think the different terms should be
searched for globally.
So for example to locate the album "true" by artist "trinity roots" it would
(depending on other items in your library) probably be enough to enter
roots true
or maybe even roo tru.
I found that this form of searching was the most valuable feature for me in
itunes. As the search updates, it allows you to interactively adjust what your
typing to narrow in to what you want. I found this interactive search experience
really a new feature in computer interfaces and I think its great.
Im not sure if the search update in rhythmbox would be fast enough for this kind
of behaviour, but it would be good to have anyway, for the reasons outlined above.
Not sure if this should really be called 'fuzzy' searching though - since to me
this implied Madman style search interpretation (matching spelling mistakes and
providing best match etc.) which is not what I would like at all...
It would be also very great if I could find stuff like "Die Ärzte" by typing
"die arzte" (just leaving out the umlaut on a foreign keyboard) or "die aerzte"
(which is how German people would write it).
I agree that this a good idea. I think the current behaviour is non-intuitive.
Users are used to the concept that searching for:
The Rolling Stones
Will search for 3 terms.
And if they want to search for a number of words in a row its:
"The Rolling Stones"
Will search for the 1 term.
I find the current implementation limiting.
e.g. Let's say I have Bob Dylan's Like a Rolling Stone, plus a number of Rolling
Stone tracks. I go to find the Dylan track via the search-bar.
Searching for:
Rolling Stone
will find a number of tracks.
Searching for:
Rolling Stone Dylan
won't find any tracks.
Someone started working on this a few days ago, with some guidance via IRC. They
were using the Levenstein distance algorithm - which means that it can match
even with extra letters inserted, letters deleted or swapped.
Last time we talked they had it mostly working. It was fairly slow, but there
are a couple of obvious optimizations we can make, which will provide
order-of-magnitude improvements to speed.
I'll provide an update next time I hear from them.
I haven't heard from them in about a week, so I'm attaching the most up to date
copy of the patch that I had - so it doesn't get lost.
The patch does a number of things:
adds functions to rb-string-helpers, which computes the "distance" of two
strings using some "Party Pattern" code written by Benjamin Otte two years ago.
(using the Levenstein distance algorithm)
stores the PartyPatterns in RBRefStrings, so they don't have to be generated
every time a query is run
adds a new operator RHYTHMDB_QUERY_PROP_FUZZY_MATCH, which does fuzzy
matching
adds a new property RHYTHMDB_PROP_SEARCH_STRING, which can only (currently)
be used by the fuzzy operators, and causes it to match against genre, artist,
album and title (for the search box). This lets you specify more than one of
those at the same time.
changes the search box to use RHYTHMDB_QUERY_PROP_FUZZY_MATCH and
RHYTHMDB_PROP_SEARCH_STRING.
This implements fuzzy matching which ignores all punctuation and case
differences, as well as matches words seperately. The search box also does this
with genre/artist/album/title at once, so "pris end" will match "Prisioner of
Society" by "The Living End". By fiddling around with the value returned from
party_pattern_cost_replace, party_pattern_cost_delete and
party_pattern_cost_insert; as well as the threshold in
evaluate_conjunctive_subquery it can be made to match even with missing
characters or extra ones inserted. With the current (fairly arbitrary) values
it will ignore one missing or extra character per search.
Although doing fuzzy matching is slower than looking for a substring, this
doesn't feel that much slower (I haven't done profiling yet, so I have no
numbers). This is probably because it does one match for all four properties
(genre/album/artist/title) rather than doing a substring search four times.
There is an additional obvious optimisation that could be done - by not having
to rebuilding the PartyPattern of the search term for every entry.
Some things that need to be done:
fix the copyright/licence stuff in rb-string-helpers.
rename some things, and fix some code style issues.
check whether can match "è" with "e" and the like
implement the above optimisation, and then do some profiling to see what
speed difference this patch causes.
This is a variation on the earlier patch, which doesn't have any "fuzzy
matching" but can match against multiple properties, and hence is simpler. For
example I can entering "cou row liv" will find my tracks by "Counting Crows"
from "Live at the Wiltern Theatre".
A better version of the second patch has been committed to cvs.
If someone wants to add real "fuzzy matching" that ignores spelling mistakes and
the like, it should be easy enough to add the necessary bits from the first patch.
I've unmarked 338824 as a duplicate, and am re-titline this bug to be the other bit, since the two features are different and are likely to be done separately.