Limit the types of text file that we index
We currently index anything with a mime type that matches text/*
, controlled by the 90-text-generic.rule
file:
[ExtractorRule]
ModulePath=libextract-text.so
MimeTypes=text/*
FallbackRdfTypes=nfo:Document;nfo:PlainTextDocument;
There are many files with text/*
mime types that it makes no sense to index for full-text search. This includes source code and video game data files. This can often be very large and causes big performance problems, as well as messing up search results.
We may be able to simply change the rule to match text/plain
.