Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • tracker-miners tracker-miners
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 82
    • Issues 82
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 24
    • Merge requests 24
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GNOMEGNOME
  • tracker-minerstracker-miners
  • Issues
  • #78
Closed
Open
Issue created Aug 27, 2019 by Sam Thursfield@sthursfieldMaintainer

Limit the types of text file that we index

We currently index anything with a mime type that matches text/*, controlled by the 90-text-generic.rule file:

[ExtractorRule]
ModulePath=libextract-text.so
MimeTypes=text/*
FallbackRdfTypes=nfo:Document;nfo:PlainTextDocument;

There are many files with text/* mime types that it makes no sense to index for full-text search. This includes source code and video game data files. This can often be very large and causes big performance problems, as well as messing up search results.

We may be able to simply change the rule to match text/plain.

Assignee
Assign to
Time tracking