Add Wordlist Scoring to GNOME Crosswords Editor
Add Wordlist Scoring to GNOME Crosswords Editor
Mentors
- jrb@gnome.org
- federico@gnome.org
- tanmayp@gnome.org — GSoC alum!
Project length
Long ~350 hours
Could be short (~175 hours) if necessary, with fewer milestones reached
Description
GNOME Crosswords Editor uses some lists of words in order to produce crosswords. These lists lack the metadata to make more interesting puzzles. We'd like to find, gather, and encode that metadata with the lists in order to choose appropriate words when working with the editor.
For more information on the problem, please read this design doc.
This project will involve finding ways to calculate, measure, and encode values for each of the five traits listed. We will then use it when creating a puzzle to try and create more interesting grids.
Requirements
- The biggest requirement for this project is a love of words, and a certain amount of comfort with uncertainty. I will expect the intern to do some independent research and exploration
- This project will mostly be in python, and will involve a fair amount of data analysis as well as coding
- Some low-level C programming knowledge is a strong plus, as well as the ability to use a hex-editor. The current wordlist is stored in a custom data structure
- This project will involve working with some large data sets — possibly as big as 20 GB. A relatively fast machine with sufficient disk space will be required, as well as the ability to download large files.
Communication
Primarily matrix. The main crosswords channel can be found at https://matrix.to/#/#crosswords:gnome.org
We also will use gitlab issues and email as appropriate.
Video conferencing occasionally when necessary.