100% CPU usage during photo import in large collection (no indexes in DB (!))
@socketpair
Submitted by Марк Коренберг Link to original bug (#742670)
Description
I have straced shotwell, see that man-many-many reads on sqlite DB file. Next, I figure out what cause that. I choose GDB and see that many times Shotwell spend in function detecting duplicates.
It executes query like
SELECT id FROM PhotoTable WHERE filename=? OR ((thumbnail_md5=? or md5=?) and file_format=?))
(not all conditions are always used). Next I see indexes on that table, and discover only ONE (!):
CREATE INDEX PhotoEventIDIndex ON PhotoTable (event_id);
Next, I added indexes by hand CREATE INDEX PhotoEventIDIndex ON PhotoTable (event_id);
sqlite> CREATE unique INDEX mmarkk1 on PhotoTable (filename);
sqlite> CREATE unique INDEX mmarkk2 on PhotoTable (thumbnail_md5, file_format); Error: UNIQUE constraint failed: PhotoTable.thumbnail_md5, PhotoTable.file_format; <--- WTF?, but that is another history...
sqlite> CREATE INDEX mmarkk2 on PhotoTable (thumbnail_md5, file_format);
sqlite> CREATE unique INDEX mmarkk3 on PhotoTable (md5, file_format);
sqlite> CREATE unique INDEX mmarkk4 on PhotoTable (md5, thumbnail_md5, file_format);
These indexes are superfluous, but guarantee help me in my case. After that, Shotwell use 100% IO of my SD-card (instead of 100% CPU, as in situation before adding indexes).
I have DB of size 13 MB. After adding indexes it become 22 MB :). It is acceptable for my 200-GB collection of photos.
sqlite> select count(*) from PhotoTable; 36403
Also, I use Shotwell 5+ years, and maybe bug in DB upgrade procedures, that did not add indexes...
Version: 0.20.x
Resolution: RESOLVED FIXED