Cuneiform: Split text areas before OCR
Cuneiform tends to stop reading pages when it reachs a large non-readable area. Because of this, when using Cuneiform, all the keywords are not actually extracted.
A way to work around this problem would be to split the text areas prior to OCR.
For instance, unpaper can do that (ocrfeeder uses it).