... | ... | @@ -11,24 +11,24 @@ a number added in case of name collision. `idx` can be a string too. |
|
|
In every folder you have:
|
|
|
|
|
|
* For image documents:
|
|
|
* `paper.<X>.jpg` : A page in JPG format (X starts at 1). It's the original page (as scanned or imported).
|
|
|
* `paper.<X>.edited.jpg` : The page after post-processing and editing. (Paperwork >= 2.0 only)
|
|
|
* `paper.<X>.words` (optional) : A
|
|
|
* `paper.<X>.jpg`: A page in JPG format (X starts at 1). It's the original page (as scanned or imported).
|
|
|
* `paper.<X>.edited.jpg` (optional): The page after post-processing and editing. (Paperwork >= 2.0 only)
|
|
|
* `paper.<X>.words` (optional): A
|
|
|
[hOCR](https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview)
|
|
|
file, containing all the words found on the page using the OCR (optional, but required for indexing ; can be regenerated with the options "Redo OCR (...)").
|
|
|
* `paper.<X>.thumb.jpg` (optional, generated and updated automatically) : A thumbnail version of the page (faster to load).
|
|
|
* `paper.<X>.thumb.jpg` (optional, generated and updated automatically): A thumbnail version of the page (faster to load).
|
|
|
Starting with Paperwork 2.0, only paper.1.thumb.jpg is used.
|
|
|
* labels (optional) : a text file containing the labels applied on this document (text + label color)
|
|
|
* `extra.txt` (optional) : extra keywords added by the user
|
|
|
* labels (optional): a text file containing the labels applied on this document (text + label color)
|
|
|
* `extra.txt` (optional): extra keywords added by the user
|
|
|
* For PDF documents:
|
|
|
* `doc.pdf` : the document
|
|
|
* `labels` (optional) : a text file containing the labels applied on this document
|
|
|
* `extra.txt` (optional) : extra keywords added by the user
|
|
|
* `paper.<X>.words` (optional) : A
|
|
|
* `doc.pdf`: the document
|
|
|
* `labels` (optional): a text file containing the labels applied on this document
|
|
|
* `extra.txt` (optional): extra keywords added by the user
|
|
|
* `paper.<X>.words` (optional): A
|
|
|
[hOCR](https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0/preview)
|
|
|
file, containing all the words found on the page using the OCR. Some PDF contains crap instead
|
|
|
of the real text, so running the OCR on them can sometimes be useful.
|
|
|
* `paper.<X>.edited.jpg` : The page after editing. (Paperwork >= 2.0 only)
|
|
|
* `paper.<X>.edited.jpg` (optional): The page after editing. (Paperwork >= 2.0 only)
|
|
|
* `page_map.csv` (optional): Created if the user move the page inside the PDF file. doc.pdf is not actually modified,
|
|
|
only this mapping file. Pages are reordered on-the-fly when the document is displayed or exported. (Paperwork >= 2.0 only)
|
|
|
|
... | ... | |