idea: Loading metadata from DIN 5008 formatted letters

Background information:

This is probably only really relevant for German users, but in Germany we have a document formatting standard called DIN 5008 which defines some things like how sender/return addresses on letters are formatted and how the additional metadata information at the top right of a letter is covered.

At least the sender + recipient information would be pretty accurate on most professional (e.g. usually most businesses or government) letters as that is placed kinda accurately to fit in the window on the letter envelopes, so the post can see where to send it.

The norm defines

the location of areas (header, recipient, subject, etc) and page margins
some styling (e.g. allowed font sizes and families)
some standard abbreviations
formatting of addresses, dates, etc.

and areas I think are interesting for paperwork and especially the search:

subject (automatic document title that's easier to search for if you have a lot of documents)
sender information
- could maybe be grouped in searches, one use-case would be searching your old address when you move and have all companies that ever sent anything to your address be grouped
- could offer buttons to quickly start writing a mail or letter to the sender of documents (though that's probably out of scope for paperwork)
- could be imported/synced with CardDav (even more out of scope)
letter send date (could ask to use it, see discussions in #338)
recipient (for environments where there is more than one person/entity using paperwork on the same account, although that's probably a rare case)

Sample:

and template by the German post: https://www.deutschepost.de/de/b/briefvorlagen/normbrief-din-5008-vorlage.html

What I suggest from all of this:

first there needs to be either some kind of algorithm that detects where grouped areas in the cover page are or search for them explicitly at hardcoded (norm standardized) locations
have some more additional metadata; in parallel to #623 having the title be auto-detected by format would allow to automatically parse a lot of letters and avoid work by the user

Disadvantages of this: this might be making paperwork too much like an E-Mail client, just with letters / PDFs. As these days most letters also have E-Mail equivalents being sent, having an alternative tool to import the few missing letters into E-Mails might make more sense.

I would guess other countries also have standards where to put recipients and other information on the letters, so having some generic way where it checks for these sections would be useful.