Embed Notice
HTML Code
Corresponding Notice
- Embed this notice@kaia @Suiseiseki all the pdfs where created wtih the same .docx template, so i wrote a programm that just reads all the characters with their size and tries to guess if what it's reading is a headline or not, skipped the impressum and had a bunch of other special handling. Tables where still garbled nonsense, hyphens everywhere, page numbers in the text etc but it turned out ok embeddings.
Got my hands on the source docx files a while later and those i could actually get the text from as i wanted after i stopped reading microsofts documentation for the format and just looked through them in a text editor to figure out how to parse it