![]() You should check first, if copy'n'pasting of text works if you use a simple text file as a target (not an MS Word document). ![]() Hence it is not easy to extract text that is shown with this font (extraction would require manual reverse engineering - but then you can also just "read" the PDF pages). The font SSKFGJ+ArialMT uses a custom encoding, but the PDF has no /ToUnicode for this font, as indicated by the no entry for the column headed uni. In the above case, both used fonts are embedded as subsets (indicated by the XYZABC+-prefixes to their names, as well as by the yes in the emb and the sub columns). The command above asked for the fonts used in the page range 3 (first to check) to 5 (last page to check). SSKFGJ+ArialMT CID TrueType Custom yes yes no 11 0 IADKRB+Arial-BoldMT CID TrueType Identity-H yes yes yes 10 0 Pdffonts returns a few basic information items about the fonts used by your PDF.Įxample output: $ pdffonts -f 3 -l 5 sample.pdf ![]() In order to successfully extract text (or copy'n'paste it) from a PDF, the font should either use a standard encoding (not a Custom one), and it should have a /ToUnicode table associated with it inside the PDF. That is part of the XPDF package for Windows and can be used without installing, just from a DOS box. You should check your PDF document's fonts first with the help of the pdffonts utility. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |