Jump to content
You must now use your email address to sign in [click for more info] ×

Korean (non-latin) font cannot be extracted from exported PDF if Embed subsets is checked


Recommended Posts

I have a document (Font Embed Example.afpub) using Noto Sans Korean [KR] Bold (Google Fonts) and Noto Sans (Google Fonts) but still only using Latin characters.

image.thumb.png.a54383ca24be9eb467bc7fec9e567be6.png

Both fonts are listed as 'Installable' in Windows font settings:

image.png.acec1e9627f52c37c2cde93fa25b4c45.png

However, when exporting the document as a PDF from Publisher with the Subset fonts option checked, the exported PDF (Font Embed Example (subset).pdf) is missing all styles (i.e bold and regular) of only the KR font:

image.png.3f0192a1eadc5177e1afe92cff310046.png

The regular Noto Sans font however embeds perfectly well:

image.thumb.png.369cdfcb3b135f1273d6a400b7d5a655.png

image.png.7724ebd16eaa77866c444566f8733f01.png

Unchecking Embed subsets fixes the issue, at the expense of increasing file size from 1MB to almost 8MB (since the Korean font is quite large) (Font Embed Example (non-subset).pdf) which is too large for some file upload limits. I think this may be a bug in the PDF exporter. I haven't conducted any testing with other non-latin fonts but this may yield similar results.

I'm running the latest Publisher version on Windows 11. The same issue also occurred on my laptop running Windows 10.

Edited by Luca Huelle
More tags
Link to comment
Share on other sites

  • Staff

Hi Luca, welcome to the forums!

This seems to be a problem only with Adobe software I'm afraid, the font works elsewhere fine and there are many reports of failures with it in many of Adobes apps.

https://www.google.com/search?q=Noto+Sans+adobe+problem&rlz=1C1ONGR_en-GBGB1005GB1005&oq=Noto+Sans+adobe+problem&aqs=chrome..69i57j33i160l2.6633j0j4&sourceid=chrome&ie=UTF-8

Lee

Link to comment
Share on other sites

Hi, thank you for having a look at the issue!

None of the issues I looked through in those search results are really related to my bug I think. They were using the wrong fonts, resulted in garbled output, had issues using the Adobe Fonts platform or eventually discovered it was actually a bug with InDesign and not the PDF reader.


Could you give some examples of "elsewhere"? I've tried to view the PDF in Microsoft Edge, Firefox, Sejda & PDF Chef (online) - none displayed the font correctly although some had a crack at some of the individual letters at least but never all of them.image.png.704ee6d153a02ebc4eb4861bc748010a.png

Edited by Luca Huelle
Link to comment
Share on other sites

9 hours ago, LeeThorpe said:

This seems to be a problem only with Adobe software I'm afraid

Ah, no.
There is definitely some problems in this PDF.
Note: I did check the fonts and did not find any issues.

In PDF-Exchange Editor it displays the Noto Sans KR Bold as Regular, and shows both Noto Sans KR Regular & Bold as not embedded. So it is having issues trying to figure-out the embedded fonts.

Ahh ... as I am writing I see @Luca Huelle just posted above.

Nitro Pro displays this:Nitro.Pro-displays-this.thumb.png.2251d972e07ec651b172429c3eaff30f.png

FlexiPDF (and InfixPDF) display this:
FlexiPDF-displays-this.thumb.png.421b8a2338e3404ae998417f8d264c26.png

FlexiPDF (and Infix) Have a useful feature to remap characters.
But you can also use it to look at the codes behind various characters.
Here is the font that does display properly - Noto Sans Italic.
The codes info for the selected D is displayed below the table.
FlexiPDF_Noto.Sans-Italic-codes.thumb.png.fb4cd805709bbc36b7346f9611618397.png

Here is the same table for the Noto Sans KR Bold embedded font.
The B is selected, but it cannot connect the the glyph to the codes.
The other characters are there in the table, but nothing displays.
FlexiPDF_Noto.Sans-KR-Bold-codes.thumb.png.21f14e3e0272586ebf4f97e5656458b2.png


If I open the PDF in PDF DeBugger and go look at the fonts in the resources
the Noto Sans Italic correctly shows Glyphs: 14 which is the correct number for that embedded font.
The Noto Sans KR Regular and Bold both show Glyphs: 24861 - when the actual glyph numbers are 13 and 12.
Not all of those are listed in the table, but most of the glyphs listed in the table are not used in the PDF.
What should be there is only the glyphs used.
So the info in the PDF is wacko.

To test if OTF vs. TTF is an issue, I converted the Noto Sans KR Regular and Bold fonts from OTF to TTF.
The TTF fonts appear to work in APub when exported to PDF.

In PDF Exchange the Noto Sans KR Bold still displays as Regular.
But the TTF versions both work.
APub-test-in-PDF-Exchange.png.ed9f1f8a2d2f430599e0ccd3cc86a417.png

In FlexiPDF, the OTF Bold is still bad, and the TTF fonts work.
APub-test-in-FlexiPDF.png.5d0b59d636c8bc7d69f7be80852e0c01.png

Note that I exported-to-PDF the same test text from Word and LibreOffice.
Both PDFs display correctly in all PDF editors, and the glyphs counts are
reasonable when examined in PDF Debugger, and I can see all the correct
codes/glyphs in the FlexiPDF Remap fonts dialog.

So it appears something is wrong when the OTF is sub-setted and embedded.
It may be the CIDs are not correct for a sub-setted font (look like the originals).
Which would explain why embedding a full not-sub-setted font works.

Regardless, something is not right which is probably why Acrobat is balking.

Link to comment
Share on other sites

Thank you so much for the in-depth investigation, LibreTraining! Hopefully this gets a step closer to the root cause.

So am I right in saying that it does seem to be something to do with sub-setting and embedding an OTF font? Does this happen with other OTF fonts or just Noto KR/other international fonts since it has a larger number of glyphs than perhaps expected maybe (because it includes non-latin glyphs)?

Link to comment
Share on other sites

There have been problems with sub-setting for a long time.
So I do not think the issue is these fonts, or non-Latin characters.
OTF and TTF fonts are embedded differently so it may just be the OTFs.
I do not remember now, but this has come up over-and-over.
And it seems that sometimes it works, and sometimes it does not.
The usual work-around is to turn-off sub-setting.
But with a font this big that is kinda painful.

It would take a lot more testing to narrow down the cause.
But I suspect it is related the CID not being handled correctly.
Inside PDFs the characters are identified by a Character ID (CID).
And those are not necessarily the same as the Glyph ID inside the font.
When a sub-setted font is embedded in a PDF it is really a little mini-font.
In your PDF the TTF font embedded has assigned CIDs 1-14, for those 14 characters.
The embedded OTF fonts appear to still have the CIDs from the original full font.
I do not know if that is correct, but other applications do not have this in the PDFs they produced. They have lower consecutive CIDs like the TTF example above.
So I am guessing that the issue is related as visually you can see the correct glyphs are not being connected to the correct codes in the PDF displayed.
And the embedding the full font works (so not a font problem).

All of this PDF stuff is very complex I quickly get kinda lost when rummaging around inside PDFs. Reading the specs is quite an ordeal (and confusing).
But I am fairly sure something is wrong here.

Link to comment
Share on other sites

Thank you for your explanation - clearly there's a lot to unpack here and it's all very complicated 😅! But still thank you for your explanations 🥰

If you could provide the TTF fonts that'd be great since that seems to be a suitable workaround for now - assuming it converted the Korean characters suitably as well (which I guess it might not have done given that might have been the whole reason the newer otf type was used in the first place... but it's worth a shot regardless).

Link to comment
Share on other sites

7 hours ago, Luca Huelle said:

If you could provide the TTF fonts that'd be great since that seems to be a suitable workaround for now - assuming it converted the Korean characters suitably as well (which I guess it might not have done given that might have been the whole reason the newer otf type was used in the first place... but it's worth a shot regardless).

OK. Give these a try.
I only converted the Regular and Bold.

Noto Sans KR TTF fonts.zip

Let me know if you have any issues.
I did change the name to include TTF, so you can have them installed at the same time as the OTF fonts.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.