Jump to content
You must now use your email address to sign in [click for more info] ×

Find and replace strange letters


Recommended Posts

Greetings team

I have a small document I was pulling in as a PDF (if any one has a clean way to convert/import Corel files I'm all ears) and the client asked for the font to be changed, from Calibri to Arial, with that all the "ti" letters in the document were replaced with a "ĕ". I'm not too worried, quick find and replace will do the trick. But, it seems the character will not copy to clipboard. That's a bit more annoying, I'm left wondering where that alien character is coming from. I hit character map in Windows and in the Arial font I find this little bugger. Great, now to find and replace. But no, it's not finding this character so I have no idea why or what this odd little letter that has sneaked into my text is.

There was a little twist, as I  finished typing the above I went in and first fixed it all up and found the usual suspect that the "ti" for the rest was replaced with that combined character in the rest of the document. To be safe replaced those as well, a lot less red underlines now. It makes me wonder if this is some thing that happens a lot, can't we get a one button fix for bigger documents? A fix text import issues? Hmmmm, I think it again requires some kicking around by the community.

text error.png

Strange letters.afpub

Link to comment
Share on other sites

Hypothesizing here:

The "ti" combination would commonly produce a ligature of some sort.  When the PDF file was written chances are the ligature was given a single code point which happens to map to that particular variant of the letter "e" in whatever encoding the PDF claimed to be in.  If the font was embedded with the document the ligature could have simply been substituted for an unused code point and would have rendered correctly (which is the true purpose of a PDF file - to look good as-is, not to be easy to edit; it is "digital paper" not a proper editing format), but when imported into the Affinity product, which does not use the embedded font, it would have read it as the "e" character and used that instead of the ligature that was encoded in the PDF?

Note that this form of substitution would make sense if it allows the application to use a single-byte encoding for the text instead of a multi-byte one, as this in theory saves space and reduces the size of the PDF file with no impact to its visual appearance - again, the primary purpose of a PDF file which is not produced for reasons of accessibility or searching.  My guess is that if you opened that PDF in Preview / Acrobat Reader / some other PDF viewer with search capability and tried searching for one of the words that has "ti" in it, it probably wouldn't come up in the search either.

Link to comment
Share on other sites

7 hours ago, fde101 said:

If the font was embedded with the document the ligature could have simply been substituted for an unused code point and would have rendered correctly (which is the true purpose of a PDF file - to look good as-is, not to be easy to edit; it is "digital paper" not a proper editing format), but when imported into the Affinity product, which does not use the embedded font, it would have read it as the "e" character and used that instead of the ligature that was encoded in the PDF?

Thanks, yes, that all makes a lot of sense. In this case (and in many others I assume) the PDF ends up being the only way to get a reasonable level of layout into Publisher. In that sense, it's about the quickest way to fix it. The fact that I couldn't copy and paste the offending character was a bit of a pain. Perhaps it should be reported as a bug for review, the system font was missing that character. Hmmmmm. I think this may be more of a system error that there may be no work around for.

Link to comment
Share on other sites

42 minutes ago, JeffreyK said:

the PDF ends up being the only way to get a reasonable level of layout into Publisher

Check the export settings from the originating program to see if there is an option to embed the complete font instead of a subset, or perhaps to not embed the fonts at all given that the embedded fonts aren't used by Publisher anyway.  In that situation it might export something that imports a bit more cleanly?

 

43 minutes ago, JeffreyK said:

The fact that I couldn't copy and paste the offending character was a bit of a pain.

Yeah, that is one piece of this that I can't really explain.

Link to comment
Share on other sites

  • 6 months later...

I recently encountered the same problem. "ti" became G and tt became l'. Happened when I exported to create a pdf. Because I was working with a 300 page document this was a hassle. Turned out that several chapters written quite some time ago were in Calibri font  and were scattered throughout.  Was this ever resolved?

 

Michele

2021-03-20_12-45-00.png

Link to comment
Share on other sites

2 hours ago, 'Chele said:

I recently encountered the same problem. "ti" became G and tt became l'. Happened when I exported to create a pdf.

Typically this happens when opening a PDF as the document source, not when exporting to PDF. What was the document original source?

Link to comment
Share on other sites

  • 2 weeks later...

I have had to turn off standard ligatures in the typography controls to prevent exported pdf files from having ligatures replaced with garbage characters when the document is opened in another application. As far as I can tell, this must be done for each text frame on each page. My document was only 32 pages, and not all had text, so this was manageable. However, if the document was at all lengthy, this would be a nightmare. There needs to be a global way to turn off ligatures for the whole document. If this exists, I have not been able to find it yet.

Richard Bryan

Link to comment
Share on other sites

7 minutes ago, R2B said:

I have had to turn off standard ligatures in the typography controls to prevent exported pdf files from having ligatures replaced with garbage characters when the document is opened in another application.

Or, you could click the More... button when exporting, and at the bottom, for Embed Fonts, choose All Fonts and make sure that Subset Fonts is not selected.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

12 minutes ago, R2B said:

There needs to be a global way to turn off ligatures for the whole document. If this exists, I have not been able to find it yet.

Assuming that you:

  • Are using Text Styles for all your text, and
  • Use the standard Text Styles supplied by Serif, or your own hierarchical set that all derive from a base style ("Base" in the default set of text styles)

then just edit the Base style and turn off the Standard Ligatures option there.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

5 hours ago, walt.farrell said:

Assuming that you:

  • Are using Text Styles for all your text, and
  • Use the standard Text Styles supplied by Serif, or your own hierarchical set that all derive from a base style ("Base" in the default set of text styles)

then just edit the Base style and turn off the Standard Ligatures option there.

I'm not using any styles, so there is no base style to edit.

Richard Bryan

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.