Jump to content
You must now use your email address to sign in [click for more info] ×

Strange behaviour when capturing hyphens from pdf


Recommended Posts

I have noticed that when importing pdf documents into Publisher some hyphenated words "swallow up" the hyphens. Best explained by reference to attached files:

OrigP6.pdf is the pdf file that I imported into publisher. I added yellow highlighting at problem words.

OrigP6Beta1.9.0.874.afpub is how it appears after import. The yellow highlighting shows the hyphenated words where the hyphen is misplaced. This import did not have "favour editable text over fidelity" selected.

OrigP6Beta1.9.0.874-opt2.afpub is how it appears after import with "favour editable text over fidelity" selected. Interestingly, there are additional hyphens that did not import tidily, as well as other minor issues (generally highlighted) and one that was an issue in the first import but not in the second.

Page6ExPub is a pdf export of OrigP6Beta1.9.0.874.afpub

I guess one should not expect perfect import of pdf documents, but this hyphenation issue is difficult to spot easily and has led to a large amount of additional work to sort it out (especially as I used the maximum fidelity option, with a new paragraph at the end of every line).

This issue also exists in earlier versions of Publisher, but I include it here as this beta is directed at improving pdf import among other things.

OrigP6.pdf

OrigP6Beta1.9.0.874.afpub

OrigP6Beta1.9.0.874-opt2.afpub

Page6ExPub.pdf

Link to comment
Share on other sites

My input pdf was also produced from Quark Express. I have tried capturing other pdf files (presumably not created from Quark Express), but have not been able to reproduce this issue with hyphenation. All the Quark Files I have captured did exhibit this pheonomenon, however.

Given the importance of Quark Express in the Desktop Publishing domain, I hope that Affinity will see fit to address the issue. It might just be a simple fix!

Link to comment
Share on other sites

  • 2 weeks later...

I've also tested this on the mac (10.15.7 Catalina) versions of Affinity Publisher beta v1.9.0.887 and Affinity Designer beta v1.9.0.9.
Pdf produced in QuarkXpress v13.0, pdf-version v1.4 and they have the same problem.

It appears the tracking on every hyphenated word gets screwed up when imported into either Publisher or Designer, but I guess they're both using the same pdf import code.
The document has been auto-hyphenated in QuarkXpress and after importing it in Publisher/Designer, around every hyphen, even the ones that look ok, the character tracking is a negative value varying between -300 down to -500.

There is also a mysterious extra space inserted before every hyphen character, like "BE -FORE". It will appear if you select the hyphenated word and set the tracking to zero.
Maybe that "space-hyphen" combination is used in Quark to mark that the word has been auto-hyphenated and then negative tracking is used to hide the extra space? Sounds a bit wild, but who knows?

Example: in the word "BE -\nFORE" the tracking values are zero except for the space and the paragraph break where they dive to -450.
Like this: "0,0,-450,0,-450,0,0,0,0". 

You can easily produce the same "look" by typing some characters in a text frame in Publisher or Designer, selecting them and set the tracking to -450. They will overlap each other, which is the expected behaviour when setting crazy values.

The pdf I used as an example in my previous post displays correctly in Acrobat Reader so I assume either AR caps the tracking values if they're too low or they get screwed up in Publishers/Designers import code?

Link to comment
Share on other sites

  • Staff

Hi @microjez,

Sorry for the delayed reply. 

Where did you get your font from?  The one supplied by Adobe works fine: https://fonts.adobe.com/fonts/schoolbook?red=a . I tried opening the PDF, manually replacing the missing "Century Schoolbook" with "Schoolbook" and the hyphens were in the right place. I wonder if it's a problem with your version of the font

 

Link to comment
Share on other sites

The pdf I imported from Quark express was inherited, so I do not know where the original publisher obtained his fonts, but I assume they came bundled with Quark.

I am assuming the font I used in my copy of Publisher was provided with the package. I have not downloaded any additional fonts.

I hope this helps!

Link to comment
Share on other sites

Thanks for your post. My import screen looks a bit different (see uploaded image). Could this be because I have been using Century Schoolbook already? If Century schoolbook does not come "with" Publisher, I have no idea where I got it from. I am sure Century Schoolbook was the original font, as I was able to view it in a trial version of Quark.

I watched your video with interest, and noticed your indication of the Positioning and Transform area. Stepping through the string "bug-" character by character I encountered a variety of values for VA such as 0%, -537%, -80%. By manually changing these values I could retrieve the hyphen from "inside" the "g", although, as user AVOLO mentioned there is an extra space before the hyphen. So, it seems there is a way of sorting such occurrences in the text afterwards, but that is only if I can spot them! Me and my co-editor took several iterations of proof-reading before all these were found, and while it is good that Publisher is versatile enough to tweak the text using Positioning and Transform, it is unfortunate that this needs to be done manually. Of course, one could just blame Quark Express!

image.png

Link to comment
Share on other sites

  • Staff

I found an interesting bug (now logged with our developers) with this, which seems to be cause by the afpub files you've attached. Somehow if I open any of your afpub files, the font appears on my list of installed fonts, even though it's not installed. If I then try to open the PDF, I get the hyphen in the wrong place. A restart of the app seems to fix the issue, until I open that afpub file. 

Link to comment
Share on other sites

  • 2 weeks later...
On 12/30/2020 at 6:02 PM, avolo said:

I have exactly the same problem in Affinity Publisher v1.8.6.

The pdf I'm importing was produced in QuarkXpress v13.0, pdf-version v1.4.

I have attached two pics showing the problem.

 

Original.png

Imported.png

This bug is still present in the mac rc beta versions of Designer (1.9.0.9) and Publisher (1.9.0.902).
The font used in the example was plain old Times New Roman and Acrobat Reader displays it correctly.
Will this be fixed for the 1.9 release?

Link to comment
Share on other sites

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.