pdussart Posted March 10, 2019 Share Posted March 10, 2019 Dear, If I open a PDF file with Affinity 249 on Windows 10, in some cases, - hyphen are disappearing with Affinity, while this issue does nots exists with Adobe Reader, Microsoft Word, Edge, Scribus, etc I think that during the open process, the character before the hyphen gets transformed into some combination with the previous character. See input PDF file, description of findings + screens and .AFPUB files in attachement Regards, Philippe Affinity issue Open PDF.afpub Affinity issue Open PDF.pdf mania201962p04.pdf Link to comment Share on other sites More sharing options...
Wosven Posted March 10, 2019 Share Posted March 10, 2019 The hyphen is transformed in "tiret conditionnel" (conditional hyphen). Perhaps because when opening a PDF, there's no possibility to distinguish them from hyphen. Since we use a lot of them in French, it would be better to have only regular hyphens, and delete manually the one added by hyphenation. Link to comment Share on other sites More sharing options...
kenmcd Posted March 10, 2019 Share Posted March 10, 2019 It appears that the APub PDF import does not recognize the soft-hyphen there. U+00AD : SOFT HYPHEN [SHY] {discretionary hyphen} Not sure why the author would use a soft-hyphen there, but that was the Unicode ID I got. I edited the PDF and copied the characters and pasted them into a Unicode identifier tool. I edited the original PDF and replaced that character with a regular hyphen and then it imported fine. This does not seem to be an APub error. The soft-hyphen should only appear if it is needed at the end of a line. So I do not know how it ended up being the required dash/hyphen between those words in that PDF. @Wosven You posted while I was testing. How does a soft-hyphen or conditional hyphen end up in the middle of a sentence in French? How is it used? I do not understand. Wosven 1 Link to comment Share on other sites More sharing options...
Wosven Posted March 10, 2019 Share Posted March 10, 2019 @LibreTraining They don't unless we need them and manually insert some to add a hyphenation that doesn't follow the rule set in the paragraph style! (usually for aesthetic). It's a reason to have them in a text, if the text flow differently, they'll disappear. Some can be imported from Word too, and they can appear/be visible in ID (why not in APub?), and we need to delete them (I didn't checked their unicode value, I thought they were converted to regular hyphens, while exporting as PDF or copying-pasting). But they shouldn't be used in words where they need to be visible, like compound words: Jean-Marie (J.-M.), sac-à-main… Another strange point in this PDF is that's there aren't regular spaces too. Link to comment Share on other sites More sharing options...
kenmcd Posted March 10, 2019 Share Posted March 10, 2019 Yes, that is how I would expect soft-hyphens to be used. So it is the same as I am familiar with in English. No special use. But it is weird that a soft-hyphen appears in this compound word. I would expect to see a non-breaking hyphen, or a regular hyphen, but not a soft-hyphen. The PDF document info says it was created with Scribus 1.5.4. How can a soft-hyphen appear in the middle of a sentence? I'm soooo confused. Link to comment Share on other sites More sharing options...
Wosven Posted March 10, 2019 Share Posted March 10, 2019 Same effect with a PDF created with Scribus 1.5.3! Soft-hyphen and non-breaking spaces instead of hyphens and spaces. Document-1.pdf [Edit] same problem with Scribus 1.5.4, with copied or written text Document-2.pdf Link to comment Share on other sites More sharing options...
kenmcd Posted March 10, 2019 Share Posted March 10, 2019 Well that is really odd. Apparently this has been an issue with Scribus for awhile. I found an unanswered post in the Scribus forum from Nov. 2017.Hyphen-minus changed to soft hyphen in PDF Link to comment Share on other sites More sharing options...
pdussart Posted March 11, 2019 Author Share Posted March 11, 2019 (edited) Dear all, I read different things about "soft hyphen" etc. I have no clue about what this is. I have used this version of Scribus and the other tools mentioned without any issue with my German printers for PDF-X3 documents. About 20 magazines and four books. I even took the risk of using Publisher 145 for printing a book. It worked fine ! (my old PC was windows 8.1 at the time - I got a new PC with windows 10 in January) As far as I am concerned, I just use the hyphen that shows on my keyboard. No Alt+ combination or whatever "clever". Note: Same issue with Affinity Photo and Affinity Designer I can only conclude that there is a "glitch" somewhere with Affinity tools. Note 2: I am using common fonts like Arial, Times New Roman, Helvetica and Verdana. Nothing fancy. Regards, Philippe Edited March 11, 2019 by pdussart Additinal note about fonts used Link to comment Share on other sites More sharing options...
kenmcd Posted March 11, 2019 Share Posted March 11, 2019 The problem is not an Affinity "glitch." The problem in Scribus is producing PDFs with incorrect characters. Any application importing these broken PDFs is going to properly import these wrong characters as written. The PDFs will print properly, but that does not mean they are structurally correct. There is no rational reason that I am aware of to justify replacing all spaces with non-breaking spaces. Link to comment Share on other sites More sharing options...
pdussart Posted March 16, 2019 Author Share Posted March 16, 2019 Dear All, I have no way to display control characters of PDF files, so I cannot say whether Scribus writes a PDF file correctly or not (see above) But I have another example of conversion issues with the Word -> PDF -> Publisher process. 1) Create a page with Microsoft Word (bought Jan 2019, new PC on Windows 10) Set Word so that all control characters are shown 2) Save Word as PDF (with option rasterize if not embedded) or PDF/A Results in Affinity look the same. Both tested. 3) Open PDF in Publisher 247 with option "Group Lines in text frame" - the result is "strange" Publisher creates Text Frames that do not seem consistant with Word control characters. An additional logic seem to take place. - e.g. in some cases (after 10 spaces) a new frame is created, regardless of a character paragraph end In this case, I think that a new frame should not be created by Publisher after a certain amount of spaces, but after a paragraph end. I would also suggest to add a third option when opening PDF files : "Convert blocks of text" in a "structured" set of frames. Ideally, categories of blocks would be: Heading - Core text (in one or more columns) - Footer This would be an useful option for publishing simple documents as books or magazines containing "structured" pages of text. Alternatively, a third option would be "For each page, put all text in a single Text Frame, while respecting the original visual position as per PDF display by Acrobat Reader. Also, a simple tool to merge Text Frames would be welcome: - CTRL+ right click on frames to select, then merge. - Option 1: Keep the the X Y positions of the text as per PDF display by Acrobat Reader. - Option 2: Wrap text in the merged frame. 2 PDF files in attachment + 5 screen shots. Note: check the sequence of file names Regards, Philippe Publisher PDF import Text Frames Example A.pdf Publisher PDF import Text Frames Example B.pdf Link to comment Share on other sites More sharing options...
Recommended Posts