Jump to content
You must now use your email address to sign in [click for more info] ×

Hyphenation exceptions dictionary


Recommended Posts

I realize that we're limited to the Hunspell hyphenation dictionaries and that Hunspell doesn't allow user hyphenation dictionaries like it does for spelling, but would it be possible for us to create an Affinity-specific exceptions list in Settings (like AutoCorrect) that Affinity could use to override what Hunspell would do?

For example, Hunspell (US English) hyphenates cardiopulmonary as car~diopul~monary which is awful, it should be car~dio~pul~mo~nary.

Users could enter words with tildes at the desired break points. And to prevent a word from being hyphenated, enter it without any breakpoints. For example, Hunspell hyphenates Facebook as Face~book but a product name should never be hyphenated.

Thank you

Download a free manual for Publisher 2.4 from this forum - expanded 300-page PDF

My system: Affinity 2.4.2 for macOS Sonoma 14.4.1, MacBook Pro 14" (M1 Pro)

Link to comment
Share on other sites

On 2/19/2024 at 4:33 AM, MikeTO said:

I realize that we're limited to the Hunspell hyphenation dictionaries and that Hunspell doesn't allow user hyphenation dictionaries as it does for spelling, but would it be possible for us to create an Affinity-specific exceptions list in Settings (like Autocorrect) that Affinity could use to override what Hunspell would do?

For example, Hunspell (US English) hyphenates cardiopulmonary as car~diopul~monary which is awful, it should be car~dio~pul~mo~nary.

Users could enter words with tildes at the desired break points. To prevent a word from being hyphenated, enter it without any breakpoints. For example, Hunspell hyphenates Facebook as Facebook, but a product name should never be hyphenated.

Thank you

The question is why are we limited to Hunspell hyphenation dictionaries? Are there no other free-to-use dictionaries that could be implemented in Affinity in a supplementary fashion?

Link to comment
Share on other sites

10 minutes ago, Archangel said:

The question is why are we limited to Hunspell hyphenation dictionaries? Are there no other free-to-use dictionaries that could be implemented in Affinity in a supplementary fashion?

Because that's what Serif has chosen to use for spell checking and hyphenation.

Could they implement something else? Certainly, if they chose to.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

6 hours ago, Archangel said:

The question is why are we limited to Hunspell hyphenation dictionaries? Are there no other free-to-use dictionaries that could be implemented in Affinity in a supplementary fashion?

There might be other free options for specific languages but Hunspell is the only global dictionary system that is free to use, especially for hyphenation.

Download a free manual for Publisher 2.4 from this forum - expanded 300-page PDF

My system: Affinity 2.4.2 for macOS Sonoma 14.4.1, MacBook Pro 14" (M1 Pro)

Link to comment
Share on other sites

A bit more info for those interested.

Hyphenation systems use pattern matching rather than word lists because they were created in the early '80s when memory and storage were limited. I believe the patterns in the Hunspell US English hyphenation dictionary were generated from the 1966 Webster's Pocket Dictionary of 50K words. The UK English hyphenation dictionary was generated from an Oxford University Press dictionary with 115K words and it has more than twice as many patterns. Now that memory and storage are no longer limited, it would be better to generate many more patterns from larger dictionaries or switch to word lists but that would require paid licenses so we're stuck with what was created long ago. Many exceptions have been added to the US hyphenation dictionary to compensate for its limited patterns but far fewer have been needed for the UK dictionary. UK English hyphenation is pretty decent but US English hyphenation is somewhat limited.

For Canadians and Australians who don't have their own English hyphenation dictionary: Setting the spelling language to English Canada or Australia and leaving hyphenation language set to Auto will default to US English hyphenation. Canadian hyphenation is a combination of US and UK hyphenation usage and I assume Australian hyphenation is similar so you would likely get better results by changing Character > Language > Hyphenation from "Auto" to "English (United Kingdom)".

For Americans unhappy with the quality of US hyphenation, you could consider trying UK hyphenation. It would introduce some hyphenation errors but would others. If you do this you should closely inspect the hyphenation results.

Download a free manual for Publisher 2.4 from this forum - expanded 300-page PDF

My system: Affinity 2.4.2 for macOS Sonoma 14.4.1, MacBook Pro 14" (M1 Pro)

Link to comment
Share on other sites

Summary of the below for those who don't want to read: I recommend that Canadian and Australian users switch from Auto to English (UK) hyphenation for better results. And Americans might want to switch, too, because Hunspell's US hyphenation dictionary is so poor.

I decided to test Canadian and Australian hyphenation to provide a solid recommendation for Canadian and Australian users in my manual. Grammar sites will usually tell you that our hyphenation is similar to American hyphenation but with differences for compound words and adjectives followed by gerunds. And they'll tell you that UK hyphenation rules require hyphens between prefixes (pre-trial) whereas Americans will omit the hyphen. But these are all grammatical issues and we type these hyphens manually. These differences in rules don't impact automatic paragraph hyphenation.

In the absence of Canadian or Australian Hunspell hyphenation dictionaries, Publisher should therefore default to American hyphenation, which it does. But as I wrote above, the American hyphenation dictionary is very limited compared to the UK one and doesn't work well. It's not Publisher's fault that Hunspell has a poor American dictionary, the Auto setting is doing what it should. I wanted to know whether there were downsides in switching to the UK hyphenation dictionary.

I turned to Microsoft Word which is as close as we have to a gold standard for automatic hyphenation. I created a test file of various words, including some words specific to US and UK dictionaries, and some jargon words that aren't in the dictionary and which were flagged as misspellings, and compared the automatic breakpoints for US, UK, Canadian, and Australian English. Every word hyphenated the same with all four languages. I'm sure there are differences, but I didn't find them in my testing.

I then repeated the test in Publisher, comparing the US and UK dictionaries, the only ones available, to see which would be better for Canadians and Australians, and my determination is that Canadian and Australian users should definitely switch to the UK dictionary. In fact, I think Americans should switch, too, because the UK dictionary hyphenates US spellings better than the US dictionary. Remember that the hyphenation dictionary isn't a dictionary of spellings but of letter patterns generated from UK spellings. It just turns out those UK patterns work pretty well on American spellings, and better than the patterns generated from the smaller American dictionary.

I recommend giving it a try with one of your documents - edit the body or base text style and change hyphenation language from Auto to English (United Kingdom) and compare the hyphenation.

Edited by MikeTO
corrected typo

Download a free manual for Publisher 2.4 from this forum - expanded 300-page PDF

My system: Affinity 2.4.2 for macOS Sonoma 14.4.1, MacBook Pro 14" (M1 Pro)

Link to comment
Share on other sites

Just now, walt.farrell said:

Sorry, but something seems wrong with that. If English (US) gives better results, and American users should switch, too, how does that work if Hunspell's US Hyphenation dictionary is so poor?

I am thinking the (US) is a typo or an autofill mistake.

Am I correct @MikeTO ?

Mac Pro (Late 2013) Mac OS 12.7.4 
Affinity Designer 2.4.1 | Affinity Photo 2.4.1 | Affinity Publisher 2.4.1 | Beta versions as they appear.

I have never mastered color management, period, so I cannot help with that.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.