Jump to content
You must now use your email address to sign in [click for more info] ×

Nightmare on Word number headings imported to AfPub from Word (Mac 2021)


Recommended Posts

I'm having a serious problem with importing a long document from Word with three heading levels.

I imported what looks like a perfectly formatted Word document as per screenshot 1. What I get is as per screenshot 2.

I've tried to fix this in many ways, changing styles in Word, and trying to fix in AfPub, over many hours and days. This is setting me badly behind (schedule.) But to no avail at all.

I hope someone might be able to help me with this. I'd be ever so grateful, as I'm dreading to have to go back to an InD subscription.

Charles

P.S. The text being in blue is another matter, but I can fix that, despite remaining a mystery.

Screenshot 2.png

Screenshot 1. png

Link to comment
Share on other sites

I would think there is something in the Word DOCX file that has created the various Heading styles. I would search for 

ScreenShot2024-01-19at9_06_54AM.png.f6e502d38c8dae3564ab84c589873a72.png

in the Publisher file.

Check in Word to see if those errant headings are defined.

 

Mac Pro (Late 2013) Mac OS 12.7.4 
Affinity Designer 2.4.1 | Affinity Photo 2.4.1 | Affinity Publisher 2.4.1 | Beta versions as they appear.

I have never mastered color management, period, so I cannot help with that.

Link to comment
Share on other sites

If your workflow is importing a Word document that is basically more or less fully style tagged, I would import that file in a Publisher document that has all default styles (not used) deleted. That would avoid having multiple styles with identical or nearly identical style names cluttered in the layout and confusing formatting of the document. Publisher does not have a capability to signal conflicting styles and let the user design at import time, whether to use Publisher-defined styles with the (matched/similar) style names, mapping the imported and existing styles, or overwriting Publisher styles with imported style definitions, and avoid what you describe as "style nightmare".

So if you have a good arrangement in Word already, I would recommend importing into a document that is as much as possible cleared of in-built styles.

EDIT: This would be a workable solution even if not having Word styles well-defined. Just having source text tagged with paragraph and character style names and finalizing the actual style definitions in Publisher, should work well. It is the style name conflicts that are probably the biggest nuisance in preparation of layout.

Link to comment
Share on other sites

12 hours ago, lacerto said:

If your workflow is importing a Word document that is basically more or less fully style tagged, I would import that file in a Publisher document that has all default styles (not used) deleted. That would avoid having multiple styles with identical or nearly identical style names cluttered in the layout and confusing formatting of the document. Publisher does not have a capability to signal conflicting styles and let the user design at import time, whether to use Publisher-defined styles with the (matched/similar) style names, mapping the imported and existing styles, or overwriting Publisher styles with imported style definitions, and avoid what you describe as "style nightmare".

So if you have a good arrangement in Word already, I would recommend importing into a document that is as much as possible cleared of in-built styles.

EDIT: This would be a workable solution even if not having Word styles well-defined. Just having source text tagged with paragraph and character style names and finalizing the actual style definitions in Publisher, should work well. It is the style name conflicts that are probably the biggest nuisance in preparation of layout.

 

Thanks lacerto. I've been deleting all Publisher styles before importing for a good while now: a lesson early learnt. I'll see what more I can do with the word styles now, and follow what the other guys responding suggest.

Link to comment
Share on other sites

1 hour ago, charlesbewlay said:

I've been deleting all Publisher styles before importing for a good while now: a lesson early learnt.

Ok, fine. Have you checked if your document contains obsolete (old but not currently used) style definitions as Publisher might import all styles whether used or not [the screenshot is from Windows version but you should have something like this also on macOS version]:

image.png.6c249b596bc8cf16abfcc3db21480f26.png

Link to comment
Share on other sites

13 hours ago, MikeTO said:

I'd need to see a sample of the document to figure it out. I created this test document using the multi-level list feature of Word and it imported perfectly into Publisher.

test.docx 13 kB · 0 downloads

 

Thanks Mike. I note you have used List Paragraph rather than Heading 1, 2  etc. And your sample is fine. I duplicated what you did and can even add first level. But in List Styles (screenshot) under numbering 'No list' is selected. But when I try to replicate that I have to select 1/1.1/1.11. Mighty mysterious.

I attach my sample, the Word version and the AfPub version. Even though all changes are accepted in Word's Review, and in Tacking, no markup and nothing is selected in the options, AfPub seems to import a lot of chaos after the bottom of page 14, and also adds deleted pages in the prelims. NOne of that happens if I import the full document.

Any way forward?

Screenshot 2024-01-20 at 09.55.28.png

BOCRA sample for forum.afpub Sample UNDERSTANDING COMMS FINAL GALLEY PROOF.docx

Link to comment
Share on other sites

I downloaded your Word file and it has revision marks, which Affinity Publisher reads in. You should accept all revisions (on the Review tab) and stop tracking and then import the cleaned file:

image.png.19d9c39f34b8ddcad2c9104d1279becb.png

(Again, this is probably a bit different on macOS.)

Note that Publisher also imports all hidden text so if you have obsolete styles there, these styles would also be imported.

Link to comment
Share on other sites

15 hours ago, Old Bruce said:

I would think there is something in the Word DOCX file that has created the various Heading styles. I would search for 

ScreenShot2024-01-19at9_06_54AM.png.f6e502d38c8dae3564ab84c589873a72.png

in the Publisher file.

Check in Word to see if those errant headings are defined.

 

 

Nope, nothing like that in Word version. I also deleted unused styles, but Word keeps a great lot anyhow. It looks to me like Word on Windows has more control than on the Mac. looking at what lacerto just posted.

Link to comment
Share on other sites

I would recommend - as I already learned in my apprenticeship - always to load only unformated text (e.g. saved as a *.txt-file, that doesn't allow formatings by design). Formatings often cause problems and should always be done in the DTP-Software, not in the text editor.

Link to comment
Share on other sites

I had a look on the Word document that you posted, and when I tried to accept changes and stop tracking, Word stops responding. The same happens also on macOS Word, and LibreOffice Writer cannot handle the file, either.

But I took it to Apple's Pages, which auto-accepted all tracking changes in tables and other places where it cannot do tracking, and then I accepted manually all changes in Pages, and removed also comments. You can find attached a cleaned file. I suggest that you do the same process yourself (Pages is free app on App Store) to see that what Pages does is correct. [EDIT: It does not seem so, so the original document might just be corrupt, and require manual cleaning.]

Anyway, when I imported the cleaned document in Publisher, the style nightmare seems to be over.

image.png.6767cc23002043025fa1062d77dce393.png

Sample UNDERSTANDING COMMS FINAL GALLEY PROOF_cleaned.docx

Link to comment
Share on other sites

1 hour ago, lacerto said:

I downloaded your Word file and it has revision marks, which Affinity Publisher reads in. You should accept all revisions (on the Review tab) and stop tracking and then import the cleaned file:

image.png.19d9c39f34b8ddcad2c9104d1279becb.png

(Again, this is probably a bit different on macOS.)

Note that Publisher also imports all hidden text so if you have obsolete styles there, these styles would also be imported.

 

The original text had all accepted and was fine. the sample is crashing Word even after a restart, app and machine.

I've now dropped the original into Publisher and cut 100+ pages to make a sample, so that's attached.

Sample 2 BOCRA Pages.afpub

Link to comment
Share on other sites

27 minutes ago, lacerto said:

I had a look on the Word document that you posted, and when I tried to accept changes and stop tracking, Word stops responding. The same happens also on macOS Word, and LibreOffice Writer cannot handle the file, either.

But I took it to Apple's Pages, which auto-accepted all tracking changes in tables and other places where it cannot do tracking, and then I accepted manually all changes in Pages, and removed also comments. You can find attached a cleaned file. I suggest that you do the same process yourself (Pages is free app on App Store) to see that what Pages does is correct. [EDIT: It does not seem so, so the original document might just be corrupt, and require manual cleaning.]

Anyway, when I imported the cleaned document in Publisher, the style nightmare seems to be over.

image.png.6767cc23002043025fa1062d77dce393.png

Sample UNDERSTANDING COMMS FINAL GALLEY PROOF_cleaned.docx 24.16 kB · 0 downloads

 
 

Looks great what you've done! Thanks. Something wrong with the Word sample for sure. I'll try to replicate, but will be away for a couple of hours or so.

Link to comment
Share on other sites

It is interesting that Pages seems to be able to handle the cleaned Word file just fine -- see attached the PDF that I exported from Pages.

Sample UNDERSTANDING COMMS FINAL GALLEY PROOF.pdf

However, if I try to import in Publisher, I get oddly messed up heading numberings, numbered paragraphs, etc. 

If I import in InDesign, results are better but not nearly the same as in Pages.

UPDATE: I looked at the second Publisher sample you posted, and its heading numbering is incorrect, so the paragraph styles should be defined to have numbered lists with correct levels, and then the existing style assignments should be reapplied to get numbering right. 

UPDATE2: I fixed the major heading numbering styles (heading 1, heading 2 and heading 3) and automatically reapplied styles by using Find Replace (searching heading 1, and replacing with the same style, etc.):

Sample 2 BOCRA Pages_fixed.afpub

But there are many lists that still need to be fixed, not just numbering but also formatting.

Note that  you can restart numbering from the currently selected paragraph by using the option in the Paragraph panel:

image.png.ac63cb2d99752ed53a696f4ccab07739.png

Link to comment
Share on other sites

An afterthought: This was a really interesting problem because Apple Pages could basically fix issues with a broken Word (its ability to autoaccept table changes and control so well Word tracking is a powerful feature and makes Pages a valuable Word document tool). It is also interesting that it seemed to be able to handle numbered paragraphs so well. Without having the original Word document, it is difficult to say what actually caused the numbering chaos in Publisher (e.g., losing the number formatting). As mentioned, InDesign imported the sample Word document better but it was necessary to redefine the heading styles there, too. Yet the .docx file exported by Pages appeared to behave correctly in Word. 

EDIT: I forgot to mention that I seem to have found a bug in the UI of macOS Publisher when trying to Find Replace heading styles. When there are lots of styles, it is not possible to scroll hidden styles visible similarly as it is on Windows where there is an arrow at the bottom of the list and where the mouse wheel can also be used to scroll these kinds of lists. I could not find an alternative method of having formatting style added in Find and Replace boxes so I did this task in the Windows version.

EDIT: I realized that it is possible to use arrow keys but it is of course pretty awkward...

UPDATE: Actually LibreOffice Writer could also be used to fix the Word document [accept all changes and stop tracking] and when saved back to Word format, the numbering problem was also fixed [but only when importing to e.g. InDesign; Affinity Publisher appears to have bugs in importing lists]. There was List Paragraph style that had alphabetical list style removing of which fixed the insanely long lists in a snap, but there are probably paragraphs where this list styling was intentionally applied so careful revision job is needed to make this document work as expected. 

Link to comment
Share on other sites

3 hours ago, charlesbewlay said:

Well, as I learnt, as a book publisher, produce any number of galley proofs before dropping into pages. I've never had problems like I'm encountering now.

Well, I'm not a book publisher. I'm a media designer. I learned to prepare images in Photoshop, create graphics in Freehand, later in Illustrator, and DTP using Quark XPress, later InDesign, about twenty years ago, and now I practice it in Publisher. And we always used to create the text in text editors first, save it unformated and then load it in the DTP-Software, to do all the layout work, including the formating, because this is a well ordered and reliable workflow, that prevents needless problems.

Link to comment
Share on other sites

5 minutes ago, iconoclast said:

because this is a well ordered and reliable workflow, that prevents needless problems.

We have done DTP for decades and practically never do it like this. We typically import cleaned Word documents (images etc. stripped) keeping local formatting and unifying everything else (basically using a single font and size when applicable) and then tag paragraph styles which we apply in layout with own scripts, using already defined InDesign formatting. This means that we would normally discard list formatting but might make an exception if the text has very complex hierarchy

When working with multi-language texts, sometimes including RTL ones, with legal texts containing huge amount of footnotes, scientific texts containing equations, etc., normally with hectic schedules, there is really no alternative for us, and this workflow has never broken or caused any serious issues.

The document included in this post was somehow corrupted and would ideally require some fixes and preparation before being imported in layout, but as was shown, it was not so badly damaged that it could not be used even as it is. 

But everyone with years of experience in publishing business has of course their personal preferences and optimized workflows they want to use whenever possible.

Link to comment
Share on other sites

I did a lot of testing and in a nutshell, importing multi-level lists from MS Word doesn't work.

For Serif, this simple multi-level list in MS Word imports incorrectly into Publisher. With a blank document with all text styles deleted, place this test file into a frame. The list of styles will include some nonsense styles and the heading styles won't be properly defined so the lists will be broken.

testing.docx

For Serif, I found a second bug while looking at Charles' document. Tables form Word files aren't formatted with text styles after placing into Publisher, they are set to No Style. With a blank document with all text styles deleted, place this test file into a frame. The table text will be formatted as No Style after placing - it's formatted as Heading 2 in MS Word.

test.docx

For Charles:

I believe the issue within being unable to scroll the style list is a known bug.

Yes, you must accept all tracking changes before importing text into Publisher. I will add a tip to that effect in my manual. The sample Word file hung MS Word for me, too, when I tried to accept all changes, requiring a force quit. I fixed it with Pages as suggested above so I could play with it but I don't recommend that - Pages made a mess of the text styles and the headings became formatted with the Page Number style.

It's going to take some effort to make this work in Publisher but here's how to do it.

  1. Open the document in Word. Add a temporary paragraph outside of the table and format it as style "Table Left". That style is used in your tables but nowhere else and because Publisher doesn't style the table text the text style won't be created in Publisher. You need this style so create a temporary paragraph formatted with the style to ensure the style is imported. Save the file.
  2. Place the modified file into Publisher.
  3. Delete that temporary paragraph and format the table text as "Table Left". This will solve the problem of all the table text and paragraphs styled as "TEXT" being formatted as lists.
  4. Go to the first paragraph numbered 0.1 and use Paragraph > Bullets and Numbering to fix it. Deselect Restart Numbering and change the list name from "6" to "2" which is the name Publisher assigned to the parent list. Now it will be numbered 1.2.
  5. Using the Text Styles panel, click the menu icon to the right of Heading 2 and choose Update Heading 2. For any other mis-numbered Heading 2 paragraphs, just re-apply Heading 2 to them to clear the formatting overrides.
  6. The next problem will be the first 0.0.1 paragraph. Deselect Restart Numbering and change the list name from "9" to "2". 
  7. Using the Text Styles panel, click the menu icon to the right of Heading 3 and choose Update Heading 3. For any other mis-numbered Heading 3 paragraphs, just re-apply Heading 3 to them to clear the formatting overrides.

This should clean it up although it will take some effort.

Cheers

Download a free manual for Publisher 2.4 from this forum - expanded 300-page PDF

My system: Affinity 2.4.2 for macOS Sonoma 14.4.1, MacBook Pro 14" (M1 Pro)

Link to comment
Share on other sites

13 minutes ago, MikeTO said:

For Serif, this simple multi-level list in MS Word imports incorrectly into Publisher. With a blank document with all text styles deleted, place this test file into a frame. The list of styles will include some nonsense styles and the heading styles won't be properly defined so the lists will be broken.

I found this in the Edit Text Styles panel for heading 1. The heading 1 1 is set for Next style. If I set that Next style to Normal and then delete unused styles all the "nonsense styles" will disappear.

ScreenShot2024-01-20at9_14_17AM.png.34d9b2c8804269a215ef88f8d93b91fa.png

Some caveats are that I don't have the various fonts defined in the various text styles and also I have to use Pages and LibreOffice because I do not own Word. Not having access to Word I cannot see if there is some thing set to create extra numbered lists.

Mac Pro (Late 2013) Mac OS 12.7.4 
Affinity Designer 2.4.1 | Affinity Photo 2.4.1 | Affinity Publisher 2.4.1 | Beta versions as they appear.

I have never mastered color management, period, so I cannot help with that.

Link to comment
Share on other sites

7 hours ago, lacerto said:

It is interesting that Pages seems to be able to handle the cleaned Word file just fine -- see attached the PDF that I exported from Pages.

Sample UNDERSTANDING COMMS FINAL GALLEY PROOF.pdf 205.11 kB · 1 download

However, if I try to import in Publisher, I get oddly messed up heading numberings, numbered paragraphs, etc. 

If I import in InDesign, results are better but not nearly the same as in Pages.

UPDATE: I looked at the second Publisher sample you posted, and its heading numbering is incorrect, so the paragraph styles should be defined to have numbered lists with correct levels, and then the existing style assignments should be reapplied to get numbering right. 

UPDATE2: I fixed the major heading numbering styles (heading 1, heading 2 and heading 3) and automatically reapplied styles by using Find Replace (searching heading 1, and replacing with the same style, etc.):

Sample 2 BOCRA Pages_fixed.afpub

But there are many lists that still need to fixed, not just numbering but also formatting.

Note that  you can restart numbering from the currently selected paragraph by using the option in the Paragraph panel:

image.png.ac63cb2d99752ed53a696f4ccab07739.pngn

 
 

Lacerto, I'm gobsmacked!!! How can I ever thank you enough?? The other paras are a relatively easy fix. I'll sleep better tonight.

I'd been trying the Pages route, but that also had problems. At least it now does footnotes, but no indexing (powerful in AfPub!). It seems like there is no import option, only Open, so that gives page layout problems, and using Convert to Page Layout just erases all text. Anyhow another story, but has it uses for sure, so thanks for reawaking me up to it as well.

Link to comment
Share on other sites

4 hours ago, iconoclast said:

Well, I'm not a book publisher. I'm a media designer. I learned to prepare images in Photoshop, create graphics in Freehand, later in Illustrator, and DTP using Quark XPress, later InDesign, about twenty years ago, and now I practice it in Publisher. And we always used to create the text in text editors first, save it unformated and then load it in the DTP-Software, to do all the layout work, including the formating, because this is a well ordered and reliable workflow, that prevents needless problems.

 

Yes, we are in different worlds really. I'd do the same as you if graphics were a big part of what I do. I started with Pagemaker, and a bit of Freehand!

Link to comment
Share on other sites

13 hours ago, charlesbewlay said:

Lacerto, I'm gobsmacked!!! How can I ever thank you enough?? The other paras are a relatively easy fix. I'll sleep better tonight.

You're welcome. I am happy to be able to help, and always learn something myself, too! In addition to Pages, I found LibreOffice very useful, too, it could be used to accept all revision marks and stop tracking, though I am not sure that the results were expected (as it also accepted deletions, which possibly were not intentional -- I suppose tracking was actually left on inadvertently, when preparing a version just for the forum review). UPDATE: LibreOffice would also support indexes and most other Word features and would therefore probably be the best tool to fix corrupted Word documents.

It was really odd that Pages could do numbering so well -- as you mentioned Pages yourself, could it be that the text was at some point in Pages???

I wish you get it sorted out -- it is not a small task to learn to use new tools, even if much might be familiar from other similar apps! 

Link to comment
Share on other sites

4 hours ago, lacerto said:

LibreOffice […] would therefore probably be the best tool to fix corrupted Word documents.

It's definitely commonly advised to use LibreOffice to open and resave as .docx a corrupted Word file, as this software interprets quite well all Word features and re-encode them at export. 

For example:
https://answers.microsoft.com/fr-fr/msoffice/forum/all/ouverture-du-fichier-impossible-fichier-corrompu/9a89a23c-b68e-414c-9f20-83ac6b67b493

 

Affinity Suite 2.4 – Monterey 12.7.4 – MacBookPro 14" 2021 M1 Pro 16Go/1To

I apologise for any approximations in my English. It is not my mother tongue.

Link to comment
Share on other sites

12 hours ago, Oufti said:

It's definitely commonly advised to use LibreOffice to open and resave as .docx a corrupted Word file, as this software interprets quite well all Word features and re-encode them at export.

Yes, it seems so. However, in the end using Word to "fix" the tracking issue was adequate, as the issue that caused Word becoming unresponsive could be avoided simply by rejecting (instead of accepting) the last made changes that the author had made to hide part of the content, and especially by rejecting one fatal formatting change which probably caused the crash because if accepted, alpha-formatted list was applied to rest of the text paragraphs (including inside tables), creating absurdly long alpha indexes. After doing this, tracking could be stopped and the document behaved just fine. So this "error" was a kind of a secondary issue not related to author's actual dilemma with Publisher importing numbered lists incorrectly.

The numbering issue itself was something that could (a bit surprisingly perhaps) be improved when taking the document to LibreOffice Writer and simply just letting it save the document. On the other hand, Pages could import numbering correctly even without this fix so it is unclear whether there were initially any errors. But after the file was saved using LibreOffice Writer, InDesign could import the document so that numbering was retained and related styles correctly defined (with only two exceptions at the start of the document). Publisher however still imported incorrectly the lists encoded by LibreOffice Writer.

As a general note, Publisher still appears to lack tools that make it easier to fix numbering errors, most importantly ability to flatten autonumbering to text, and it seems, also ability to apply "continue numbering" command (as a pair of restarting numbering, which is supported). When document hierarchy is several levels deep, and numbering schemes are complex, it is often easiest to go through auto-numbered lists one by one in sequence and first make numbering corrections using automation if possible, but thereafter flatten the list (which is often also useful for checking formatting integrity, since indexes of auto-numbered lists cannot be selected. Sometimes a lot of time can be saved by simply just flattening lists and typing in correct custom indexes that do not come naturally and cause problems.

Converting autonumbering to text is also supported in Word, but I think the feature can only be accessed via a VBA macro (which could of course be placed on any of the toolbars or menus). In case someone is interested, here are the required macro commands (operable both in Windows and macOS version of Word):

Sub ConvertListsToText()
    ActiveDocument.ConvertNumbersToText
End Sub
Sub ConvertActiveListToText()
    Selection.Range.ListFormat.ConvertNumbersToText
End Sub

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.