Lukas G.'s Content - Affinity

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

I see - I honestly assumed that was a typo, because there is no such thing as an "ANSI encoding" that's well defined. But that now that part of your answer makes sense to me - we were seeing the same thing, just referring to it by a different name "ANSI encoding" seems to be a historical misnomer common in the Windows world for Windows CP-1252 (which is almost identical to what later became ISO-8859-1): (From https://en.wikipedia.org/wiki/Windows-1252) Also: (From https://en.wikipedia.org/wiki/ANSI_character_set)

September 24, 2021
39 replies

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

@Mooxo a different workaround could be to just save your CSV with Latin-1 (ISO-8859-1) encoding from Numbers.app, instead of UTF-8. It seems that that's what Publisher uses as its fallback encoding, so this workaround might work as well, and would not even require you to add an accented character at the top. Edit: "Western (ISO Latin 1)" is what Numbers.app calls ISO-8859-1 I believe.

September 23, 2021
39 replies
- 2

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

It's not ASCII encoding that Publisher decided to go with in this case though, it's Latin-1 (ISO-8859-1). 0xC3 0xA9 is what an UTF-8 encoded é character looks like, and when it's incorrectly decoded as Latin-1 that then comes out as Ã©.

September 23, 2021
39 replies

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

@walt.farrell that makes a lot of sense. I've also had a look at the test data, and inspected it with a quick Python script. And it is a valid UTF-8 encoded CSV file. The only non-ASCII characters that occur are é and û, and they are both properly UTF-8 encoded, everywhere. So my hypothesis from above that it's mixed encodings being used in the same file that throws of Publisher's character set sniffing is wrong. But the sniffing heuristic only looking at the first 4k of data sounds very plausible. That's clearly a bug then in my opinion - at least as long as there isn't an option for the user to just select the encoding to be used, and void the need for any character set sniffing alltogether.

September 23, 2021
39 replies

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

Exactly. And those characters earlier in the file might not have been enough to bias the other Software (that apparently reads the file properly) towards "this looks like ISO-8859-1", maybe because the majority of characters are indeed UTF-8 encoded. Or some of them hard-defaulted to UTF-8, which is not an unreasonable assumption these days. And if you're really dealing with mixed encodings in the same file, there really is no right or wrong in terms of the exact implementation of the encoding detection mechanism. Some use sophisticated strategies like lookup tables for character frequencies in different languages, others are happy with the first encoding that sort of works and doesn't result in unprintable characters. The theory about mixed encodings in the data is still just speculation on my part, but I've seen stranger things with real world data.

February 1, 2021
39 replies

[1.9.0.902] Merge - Character encoding problem ?

Lukas G. replied to Delden's topic in [ARCHIVE] Publisher beta on macOS threads

@Delden this is just a shot in the dark, but is it possible that your entire data set could contain text encoded in different encodings? So, UTF-8 encoded strings as well as ISO-8859-1 ones in the same file for example? I'm asking because CSV has no metadata that allows to define the character encoding that's been used. You obviously have to select one when producing the CSV, but there's no way to include that information in the CSV itself - it has no header or any sort of metadata whatsoever. Therefore, software reading CSV often has to guess / recognize the encoding that's been used (charset sniffing). Now, if Affinity Publisher does this (and it almost has to, since it currently doesn't seem to even allow you to explicitly specify the encoding on ingest), such an heuristic could be thrown way off if there's mixed encodings used in the same file. That could be an explanation why your full data set shows the problem, but the extract doesn't (for me at least, works fine here). I would maybe quickly check if you can still reproduce the issue with your own extract. If not, that could be an indication that some other data in your full data set is throwing some kind of character set detection in AP off. (For what it's worth - the ô in your extract is correctly encoded as a UTF-8 Multi-Byte character (0xC3 0xB4). What your merged result looks like is exactly what UTF-8 looks like when it's accidentally decoded as ISO-8859-1 (Latin1)).

February 1, 2021
39 replies

Data Merge Instructions

Lukas G. replied to kevinslimp's topic in Pre-V2 Archive of Desktop Questions (macOS and Windows)

This is an example for the shortest / simplest workflow I can think of. There's obviously more to it, but this should hopefully get you started: 1) Define a data source. To do this, go to the Document menu and choose Data Merge Manager.... There, add your .csv file by clicking on the page icon in the bottom left of that dialog. In the Source section on the right hand side, check that Delimiter and Quote character are set appropriately for your CSV dialect. Make sure the proper number of fields (columns) and records have been parsed. Then Close that dialog for now via the button on the bottom right. 2) Place your Data Merge Layout Select the new blue / grey Data Merge Layout Tool in the toolbar to the left. For me it's below the picture frame icon of the Place Image tool. With that tool selected, draw up a grid on your document - exact dimensions and positioning aren't crucial yet. You now get a horizontal context toolbar at the top (above your artboard / rulers). In there, you'll want to at least set up the number of Rows and Columns according to your needs. This will determine the number of instances data merge will produce. Now fine tune the positioning and dimensions of your data merge layout (the grid you're seeing). 3) Draw your template graphic With the Data Merge Layout still selected (!), draw a rectangle in the top left cell. You should see that rectangle getting duplicated immediately across all other cells. The top left cell is considered your template, and having the Data Merge Layout selected when drawing makes sure that the content you draw then gets nested inside the Data Merge Layout in the layer panel on the right hand side. Moving existing objects into the Data Merge Layout "Group" in the layer panel is another way to bring in existing objects into the DML template. 4) Show fields panel Bring up the fields panel. If it's not visible, select the View menu at the top, and choose Studio -> Fields. At the very bottom of the fields panel (you might have to scroll) you should see a section "Data Merge - NameOfYourFile.csv". 5) Add fields to text objects Now draw up an Art Text object using the Artistic Text Tool. Again in the top left cell, with the Data Merge Layout selected. When you get the blinking cursor, insert a field from your data by double clicking on that field in the fields panel. It should be displayed as a <Field Name> place holder in your Art Text. 6) Generate Now open the Data Merge Manager again from the Document menu. Click the Generate button. This should generate as many instances of your template as there are records or cells (whichever is lower, I think). The generated instances are produced as an entirely new unsaved document. You can save / print it, or switch back to the template document using View -> Views. Hope this helps!

February 1, 2021
5 replies

Sign In

Lukas G.

Posts

Joined

Last visited

Content Type

Profiles

Forums

Everything posted by Lukas G.

[1.9.0.902] Merge - Character encoding problem ?

[1.9.0.902] Merge - Character encoding problem ?

[1.9.0.902] Merge - Character encoding problem ?

[1.9.0.902] Merge - Character encoding problem ?

[1.9.0.902] Merge - Character encoding problem ?

[1.9.0.902] Merge - Character encoding problem ?

Data Merge Instructions

Browse

Activity

Affinity

Important Information