Jump to content

Recommended Posts

Posted

Hello,

I'm trying to edit a document created by `pdfsandwich`. What this tool does is generating invisible text on top of a scanned page so text can be searched & selected. I've tried to export this document to PDF, but after doing so the text is nowhere to be found (I've reimported the PDF into Publisher, and only the images are there, there are no text layers nor text as curves or anything similar).

I've also tried creating a document (attached) with invisible text (both by setting the font color to invisible as well as making the layer itself invisible). I was expecting all text to be selectable (Test 1, 2 and 3) but I can't make it happen.

The only PDF export feature I can think of that might affect this is "Include Invisible Layers", which is marked.

If this is some sort of bug, does anybody know of a workaround? I need to release this to my client as soon as possible, so hopefully I don't have to start from scratch...

untitled.afpub

Posted (edited)

Try setting the text colour to something and the layer opacity to 1%
This example has black text over the image, the text is searchable and selectable
Using Publisher V1 & Chrome to view the pdf

InvisibleText.jpg

Edited by David in Яuislip
color corrected to colour

Microsoft Windows 11 Home, Intel i7-1360P 2.20 GHz, 32 GB RAM, 1TB SSD, Intel Iris Xe
Affinity Photo - 24/05/20, Affinity Publisher - 06/12/20, KTM Superduke - 27/09/10

Posted

Thanks for that workaround @David in Яuislip, I'll try it in V2 and report back.

However I'd still like to know if this is a bug, or if there is a better way to do this because this workaround will have me modifying 200 pages by hand, which is also going to take a long time...

Thanks again, really appreciate it!

Posted
On 4/25/2023 at 1:29 PM, arcnor said:

if there is a better way to do this because this workaround will have me modifying 200 pages by hand, which is also going to take a long time...

If your text has a text style applied, you are (probably) in luck. Just change the colour fill for that text style.

----------
Windows 10 / 11, Complete Suite Retail and Beta

Posted
1 hour ago, joe_l said:

If your text has a text style applied, you are (probably) in luck. Just change the colour fill for that text style.

It sounds l Ike the OP is Opening a PDF file created by another application. In that case, none of the incoming text will have a Text Style applied, as PDF does not support Text Styles.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
    Laptop 2: Windows 11 Pro 24H2,  16GB memory, Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) 12 Core CPU 4.01 GHz, Qualcomm(R) Adreno(TM) X1-85 GPU
iPad:  iPad Pro M1, 12.9": iPadOS 18.3.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sequoia 15.0.1

Posted
On 4/25/2023 at 11:32 AM, arcnor said:

I've also tried creating a document (attached) with invisible text (both by setting the font color to invisible as well as making the layer itself invisible). I was expecting all text to be selectable (Test 1, 2 and 3) but I can't make it happen.

Hi @arcnor If you are using Publisher 2 have you tried Select/Select Object/Frame Text or Art Text? It works over multiple pages. If that doesn't do what you want it might be worth creating a dummy pdf using pdfsandwich and sharing it on this forum for people to investigate.

Windows 10 Pro, I5 3.3G PC 16G RAM

Posted

Hi @MickRose, my initial message has a test file attached already.

As for what you mention, I think you might have misunderstood me. What I want is selectable invisible text on the PDF output, it has nothing to do with selecting in the editor itself (but I used those selection tools to perform the workaround, for sure).

Posted

Hi @arcnor sorry - I still don't understand. You supplied a afpub file which can be opened in Publisher 2 and the hidden text layers on the 2 document pages can easily be selected and the text colour changed etc. Are you wanting to select text from your PDF within your PDF editor (which is?) or within Publisher? If you are wanting to use Acrobat Pro to select invisible objects, there is PitStop Professional but that is very expensive.

Windows 10 Pro, I5 3.3G PC 16G RAM

Posted
18 minutes ago, MickRose said:

PDF viewers are just that - viewers. They are not designed to edit within the PDF.

But some do allow you to select and copy text, which you can then paste elsewhere. I think that @arcnor is trying to accomplish that.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
    Laptop 2: Windows 11 Pro 24H2,  16GB memory, Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) 12 Core CPU 4.01 GHz, Qualcomm(R) Adreno(TM) X1-85 GPU
iPad:  iPad Pro M1, 12.9": iPadOS 18.3.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sequoia 15.0.1

Posted

Hi @arcnor - Are you hoping that Publisher will make a 3 layer PDF file and that a PDF viewer can select individual layers? If so then I think you will be disappointed. There are 2 issues here. Firstly Affinity Publisher layers are not layers in the sense that other dtp programs (such as InDesign) use layers - they are more like an ordered object list. As far as I know Affinity Publisher does not create layers which are user selectable in Acrobat. If I am wrong perhaps others will say so. Secondly you would need to check whether your PDF viewer can access the layers within the PDF. I think that is unlikely but I might be wrong.

Windows 10 Pro, I5 3.3G PC 16G RAM

Posted
On 4/25/2023 at 11:32 AM, arcnor said:

I'm trying to edit a document created by `pdfsandwich`. What this tool does is generating invisible text on top of a scanned page so text can be searched & selected. I've tried to export this document to PDF, but after doing so the text is nowhere to be found (I've reimported the PDF into Publisher, and only the images are there, there are no text layers nor text as curves or anything similar).

As far as I'm aware pdfsandwich generates invisible text below the scanned image but it does so using a glyphless font, i.e., a font with a single glyph that occupies empty space making it truly invisible. When you open an OCR generated pdf in Publisher created using pdfsandwich you will see that the text layers have no file or stroke colour assigned to them.

The glyphless font works in conjunction with a CIDToGIDMap where the Character Identifiers (CID, i.e., a font whose glyphs have no names are described in relation to a character collection) are mapped to the corresponding Glyph Identifier (GID, i.e., a device for mapping character codes to Unicode code points). This allows the invisible text to become searchable in the pdf file.

On 4/25/2023 at 11:32 AM, arcnor said:

I've also tried creating a document (attached) with invisible text (both by setting the font color to invisible as well as making the layer itself invisible). I was expecting all text to be selectable (Test 1, 2 and 3) but I can't make it happen.

Making an 'actual' font invisible by setting the font colour to none in Publisher will treat the text as though it isn't included and it won't be searchable in a pdf.

Toggling the layer visibility off in Publisher so the text isn't 'visible' will result in an exported pdf with searchable text, however this also results in a warning message in most pdf readers, e.g., Acrobat Reader and Foxit Reader saying "The result occurred on a layer that is not currently visible. Would you like to make the layer visible now?" The text will also automatically still be visible in both the macOS Finder and in Apple Preview which doesn't support Layers.

So neither of these options is going to give you what you want.

In your original file, Untitled.pdf, all the text is selectable but that is because the 'Test 2' and 'Test 3' text are set to 1% opacity and is therefore still visible which you can see when zooming in to the text in Acrobat Reader.

On 4/25/2023 at 11:32 AM, arcnor said:

The only PDF export feature I can think of that might affect this is "Include Invisible Layers", which is marked.

The Layers referenced in the pdf export settings refers to Layers created using the 'Add Layer' option at the bottom of the Layers Panel. These Layers can be toggled on and off in pdf Readers such as Acrobat to show or hide content within the pdf file, useful for maps where you may want to see different geographical characteristics or how boundaries have changed over time and so on.

On 4/25/2023 at 11:32 AM, arcnor said:

If this is some sort of bug, does anybody know of a workaround? I need to release this to my client as soon as possible, so hopefully I don't have to start from scratch...

If your objective is simply to make the invisible text visible in the pdf generated using pdfsandwich then all you need to do is open it in Publisher, hide the image layer at the bottom of the layer stack, select the text layers, give them a colour and change the default mapped font if not suitable to something else.

If your objective is to create 'invisible' text using Publisher in a similar fashion to pdfsandwich and then export this to a pdf so it is truly 'invisible' but searchable in a pdf reader then this is not possible. The closest you can likley get is to use the same colour text as your document background so the text is not visible in the pdf reader but can be searched like the file attached (e.g., search for Alice in Acrobat Reader) though obviously the text can still be copied and pasted anywhere and is therefore not 'secure' if that is the intention.

Can you elaborate a little further on what it is you need to achieve, i.e., is it simply to make the invisible text generated by pdfsandwich visible or is it to be able to create your own document with invisible text?

Alice in Wonderland.pdf

Affinity Designer 2.6.0 | Affinity Photo 2.6.0 | Affinity Publisher 2.6.0
MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Posted

Thank you @Hangman, that makes a lot of sense. And thank you as well @MickRose, I think you still misunderstood what I was trying to do but I really appreciate you taking the time to help.

3 minutes ago, Hangman said:

In your original file, Untitled.pdf, all the text is selectable but that is because the 'Test 2' and 'Test 3' text are set to 1% opacity and is therefore still visible which you can see when zooming in to the text in Acrobat Reader.

Yeah, this is using the workaround above, because I couldn't find a better one.

4 minutes ago, Hangman said:

If your objective is to create 'invisible' text using Publisher in a similar fashion to pdfsandwich and then export this to a pdf so it is truly 'invisible' but searchable in a pdf reader then this is not possible. The closest you can likley get is to use the same colour text as your document background so the text is not visible in the pdf reader but can be searched like the file attached (e.g., search for Alice in Acrobat Reader) though obviously the text can still be copied and pasted anywhere and is therefore not 'secure' if that is the intention.

This is exactly what I want, and it's really sad it's not possible to make it work. I feel that playing with the color is not really a good solution, because there is not perfect matching between what the text shows and what the image below it has, so I guess for now I'll have to continue using the 1% opacity trick.

But given your explanation of what PDFSandwich does, maybe I can create a tool that will preprocess an existing file and just replace everything with a glyphless font + CIDToGIDMap in the same way, I'll have to brush up on my PDF knowledge, but that should be doable (for next projects, as I mentioned I had to provide something for my client ASAP).

Thanks again!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.