Jump to content

[PDF Import Export] invisible Text not selectable after export


Recommended Posts

I just had the problem that my PDFs that had selectable Text lost the function on exporting with Publisher 2.x (Final 2.1 and Beta 2.2.1931).

Documents are OCR scanned with Epson Document Capture Pro v3.3.3. An example is attached.

One problem is, that the image is on top and the text has no style/color.

The only workaround I have found for me was giving the text a color and set opacity to 1%.

But that is still ugly.

 

Layer_Art_Text.png.fa77e201ab80e02e22d638096f5fd718.png

 

 

While searching I found no bug report but after a while the exact same problem in a discussion with the same workaround:

 

As I was unable to find an existing bug report, I thought it would be worth to open it somewhere as I find it a core feature.

 

 

pdf-example-selectable-text.pdf

System                                                              Notebook Lenovo P50

CPU: AMD Ryzen 3900x                              CPU: Intel i7-6700HQ

RAM: 64 GB                                                   RAM: 32 GB

GPU: RTX 3080 TI                                         GPU: NVIDIA Quadro M2000M 4 GB

SSD: Samsung 980 PRO 1 TB                       SSD: Samsung 850 Pro 500GB

OS: Windows 11 Edu x64                               OS: Windows 10 x64

TFT: 1 x Samsung C49RG94SSU                  TFT: 2 x Lenovo

Link to comment
Share on other sites

Hi @af-user,

Is there any reason why you can't simply hide, delete or move the scanned layer to the bottom of the layer stack and hide it and then just give the text layers a colour fill? Doing so then exports you pdf with selectable, editable text?

What is it you are wanting to do with the exported document and is there any need to keep the scanned layer, i.e., the layer at the top of the layer stack?

Affinity Designer 2.5.5 | Affinity Photo 2.5.5 | Affinity Publisher 2.5.5
Affinity Designer Beta 2.6.0.2861 | Affinity Photo Beta 2.6.0.2861 | Affinity Publisher Beta 2.6.0.2861

MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Link to comment
Share on other sites

Yes this is what I do, I move the image to the back, set a color to the text, and opacity to 1%.

The need is to have a visual unmodified correct representation of the scanned document but to have the possibility to search and copy Text.

The question here is, when such a document is imported, it could either do these steps automatically to have a better import or to have the full same functionality as the imported document would be perfect.

 

To be more specific: In my case it is documents that you need to have the correct exact image for legal / signed documents as OCR could do harm.

OCR Text is still good for searching / indexing documents.

 

Just for a reference, there was a very funny "feature" in the past with a very funny to watch video about Xerox Scanners (recommend to watch):

https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning

System                                                              Notebook Lenovo P50

CPU: AMD Ryzen 3900x                              CPU: Intel i7-6700HQ

RAM: 64 GB                                                   RAM: 32 GB

GPU: RTX 3080 TI                                         GPU: NVIDIA Quadro M2000M 4 GB

SSD: Samsung 980 PRO 1 TB                       SSD: Samsung 850 Pro 500GB

OS: Windows 11 Edu x64                               OS: Windows 10 x64

TFT: 1 x Samsung C49RG94SSU                  TFT: 2 x Lenovo

Link to comment
Share on other sites

2 hours ago, af-user said:

The need is to have a visual unmodified correct representation of the scanned document but to have the possibility to search and copy Text.

So, the pdf you uploaded is fully searchable, though there are a number of typos as a result of the OCR interpretation which could potentially impact the searchability but you can copy the text... Is this a direct output from your Epson Document Capture Pro v3.3.3 or a version you've opened in Publisher to change the colour of the text and then re-exported?

 

Affinity Designer 2.5.5 | Affinity Photo 2.5.5 | Affinity Publisher 2.5.5
Affinity Designer Beta 2.6.0.2861 | Affinity Photo Beta 2.6.0.2861 | Affinity Publisher Beta 2.6.0.2861

MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Link to comment
Share on other sites

The document was the scanned one from Document Capture Pro, not from Affinity.

System                                                              Notebook Lenovo P50

CPU: AMD Ryzen 3900x                              CPU: Intel i7-6700HQ

RAM: 64 GB                                                   RAM: 32 GB

GPU: RTX 3080 TI                                         GPU: NVIDIA Quadro M2000M 4 GB

SSD: Samsung 980 PRO 1 TB                       SSD: Samsung 850 Pro 500GB

OS: Windows 11 Edu x64                               OS: Windows 10 x64

TFT: 1 x Samsung C49RG94SSU                  TFT: 2 x Lenovo

Link to comment
Share on other sites

As the pdf is searchable is there a reason for opening it in Publisher or is the reason for doing so simply to change the text colour should someone need to copy and paste the text?

I can open the pdf in Publisher, change the text colour and re-export it as a pdf and the text is still searchable and selectable…

Which pdf export settings are you using?

Affinity Designer 2.5.5 | Affinity Photo 2.5.5 | Affinity Publisher 2.5.5
Affinity Designer Beta 2.6.0.2861 | Affinity Photo Beta 2.6.0.2861 | Affinity Publisher Beta 2.6.0.2861

MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Link to comment
Share on other sites

Putting multiple documents / pages together, adjust sizes, etc.

Of course I could do it in other tools, but that is not the thing I want to do.

Oh that sounds interesting, exported with multiple presets, like digital hq and others, but it was not selectable for me after exporting.

Will try again.

... ah I see you changed the text color, yes then it works as stated before. But the import process shozld take care of it as best as possible in my opinion.

System                                                              Notebook Lenovo P50

CPU: AMD Ryzen 3900x                              CPU: Intel i7-6700HQ

RAM: 64 GB                                                   RAM: 32 GB

GPU: RTX 3080 TI                                         GPU: NVIDIA Quadro M2000M 4 GB

SSD: Samsung 980 PRO 1 TB                       SSD: Samsung 850 Pro 500GB

OS: Windows 11 Edu x64                               OS: Windows 10 x64

TFT: 1 x Samsung C49RG94SSU                  TFT: 2 x Lenovo

Link to comment
Share on other sites

If you Place rather than Open the scanned pdf files in Publisher maintaining PDF Passthrough that will allow you to put multiple documents/pages together and adjust the size of the scanned documents on each page. When exported from Publisher the pdf will remain searchable and the text selectable allowing you to copy and paste it.

There is also no need to change the colour of the invisible text because when you copy and paste text from the pdf into another document, e.g., Word, Google Docs etc., the text appears visible.

I don't know if that gives you what you are looking for, if not let us know as there are other possible options.

Affinity Designer 2.5.5 | Affinity Photo 2.5.5 | Affinity Publisher 2.5.5
Affinity Designer Beta 2.6.0.2861 | Affinity Photo Beta 2.6.0.2861 | Affinity Publisher Beta 2.6.0.2861

MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Link to comment
Share on other sites

Indeed, allows to select the text and I can still adjust the size.

In that case it is the much better workaround! As long as no other edit is needed.

 

Thanks!

 

System                                                              Notebook Lenovo P50

CPU: AMD Ryzen 3900x                              CPU: Intel i7-6700HQ

RAM: 64 GB                                                   RAM: 32 GB

GPU: RTX 3080 TI                                         GPU: NVIDIA Quadro M2000M 4 GB

SSD: Samsung 980 PRO 1 TB                       SSD: Samsung 850 Pro 500GB

OS: Windows 11 Edu x64                               OS: Windows 10 x64

TFT: 1 x Samsung C49RG94SSU                  TFT: 2 x Lenovo

Link to comment
Share on other sites

Exactly, it becomes a little problematic if you need to make edits in as much as the text, when opened in Publisher, suffers from tracking issues between characters that include accents and non accents... plus some of the text is incorrectly interpreted, resulting in typo's, but it is still possible...

This is a file I opened in Publisher, tweaked the scanned text and then exported from Publisher, the text is both searchable and selectable...

ocr-publisher-export.pdf

Affinity Designer 2.5.5 | Affinity Photo 2.5.5 | Affinity Publisher 2.5.5
Affinity Designer Beta 2.6.0.2861 | Affinity Photo Beta 2.6.0.2861 | Affinity Publisher Beta 2.6.0.2861

MacBook Pro M3 Max, 36 GB Unified Memory, macOS Sonoma 14.6.1, Magic Mouse
HP ENVY x360, 8 GB RAM, AMD Ryzen 5 2500U, Windows 10 Home, Logitech Mouse

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.