Jump to content
You must now use your email address to sign in [click for more info] ×

Regex pour texte balisé. Regex for tagged text.


uneMule

Recommended Posts

Bonjour à tous.
si il vous arrive, à vous aussi, de recevoir du texte avec des balises html, voici une petite expression qui vous permettra de faire le ménage.

*****

Hello to all.
if, you to, you receive text with html tags, here is a little expression that will allow you to clean up.

regex_html.afpub

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

Hello @walt.farrell

Vous aviez fait une suggestion intéressante dans un précédent topic pour limiter la zone de recherche avec deux balise de délimitation. Mais je ne retrouve plus le sujet.

*****

You made an interesting suggestion in a previous topic to limit the search area with two delimiting tags. But I can't find the topic anymore.

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

I mentioned it several times, but apparently described the details only once:

 

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

@walt.farrell

Vôtre mémoire fonctionne mieux que la mienne.
Mon expression (?<=@@)(?:.*)(\<les\>)(?:.*)(?=@@) ne sélectionne qu'une occurrence, même si je peux remplacer $1 par autre chose. @@ sont les balises.

*****

Your memory works better than mine.
My expression (?<=@@)(?:.*)(\<les\>)(?:.*)(?=@@) only selects one occurrence, although I can replace $1 with something else. @@ are the tags.

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

I don't think I had even considered using lookahead or lookbehind in that earlier discussion.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

2 minutes ago, walt.farrell said:

I don't think I had even considered using lookahead or lookbehind in that earlier discussion.

non, c'est moi qui suis parti dans cette direction.
No, I'm the one who went in that direction.

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

@walt.farrell  (%%%.*?)(\bTom\b)(.*%%%)

Et bien non... je ne trouve qu'une occurrence et le (.*%%%) ne cherche pas les \n ou \r.
Ou alors, je suis... fatigué, moi aussi !

*****

Well, no... I can only find one occurrence and the (.*%%%) doesn't look for any \n or \r.
Or maybe I'm just... tired, me too!

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

You're right. With something like this:

Find:    (%%%.*?)(sed)(.*%%%)
Replace: \1===\3

you will only find one occurrence at a time. So you first press Find, and if there is an occurrence of "sed" somewhere in the area that Find selects the entire text string from %%% through the ending %%%.

You then press Replace, and the first occurrence of "sed" is replaced by "===".

You then press Find again, and if there is a search result, you press Replace. Repeat until the Find fails.

Note: You canjust keep pressing Replace after the first Find without pressing Find again. But you never know when you are finished, as the result list is never cleared that way.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

49 minutes ago, walt.farrell said:

Note: You canjust keep pressing Replace after the first Find without pressing Find again. But you never know when you are finished, as the result list is never cleared that way.

Merci, je teste.
Il est vrai que cette question a été remontée : la mise à jour dynamique de la liste de recherche (ainsi que le nombre d'occurrence).

*****

Thanks, I'm testing.
It's true that this question has come up: the dynamic updating of the search list (as well as the number of hits).

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

49 minutes ago, uneMule said:

the dynamic updating of the search list (as well as the number of hits).

Yes, that has come up. The list updates if you press Find again. It does not update if you simply keep pressing Replace.

It is, of course, inconvenient to have to click Find, then Replace, then Find, then Replace, ..., when you could simply click Find once, then Replace, Replace, ....

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

@walt.farrell

in this case is true : %%% blabla sed blabla sed blabla.%%%

but in this case, not : %%% blabla sed blabla sed blabla.\n.blabla ... blabla.%%%

For me, the reason is the same (.*%%%) doesn't look for any \n or \r

And [\n.]*%%% give nothing. I wonder if the . don't becomes a . in []?

What do you think about this : (%%%.*?)(\bsed\b)([[:print:]]*%%%)

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

1 hour ago, a Mule said:

For me, the reason is the same (. * %%%) doesn't look for any \ n or \ r

In the Find field, click the Format cog icon, select Regular Expression Options, and enable "Dot matches paragraph break".

Sorry; forgot that bit of information.

-- Walt
Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases
PC:
    Desktop:  Windows 11 Pro, version 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 

    Laptop:  Windows 11 Pro, version 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU.
iPad:  iPad Pro M1, 12.9": iPadOS 17.4.1, Apple Pencil 2, Magic Keyboard 
Mac:  2023 M2 MacBook Air 15", 16GB memory, macOS Sonoma 14.4.1

Link to comment
Share on other sites

3 minutes ago, walt.farrell said:

Sorry; forgot that bit of information.

No problem. Of course.

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

  • 2 months later...
On 6/19/2021 at 1:40 AM, walt.farrell said:

In the Find field, click the Format cog icon, select Regular Expression Options, and enable "Dot matches paragraph break".

@walt.farrell Bonsoir

ou (?s) en début de motif. Ça permet de gérer l'option dans le motif. Plus joli.
(?s)(%%%.*?)(\bsed\b)(.*%%%)
Je me réveille un peu tard :)

or (?s) at the beginning of the pattern. This allows to manage the option in the pattern. More pretty.
(?s)(%%%.*?)(\bsed\b)(.*%%%)
I wake up a little late :)

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

Hi @walt.farrell

its me again.

This is a new discovery. Especially for me :)
This regex works, without any additional option:
(%%%[\s\S]*?)(\bsed\b)([\s\S]*%%%)

"...you can use a character class such as [\s\S] to match any character. This character matches a character that is either a whitespace character (including line break characters), or a character that is not a whitespace character."
Source : https://www.regular-expressions.info/dot.html

Toujours pas !
Windows 10 Pro 21H2 - Intel Core i7-3630QM CPU @ 2.40GHz - 16 Gb Ram - GeForce GT 650M - Intel HD 4000
Affinity Photo | Affinity Designer | Affinity Publisher | 2

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.