Jump to content
Sign in to follow this  
Seneca

GREP bug

Recommended Posts

Peter Kahrel wrote a nice little grep that applies a background colour to every second paragraph in a story.

see: https://indesignsecrets.com/adding-shading-to-alternate-paragraphs-with-grep-find-change.php?utm_source=CreativePro+Network+List&utm_campaign=2b69fc0643-InDesign_Secrets_Tip_of_the_Week_1_28_2016_COPY_01&utm_medium=email&utm_term=0_98b7b678b4-2b69fc0643-262960537

I tried to reproduce this example in Publisher but unfortunately Publisher doesn't select the consecutive 2 paragraphs correctly and instead selects the whole text.

The grep in question is:

Find: \r.+\r\K.

Replace: Shaded Paragraph Style

Once this this bug is sorted out this grep will work as per the article above.

Of course one would need to define a Shaded Para Style for GREP to apply it to the selection.

Share this post


Link to post
Share on other sites
On 4/12/2019 at 8:44 AM, Seneca said:

I tried to reproduce this example in Publisher but unfortunately Publisher doesn't select the consecutive 2 paragraphs correctly and instead selects the whole text. 

That regex, as written, is incompatible with Publisher's regex implementation or engine, because "." will automatically match a new-line or a paragraph break (\r).

The regex needs to be
     \r.+?\r\K.
where the ? stops the ".+" from being a greedy match. Without the ? the regex matches (approximately) everything from the first \r to the last \r.

Edit: Or, alternatively, \r[^\r]+\r\K.

 


-- Walt

Windows 10 Home, version 1903 (18362.356), 16GB memory, Intel Core i7-6700K @ 4.00Gz, GeForce GTX 970
Affinity Photo 1.7.3.481 and 1.8.0.486 Beta   / Affinity Designer 1.7.3.481 and 1.8.0.486 Beta  / Affinity Publisher 1.7.3.481 and 1.8.0.499 Beta

Share this post


Link to post
Share on other sites
13 minutes ago, MikeW said:

I'm away from the computer right now, but the decimal point should not include a new line.

It can match new-line, or not, depending on either options chosen by the user (and passed throug by the program) or options coded into the program that invokes the regex parser.

In Notepad++, for example, the program gives the user the option of how that aspect of regex parsing should work. Publisher does not give us that option, and seems to hard-code it as ". will match new-line". But there's a user-controllable way around that. By adding (?-s) at the start of the regex the user can reverse that behavior.

However, I suspect there is a bug, though I haven't figured out what it is. I've attached a .afpub file with some text.

A regex Find for either "\r[^\r]+\r\K."  (without the " marks) or "(?-s)\r.+\r\K." should give the same results, but it doesn't.

 

 

regex-dot-s.afpub


-- Walt

Windows 10 Home, version 1903 (18362.356), 16GB memory, Intel Core i7-6700K @ 4.00Gz, GeForce GTX 970
Affinity Photo 1.7.3.481 and 1.8.0.486 Beta   / Affinity Designer 1.7.3.481 and 1.8.0.486 Beta  / Affinity Publisher 1.7.3.481 and 1.8.0.499 Beta

Share this post


Link to post
Share on other sites

Yes, NotePad++ has the switch for . matching a new line. A distinct switch one needs to turn on. In lieu of the switch, I believe the default behavior should not match a new line and should Serif desire to add the switch, then all is well and good.

And as far as that goes, one shouldn't need the . after the K. The cursor should just locate the appropriate start of the paragraph versus selecting the first character as with ID. I use the same expression in UltraEdit sans the . following the K frequently for inserting tagged text paragraph style tags where every other paragraph cycles between two p.styles.

fwiw, I see no difference here between:

\r.+?\r\K.
\r[^\r]+\r\K.
(?-s)\r.+\r\K.

All select the first character in every other paragraph.

 

Share this post


Link to post
Share on other sites
31 minutes ago, MikeW said:

Yes, NotePad++ has the switch for . matching a new line. A distinct switch one needs to turn on. In lieu of the switch, I believe the default behavior should not match a new line and should Serif desire to add the switch, then all is well and good.

Either they could add the external switch (probably under the cog icon along with the other Find options) or they could simply require users to specify (?s) if the user wants the "." to match newline and paragraph break.

12 minutes ago, v_kyr said:

Usually for reg exp a "." matches any single character except newline!

Yes, usually that's true.


-- Walt

Windows 10 Home, version 1903 (18362.356), 16GB memory, Intel Core i7-6700K @ 4.00Gz, GeForce GTX 970
Affinity Photo 1.7.3.481 and 1.8.0.486 Beta   / Affinity Designer 1.7.3.481 and 1.8.0.486 Beta  / Affinity Publisher 1.7.3.481 and 1.8.0.499 Beta

Share this post


Link to post
Share on other sites
35 minutes ago, MikeW said:

And as far as that goes, one shouldn't need the . after the K. The cursor should just locate the appropriate start of the paragraph versus selecting the first character as with ID. I use the same expression in UltraEdit sans the . following the K frequently for inserting tagged text paragraph style tags where every other paragraph cycles between two p.styles.

Yes, the behavior in Publisher is strange. Without the . the cursor is set, but it's set just before the visible paragraph break character. We could see in a prior beta (before Serif updated the results list to show the encoded characters rather than \n, etc.) that a paragraph break is a \n\r internally in the file. I suspect that the cursor is being set improperly, after the \n and before the \r:

image.png.8ea036edf206a9d856329fb44b91617c.png


-- Walt

Windows 10 Home, version 1903 (18362.356), 16GB memory, Intel Core i7-6700K @ 4.00Gz, GeForce GTX 970
Affinity Photo 1.7.3.481 and 1.8.0.486 Beta   / Affinity Designer 1.7.3.481 and 1.8.0.486 Beta  / Affinity Publisher 1.7.3.481 and 1.8.0.499 Beta

Share this post


Link to post
Share on other sites

I hate uncommon behavior, especially for something old school like reg expressions here. One should not deviate from the general standards, because that's just confusing. Either you program it to behave right or not at all. - Strange workarounds or other mismatch behavior etc. are a no go. 


☛ Affinity Designer 1.7.3 ◆ Affinity Photo 1.7.3 ◆ OSX El Capitan

Share this post


Link to post
Share on other sites
44 minutes ago, MikeW said:

fwiw, I see no difference here between:

\r.+?\r\K.
\r[^\r]+\r\K.
(?-s)\r.+\r\K.

All select the first character in every other paragraph.

For me, the first 2 give identical results. The 3rd selects only the Y at the start of the 5th paragraph.

(And now, somehow, I've managed to break Find completely. Neither normal nor regex searches find anything. After restarting Publisher, a normal Find worked, but none of those 3 regexes would work, and as soon as I tried them even a normal Find is broken.)


-- Walt

Windows 10 Home, version 1903 (18362.356), 16GB memory, Intel Core i7-6700K @ 4.00Gz, GeForce GTX 970
Affinity Photo 1.7.3.481 and 1.8.0.486 Beta   / Affinity Designer 1.7.3.481 and 1.8.0.486 Beta  / Affinity Publisher 1.7.3.481 and 1.8.0.499 Beta

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

These are the Terms of Use you will be asked to agree to if you join the forum. | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.