philipt18 Posted September 2 Share Posted September 2 I'm trying to do a regex search to find all special characters, such as ligatures and accented characters. I haven't found a great way to do this but I've been playing with it and came up with: [^a-zA-Z0-9‘’”“;.,-—–() !\n] The idea is to exclude alphanumeric characters, punctuation, spaces, as well as the newline character. This works up to a point, but I still have almost 7000 matches. Some of these are great, such as finding non-curved quotes. However, many of the matches are footnotes and I think index references. Is there a way to exclude affinity-specific references like those? Quote Link to comment Share on other sites More sharing options...
MikeTO Posted September 2 Share Posted September 2 To find all non-ASCII characters, use: [\x7f-\xff] You can pair this with a specific text style to eliminate more matches. philipt18 1 Quote Download a free PDF manual for Affinity Publisher 2.5 Download a quick reference chart for Affinity's Special Characters Affinity 2.5 for macOS Sequoia 15.1, MacBook Pro 14" (M4 Pro) Link to comment Share on other sites More sharing options...
philipt18 Posted September 6 Author Share Posted September 6 On 9/2/2024 at 6:06 PM, MikeTO said: To find all non-ASCII characters, use: [\x7f-\xff] You can pair this with a specific text style to eliminate more matches. Thank you, that works pretty well. There is one odd thing happening, however. I have some ligature characters in the text, which include æ and œ. I looked those up and they both fall within the 7F-FF range (æ is E6 and œ is 9C) but it doesn't seem to be finding œ, only æ. If I search directly I find 15 instances of æ and oe each. Only the æ matches are showing up in the x7f-xff search. I assume this is actually a bug. I've attached a document which illustrates the problem. Presumably it's not just œ that isn't showing up, but I haven't tested anything else. ligature search bug.afpub Quote Link to comment Share on other sites More sharing options...
philipt18 Posted September 6 Author Share Posted September 6 So I've been doing a bit if testing and it seems the range 128-159 (hex 80-9F) is not found by Affinity. That is presumably because that range is not defined in ISO-8859-1, while Windows-1252 does define that range (which included œ). Apparently it was added to ISO-8859-15. That's the only thing that makes sense to me. I guess I can just add those characters to the search, but part of what I'm trying to do is find characters I'm not expecting. Quote Link to comment Share on other sites More sharing options...
Felix Kasza Posted September 6 Share Posted September 6 [\x{0080}-\x{ffff}] will find both æ and œ. The regex engine in APub makes me fairly happy (it does not yet do Unicode scripts or Unicode blocks, but the other \p{…} stuff seems to be there); the regrettable part is that there is nop Unicode class that would let you select letters minus ASCII letters. Also, the && syntax does not work as I had hoped: [[:graph:]&&[^a-zA-Z0-9‘’”“;.,-—–()!]] should find your ligatures, not to mention the code point á or it'a "a" + combining accent version. Regrettably, the engine seems to dislike "&&" in a bracket expression, dooming also attempts like [\p{Letter}&&[\x{007f}-\x{ffff}]] and [[:graph:]&&[:^ascii:]]. Set subtraction syntax also fails: [[:graph:]-[:ascii:]] O Serif: It would be nice to know which regex engine you use. Thanks! Quote Link to comment Share on other sites More sharing options...
walt.farrell Posted September 6 Share Posted September 6 5 minutes ago, Felix Kasza said: It would be nice to know which regex engine you use. Boost. But I don't think they've ever documented exactly which level of the engine, nor exactly which initialization parameters they pass into it (since it can operate in several different modes). Quote -- Walt Designer, Photo, and Publisher V1 and V2 at latest retail and beta releases PC: Desktop: Windows 11 Pro 23H2, 64GB memory, AMD Ryzen 9 5900 12-Core @ 3.00 GHz, NVIDIA GeForce RTX 3090 Laptop: Windows 11 Pro 23H2, 32GB memory, Intel Core i7-10750H @ 2.60GHz, Intel UHD Graphics Comet Lake GT2 and NVIDIA GeForce RTX 3070 Laptop GPU. Laptop 2: Windows 11 Pro 24H2, 16GB memory, Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) 12 Core CPU 4.01 GHz, Qualcomm(R) Adreno(TM) X1-85 GPU iPad: iPad Pro M1, 12.9": iPadOS 18.1, Apple Pencil 2, Magic Keyboard Mac: 2023 M2 MacBook Air 15", 16GB memory, macOS Sequoia 15.0.1 Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.