Author Topic: Struggeling with RegExp - Experts needed.  (Read 4469 times)

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Struggeling with RegExp - Experts needed.
« on: November 28, 2006, 09:38:06 pm »
Dear all,
since Thomas is not available I decided to ask here hoping I can find some RegExp experts. I'm trying to fix a bug in macrosmanager.cpp concerning the new $relative() and $absolute() macros. I came that far:
The RegExp "m_re_path" seems to extract the string not correctly - from e.g.
$relative($(TARGET_OUTPUT_DIR))lib$(TARGET_OUTPUT_BASENAME).a
*should* be extracted:
$relative($(TARGET_OUTPUT_DIR))
bus *is* extracted:
$relative($(TARGET_OUTPUT_DIR)
(Notice the missing trailing bracket.)
I thought changing the RegExp from
([^$]|^)(\$(absolute|relative)\(([^)]+)\))
into:
([^$]|^)(\$(absolute|relative)\(([^)]+\))\))
might solve the problem, but it doesn't - as it seems the intended functionality (can have nested macros/string) get's lost. Any ideas???
With regards, Morten.
Ps.: For those having more insight: The guess that MakeRelativeTo() might be the cause of the bug is wrong - this works properly, also for path's with a trailing path separator.
« Last Edit: November 28, 2006, 09:39:45 pm by MortenMacFly »
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Re: Struggeling with RegExp - Experts needed.
« Reply #1 on: November 28, 2006, 11:20:43 pm »
shouldn't the inner substitution $(TARGET_OUTPUT_DIR) be found first and substituted? so your m_re_path regex should exclude cases where group 4's text contains nested macro variables until after they have been subsituted?

something like
Code
"([^$]|^)(\$(absolute|relative)\((?!.*\$\(.*\).*)(.*)\))"

where
Code
(?!.*\$\(.*\).*)

excludes patterns that have unsubsituted macros
« Last Edit: November 29, 2006, 12:10:08 am by dmoore »

joat

  • Guest
Re: Struggeling with RegExp - Experts needed.
« Reply #2 on: November 28, 2006, 11:30:31 pm »
Try ([^$]|^)(\$(absolute|relative)\(([^)]+))\)\)
Ignore that (my brain was kinda fried since I wrote it after staying up two days straight writing a research paper).

Let me make sure I've got this straight. The problem is that you want to have stuff nested in $relative() and extract $relative() with whatever is inside of it? How flexible is the accepted nesting? Would more than one item be allowed in a set of braces such as $relative(something()somethingelse()) ? If not, and only one item is allowed inside each set of braces like $relative(something(somethingelse())) then ([^$]|^)(\$(absolute|relative)\(([^)]+))\)+ should work since all of the closing braces would be together. To be on the safe side, it might also be a good idea to use some code to verify that all open braces are closed before and after applying the regex in this case.
« Last Edit: November 30, 2006, 09:16:33 am by joat »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Struggeling with RegExp - Experts needed.
« Reply #3 on: November 28, 2006, 11:57:52 pm »
Morten, you are very lucky ;).
In revision 3300 I committed a little plugin I had sitting here for some time which just might help you: a (very) simple regular expressions testbed. You give it a regex, a test input and click a button for the results :).
Be patient!
This bug will be fixed soon...

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Re: Struggeling with RegExp - Experts needed.
« Reply #4 on: November 29, 2006, 12:03:28 am »
Morten, you are very lucky ;).
In revision 3300 I committed a little plugin I had sitting here for some time which just might help you: a (very) simple regular expressions testbed. You give it a regex, a test input and click a button for the results :).

even more reason to make the C::B editor support "Advanced" wxRegex for find and replace (My POSIX fix is not nearly enough). The only reason Scintilla doesn't support "advanced" regexes natively is because of licensing concerns. Since C::B already uses the wxWidgets with advanced regexes embedded, licensing should be not be a concern. I may have time to put up a patch next week if noone else beats me to it. ;)
« Last Edit: November 29, 2006, 12:08:18 am by dmoore »

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Struggeling with RegExp - Experts needed.
« Reply #5 on: November 29, 2006, 12:43:01 am »
Neither of your proposals is good, unluckily.

Joat's proposal will grab the right string, but it misses the point because it will not work with anything that is not quoted.
Dmoore's proposal does not work because variables cannot be expanded first (without some nasty hack), as that would replace $relative with an empty string and leave the brackets untouched.

The problem is so complicated because the regex is greedy. According to the documentation, you can use +? for a non-greedy match, so \(.+?\) would be the right thing. However, I tried that and it  did not compile...
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Re: Struggeling with RegExp - Experts needed.
« Reply #6 on: November 29, 2006, 11:15:57 pm »
Dmoore's proposal does not work because variables cannot be expanded first (without some nasty hack), as that would replace $relative with an empty string and leave the brackets untouched.

I didn't claim a nasty hack wouldn't be required, but that's what happens when you try to be too cute with macro syntax :) (don't get me wrong, I like cute). I guess I am not sure how general these absolute/relative functions need to be. For instance, should this work:

(1) $relative($(LIBNAME($LIBTYPE)))

or only this:

(2) $relative($(LIBNAME))

and this

(3) $relative($LIBNAME)

(there's also the % syntax to worry about, or was that removed?)

if all of (1), (2) and (3) are required then you can see why I proposed what I did (because (1) wouldn't make any sense until all other substitutions had been made), albeit with the side effect that you would need to avoid substitution of "$relative" or "$absolute". If only (2) and (3) are required then you could capture what you need with:

Code
([^$]|^)(\$(absolute|relative)\((\$\([^\(\)]+\)|[^\(\)]+)\))

(you may also want to allow for whitespace in the brackets with a few " *" scattered about)

PS: I'm not sure how non-greedy search would help in this case. Isn't the problem that the regex isn't greedy enough when there are sub-braces?
« Last Edit: November 29, 2006, 11:21:26 pm by dmoore »