Regular expression hang  
Author Message
Sander Bos





PostPosted: Regular Expressions, Regular expression hang Top

Hi there,

if I execute this sample code

Console.WriteLine("Start");

Regex regularExpression = new Regex( );

regularExpression.Match( );

Console.WriteLine("Before hang");

regularExpression.Match( );

Console.WriteLine("After");

the second match call (it is not important that it is the second Match call, just to show that the regular expression does work on a similar but smaller string) will lead to a**** process (so 'After' is never reached, the Match call just takes 100% CPU). I wonder whether I could find out whether this is known issue (I found this forum, while looking for a known issues page).

(I am using .Net v2.0.50727)

Kind regards,

--Sander.



.NET Development2  
 
 
Lepaca





PostPosted: Regular Expressions, Regular expression hang Top

Use lazy search:

<%!( <declaration>.* )%>


 
 
Sander Bos





PostPosted: Regular Expressions, Regular expression hang Top

Hello Lepaca,

thanks for your answer.

In fact I already figured out as much, but the regex library still should not hang like this should it Or am I 'saying' something with my expression that is just telling the regular expression system to hang

I thought the lazy option would only cause it to not find a later %> in the input, but since there is only one in my input it would not make a difference for this input string

Kind regards,

--Sander.


 
 
Lepaca





PostPosted: Regular Expressions, Regular expression hang Top

The problem is in (.|[\s]).
"." include also spaces, and this confuses regex parser.
This is substantially a bug of Regex engine.
You can write:
<%!( <declaration>.*)%>

The difference between Lazy and Greedy search with this string:
<%! a %> b %> c %>
With greedy (.*) you find <%! a %> b %> c %>
With lazy (.* ) <%! a %>



 
 
Sander Bos





PostPosted: Regular Expressions, Regular expression hang Top

Helllo Lepaca,

once again thank you for your answer.

One more small remark from my end for completeness on why I use .|\s. Because I think this is another issue with the Regex library. In actual life I use the Multiline option, and then if I have this program example:

Regex regularExpression1 = new Regex( , RegexOptions.Multiline);
Regex regularExpression2 = new Regex( , RegexOptions.Multiline);
string testString1 =
<%! foo %>
";
string testString2 =
<%!
foo
%>
";
Console.WriteLine(regularExpression1.Match(testString1).Success);
Console.WriteLine(regularExpression1.Match(testString2).Success);
Console.WriteLine(regularExpression2.Match(testString1).Success);
Console.WriteLine(regularExpression2.Match(testString2).Success);

You can see that regularExpression1 and two are mostly the same, except one uses . to match and one uses .|\s, but the .-only one won't give me a match for testString2 which uses some linebreaks (so the result of this program is 3 True's followed by a False).

Kind regards,

--Sander.


 
 
Lepaca





PostPosted: Regular Expressions, Regular expression hang Top

"." includes all characters except "newline"...
If you want also "newline" you can use (.|\n) or SingleLine option
Note that SingleLine and MultiLine are different and you can use both at the same time...

 
 
Sander Bos





PostPosted: Regular Expressions, Regular expression hang Top

Thanks, I was not aware of the SingleLine option.

--Sander.


 
 
OmegaMan





PostPosted: Regular Expressions, Regular expression hang Top

Hi Sander, Did the posts resolve the problem

If so mark the post(s) that helped you as the answer(s), so when others search the forums, they might be more inclined to look at a successful post than a non successful one...in the search results the Answered posts are bubbled to the top before the unanswered.

Or post what it is that you want the regex to do Give us sample data and what is expected in return and maybe the regex could be crafted to work. Thanks.




 
 
Sander Bos





PostPosted: Regular Expressions, Regular expression hang Top

Hello OmegaMan,

I have received other messages to mark an answer as answering the question. I did do some marking then but not as an answer.

I had very interesting responses from Lepaca indicating how I could change the regexp, but I already found an alternative working syntax before I made my first post. My problem was not that I was stuck with a regexp, but that I had been looking for an hour where my code was**** (you don't expect it to be on Regex.Match).

What I wanted to know in my original question (which includes sample code and data (in the code)), was whether it is a bug in the Regex engine, and where I should look to see whether this is a known issue or not (since I did not find where to look for that, that was part of the question).

Kind regards,

--Sander.