View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005537CMakepublic2007-08-19 01:582008-10-01 17:05
ReporterBrandon Van Every 
Assigned ToBill Hoffman 
PrioritynormalSeveritycrashReproducibilityalways
StatusclosedResolutionwon't fix 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0005537: REGEX MATCH and MATCHALL can be pathologically slow
DescriptionSTRING(REGEX MATCH ...) and STRING(REGEX MATCHALL ...) are pathologically slow when given regexes of the form "([a-z]+ *)+\r?\n" even for tiny input streams, such as 30 characters. This pattern is used to detect a line containing a whitespace separated list of words, which is extremely important when parsing files. With larger files, the regex can be so slow that CMake appears to hang indefinitely. Even a 3GHz PC with 1GB RAM can be brought to its knees.

More generally, patterns of the form "([^a]+a*)+a" exhibit the problem. A workaround is to express the pattern as "[^a]+(a+[^a]+)*a*". A .zip file containing a reproducer script and a sample input file is attached.
TagsNo tags attached.
Attached Fileszip file icon slow.zip [^] (1,237 bytes) 2007-08-19 01:58

 Relationships

  Notes
(0010034)
Brandon Van Every (reporter)
2007-12-31 04:46

Patterns of the form "a([^x]+)+a" where 'x' is any character other than 'a' also exhibit the problem. The problem appears to be due to the 2 levels of +. It also happens with ( *)+ and ( +)* and ( *)*. It doesn't happen when there's only 1 level of + or *.

 Issue History
Date Modified Username Field Change
2007-08-19 01:58 Brandon Van Every New Issue
2007-08-19 01:58 Brandon Van Every File Added: slow.zip
2007-12-17 17:56 Bill Hoffman Status new => assigned
2007-12-17 17:56 Bill Hoffman Assigned To => Bill Hoffman
2007-12-31 04:46 Brandon Van Every Note Added: 0010034
2008-10-01 17:05 Bill Hoffman Status assigned => closed
2008-10-01 17:05 Bill Hoffman Resolution open => won't fix


Copyright © 2000 - 2018 MantisBT Team