Quantcast
Viewing latest article 1
Browse Latest Browse All 4

Answer by Wiktor Stribiżew for why is my regex greedy

Remember that a regex engine parses strings from left to right.

You have blocks of substrings that are delimited with edit and next. Since the first edit block can be matched first, it is matched, and then [\s\S]*? matches up to the first occurrence of service "ALL" that is in the second block.

You might fix the regex using a tempered greedy token:

edit(?:(?!edit)[\s\S])*?service ("ALL")[\s\S]*?next    ^^^^^^^^^^^^^^^^^^^^

See this regex demo.

The (?:(?!edit)[\s\S])*? construct matches any char ([\s\S]), 0+ repetitions as few as possible (*?), that does not start the edit char sequence.

However, if edit or next happen to be inside the block, you will have incorrect matches. A safer regex will look like

(?m)^\h*edit \d+(?:(?!^\h*edit)[\s\S])*?service ("ALL")[\s\S]*?\R\h*next$

See the regex demo

Details

  • (?m)^ - start of a line
  • \h* - 0+ horizontal whitespaces
  • edit \d+ - edit, space and 1+ digits
  • (?:(?!^\h*edit)[\s\S])*? - any text not overflowing edit that is at the start of a line optionally preceded with 0+ horizontal whitespaces up to the first...
  • service ("ALL") - service "ALL" substring ("ALL" is captured into Group 1)
  • [\s\S]*? - any 0+ chars, as few as possible
  • \R - a line break
  • \h* - 0+ horizontal whitespaces
  • next - a literal substring
  • $ - end of a line.

Viewing latest article 1
Browse Latest Browse All 4

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>