9 月 262022
 

Source: How to find patterns across multiple lines using grep?

Grep is an awkward tool for this operation.

pcregrep which is found in most of the modern Linux systems can be used as

pcregrep -M  'abc.*(\n|.)*efg' test.txt

where -M--multiline allow patterns to match more than one line

There is a newer pcre2grep also. Both are provided by the PCRE project.


Without the need to install the grep variant pcregrep, you can do a multiline search with grep.

$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c

Explanation:

-P activate perl-regexp for grep (a powerful extension of regular expressions)

-z Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. That is, grep knows where the ends of the lines are, but sees the input as one big line. Beware this also adds a trailing NUL char if used with -o, see comments.

-o print only matching. Because we’re using -z, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won’t do that.

In regexp:

(?s) activate PCRE_DOTALL, which means that . finds any character or newline

\N find anything except newline, even with PCRE_DOTALL activated

.*? find . in non-greedy mode, that is, stops as soon as possible.

^ find start of line

\1 backreference to the first group (\s*). This is a try to find the same indentation of method.

As you can imagine, this search prints the main method in a C (*.c) source file.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)

CAPTCHA Image
Play CAPTCHA Audio
Reload Image