Features of Grep

Features of grep

In my last blog posting I wrote about features of the less pager. Its one of those very common Linux utilities that everyone uses but no one reads the man page to learn about features. Just like less, everyone uses grep, but there’s a lot of features you’re missing out on if you haven’t read the man page.

grep is a very very old command, its name comes from the command that was run through the ed editor. If you wanted to print out the lines of a file that matched a regular expression (re) you’d run ‘g/re/p’ in ed which stood for globally search for a regular expression and print the matching lines. vi and sed users might recognize the syntax and other basic ed commands since they both adopted a lot from ed.

You should already know the basics of grep (if not, check the video to the right). Here are some simple features you probably already know:

  • -r recursively search all the directories
  • -l print only filenames that match, don’t print the matching lines
  • -h print only the matching lines that match, don’t print the filenames
  • -i ignore case in your pattern so ‘foo’ will match FOO, Foo, and foo

It is often times useful to get some context around the line you’re searching for:

  • -A x print x lines after the match
  • -B x print x lines before the match
  • -C x print x lines before and after the match

I often need to do something to a list of files that matches some pattern. You could use grep -l and pipe that to xargs but you’ll run in to problems if there are weird characters in the filename, luckily grep has a -Z flag that will use the null character to separate filenames which can be used with xargs like grep -lZ foo ./* | xargs -0 ls -l

By default grep uses Basic Regular Expressions (BRE) which works great for metacharacters like ., *, ^, and $. But when you want to use metacharacters like ?, +, and | ? You can use those with BRE but you have to prefix with with a \ like \?. This is awkward to type and hard to read, luckily grep can do Extended Regular Expressions (ERE) and you can use those metacharacters without having to escape them. You can use ERE by giving gerp the -E flag or by just running egrep instead. I prefer using egrep. If you prefer Perl Compatible Regular Expressions (PCRE) you can use the -P flag.

What happens if you the string you're searching for has one of those metacharacters. If you're searching to the string '/* some comment */' it is annoying having to escape the metacharacters, but grep has a nice feature that will search for your exact string and ignore what would be metacharacters in regular expressions. If you invoke grep with -F you won't have to worry about escaping any characters, it will search for exactly what you put between your single quotes. You can also use fgrep to do the same thing.

Want to know how many times a string or regular expression occurs in a file. You can do grep 'foo' ./bar.txt | wc -l but you can also do the entire thing in grep with grep -c 'foo' ./bar.txt.

I always like having the pattern I search for highlighted so I can spot it more easily visually. If you add --color to your grep command, that works but you can add "GREP_OPTIONS='--color=auto'" your bashrc so it automatically does it for you is even easier.

This entry was posted in Linux. Bookmark the permalink.