APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

grep in depth

I love reading manual pages, especially for commands I use every day. I'm always discovering things I've forgotten and sometimes find new features that arrived while I wasn't paying attention. For some reason, I hadn't looked at 'grep' in quite a while. That's probably because my ordinary use is trivial, and I tend to use Perl for more complex needs. However, modern GNU grep has many options that may be unexpected for old Unix hands.

Two of the nicest of the new grep options are the ones that give you context. For example:



# old style grep
$ grep wtoday foo
drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
# context options
$ grep -A3 -B3 wtoday foo
drwxr-xr-x   162 apl   staff     5508 May  1  2003 wlog
drwxr-xr-x     2 apl   staff       68 Apr 24  2003 wlogs
drwx------   273 apl   staff     9282 Sep 23 08:05 world
drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
-rw-rw-r--     1 apl   staff       16 May 27  2003 www.labels
-rwxr-xr-x     1 apl   staff      114 Nov 30 05:00 x
-rwxr-xr-x     1 apl   staff      114 Mar 11  2005 x.back
 

You don't have to specify A and B separately; use -C instead. I use that so much that I have a function in my shell for it:

around() {
 grep  -C3 $*
 }
 

Old Unix folk are used to mixing standard input with actual files for grep:

$ zcat foo.gz | grep  wtoday - foo
(standard input):drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
foo:drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
 

But now you can change that "(standard input)" label:

$ zcat foo.gz | grep --label stdin  wtoday - foo
stdin:drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
foo:drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
 

And if you don't want any file names, use -h:

$ zcat foo.gz | grep  -h  wtoday - foo
drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
 

Have you every been annoyed by this?

$ grep wtoday foo*
foo:drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
Binary file foo.tar matches
foo.txt:wtoday
 

You can suppress that with -I:

$ grep -I wtoday foo*
foo:drwxr-xr-x     4 apl   staff      136 Jan 14 06:50 wtoday
foo.txt:wtoday
 

There are similar flags to determine how to handle device files and directories; see "-d" and "-D" options in the man page.

The "-l" and "-L" flags are interesting. Old Unix hands are of course accustomed to "-l"; we use it all the time for things like this:

vi `grep -l href *.html`
 

The capital "-L" instead prints the names of files that DO NOT match, which is a surprisingly common need once you realize that you can get this so easily now.

Another handy feature is to quit each file after a certain count of output lines. This gives some idea of the content without cluttering up the screen with extraneous output:`

$ grep  -m1 w foo*
foo:-rw-r--r--     1 apl   wheel  1480148 Jun 24  2004 103055x.jpg
foo.ad:unknown:*:99:99:Unknown User:/dev/null:/dev/null
foo.af:www:*:70:70:World Wide Web Server:/Library/WebServer:/dev/null
foo.ai:apl:*:501:20:Anthony Lawrence:/Users/apl:/bin/bash
foobar:new=`echo $@ | tac`
 

While we're thinking about suppressing extraneous output, the "-o" flag only outputs the match you asked for:

$ grep -h -o 'http:[^"]*'  *html
http://lwf.ncdc.noaa.gov/servlets/DLYP (1.02a)
http://pagead2.googlesyndication.com/pagead/show_ads.js
http://www.amazon.com/exec/obidos/ASIN/0131492470/aplawrencescouni
http://www.w3.org/TR/html4/loose.dtd
http://netsourced.com/
http://mailhost.ssmh.org/cphone.html
http://submitexpress.com/metatag.html -->
http://www.php.net/
http://www.hping.org/visitors
 

This type of thing would have required a "sed" before. Using sed isn't so bad, but look what we had to do before we had "-r" for a recursive grep

GNU grep has -Z for the same reasons find has -print0; not all of us are careful with our file names.

This short article doesn't begin to cover all of grep, but it might cause you to go take a fresh look at the man page, or to upgrade your ancient grep if it lacks these features.



Got something to add? Send me email.


6 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Anthony Lawrence







Thu Oct 30 01:49:00 2014: 12542   prat

gravatar


what foo is doing here ? why you are using it .
For e.g.
zcat foo.gz | grep wtoday - foo
i can understand that you are grepping word (wtoday) in Compressed File (foo.gz). but then at the last what -foo is doing ..which i can see everywhere in `grep in depth` article.

thanks,







Thu Oct 30 07:36:20 2014: 12543   TonyLawrence

gravatar


It's not "grep -foo"

It's "grep - foo"

The "grep -" is "grep from stdin (which would be from the pipe in the example quoted).







Wed Nov 5 16:42:56 2014: 12546   prat

gravatar


still not clear dear..what is foo ? what is grep - foo ?



Wed Nov 5 16:50:37 2014: 12547   TonyLawrence

gravatar


Telling grep to find "foo" in its stdin



Sat Dec 6 17:40:18 2014: 12567   anonymous

gravatar


sry but still not clear to me
you said `Telling grep to find "foo" in its stdin ` but as per my knowledge you are grepping 'wtoday' in foo.gz file right ? so again my old question is what foo is doing there. can you explain it with another e.g. for you trouble to you



Sat Dec 6 18:59:56 2014: 12569   TonyLawrence

gravatar


We used zcat on the .z file, sending output to stdin.

------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





The danger of computers becoming like humans is not as great as the danger of humans becoming like computers. (Konrad Zuse)





This post tagged: