APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed
RSS Feeds RSS Feeds











(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Printer Friendly Version
->
-> How can I recursively grep through sub-directories?


How can I recursively grep through sub-directories?

How to search through sub-directories whether or not your Unix has recursive (GNU) grep.


You must mean that your ancient Unix doesn't have GNU grep, right? If you do have a modern "grep", just go do a "man grep"; you don't need to read this (though you may want to just so you really appreciate GNU grep). Just add "-r" (with perhaps --include) and grep will search through subdirectories.



Say you wanted to search for "perl" in only *.html files in the current directory and every subdirectory. You could do :

grep -r --include="*.html" perl .
 

("." is current directory)

You don't even need the --include; "grep -r perl . " will search all files. If you had a directory strucure like this:

./alpha
./beta
./beta/dog
./beta/dog/perlinhere.html
 

either invocation would search "perlinhere" looking for "perl" inside. So would:

grep -r --include="*.html" perl b*
 

But this of course would not (because the file with the pattern is not under "a*"):

grep -r --include="*.html" perl a*
 

You can also use --exclude= to search every file except the ones that match your pattern.

(BSD grep has "-d recurse". That also works in GNU grep and is equivalent to "-r")

Easy enough, isn't it?

For those without a modern grep

But if you are on some old Unix without recursive capabilities in its grep, it gets very hard. The problem with all the reponses that invariably pop up for this type of question is that none of them are ever truly fast and most of them aren't truly robust.

Typically, the answer is to use find, xargs, and grep. That's horribly slow for a full filesystem search, and it's painfully difficult to properly construct a pipeline that will avoid searching binaries if you don't want to, won't get stuck on named pipes or blow up on funky filenames (beginning with -, or sometimes spaces, punctuation etc). There are ways around all these things, but they are all ugly.

BTW, something that almost never gets mentioned but that I will frequently use under conditions where it is appropriate is a simple

 
 grep pattern * */* */*/* 2>/dev/null
 

Not useful much beyond that, and may not even be good at that except for certain starting points, but it's faster than any find xargs pipeline can ever be if the set is small enough.

The simplistic approach using find is

 find /whereveryouwantostart -exec grep whatever {} dev/null \;
 

That's not necessarily very efficient. Using xargs can help

 find . | xargs grep whatever
 

But it also has bugs if the filenames could have "-" at their beginning. Fixing that can be a little nasty.

You may not want to grep binary files:

 find .  -type f -print|xargs file|grep -i text|cut -fl -d:    | xargs grep whatever
 

That's pretty awful, but it's what you have to get into if you have special cases. Special cases are what makes this question more difficult. If you have a small number of files and subdirs to search, the simple approach may work fine for you. If not, you have to get more creative.

November, 2008: Excuse the interuption, but there is something new to talk about and I didn't want you to have to go all the way to the comments to find it. It's called "ack", it's written in Perl, and it addresses the things the things this page talks about. Find it at http://betterthangrep.com/.

Bill Campbell offers this Perl script:

 I have a perlscript I call ``textfiles'' that I use for many
 things like this:
        textfiles dirname [dirname... ] | xargs ...
 
 Essentially it runs ``gfind @ARGV -type f'', then uses perl's -T
 option on each file to determine whether it's a text file.
 
 My textfiles script also has options to add options to the gnu
 find command like -xdev, -mindepth, and -maxdepth.
 
 Hell, it's short so I'm attaching it for anybody who wants to use
 it.  It does assume that the gnu version of find is in your PATH
 named gfind (I make a symlink to /usr/bin/find on Linux systems
 so that it works there as well).
 
 
 #!/usr/local/bin/perl
 eval ' exec /usr/local/bin/perl -S $0 "$@" '
        if $running_under_some_shell;
 
 # $Header: /u/usr/cvs/lbin/textfiles,v 1.7 2000/06/22 18:29:08 bill Exp $
 # $Date: 2000/06/22 18:29:08 $
 # @(#) $Id: textfiles,v 1.7 2000/06/22 18:29:08 bill Exp $
 # 
 #      find text files
 
 ( $progname = $0 ) =~ s!.*/!!; # save this very early
 
 $USAGE = "
 # Find text files
 #
 #   Usage: $progname [-v] [file [file...]]
 #
 # Options   Argument    Description
 #   -f                  Follow symlinks
 #   -M      maxdepth    maxdepth argument to gfind
 #   -m      mindepth    mindepth argument to gfind
 #   -x                  Don't cross device boundaries
 #   -v                  Verbose
 #
 ";
 
 sub usage {
        die join("\n",@_) .
        "\n$USAGE\n";
 }
 
 do "getopts.pl";
 
 &usage("Invalid Option") unless do Getopts("fM:m:xvV");
 
 $verbose = '-v' if $opt_v;
 $suffix = $$ unless $opt_v;
 
 $\ = "\n";     # use newlines as separators.
 
 # use current directory if there aren't any arguments
 push(@ARGV, '.') unless defined($ARGV[0]);
 
 $args = join(" ", @ARGV);
 $xdev = '-xdev' if $opt_x;
 $opt_f = '-follow' if $opt_f;
 $opt_m = "-mindepth $opt_m" if $opt_m;
 $opt_M = "-maxdepth $opt_M" if $opt_M;
 $cmd = "gfind @ARGV -type f $xdev $opt_f $opt_m $opt_M |";
 print STDERR "cmd = >$cmd<" if $verbose;
 
 open(INPUT, $cmd);
 while(<INPUT>) {
        chop($name = $_);
        print STDERR "testing $name..." if $verbose;
        print $name if -T $name;
 }
 
 

John Dubois also comments on Glimpse:

Glimpse indexes files by the words contained in the file. Then when you want to search all of the files, it only runs its equivalent of grep (agrep) on the files that contain the words you're looking for. You can search for partial words too, though it takes longer. I have the man pages, include files, rfcs, source trees, my home directory, web pages, etc. all separately glimpse-indexed.

Binaries & man pages for OpenServer are at ftp://deepthought.armory.com/pub/scobins/glimpse.tar.Z

A front end that allows you to easily search any of multiple glimpse databases is at: ftp://ftp.armory.com/pub/scripts/search

See also: Find with -execdir and Grep in Depth.




If this page was useful to you, please help others find it:  





19 comments




More Articles by - Find me on Google+



Click here to add your comments
- no registration needed!




Thu Feb 28 20:45:18 2008: 3730   duraimuthu


to recrusively grep in unix
create the following shell script. run it with a parameter like './new.sh string'

#!/bin/bash
for i in $( du -a | awk '{print $2}' ); do
grep $1 $i
done






Thu Feb 28 21:00:13 2008: 3733   TonyLawrence

gravatar
Goodness: isn't "du" a bit much just to get a list of file names?

Example

time du -a

real 0m10.676s user 0m0.395s sys 0m5.512s

time find .

real 0m1.272s user 0m0.192s sys 0m0.698s

Also: that's going to run into problems with odd file names (just as find and xargs do). Really it's no different (other than being slow) than the "find" example without "xargs" given above, and of course "xargs" makes that much more efficient - but still suffers from the various problems covered above.

On Mac OS X, "mdfind" answers *most* situations, but not all..

Recursive grep remains difficult and problematical.











Tue Jul 1 19:06:32 2008: 4385   anonymous


This is much more efficient, the wildcards in grep might not work as nicely as they do for find

find <path> -name <filenamepattern> | xargs grep <searchstring>



Tue Jul 1 21:34:41 2008: 4386   TonyLawrence

gravatar
Sigh.. did you even READ the article?



Wed Nov 26 23:41:43 2008: 4821   TonyLawrence

gravatar
Whoopee: something new, and it's GREAT!

http://petdance.com/ack/

Found at http://mike.hostetlerhome.com/ ("Where Are The Wise Men?")

Thanks, Mike!



Sat Nov 29 01:01:35 2008: 4829   AlexWorden


Ummm... how about:



grep -RH --include "abc.foo" searchTerm *



-R tells grep to search recursively

-H tells grep to print out the filename it found a match in (how retarded is it that this isn't the default!)

--include lets you specify filename patterns you want to search



You'll need to supply the * pattern at the end to cause grep to include everything in the current directory and below in it's search - otherwise it'll wait for std input.







Sat Nov 29 01:20:43 2008: 4830   TonyLawrence

gravatar
If you read the entire article you'd understand that no matter WHAT you do, recursive grep is difficult.



Wed Dec 10 14:49:20 2008: 4891   anonymous


I used this one for finding expressions in sql scripts:
find . -name "*.sql" -print0 -type f | xargs -0 grep "expression"

Read the "UNUSUAL FILENAMES" section in find's manpage






Tue Mar 24 03:03:17 2009: 5831   anonymous

gravatar
The article is based on the premise that some UNIX systems don't have GNU tools available (i.e. Solaris), thus things like grep -R, or find -print0 and xargs -0 are also not available.

Nice article.



Tue Mar 24 12:01:50 2009: 5832   TonyLawrence

gravatar
Well, heck: all they had to do was read the first paragraph :-)







Thu May 28 11:30:01 2009: 6420   anonymous

gravatar
Many thanks to Alex Worden for giving the example for using grep -r (very strange syntax).



Thu Jan 14 12:57:04 2010: 7908   GilRoitto

gravatar


find . | xargs ... works poorly when there are strange characters in the filenames. This did the trick for me:

find . -print0 | xargs -0 grep whatever






Tue Jan 19 22:48:54 2010: 7926   AlexOD

gravatar


I miss having the GUI multifile recursive grep of Codewright. It even helped build reg-exp for me.

Thanks for getting me back to finding meaningfully again.



Sun Feb 14 23:53:13 2010: 8088   MrBastion

gravatar


Alex suggested:



>> grep -RH --include "abc.foo" searchTerm *



In regards to the grep recursive code that guy posted you said :



>> If you read the entire article you'd understand that no matter

>> WHAT you do, recursive grep is difficult.



In your article you gave 3 reasons not to use grep recursive: speed, elegance, and robustness.



I can't see any problem with elegance or robustness (unless you run an older Unix system in which case obviously it won't work but that isn't a problem for me). As for speed, I don't know it seems fine to me. The fact that I can very simply use '--include' to provide the file name pattern to search (eg "*.html" or "*.{html,php}") makes it simple and fast enough for me.



I am using the following code that is partly mine and partly from another website (mehtanirav.com). I am a little green when it comes to unix scripting so be tolerant ;)



#!/bin/sh

# usage:

# /src/rpl.sh folderContainingFiles/ oldText newText "*.html"

# /src/rpl.sh ./ '=\"\/' "=\"http\:\/\/www.myweb.com\/" "*[php,html]"



### help screen

#if argument is missing then output help info

if [ -z "$4" ]; then

echo "purpose:"

echo " search recursively and replace text"

echo "usage:"

echo " /src/rpl.sh folderContainingFiles/ oldText newText file-name-pattern"

echo "examples:"

echo " /src/rpl.sh ./ \"2002-2009\" \"2002-2010\" \"*.html\""

echo " /src/rpl.sh ./ \"2000\" \"2010\" \"*[php,html]\""

echo "to replace '=\"/' with '=\"http://www.myweb.com/' " use:

echo " /src/rpl.sh ./ '=\\\"\/' \"=\\\"http\:\/\/www.myweb.com\/\" \"*[php,html]\" "

echo ""

echo ""

exit 0

fi



echo "grep -rl \"$2\" $1 --include=\"$4\" "

grep -rl "$2" $1 --include="$4" |

while read filename

do

(

echo $filename

#use a backup file in case everything goes wrong

sed "s/$2/$3/g;" $filename> $filename.xx

if [ "$?" -ne "0" ]; then

echo "**ERROR** $filename : sed exited with error code $? in $filename"

echo "** Error here: sed \"s/$2/$3/g;\" $filename> $filename.xx "

else

# my samba had odd issues copying over the old file (why?!) so I just delete the old file:

rm $filename

# copying instead of renaming..should retain the old file permissions (well that is what one forum post said, I don't know really I didn't test it)

cp $filename.xx $filename

rm $filename.xx

fi

)

done













Mon Feb 15 12:31:12 2010: 8091   TonyLawrence

gravatar


The very first paragraph says:

You must mean that your ancient Unix doesn't have GNU grep, right? If you do, just go do a "man grep"; you don't need to read this (though you may want to just so you really appreciate GNU grep). Just add "-r" and grep will search through subdirectories.


The comments about "speed, elegance, and robustness" have to do with OLD SYSTEMS THAT DON'T HAVE GNU GREP.

Sheesh :-)



Wed Oct 12 16:42:22 2011: 10006   MikeW

gravatar


I think the issue is that the syntax is non-obvious to the new user, who might expect:
grep -r "perl" *.html
(search recursively for string "perl" in all '.html' files)
to work.



Wed Sep 5 07:21:13 2012: 11267   P

gravatar


The --include flag was a great help. Thanks!
I went through many sites trying to find a way to search a string recursively in files of a particular type. Everyone talked about the find command, nobody could give a grep command example.



Thu Jul 18 07:46:01 2013: 12235   IrfanRashid

gravatar


Thanks! it solved my problem

Don't miss responses! Subscribe to Comments by RSS or by Email

Click here to add your comments


If you want a picture to show with your comment, go get a Gravatar

Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

Jump to Comments



Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.

I am a Kerio reseller. Articles here related to Kerio products reflect my honest opinion, but I do have an obvious interest in selling those products also.

Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.

We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.

pavatar.jpg

This post tagged:

       - FAQ
       - Popular
       - Shell



















My Troubleshooting E-Book will show you how to solve tough problems on Linux and Unix systems!


book graphic unix and linux troubleshooting guide



Buy Kerio from a dealer
who knows tech:
I sell and support

Kerio Connect Mail server, Control, Workspace and Operator licenses and subscription renewals



Click and enter your name and phone number to call me about Kerio® products right now (Flash required)