Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
  Previous   Contents   Next 
   
 
Chapter 4

Searching Files

This chapter describes how to search directories and files for keywords and strings by using the grep command.

Searching for Patterns With grep

To search for a particular character string in a file, use the grep command. The basic syntax of the grep command is:

$ grep string file

In this example, string is the word or phrase you want to find, and file is the file to be searched.


Note - A string is one or more characters. A single letter is a string, as is a word or a sentence. Strings can include blank spaces, punctuation, and invisible (control) characters.


For example, to find Edgar Allan Poe's telephone extension, type grep, all or part of his name, and the file containing the information:

$ grep Poe extensions
Edgar Allan Poe     x72836
$

Note that more than one line might match the pattern you give.

$ grep Allan extensions
David Allan         x76438
Edgar Allan Poe     x72836
$ grep Al extensions
Louisa May Alcott   x74236
David Allan         x76438
Edgar Allan Poe     x72836
$

grep is case sensitive; that is, you must match the pattern with respect to uppercase and lowercase letters:

$ grep allan extensions
$ grep Allan extensions
David Allan         x76438
Edgar Allan Poe     x72836
$

Note that grep failed in the first try because none of the entries began with a lowercase a.

grep as a Filter

You can use the grep command as a filter with other commands, enabling you to filter out unnecessary information from the command output. To use grep as a filter, you must pipe the output of the command through grep. The symbol for pipe is "|".

The following example displays files that end in ".ps" and were created in the month of September.

$ ls -l *.ps | grep Sep

The first part of this command line produces a list of files ending in .ps.

$ ls -l *.ps
-rw-r--r--   1 user2    users     833233 Jun 29 16:22 buttons.ps
-rw-r--r--   1 user2    users      39245 Sep 27 09:38 changes.ps
-rw-r--r--   1 user2    users     608368 Mar  2  2000 clock.ps
-rw-r--r--   1 user2    users     827114 Sep 13 16:49 commands.ps
$

The second part of the command line pipes that list through grep, looking for the pattern Sep.

| grep Sep

The search provides the following results.

$ ls -l *.ps | grep Sep
-rw-r--r--   1 user2    users      39245 Sep 27 09:38 changes.ps
-rw-r--r--   1 user2    users     827114 Sep 13 16:49 commands.ps
$

grep With Multiword Strings

To find a pattern that is more than one word long, enclose the string with single or double quotation marks.

$ grep "Louisa May" extensions
Louisa May Alcott     x74236
$

The grep command can search for a string in groups of files. When it finds a pattern that matches in more than one file, it prints the name of the file, followed by a colon, then the line matching the pattern.

$ grep ar *
actors:Humphrey Bogart
alaska:Alaska is the largest state in the United States.
wilde:book.  Books are well written or badly written.
$

Searching for Lines Without a Certain String

To search for all the lines of a file that do not contain a certain string, use the -v option to grep. The following example shows how to search through all the files in the current directory for lines that do not contain the letter e.

$ ls
actors    alaska    hinterland    tutors    wilde
$ grep -v e *
actors:Mon Mar 14 10:00 PST 1936
wilde:That is all.
$

Using Regular Expressions With grep

You can also use the grep command to search for targets that are defined as patterns by using regular expressions. Regular expressions consist of letters and numbers, in addition to characters with special meaning to grep. These special characters, called metacharacters, also have special meaning to the system. When you use regular expressions with the grep command, you need to tell your system to ignore the special meaning of these metacharacters by escaping them. When you use a grep regular expression at the command prompt, surround the regular expression with quotes. Escape metacharacters (such as & ! . * $ ? and \) with a backslash (\). See "Searching for Metacharacters" for more information on escaping metacharacters.

  • A caret (^) metacharacter indicates the beginning of the line. The following command finds any line in the file list that starts with the letter b.

    $ grep '^b' list
  • A dollar-sign ($) metacharacter indicates the end of the line. The following command displays any line in which b is the last character on the line.

    $ grep 'b$' list

    The following command displays any line in the file list where b is the only character on the line.

    $ grep '^b$' list
  • Within a regular expression, dot (.) finds any single character. The following command matches any three-character string with "an" as the first two characters, including "any," "and," "management," and "plan" (because spaces count, too).

    $ grep 'an.' list

  • When an asterisk (*) follows a character, grep interprets the asterisk as "zero or more instances of that character." When the asterisk follows a regular expression, grep interprets the asterisk as "zero or more instances of characters matching the pattern."

    Because it includes zero occurrences, the asterisk can create a confusing command output. If you want to find all words with the letters "qu" in them, type the following command.

    $ grep 'qu*' list

    However, if you want to find all words containing the letter "n," type the following command.

    $ grep 'nn*' list

    If you want to find all words containing the pattern "nn," type the following command.

    $ grep 'nnn*' list
  • To match zero or more occurrences of any character in list, type the following command.

    $ grep .* list

Searching for Metacharacters

To use the grep command to search for metacharacters such as & ! . * ? and \, precede the metacharacter with a backslash (\). The backslash tells grep to ignore (escape) the metacharacter.

For example, the following expression matches lines that start with a period, and is useful when searching for nroff or troff formatting requests (which begin with a period).

$ grep ^\.

Table 4-1 lists common search pattern elements you can use with grep.

Table 4-1 grep Search Pattern Elements

Character

Matches

^

The beginning of a text line

$

The end of a text line

.

Any single character

[...]

Any single character in the bracketed list or range

[^...]

Any character not in the list or range

*

Zero or more occurrences of the preceding character or regular expression

.*

Zero or more occurrences of any single character

\

The escape of special meaning of next character

Note that you can also use these search characters in vi text editor searches.

 
 
 
  Previous   Contents   Next