Article 26.4 gives a tutorial introduction to regular expressions. This article is intended for those of you who just need a quick listing of regular expression syntax as a refresher from time to time. It also includes some simple examples. The characters in Table 26.6 have special meaning only in search patterns.
Pattern | What Does it Match? |
---|---|
. | Match any single character except newline. |
* |
Match any number (or none) of the single characters that immediately precede it. The preceding character can also be a regular expression. For example, since |
^ |
Match the following regular expression at the beginning of the line. |
$ |
Match the preceding regular expression at the end of the line. |
[ ] |
Match any one of the enclosed characters. |
A hyphen ( |
|
\{
n
,
m
\} |
Match a range of occurrences of the single character that immediately precedes it. The preceding character can also be a regular expression. \{ |
\ |
Turn off the special meaning of the character that follows. |
\( \) |
Save the pattern enclosed between \( and \) into a special holding space. Up to nine patterns can be saved on a single line. They can be "replayed" in substitutions by the escape sequences \1 to \9. |
\< \> |
Match characters at beginning ( |
+ |
Match one or more instances of preceding regular expression. |
? |
Match zero or one instances of preceding regular expression. |
| |
Match the regular expression specified before or after. |
( ) |
Apply a match to the enclosed group of regular expressions. |
The characters in Table 26.7 have special meaning only in replacement patterns.
Pattern | What Does it Match? |
---|---|
\ |
Turn off the special meaning of the character that follows. |
\
n
|
Restore the |
& |
Re-use the search pattern as part of the replacement pattern. |
~ |
Re-use the previous replacement pattern in the current replacement pattern. |
\u |
Convert first character of replacement pattern to uppercase. |
\U |
Convert replacement pattern to uppercase. |
\l |
Convert first character of replacement pattern to lowercase. |
\L |
Convert replacement pattern to lowercase. |
When used with
grep
or
egrep
, regular expressions are surrounded by quotes. (If the pattern contains a
$
, you must use single quotes; e.g.,
'
pattern
'
.) When used with
ed
,
ex
,
sed
, and
awk
, regular expressions are usually surrounded by
/
(although any delimiter works).
Table 26.8
has some example patterns.
Pattern | What Does it Match? |
---|---|
bag |
The string
bag
. |
^bag |
bag
at beginning of line. |
bag$ |
bag
at end of line. |
^bag$ |
bag
as the only word on line. |
[Bb]ag |
Bag
or
bag
. |
b[aeiou]g | Second letter is a vowel. |
b[^aeiou]g | Second letter is a consonant (or uppercase or symbol). |
b.g | Second letter is any character. |
^...$ | Any line containing exactly three characters. |
^\. |
Any line that begins with a
.
(dot). |
^\.[a-z][a-z] | Same, followed by two lowercase letters (e.g., troff requests). |
^\.[a-z]\{2\} | Same as previous, grep or sed only. |
^[^.] |
Any line that doesn't begin with a
.
(dot). |
bugs* |
bug
,
bugs
,
bugss
, etc. |
"word" | A word in quotes. |
"*word"* | A word, with or without quotes. |
[A-Z][A-Z]* | One or more uppercase letters. |
[A-Z]+ | Same, egrep or awk only. |
[A-Z].* | An uppercase letter, followed by zero or more characters. |
[A-Z]* | Zero or more uppercase letters. |
[a-zA-Z] | Any letter. |
[^0-9A-Za-z] | Any symbol (not a letter or a number). |
[567] |
One of the numbers
5
,
6
, or
7
. |
egrep or awk pattern: | |
five|six|seven |
One of the words
five
,
six
, or
seven
. |
80[23]?86 |
One of the numbers
8086
,
80286
, or
80386
. |
compan(y|ies) |
One of the words
company
or
companies
. |
ex or vi pattern: | |
\<the |
Words like
theater
or
the
. |
the\> |
Words like
breathe
or
the
. |
\<the\> |
The word
the
. |
sed or grep pattern: | |
0\{5,\} | Five or more zeros in a row. |
[0-9]\{3\}-[0-9]\{2\}-[0-9]\{4\} |
US social security number (
nnn
-
nn
-
nnnn
). |
The following examples show the metacharacters available to
sed
or
ex
. (
ex
commands begin with a colon.) A space is marked by ; a TAB is marked by
tab
.
Command | Result |
---|---|
s/.*/( & )/ | Redo the entire line, but add parentheses. |
s/.*/mv & &.old/ | Change a wordlist into mv commands. |
/^$/d | Delete blank lines. |
:g/^$/d | ex version of previous. |
/^[![]()
tab
]*$/d |
Delete blank lines, plus lines containing only spaces or TABs. |
:g/^[![]()
tab
]*$/d |
ex version of previous. |
s/![]() ![]() ![]() |
Turn one or more spaces into one space. |
:%s/![]() ![]() ![]() |
ex version of previous. |
:s/[0-9]/Item &:/ | Turn a number into an item label (on the current line). |
:s | Repeat the substitution on the first occurrence. |
:& | Same. |
:sg | Same, but for all occurrences on the line. |
:&g | Same. |
:%&g | Repeat the substitution globally. |
:.,$s/Fortran/\U&/g | Change word to uppercase, on current line to last line. |
:%s/.*/\L&/ | Lowercase entire file. |
:s/\<./\u&/g | Uppercase first letter of each word on current line (useful for titles). |
:%s/yes/No/g |
Globally change a word to
No
. |
:%s/Yes/~/g |
Globally change a different word to
No
(previous replacement). |
s/die or do/do or die/ | Transpose words. |
s/\([Dd]ie\) or \([Dd]o\)/\2 or \1/ | Transpose, using hold buffers to preserve case. |
- from O'Reilly & Associates' UNIX in a Nutshell (SVR4/Solaris)