This is the world's best introduction to sed - the superman of UNIX stream editing. Originally I wrote this introduction for my second e-book, however later I decided to make it a part of the free e-book preview and republish it here as this article.

Introduction to sed

Mastering sed can be reduced to understanding and manipulating the four spaces of sed. These four spaces are:

  • Input Stream
  • Pattern Space
  • Hold Buffer
  • Output Stream

Think about the spaces this way - sed reads the input stream and produces the output stream. Internally it has the pattern space and the hold buffer. Sed reads data from the input stream until it finds the newline character \n. Then it places the data read so far, without the newline, into the pattern space. Most of the sed commands operate on the data in the pattern space. The hold buffer is there for your convenience. Think about it as temporary buffer. You can copy or exchange data between the pattern space and the hold buffer. Once sed has executed all the commands, it outputs the pattern space and adds a \n at the end.

It's possible to modify the behavior of sed with the -n command line switch. When -n is specified, sed doesn't output the pattern space and you have to explicitly print it with either p or P commands.

Let's look at several examples to understand the four spaces and sed. These are just examples to illustrate what sed looks like and what it's all about.

Here is the simplest possible sed program:

sed 's/foo/bar/'

This program replaces text "foo" with "bar" on every line. Here is how it works. Suppose you have a file with these lines:

abc
foo
123-foo-456

Sed opens the file as the input stream and starts reading the data. After reading "abc" it finds a newline \n. It places the text "abc" in the pattern space and now it applies the s/foo/bar/ command. Since we have "abc" in the pattern space and there is no "foo" anywhere, sed does nothing to the pattern space. At this moment sed has executed all the commands (in this case just one). The default action when all the commands have been executed is to print the pattern space, followed by newline. So the output from the first line is "abc\n".

Now sed reads in the second line "foo" and executes s/foo/bar/. This replaces "foo" with "bar". The pattern space now contains just "bar". The end of the script has been reached and sed prints out the pattern space, followed by newline. The output from the second line is "bar\n".

Now the 3rd line is read in. The pattern space is now "123-foo-456". Since there is "foo" in the text, the s/foo/bar/ is successful and the pattern space is now "123-bar-456". The end is reached and sed prints the pattern space. The output is "123-bar-456\n".

All the lines of the input have been read at this moment and sed exits. The output from running the script on our example file is:

abc
bar
123-bar-456

In this example we never used the hold buffer because there was no need for temporary storage.

Before we look at an example with temporary storage, let's take a look at three command line switches: -n, -e and -i. First -n.

If you specify -n to sed, like this:

sed -n 's/foo/bar/'

Then sed will no longer print the pattern space when it reaches the end of the script. So if you run this program on our sample file above, there will be no output. You must use sed's p command to force sed to print the line:

sed -n 's/foo/bar/; p'

As you can see, sed commands are separated by the ; character. You can also use -e switch to separate the commands:

sed -n -e 's/foo/bar/' -e 'p'

It's the same as if you used ;. Next, let's take a look at the -i command line argument. This one forces sed to do in-place editing of the file, meaning it reads the contents of the file, executes the commands, and places the new contents back in the file.

Here is an example. Suppose you have a file called "users", with the following content:

pkrumins:hacker
esr:guru
rms:geek

And you wish to replace the ":" symbol with ";" in the whole file. Then you can do it as easily as:

sed -i 's/:/;/' users

It will silently execute the s/:/;/ command on all lines in the file and do all substitutions. Be very careful when using -i as it's destructive and it's not reversible! It's always safer to run sed without -i, and then replace the file yourself.

Alternatively you can specify a file extension to the -i command. This way sed will make a backup copy of the file before it makes in-place modifications.

For example, if you specify -i.bak, like this:

sed -i.bak 's/:/;/' users

Then sed will create users.bak before modifying the contents of users file.

Actually, before we look at the hold buffer, let's take a look at addresses and ranges. Addresses allow you to restrict sed commands to certain lines, or ranges of lines.

The simplest address is a single number that limits sed commands to the given line number:

sed '5s/foo/bar/'

This limits the s/foo/bar/ only to the 5th line of file or input stream. So if there is a "foo" on the 5th line, it will be replaced with "bar". No other lines will be touched.

The addresses can be also inverted with the ! after the address. To match all lines that are not the 5th line (lines 1-4, plus lines 6-...), do this:

sed '5!s/foo/bar/'

The inversion can be applied to any address.

Next, you can also limit sed commands to a range of lines by specifying two numbers, separated by a comma:

sed '5,10s/foo/bar/'

In this one-liner the s/foo/bar/ is executed only on lines 5 - 10, inclusive. Here is a quick, useful one-liner. Suppose you want to print lines 5 - 10 in the file. You can first disable implicit line printing with the -n command line switch, and then use the p command on lines 5 - 10:

sed -n '5,10p'

This will execute the p command only on lines 5 - 10. No other lines will be output. Pretty neat, isn't it?

There is a special address $ that matches the last line of the file. Here is an example that prints the last line of the file:

sed -n '$p'

As you can see, the p command has been limited to $, which is the last line of input.

Next, there is also a single regular expression address match like this /regex/. If you specify a regex before a command, then the command will only get executed on lines that match the regex. Check this out:

sed -n '/a\+b\+/p'

Here the p command will get called only on lines that match a\+b\+ regular expression, which means one or more letters "a" followed by one or more letters "b". For example, it prints lines like "ab", "aab", "aaabbbbb", "foo-123-ab", etc. Note how the + has to be escaped. That's because sed uses basic regular expressions by default. You can enable extended regular expressions by using the -r command line switch:

sed -rn '/a+b+/p'

This way you don't need to quote meta-characters like +, ( and ).

There is also an expression to match a range between two regexes. Here is an example,

sed '/foo/,/bar/d'

This one-liner matches all lines between the first line that matches "/foo/" regex and the first line that matches "/bar/" regex, inclusive. It applies the d command that stands for delete. In other words, it deletes a range of lines between the first line that matches "/foo/" and the first line after "/foo/" that matches "/bar/", inclusive.

Now let's take a look at the hold buffer. Suppose you have a problem where you want to print the line before the line that matches a regular expression. How do you do this? If sed didn't have a hold buffer, things would be tough, but with hold buffer we can always save the current line to the hold buffer, and then let sed read in the next line. Now if the next line matches the regex, we would just print the hold buffer, which holds the previous line. Easy, right?

The command for copying the current pattern space to the hold buffer is h. The command for copying the hold buffer back to the pattern space is g. The command for exchanging the hold buffer and the pattern space is x. We just have to choose the right commands to solve this problem. Here is the solution:

sed -n '/regex/{x;p;x}; h'

It works this way - every line gets copied to the hold buffer with the h command at the end of the script. However, for every line that matches the /regex/, we exchange the hold buffer with the pattern space by using the x command, print it with the p command, and then exchange the buffers back, so that if the next line matches the /regex/ again, we could print the current line.

Also notice the command grouping. Several commands can be grouped and executed only for a specific address or range. In this one-liner the command group is {x;p;x} and it gets executed only if the current line matches /regex/.

Note that this one-liner doesn't work if it's the first line of the input matches /regex/. To fix this, we can limit the p command to all lines that are not the first line with the 1! inverted address match:

sed -n '/regex/{x;1!p;x}; h'

Notice the 1!p. This says - call the p command on all the lines that are not the 1st line. This prevents anything to be printed in case the first line matches /regex/.

Well, that's it! I think this introduction explains the most important concepts in sed, including various command line switches, the four spaces and various sed commands.

If you wish to learn more, I suggest you get a copy of my "Sed One-Liners Explained" e-book. The e-book contains exactly 100 well-explained one-liners. Once you work through them, you'll have rewired your brain to "think in sed". In other words, you'll have learned how to manipulate the pattern space, the hold buffer and you'll know when to print the data to get the results that you need.

Have fun!

If you enjoy my writing, you can subscribe to my blog, follow me on Twitter or Google+.

Comments

September 19, 2011, 04:55

good work kit, nice post!

October 08, 2011, 04:28

A few years ago I wrote a post on programming patterns in sed: http://tech.bluesmoon.info/2008/09/programming-patterns-in-sed.html

Always a fun tool to have in your toolbox.

Patillotes Permalink
October 26, 2011, 16:42

Do you know "Minimal Perl"?. It's a book about, mostly, Perl one-liners and using it as a replacement for sed, grep, awk and find.

October 28, 2011, 08:29

I didn't know about this book. I just took a look at the preview and it looks awesome!

November 21, 2011, 14:50

Go up to your last command on AIX 5.3 and ran into problems. It tried to run it as written and kept getting a parse error. Adding a semicolon after the second x got rid of the parse error, and it seems to be matching something, but it is printing a blank line.

Command:
sed -n '/test/{x;p;x;}; h' < /tmp/sedtest

File:
cat /tmp/sedtest
Yay
test
No
tset

July 17, 2012, 13:17

Peteris,
-i option is not working with sed. Is it platform specific? What are other synonyms of it? I am using HP-UX BTW.

Thanks,
Chirag.

July 17, 2012, 14:22

It's an option for modern sed implementations. HPUX doesn't have it. You just have to do what -i does yourself, like: sed ... file > file-copy; mv file-copy file

Visu Permalink
April 11, 2013, 20:12

How to write a sed command for:
match a word from the output of a command.

fw_printenv | grep bootcmd
When i execute above command i get the following output
bootcmd=run sdboot3

Now I need a something of the sort
fw_printenv | grep bootcmd | sed ......
which gives sdboot3 and no new line at the end.

Kevin Permalink
March 04, 2014, 23:40

There's probably an easier way than this, but...

Command: sed -n '/bootcmd=run /{s/bootcmd=run \(.*\)/\1/;p;}' sedtemp.txt

File: cat sedtemp.txt
not it
not it2
not it3
bootcmd=run sdboot3
not the line
nte the droid your looking fO
bootcmd=run sdbootanotherone
r

Output:
sdboot3
sdbootanotherone

Kevin Permalink
March 04, 2014, 23:40

There's probably an easier way than this, but...

Command: sed -n '/bootcmd=run /{s/bootcmd=run \(.*\)/\1/;p;}' sedtemp.txt

File: cat sedtemp.txt
not it
not it2
not it3
bootcmd=run sdboot3
not the line
nte the droid your looking fO
bootcmd=run sdbootanotherone
r

Output:
sdboot3
sdbootanotherone

August 24, 2014, 02:15

Peteris has a nice blog. Here are some ways:

$ cmd | grep bootcmd | grep -o sdboot3 # GNU grep

$ cmd | sed -n "s/.*bootcmd.*sdboot3.*/sdboot3/p"

If you want to eliminate the newline, use tr command.

Daniel Goldman - http://www.sed-book.com/

Ankit Gupta Permalink
May 20, 2013, 05:41

Great Introduction to sed Peteris. Understood all but the hold buffer concept.

Clark Permalink
September 06, 2013, 17:53

This post is a good contribution. Other tutorials I have read made my eyes go cross just before I fell asleep.

Steven Eckhoff Permalink
September 25, 2013, 18:05

Thanks!

This is a great intro.

Carl Permalink
February 12, 2014, 17:08

Great post! Clear, concise, to the point - just the way I like it. Keep up the good work.

July 12, 2014, 02:16

Very well done! Thank you!

October 24, 2014, 06:11

SEO is the most difficult jobs to do in todays world, even i am fed up with the happy new year torrent download and you will be so amazed to find the new things coming up in the world of SEO.

November 26, 2014, 17:00

I wanted to thank you for this great read!! I definitely enjoyed every little bit of it, I have you bookmarked to check out all the new stuff you post.

November 27, 2014, 04:56

Know PK Total Collection , Latest updates on PK

November 27, 2014, 09:34

I wanted to thank you for this great read!! I definitely enjoyed every little bit of it, I have you bookmarked to check out all the new stuff you post

November 27, 2014, 09:34

I wanted to thank you for this great read!! I definitely enjoyed every little bit of it, I have you bookmarked to check out all the new stuff you post

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "cdrom": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements