This article is part of the article series "Perl One-Liners Explained."
<- previous article next article ->

Perl One LinersThis is the seventh part of a nine-part article on Perl one-liners. Perl is not Perl without regular expressions, therefore in this part I will come up with and explain various Perl regular expressions. Please see part one for the introduction of the series.

Perl one-liners is my attempt to create "perl1line.txt" that is similar to "awk1line.txt" and "sed1line.txt" that have been so popular among Awk and Sed programmers, and Unix sysadmins. I will release the perl1line.txt in the next part of the series.

The article on Perl one-liners consists of nine parts:

After I am done with the next part of the article, I will release the whole article series as a pdf e-book! Please subscribe to my blog to be the first to get it. You can also follow me on Twitter.

Awesome news: I have written an e-book based on this article series. Check it out:

And here are today's one-liners:

109. Match something that looks like an IP address.


This regex doesn't guarantee that the thing that got matched is in fact a valid IP. All it does is match something that looks like an IP. It matches a number followed by a dot four times. For example, it matches a valid IP and it also matches an invalid IP such as 923.844.1.999.

Here is how it works. The ^ at the beginning of regex is an anchor that matches the beginning of string. Next \d{1,3} matches one, two or three consecutive digits. The \. matches a dot. The $ at the end is an anchor that matches the end of the string. It's important to use both ^ and $ anchors, otherwise strings like foo213.3.1.2bar would also match.

This regex can be simplified by grouping the first three repeated \d{1,3}\. expressions:


110. Test if a number is in range 0-255.


Here is how it works. A number can either be one digit, two digit or three digit. If it's a one digit number then we allow it to be anything [0-9]. If it's two digit, we also allow it to be any combination of [0-9][0-9]. However if it's a three digit number, it has to be either one hundred-something or two-hundred something. If it'e one hundred-something, then 1[0-9][0-9] matches it. If it's two hundred-something then it's either something up to 249, which is matched by 2[0-4][0-9] or it's 250-255, which is matched by 25[0-5].

111. Match an IP address.

my $ip_part = qr|([0-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])|;
if ($ip =~ /^($ip_part\.){3}$ip_part$/) {
 say "valid ip";

This regexp combines the previous two. It uses the my $ip_part = qr/.../ operator compiles the regular expression and puts it in $ip_part variable. Then the $ip_part is used to match the four parts of the IP address.

112. Check if the string looks like an email address.


This regex makes sure that the string looks like an email address. Notice that I say "looks like". It doesn't guarantee it is an email address. Here is how it works - first it matches something up to the @ symbol, then it matches as much as possible until it finds a dot, and then it matches some more. If this succeeds, then it it's something that at least looks like email address with the @ symbol and a dot in it.

For example, matches but cats@catonmat doesn't because the regex can't match the dot \. that is necessary.

Much more robust way to check if a string is a valid email would be to use Email::Valid module:

use Email::Valid;
print (Email::Valid->address('') ? 'valid email' : 'invalid email');

113. Check if the string is a decimal number.

Checking if the string is a number is really difficult. I based my regex and explanation on the one in Perl Cookbook.

Perl offers \d that matches digits 0-9. So we can start with:


This regex matches one or more digits \d starting at the beginning of the string ^ and ending at the end of the string $. However this doesn't match numbers such as +3 and -3. Let's modify the regex to match them:


Here the [+-]? means match an optional plus or a minus before the digits. This now matches +3 and -3 but it doesn't match -0.3. Let's add that:


Now we have expanded the previous regex by adding \.?\d*, which matches an optional dot followed by zero or more numbers. Now we're in business and this regex also matches numbers like -0.3 and 0.3.

Much better way to match a decimal number is to use Regexp::Common module that offers various useful regexes. For example, to match an integer you can use $RE{num}{int} from Regexp::Common.

How about positive hexadecimal numbers? Here is how:


This matches the hex prefix 0x followed by hex number itself. The /i flag at the end makes sure that the match is case insensitive. For example, 0x5af matches, 0X5Fa matches but 97 doesn't, cause it's just a decimal number.

It's better to use $RE{num}{hex} because it supports negative numbers, decimal places and number grouping.

Now how about octal? Here is how:


Octal numbers are prefixed by 0, which is followed by octal digits 0-7. For example, 013 matches but 09 doesn't, cause it's not a valid octal number.

It's better to use $RE{num}{oct} because of the same reasons as above.

Finally binary:


Binary base consists of just 0s and 1s. For example, 010101 matches but 210101 doesn't, because 2 is not a valid binary digit.

It's better to use $RE{num}{bin} because of the same reasons as above.

114. Check if a word appears twice in the string.


This regex matches word followed by something or nothing at all, followed by the same word. Here the (word) captures the word in group 1 and \1 refers to contents of group 1, therefore it's almost the same as writing /(word).*word/

For example, silly things are silly matches /(silly).*\1/, but silly things are boring doesn't, because silly is not repeated in the string.

115. Increase all numbers by one in the string.

$str =~ s/(\d+)/$1+1/ge

Here we use the substitution operator s///. It matches all integers (\d+), puts them in capture group 1, then it replaces them with their value incremented by one $1+1. The g flag makes sure it finds all the numbers in the string, and the e flag evaluates $1+1 as a Perl expression.

For example, this 1234 is awesome 444 gets turned into this 1235 is awesome 445.

116. Extract HTTP User-Agent string from the HTTP headers.

/^User-Agent: (.+)$/

HTTP headers are formatted as Key: Value pairs. It's very easy to parse such strings, you just instruct the regex engine to save the Value part in $1 group variable.

For example, if the HTTP headers contain,

Host: localhost:8000
Connection: keep-alive
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_0_0; en-US)
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

Then the regular expression will extract the Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_0_0; en-US) string.

117. Match printable ASCII characters.

/[ -~]/

This is really tricky and smart. To understand it, take a look at man ascii. You'll see that space starts at value 0x20 and the ~ character is 0x7e. All the characters between a space and ~ are printable. This regular expression matches exactly that. The [ -~] defines a range of characters from space till ~. This is my favorite regexp of all time.

You can invert the match by placing ^ as the first character in the group:

/[^ -~]/

This matches the opposite of [ -~].

118. Match text between two HTML tags.


This regex matches everything between <strong>...</strong> HTML tags. The trick here is the ([^<]*), which matches as much as possible until it finds a < character, which starts the next tag.

Alternatively you can write:


But this is a little different. For example, if the HTML is <strong><em>hello</em></strong> then the first regex doesn't match anything because the < follows <strong> and ([^<]*) matches as little as possible. The second regex matches <em>hello</em> because the (.*?)</strong> matches as little as possible until it finds </strong>, which happens to be <em>hello</em>.

However don't use regular expressions for matching and parsing HTML. Use modules like HTML::TreeBuilder to accomplish the task cleaner.

119. Replace all <b> tags with <strong>

$html =~ s|<(/)?b>|<$1strong>|g

Here I assume that the HTML is in variable $html. Next the <(/)?b> matches the opening and closing <b> tags, captures the optional closing tag slash in group $1 and then replaces the matched tag with either <strong> or </strong>, depending on if it was an opening or closing tag.

120. Extract all matches from a regular expression.

my @matches = $text =~ /regex/g;

Here the regular expression gets evaluated in the list context that makes it return all the matches. The matches get put in the @matches variable.

For example, the following regex extracts all numbers from a string:

my $t = "10 hello 25 moo 31 foo";
my @nums = $text =~ /\d+/g;

@nums now contains (10, 25, 30).

Perl one-liners explained e-book

I've now written the "Perl One-Liners Explained" e-book based on this article series. I went through all the one-liners, improved explanations, fixed mistakes and typos, added a bunch of new one-liners, added an introduction to Perl one-liners and a new chapter on Perl's special variables. Please take a look:

Have Fun!

Thanks for reading the article! In the next part I am releasing the perl1line.txt that will contain all the one-liners in a single file.

Follow me everywhere!

Testling now has moved to Testling-CI

Don't use anything described below as it doesn't work anymore

All questions about Testling-CI:

We're happy to announce Headless Testling. Headless Testling lets you run your JavaScript tests locally through jsdom and remotely with Testling.

We put headless testling on npm so installing it is as easy as:

npm install -g testling

And now you can run your tests locally!

$ testling test.js 

Or run them on real browsers by specifying --browsers argument:

$ testling test.js --browsers=iexplore/7.0,iexplore/8.0,firefox/3.5

For example, if your test.js is this:

var test = require('testling');

test('json parse', function (t) {
    t.deepEqual(JSON.parse('[1,2]'), [1,2]);

Then running testling headlessly you'll get the following output:

node/jsdom                      1/1  100 % ok

But running it with --browsers=iexplore/7.0,iexplore/8.0,firefox/3.5, the test will get executed on native IE7, IE8 and Firefox 3.5:

Bundling...  done

iexplore/7.0        0/1    0 % ok
  Error: 'JSON' is undefined
    at [anonymous]() in /test.js : line: 4, column: 5
    at [anonymous]() in /test.js : line: 3, column: 29
    at test() in /test.js : line: 3, column: 1

  > t.deepEqual(JSON.parse('[1,2]'), [1,2]);

iexplore/8.0        1/1  100 % ok
firefox/3.5         1/1  100 % ok

total               2/3   66 % ok

You can follow the development of headless testling at testling's github repo.

Follow the founders of Browserling on GitHub, Twitter, Plurk, Google+ and Facebook!

And subscribe to my blog for Browserling announcements and all kinds of other awesome blog posts!

Testling now has moved to Testling-CI

Don't use anything described below as it doesn't work anymore

All questions about Testling-CI:

We have amazing news at Browserling. We just launched a new product called Testling! Testling is automated cross-browser JavaScript testing tool. You write the JavaScript test, and we run it on all the browsers behind the scenes and report the results.

It's super easy to use. All you need is curl. Try this: create a file test.js with the following contents:

var test = require('testling');

test('json parse', function (t) {
    t.deepEqual(JSON.parse('[1,2]'), [1,2]);

And now do this:

curl -sSNT test.js,iexplore/8.0,chrome/14.0

This will run the JSON test in Internet Explorer 7, Internet Explorer 8 and Chrome 14. It's well known that IE7 does not have a JSON object so the test will fail. However IE8 and Chrome have the global JSON object and the test will succeed:

Here is the output:

Bundling...  done

iexplore/7.0        0/1    0 % ok
  Error: 'JSON' is undefined
    at [anonymous]() in /test.js : line: 4, column: 5
    at [anonymous]() in /test.js : line: 3, column: 29
    at test() in /test.js : line: 3, column: 1

  > t.deepEqual(JSON.parse('[1,2]'), [1,2]);

iexplore/8.0        1/1  100 % ok
chrome/14.0         1/1  100 % ok

total               2/3   66 % ok

It precisely shows what the error was on IE7 - JSON is undefined in test.js, on line 4.

Testling supports all the major browsers:

  • Internet Explorer 6, 7, 8 and 9 (all native).
  • Chrome 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
  • Firefox 3.0, 3.5, 3.6, 4.0, 5.0, 6.0, 7.0.
  • Opera 10.0, 10.5, 11.0, 11.5.
  • Safari 5.0.5, 5.1.

When you run a test, you can specify, which browsers to run tests on via the ?browsers query parameter. For example, to run the test on IE6, Opera 11 and Safari 5.1, do this:

curl -sSNT test.js,opera/11.0,safari/5.1

You can script interactions with websites, too. Here is an example that uses jQuery to submit a login form and compare the greeting text:

var test = require('testling');

test('testling login test', function (t) {
    t.createWindow('', function (win, $) {

        var form = $('#form')[0];

        t.submitForm(form, function (win, $) {
            t.ok($('#welcome p:first').text() === "Login successful.", "login failed");

Submitting a form only works in IE8, IE9, Chrome 7-14 and all Firefox browsers. Opera, IE6 and IE7 can't do that yet because of implementation limitations, but we're solving this problem right now.

See Testling documentation for a detailed information how to write tests!

Follow the founders of Browserling on GitHub, Twitter, Plurk, Google+ and Facebook!

And subscribe to my blog for Browserling announcements and all kinds of other awesome blog posts!

This is the world's best introduction to sed - the superman of UNIX stream editing. Originally I wrote this introduction for my second e-book, however later I decided to make it a part of the free e-book preview and republish it here as this article.

Introduction to sed

Mastering sed can be reduced to understanding and manipulating the four spaces of sed. These four spaces are:

  • Input Stream
  • Pattern Space
  • Hold Buffer
  • Output Stream

Think about the spaces this way - sed reads the input stream and produces the output stream. Internally it has the pattern space and the hold buffer. Sed reads data from the input stream until it finds the newline character \n. Then it places the data read so far, without the newline, into the pattern space. Most of the sed commands operate on the data in the pattern space. The hold buffer is there for your convenience. Think about it as temporary buffer. You can copy or exchange data between the pattern space and the hold buffer. Once sed has executed all the commands, it outputs the pattern space and adds a \n at the end.

It's possible to modify the behavior of sed with the -n command line switch. When -n is specified, sed doesn't output the pattern space and you have to explicitly print it with either p or P commands.

Let's look at several examples to understand the four spaces and sed. These are just examples to illustrate what sed looks like and what it's all about.

Here is the simplest possible sed program:

sed 's/foo/bar/'

This program replaces text "foo" with "bar" on every line. Here is how it works. Suppose you have a file with these lines:


Sed opens the file as the input stream and starts reading the data. After reading "abc" it finds a newline \n. It places the text "abc" in the pattern space and now it applies the s/foo/bar/ command. Since we have "abc" in the pattern space and there is no "foo" anywhere, sed does nothing to the pattern space. At this moment sed has executed all the commands (in this case just one). The default action when all the commands have been executed is to print the pattern space, followed by newline. So the output from the first line is "abc\n".

Now sed reads in the second line "foo" and executes s/foo/bar/. This replaces "foo" with "bar". The pattern space now contains just "bar". The end of the script has been reached and sed prints out the pattern space, followed by newline. The output from the second line is "bar\n".

Now the 3rd line is read in. The pattern space is now "123-foo-456". Since there is "foo" in the text, the s/foo/bar/ is successful and the pattern space is now "123-bar-456". The end is reached and sed prints the pattern space. The output is "123-bar-456\n".

All the lines of the input have been read at this moment and sed exits. The output from running the script on our example file is:


In this example we never used the hold buffer because there was no need for temporary storage.

Before we look at an example with temporary storage, let's take a look at three command line switches: -n, -e and -i. First -n.

If you specify -n to sed, like this:

sed -n 's/foo/bar/'

Then sed will no longer print the pattern space when it reaches the end of the script. So if you run this program on our sample file above, there will be no output. You must use sed's p command to force sed to print the line:

sed -n 's/foo/bar/; p'

As you can see, sed commands are separated by the ; character. You can also use -e switch to separate the commands:

sed -n -e 's/foo/bar/' -e 'p'

It's the same as if you used ;. Next, let's take a look at the -i command line argument. This one forces sed to do in-place editing of the file, meaning it reads the contents of the file, executes the commands, and places the new contents back in the file.

Here is an example. Suppose you have a file called "users", with the following content:


And you wish to replace the ":" symbol with ";" in the whole file. Then you can do it as easily as:

sed -i 's/:/;/' users

It will silently execute the s/:/;/ command on all lines in the file and do all substitutions. Be very careful when using -i as it's destructive and it's not reversible! It's always safer to run sed without -i, and then replace the file yourself.

Alternatively you can specify a file extension to the -i command. This way sed will make a backup copy of the file before it makes in-place modifications.

For example, if you specify -i.bak, like this:

sed -i.bak 's/:/;/' users

Then sed will create users.bak before modifying the contents of users file.

Actually, before we look at the hold buffer, let's take a look at addresses and ranges. Addresses allow you to restrict sed commands to certain lines, or ranges of lines.

The simplest address is a single number that limits sed commands to the given line number:

sed '5s/foo/bar/'

This limits the s/foo/bar/ only to the 5th line of file or input stream. So if there is a "foo" on the 5th line, it will be replaced with "bar". No other lines will be touched.

The addresses can be also inverted with the ! after the address. To match all lines that are not the 5th line (lines 1-4, plus lines 6-...), do this:

sed '5!s/foo/bar/'

The inversion can be applied to any address.

Next, you can also limit sed commands to a range of lines by specifying two numbers, separated by a comma:

sed '5,10s/foo/bar/'

In this one-liner the s/foo/bar/ is executed only on lines 5 - 10, inclusive. Here is a quick, useful one-liner. Suppose you want to print lines 5 - 10 in the file. You can first disable implicit line printing with the -n command line switch, and then use the p command on lines 5 - 10:

sed -n '5,10p'

This will execute the p command only on lines 5 - 10. No other lines will be output. Pretty neat, isn't it?

There is a special address $ that matches the last line of the file. Here is an example that prints the last line of the file:

sed -n '$p'

As you can see, the p command has been limited to $, which is the last line of input.

Next, there is also a single regular expression address match like this /regex/. If you specify a regex before a command, then the command will only get executed on lines that match the regex. Check this out:

sed -n '/a\+b\+/p'

Here the p command will get called only on lines that match a\+b\+ regular expression, which means one or more letters "a" followed by one or more letters "b". For example, it prints lines like "ab", "aab", "aaabbbbb", "foo-123-ab", etc. Note how the + has to be escaped. That's because sed uses basic regular expressions by default. You can enable extended regular expressions by using the -r command line switch:

sed -rn '/a+b+/p'

This way you don't need to quote meta-characters like +, ( and ).

There is also an expression to match a range between two regexes. Here is an example,

sed '/foo/,/bar/d'

This one-liner matches all lines between the first line that matches "/foo/" regex and the first line that matches "/bar/" regex, inclusive. It applies the d command that stands for delete. In other words, it deletes a range of lines between the first line that matches "/foo/" and the first line after "/foo/" that matches "/bar/", inclusive.

Now let's take a look at the hold buffer. Suppose you have a problem where you want to print the line before the line that matches a regular expression. How do you do this? If sed didn't have a hold buffer, things would be tough, but with hold buffer we can always save the current line to the hold buffer, and then let sed read in the next line. Now if the next line matches the regex, we would just print the hold buffer, which holds the previous line. Easy, right?

The command for copying the current pattern space to the hold buffer is h. The command for copying the hold buffer back to the pattern space is g. The command for exchanging the hold buffer and the pattern space is x. We just have to choose the right commands to solve this problem. Here is the solution:

sed -n '/regex/{x;p;x}; h'

It works this way - every line gets copied to the hold buffer with the h command at the end of the script. However, for every line that matches the /regex/, we exchange the hold buffer with the pattern space by using the x command, print it with the p command, and then exchange the buffers back, so that if the next line matches the /regex/ again, we could print the current line.

Also notice the command grouping. Several commands can be grouped and executed only for a specific address or range. In this one-liner the command group is {x;p;x} and it gets executed only if the current line matches /regex/.

Note that this one-liner doesn't work if it's the first line of the input matches /regex/. To fix this, we can limit the p command to all lines that are not the first line with the 1! inverted address match:

sed -n '/regex/{x;1!p;x}; h'

Notice the 1!p. This says - call the p command on all the lines that are not the 1st line. This prevents anything to be printed in case the first line matches /regex/.

Well, that's it! I think this introduction explains the most important concepts in sed, including various command line switches, the four spaces and various sed commands.

If you wish to learn more, I suggest you get a copy of my "Sed One-Liners Explained" e-book. The e-book contains exactly 100 well-explained one-liners. Once you work through them, you'll have rewired your brain to "think in sed". In other words, you'll have learned how to manipulate the pattern space, the hold buffer and you'll know when to print the data to get the results that you need.

Have fun!

If you enjoy my writing, you can subscribe to my blog, follow me on Twitter or Google+.

This article is part of the article series "Sed One-Liners Explained."
<- previous article next article ->

I love writing about programming and I am happy to announce my second e-book called "Sed One-Liners Explained". This book is based on my popular "Sed One-Liners Explained" article series that has been read over 500,000 times.

I reviewed all the one-liners in the series, fixed various mistakes, greatly improved the explanations, added a bunch of new one-liners, bringing the total count to 100, and added three new chapters - an introduction to sed, a summary of sed addresses and ranges, and a chapter on debugging sed scripts with sed-sed.

Table of Contents

The e-book is 98 pages long and it explains exactly 100 one-liners. It's divided into the following chapters:

  • Preface.
  • 1. Introduction to sed.
  • 2. Line Spacing.
  • 3. Line Numbering.
  • 4. Text Conversion and Substitution.
  • 5. Selective Printing of Certain Lines.
  • 6. Selective Deletion of Certain Lines.
  • 7. Special sed Applications.
  • Appendix A. Summary of All sed Commands.
  • Appendix B. Addresses and Ranges.
  • Appendix C. Debugging sed Scripts with sed-sed.
  • Index.

What's sed?

Sed is the superman of UNIX stream editing. It's a small utility that's present on every UNIX system and it transforms one stream of text into another. Let's take a look at several practical examples that sed can carry out easily. All these examples and many more are explained in the e-book.

I have also made the first chapter of the book, Introduction to sed, freely available. Please download the e-book preview to read it. The introductory chapter explains the general principles of sed, introduces the four spaces of sed, addresses and ranges, and various command line flags.

Example 1: Replace "lamb" with "goat" on every line

sed 's/lamb/goat/'

This one-liner uses the famous s/.../.../ command. The s command substitutes the text in the first part of the command with the text in the second part. In this one-liner it replaces "lamb" with "goat".

A very detailed explanation of how sed reads the lines, how it executes the commands and how the printing happens is presented in the freely available introduction chapter. Please take a look.

Example 2: Replace only the second occurrence of "lamb" with "goat" on every line

sed 's/lamb/goat/2'

Sed is the only tool that I know that takes a numeric argument to the s command. The numeric argument, in this case 2, specifies which occurrence of the text to replace. In this example only the 2nd occurrence of "lamb" gets replaced with "goat".

Example 3: Number the lines in a file

sed = file | sed 'N; s/\n/: /'

This one-liner is actually two one-liners. The first one uses the = command that inserts a line containing the line number before every original line in the file. Then this output gets piped to the second sed command that joins two adjacent lines with the N command. When joining lines with the N command, a newline character \n is placed between them. Therefore it uses the s command to replace this newline \n with a colon followed by a space ": ".

So for example, if the file contains lines:

hello world
good job
sunny day

Then after running the one-liner, the result is going to be:

1: hello world
2: good job
3: sunny day

Example 4: Delete every 2nd line

sed 'n;d'

This one-liner uses the n command that prints the current line (actually the current pattern space, see the introduction chapter for in-depth explanation), deletes it, and reads the next line. Then sed executes the d command that deletes the current line without printing. This way the 1st line gets printed, the 2nd line gets deleted, then the 3rd line gets printed again, then the 4th gets deleted, etc.

Example 5: ROT 13 encode every line

sed '

Here the y/set1/set2/ command is used. The y command substitutes elements in the set1 with the corresponding elements in the set2. The first y command replaces all lowercase letters with their 13-char-shifted counterparts, and the second y command does the same for the uppercase letters. So for example, character a gets replaced by n, b gets replaced by o, character Z gets replaced by M, etc.

Sed is actually very powerful. It's as powerful as a Turing machine, meaning you can write any computer program in it. Check out these programs written in sed. Run them as sed -f file.sed:

After you read the e-book you'll be able to understand all these complex programs!

Book Preview

See the quality of my work before you buy the e-book. I have made the first chapter, Introduction to sed, freely available. The preview also includes the full table of contents, preface and the first page of chapter two.

Buy it now!

The price of the e-book is $9.95 and it can be purchased via PayPal:

PayPal - The safer, easier way to pay online!

After you have made the payment, my automated e-book processing system will send you the PDF e-book in a few minutes!

Tweet about my book!

Help me spread the word about my new book! I prepared a special link that you can use to tweet about it:

What's next?

I am not stopping here. I love writing about programming and my next book is going to be "Perl One-Liners Explained", based on my "Perl One-Liners Explained" article series. Expect this book in a few months!


Enjoy the book and don't forget to leave comments about it!

Also if you're interested, take a look at my first e-book called "Awk One-Liners Explained". It's written in the same style as this e-book and it teaches practical Awk through many examples.

Finally, if you enjoy my blog, you can subscribe to my blog, follow me on Twitter or Google+.