This is the introduction to perl one-liners. Originally I wrote this introduction for my third e-book, however later I decided to make it a part of the free e-book preview and republish it here as this article.

Introduction to Perl one-liners

Perl one-liners are small and awesome Perl programs that fit in a single line of code and they do one thing really well. These things include changing line spacing, numbering lines, doing calculations, converting and substituting text, deleting and printing certain lines, parsing logs, editing files in-place, doing statistics, carrying out system administration tasks, updating a bunch of files at once, and many more. Perl one-liners will make you the shell warrior. Anything that took you minutes to solve, will now take you seconds!

Let's look at several examples to get more familiar with one-liners. Here is one:

perl -pi -e 's/you/me/g' file

This one-liner replaces all occurrences of the text you with me in the file file. Very useful if you ask me. Imagine you are on a remote server and have this file and you need to do the replacement. You can either open it in text editor and execute find-replace or just do it through command line and, bam, be done with it.

The -e argument is the best argument. It allows you to specify the Perl code to be executed right on the command line. In this one-liner the code says, do the substitution (s/find/replace/flags command) and replace you with me globally (g flag). The -p argument makes sure the code gets executed on every line, and that the line gets printed out after that. The -i argument makes sure that file gets edited in-place, meaning Perl opens the file, executes the substitution for each line, prints the output to a temporary file, and then replaces the original file.

How about doing the same replacement in multiple files? Just specify them on the command line!

perl -pi -e 's/you/me/g' file1 file2 file3

Now let's do the same replacement only on lines that match we. It's as simple as this:

perl -pi -e 's/you/me/g if /we/' file

Here we use the conditional if /we/. This makes sure that s/you/me/g gets executed only on lines that match the regular expression /we/. The regular expression here can be anything. Let's say you want to execute the substitution only on lines that have digits in them. You can then use the /\d/ regular expression that matches numbers:

perl -pi -e 's/you/me/g if /\d/' file

Now how about finding all repeated lines in a file?

perl -ne 'print if $a{$_}++' file

This one-liner records the lines seen so far in the %a hash and keeps the counter of how many times it has seen the same line. The $a{$_}++ creates elements in the %a hash automagically. When it sees a repeated line, the value of that hash element is defined and greater than zero, so if $a{$_} is true, and it prints the line. This one-liner also uses the -n command line argument that loops over the input but unlike -p doesn't print the lines automatically, so you have to use print explicitly.

How about numbering lines? Super simple! Perl has the $. special variable that automatically maintains the current line number. You can just print it out together with the line:

perl -ne 'print "$. $_"'

You can also achieve the same by using -p argument and modifying the $_ variable, which contains the entire line:

perl -pe '$_ = "$. $_"'

Here each line gets replaced by the string "$. $_", which is the current line number followed by the line itself.

How about we combine the previous two one-liners and create one that numbers repeated lines? Here we go:

perl -ne 'print "$. $_" if $a{$_}++'

Now let's do something different. Let's sum up all the numbers on each line. We'll use the sum function from the List::Util CPAN module. You can install it as easily as running perl -MCPAN -e'install List::Util'.

perl -MList::Util=sum -alne 'print sum @F'

The -MList::Util command line argument imports the List::Util module, and the =sum part of it imports the sum function from it. Next -a enables automatic splitting of fields into the @F array. The -l argument makes sure that print outputs a newline at the end of each line. Finally the sum @F sums up all the elements in the @F list and print prints it out, followed by a newline (that was added by the -l argument).

How about finding the date 1299 days ago? That's also solvable by a simple one-liner:

perl -MPOSIX -le '
  @now = localtime;
  $now[3] -= 1299;
  print scalar localtime mktime @now
'

This one-liner didn't quite fit in one line, but that's just because my blog has narrow space for content. What happens here is we modify the 4th element of the structure returned by localtime, which happens to be days. So we just subtract 1299 days from the current day. Then we reassembles it into a new time through localtime mktime @now and print it in scalar context that prints human readable time.

How about generating an 8 letter password? Again, solvable elegantly with a one-liner:

perl -le 'print map { (a..z)[rand 26] } 1..8'

The a..z generates a list of letters from a to z (total 26). Then we randomly choose one of the characters through generating a random number in range 0-25, and then we repeat this whole process 8 times!

Here is another one. Suppose you want to quickly find the decimal number that corresponds to an IP address. You can use the unpack function and find it really quickly:

perl -le 'print unpack("N", 127.0.0.1)'

This one-liner uses a vstring, which is a version literal. The IP address 127.0.0.1 is treated as a vstring, which is basically the numbers 127, 0, 0, 1 concatenated together. Next the unpack function unpacks them to a single decimal number.

Btw, I once created a cheat sheet with all the pack/unpack format strings. Get it here: perl pack, unpack cheat sheet.

Now how about calculations? Let's find the sum of the numbers in the first column:

perl -lane '$sum += $F[0]; END { print $sum }'

Here the lines automatically get split up into fields through the -a argument. The fields can be now accessed through the @F array. The first element of the array, $F[0], is the first column. So all we have to do is sum all the columns up through $sum += $F[0]. When the Perl program ends its job, it executes any code in the special END block. In this case we print out the total sum there. Really easy!

Now let's find out how many packets have gone through all the iptables rules. It's this simple:

iptables -L -nvx | perl -lane '
  $_ = $F[0]; $pkts += $_ if /^\d/; END { print $pkts }
'

The iptables program outputs the packets as the first field. All we do is check if the first field is numeric (because it also outputs labels header), and if so, sum the packets up on the first column, just like in the previous one-liner.

How about getting a list of the names of all users on the system?

perl -a -F: -lne 'print $F[4]' /etc/passwd

Combining -a with -F argument allows you to specify the character that lines should be split on. In this case it's the colon, which is the record separator of /etc/passwd. Next we just print the 5th field $F[4], which is the real name of the user. Really quick and easy.

As you can see, knowing Perl one-liners lets you accomplish many tasks quickly. If you wish to learn more and become really fast in the shell, I suggest you get a copy of my "Perl One-Liners Explained" e-book. The e-book contains 130 unique one-liners and many of them are presented in several different ways, so the total number of one-liners in this book is over 200.

If you enjoy my writing, you can subscribe to my blog, follow me on Twitter or Google+.

I remembered about quines today and as I hadn't ever written one before, I decided to write a quine in javascript for node.js. A quine is a computer program which takes no input and produces a copy of its own source code as its only output.

Turns out it was easier than expected. The key idea in writing a quine is having data that contains the whole program, which is then modified to match the program and printed.

Here is how my quine for node.js looks:

var data = "module.exports = function () { console.log(\"var data = \\\"%s\\\"\\n%s\", data.replace(/\\\\/g, \"\\\\\\\\\").replace(/\"/g, \"\\\\\\\"\"), data.replace(/{/, \"{\\n   \").replace(/}$/, \"\\n}\")); }"
module.exports = function () {
    console.log("var data = \"%s\"\n%s", data.replace(/\\/g, "\\\\").replace(/"/g, "\\\""), data.replace(/{/, "{\n   ").replace(/}$/, "\n}"));
}

If you require this module and call the exported function, you get the same module back:

> require('./index.js')()

Output:

var data = "module.exports = function () { console.log(\"var data = \\\"%s\\\"\\n%s\", data.replace(/\\\\/g, \"\\\\\\\\\").replace(/\"/g, \"\\\\\\\"\"), data.replace(/{/, \"{\\n   \").replace(/}$/, \"\\n}\")); }"
module.exports = function () {
    console.log("var data = \"%s\"\n%s", data.replace(/\\/g, "\\\\").replace(/"/g, "\\\""), data.replace(/{/, "{\n   ").replace(/}$/, "\n}")); 
}

Node-quine on my github: https://github.com/pkrumins/node-quine.

Update

Turns out a quine in node.js that prints itself can be written much simpler this way (by maciejmalecki):

function quine() { console.log(quine.toString()) }

SubStack came up with this solution:

(function f() { console.log('(' + f.toString() + ')()') })()

And here is Erik Lundin's quine:

module.exports = function () {
    console.log('module.exports = ' + arguments.callee.toString());
}

Isaacs offers this version of qunie that adds cli support. However this doesn't count as quine is not allowed to read the source file of itself. But I'm including it here anyway as it's fun:

var data = require('fs').readFileSync(module.filename, 'utf8');
module.exports = function () {
    console.log(data);
};
if (module === require.main) module.exports();

Hooray for quines!

I was re-reading Hacker's Delight and on page 202 I found a poem about division that I had forgotten about.

I think that I shall never envision
An op unlovely as division.

An op whose answer must be guessed
And then, through multiply, assessed;

An op for which we dearly pay,
In cycles wasted every day.

Division code is often hairy;
Long division's downright scary.

The proofs can overtax your brain,
The ceiling and floor may drive you insane.

Good code to divide takes a Knuthian hero,
But even God can't divide by zero!

Henry S. Warren, author of Hacker's Delight.

I was implementing password authentication for VNC in node.js the other day and faced a problem where it would just never successfully authenticate. I checked my implementation several times and it seemed fine. Then I tried to implement it in Python just to see if I was missing something obvious. It also didn't work. Then I tried doing the authentication through openssl and it was also giving a result I didn't expect. Finally I found some old C code and wrote node-des, which worked. I still haven't figured out why it didn't work in node, python and openssl. Perhaps my readers can help me understand what was going on!

Here is a quick overview of how VNC password authentication works. On the VNC server you set the authentication password. When a VNC client connects, the VNC server generates random 16 byte garbage called challenge and sends it to the client. The client DES encrypts the challenge with the password as the key. The result is called the response and is also 16 bytes in size. The client then sends it to the VNC server. The VNC server then also DES encrypts challenge with the password and compares it with the response. If challenge matches response, the session gets authenticated.

Let's say the password set on the VNC server is browsers, and the challenge that the VNC server sends to the client is the following 16 bytes:

0c fd 6b 8c 5c 04 b3 e5 01 40 b9 de 33 9e 0d db

Then DES encrypting these 16 bytes with the password browsers should produce the following response:

44 ee fc 48 83 bb 95 f1 c0 82 c3 e2 98 c3 6a 4a

I sniffed this data from a real VNC client - VNC server communication and I use this as a test case for node-des as it's the correct result. However now let's look at what happens if I encrypt the challenge with node's crypto module, with Python and with openssl.

First node's crypto module. I wrote the following program to DES encrypt the 16 challenge challenge bytes with the password browsers.

var crypto = require('crypto');
var challenge = Buffer([
   0x0c, 0xfd, 0x6b, 0x8c,
   0x5c, 0x04, 0xb3, 0xe5,
   0x01, 0x40, 0xb9, 0xde,
   0x33, 0x9e, 0x0d, 0xdb
]);

var response = crypto
  .createCipher('des', 'browsers')
  .update(challenge);

console.log(Buffer(response));

When I run it, it produces 27 byte long output and it's not what I'd expect:

<Buffer c3 a9 72 c2 8b c3 a2 10 56 c2 8f c3 94 12 c2 a3 c3 a3 33 c2 ba c3 9e c2 a7 c2 96>

So node.js doesn't produce the expected 16 byte result. In fact I tried all the possible variations of the des algorithm that openssl (which node.js bases its crypto module on) supports. Perhaps I chose the wrong des implementation as there are many?

var crypto = require('crypto');
var challenge = Buffer([
   0x0c, 0xfd, 0x6b, 0x8c,
   0x5c, 0x04, 0xb3, 0xe5,
   0x01, 0x40, 0xb9, 0xde,
   0x33, 0x9e, 0x0d, 0xdb
]);

var algos = ("des des-cbc des-cfb des-ecb des-ede " +
            "des-ede-cbc des-ede-cfb des-ede-ofb " +
            "des-ede3 des-ede3-cbc des-ede3-cfb " +
            "des-ede3-ofb des-ofb des3 desx").split(/\s+/);

algos.forEach(function (algo) {
    console.log(algo);
    try { 
        var response = crypto
          .createCipher(algo, 'browsers')
          .update(challenge)

        console.log(Buffer(response));
    }
    catch (x) {
        console.log("Crypto doesn't support:", algo);
    }
});

It produced the following output and not a single result was what I'd expect:

des
<Buffer c3 a9 72 c2 8b c3 a2 10 56 c2 8f c3 94 12 c2 a3 c3 a3 33 c2 ba c3 9e c2 a7 c2 96>
des-cbc
<Buffer c3 a9 72 c2 8b c3 a2 10 56 c2 8f c3 94 12 c2 a3 c3 a3 33 c2 ba c3 9e c2 a7 c2 96>
des-cfb
<Buffer 24 43 c2 99 11 c2 8c c3 ac c3 b2 42 c3 b1 77 c3 a2 c2 96 05 c3 b3 c3 94 02>
des-ecb
<Buffer c2 aa 09 c3 98 c3 92 c2 8c 69 68 c2 97 c2 88 09 c2 ae c3 89 4f 3f c3 9f c3 83>
des-ede
<Buffer c2 84 c2 86 3f c3 ba 2b c2 ba 4b c2 bd 50 05 c3 a8 c3 9d 37 66 15 51>
des-ede-cbc
<Buffer 7d 60 c2 8d 4e 27 c3 b2 0f 6f 0e 48 54 60 38 2e c2 b3 7e>
des-ede-cfb
<Buffer 43 4d c2 ac 19 c3 92 c2 81 c2 a2 3f c2 9a c2 be c2 8a c2 b3 72 44 37 34>
des-ede-ofb
<Buffer 43 4d c2 ac 19 c3 92 c2 81 c2 a2 3f 49 c3 a4 04 24 2c 77 05 40>
des-ede3
<Buffer c2 a4 c2 84 c2 b2 2e 45 c2 90 c2 af c2 9c 21 5b c2 bd c3 aa c2 84 c3 9b 4c 76>
des-ede3-cbc
<Buffer c3 81 5c c2 b9 28 c2 be c3 b6 74 c3 89 c3 89 c2 8c 62 c2 ab 3a c2 95 6c 3f>
des-ede3-cfb
<Buffer 3f c3 8d 22 c3 8e 5d 01 29 c2 8b c2 85 c2 b5 11 75 6f c2 b5 5b c3 9c>
des-ede3-ofb
<Buffer 3f c3 8d 22 c3 8e 5d 01 29 c2 8b c2 8c 64 51 c2 8e c2 a7 2f c2 a8 c2 97>
des-ofb
<Buffer 24 43 c2 99 11 c2 8c c3 ac c3 b2 42 09 68 1b c3 bd 2d 79 c2 ba c3 a1>
des3
<Buffer c3 81 5c c2 b9 28 c2 be c3 b6 74 c3 89 c3 89 c2 8c 62 c2 ab 3a c2 95 6c 3f>
desx
<Buffer 4c 45 c2 bb c2 bd 78 c2 bd c3 b5 c2 b9 c3 96 c3 b7 c3 b0 6d c3 b9 c3 8d c3 a9 2d>

I had no idea what was going on at this point, so I decided to try Python and see if I can get the result I expect through Python's DES algorithm. Worst case I could spawn Python each time I need to authenticate at VNC. I used the pyDes module for my experiment and installed it to virtualenv as following:

$ virtualenv pydes
$ cd pydes
$ . ./bin/activate
$ easy_install pydes

Then I wrote the following test program:

from pyDes import des

challenge = "\x0c\xfd\x6b\x8c" \
            "\x5c\x04\xb3\xe5" \
            "\x01\x40\xb9\xde" \
            "\x33\x9e\x0d\xdb";

response = des("browsers").encrypt(challenge);

print ["0x%x" % ord(b) for b in response]

The output was 16 bytes but not what I'd expect:

['0xd0', '0x6f', '0x0', '0x1a', '0x73', '0xb8', '0x24', '0xc6', '0xf1', '0x72', '0x81', '0x6e', '0x6d', '0x59', '0xc8', '0xa7']

Then I thought, perhaps the endianness was incorrect, so I reversed the words in the challenge:

from pyDes import des

challenge = "\xfd\x0c\x8c\x6b" \
            "\x04\x5c\xe5\xb3" \
            "\x40\x01\xde\xb9" \
            "\x9e\x33\xdb\x0d";

response = des("browsers").encrypt(challenge);

print ["0x%x" % ord(b) for b in response]

But no, this was also giving some other result that wouldn't work with VNC:

['0x2e', '0xe8', '0xec', '0x7c', '0xde', '0xf7', '0x36', '0x56', '0xf0', '0x60', '0xbe', '0x6f', '0xb', '0xf8', '0x9b', '0x76']

Being totally clueless, I decided to use the plain command line openssl tool just to see if node.js or Python were messing up the binary data or something like that. As I had sniffed the challenge from the VNC session, I put it in a file challenge and ran the openssl tool as following:

$ openssl des -pass pass:browsers -in challenge -e -nosalt | hexdump
0000000 72e9 e28b 5610 d48f a312 33e3 deba 96a7
0000010 d253 826d 9b86 3620                    

The output was again different, 24 bytes in size and it didn't match the response.

At this point I had had enough with all these different results and I started searching the web for the simplest possible DES implementation as I wasn't up to implementing DES myself. I found some C code that was public domain and wrote node-des that produces the results that I expect and that VNC server accepts.

var des = require('../build/Release/des.node');
var challenge = Buffer([
   0x0c, 0xfd, 0x6b, 0x8c,
   0x5c, 0x04, 0xb3, 0xe5,
   0x01, 0x40, 0xb9, 0xde,
   0x33, 0x9e, 0x0d, 0xdb
]);

var response = des.encrypt('browsers', challenge);

console.log(response);

Output:

$ node des.js
<SlowBuffer 44 ee fc 48 83 bb 95 f1 c0 82 c3 e2 98 c3 6a 4a>

Victory! It worked and my VNC code was functional! The question is, why did every single attempt to DES encrypt with node, python and openssl fail? Any ideas? Let's get to the bottom of this in the comments!

Update

One of my readers noticed that many of the words in node's output were of the form cx xx, so he asked me if I was sure that node wasn't converting the output to utf8? I was pretty sure it wasn't, buffers are buffers of raw bytes after all, but this looked too suspicious. I looked up in the documentation and found out that the default encoding for the buffers was utf8...

So I modified the original program and replaced console.log(Buffer(response)) with console.log(Buffer(response, 'binary')).

var crypto = require('crypto');
var challenge = Buffer([
   0x0c, 0xfd, 0x6b, 0x8c,
   0x5c, 0x04, 0xb3, 0xe5,
   0x01, 0x40, 0xb9, 0xde,
   0x33, 0x9e, 0x0d, 0xdb
]);

var response = crypto
  .createCipher('des', 'browsers')
  .update(challenge);

console.log(Buffer(response, 'binary'));

Now I do get 16 bytes in the output but it still doesn't match the sniffed response:

<Buffer e9 72 8b e2 10 56 8f d4 12 a3 e3 33 ba de a7 96>

I also modified the program that runs through all the DES algorithms by adding the 'binary' encoding to the Buffer's constructor. The output now is 16 bytes for all algorithms:

des
<Buffer e9 72 8b e2 10 56 8f d4 12 a3 e3 33 ba de a7 96>
des-cbc
<Buffer e9 72 8b e2 10 56 8f d4 12 a3 e3 33 ba de a7 96>
des-cfb
<Buffer 24 43 99 11 8c ec f2 42 f1 77 e2 96 05 f3 d4 02>
des-ecb
<Buffer aa 09 d8 d2 8c 69 68 97 88 09 ae c9 4f 3f df c3>
des-ede
<Buffer 84 86 3f fa 2b ba 4b bd 50 05 e8 dd 37 66 15 51>
des-ede-cbc
<Buffer 7d 60 8d 4e 27 f2 0f 6f 0e 48 54 60 38 2e b3 7e>
des-ede-cfb
<Buffer 43 4d ac 19 d2 81 a2 3f 9a be 8a b3 72 44 37 34>
des-ede-ofb
<Buffer 43 4d ac 19 d2 81 a2 3f 49 e4 04 24 2c 77 05 40>
des-ede3
<Buffer a4 84 b2 2e 45 90 af 9c 21 5b bd ea 84 db 4c 76>
des-ede3-cbc
<Buffer c1 5c b9 28 be f6 74 c9 c9 8c 62 ab 3a 95 6c 3f>
des-ede3-cfb
<Buffer 3f cd 22 ce 5d 01 29 8b 85 b5 11 75 6f b5 5b dc>
des-ede3-ofb
<Buffer 3f cd 22 ce 5d 01 29 8b 8c 64 51 8e a7 2f a8 97>
des-ofb
<Buffer 24 43 99 11 8c ec f2 42 09 68 1b fd 2d 79 ba e1>
des3
<Buffer c1 5c b9 28 be f6 74 c9 c9 8c 62 ab 3a 95 6c 3f>
desx
<Buffer 4c 45 bb bd 78 bd f5 b9 d6 f7 f0 6d f9 cd e9 2d>

But still no match.

Solution

Ryan (in the comments below) found out why this was happening. Turns out that the VNC authentication reverses the order of bits in every byte of the password.

I was basing my implementation on the RFB protocol's specification and it doesn't say a word about it.

I'm quoting that document:

The server sends a random 16-byte challenge. The client encrypts the challenge with DES, using a password supplied by the user as the key, and sends the resulting 16-byte response.

There is no way I could have guessed this. How often do you think of reversing bits in every byte of the password? Never.

Many people are surprised when they hear that sed is Turing complete. How come a text filtering program is Turing complete they ask? Turns out sed is like a small assembly language that has a comparison operation, a branching operation and a temporary buffer. If you translate a problem into a textual representation then these operations are enough to make sed Turing complete!

I didn't want to invent a new proof so I'll just present a proof by Christophe Blaess. The proof may sound silly but it works. Back in the day Christophe wrote an implementation of a Turing machine in sed [download turing.sed]. Now since any programming language that can implement a Turing machine is Turing complete leads us to conclude that sed is also Turing complete. ■ Haha. That's it!

Christophe Blaess offers his own introduction to Turing machines and a description of how his sed implementation works in his article Implementation of a Turing Machine as Sed Script.

Here is an example program for his Turing machine that increments a binary number. The program has the initial tape configuration on the first line of the program, and the program itself below it. In this example the initial tape configuration is the binary number 10010111 (151) and the program increments it by one.

| 10010111

# State 0
0  R0
011R1
000R1

# State 1
1  L2
100R1
111R1

# State 2
2 1R3
201R3
210L2

# State 3
3  RF
300R3
311R3 

To run this program save it to a file increment.tm and run it through turing.sed as following:

$ sed -f turing.sed < example.tm

You'll get the following output:

(0) | |10010111
(0) |1|0010111
(1) 1|0|010111
(1) 10|0|10111
(1) 100|1|0111
(1) 1001|0|111
(1) 10010|1|11
(1) 100101|1|1
(1) 1001011|1|
(1) 10010111| |
(2) 1001011|1|
(2) 100101|1|0
(2) 10010|1|00
(2) 1001|0|000
(3) 10011|0|00
(3) 100110|0|0
(3) 1001100|0|
(3) 10011000| |
(F) 10011000 | |
Final state F reached... end of processing

The output shows the state and tape changes as the program is executed. When it's done the final state F is reached and the computation process terminates producing 10011000 (152) as a result as expected.

Since you can write any program in sed, people have done so and written tetris, sokoban and many other programs. Take a look at these:

Impressive!

The Busy Beaver

If you liked this article, you'll also like my article about the busy beaver. The busy beaver is a Turing machine that puts 1's on an initially blank tape. The busy beaver problem is to find an n-state Turing machine that puts as many 1's on the tape as possible. It hasn't been solved for a 5-state Turing machine yet and theorists doubt that it shall ever be computed for 6 states!

My sed book

I wrote this fun book on sed a while ago called "Sed One-Liners Explained." If you don't know sed yet and wish to learn it, my book teaches it through 100 practical and well-explained examples. Take a look!