This article is part of the article series "Awk One-Liners Explained."
<- previous article next article ->

Hello everyone! I have awesome news - I have written my first e-book ever! It's called "Awk One-Liners Explained" and it's based on the "Famous Awk One-Liners Explained" article series that I wrote here on my blog a few years ago and that has been read over 800,000 times!

I went through all the one-liners in the article series, improved the explanations, fixed the mistakes, added an introduction chapter to Awk one-liners, and two new chapters on the most commonly used Awk special variables and on idiomatic Awk.

Table of Contents

The e-book is 58 pages long and contains exactly 70 well-explained Awk one-liners. It's divided into the following chapters:

  • Preface.
  • 1. Introduction to Awk One-Liners.
  • 2. Line Spacing.
  • 3. Numbering and Calculations.
  • 4. Text Conversion and Substitution.
  • 5. Selective Printing and Deleting of Certain Lines.
  • 6. String and Array Creation.
  • Appendix A: Awk Special Variables.
  • Appendix B: Idiomatic Awk.
  • Index.

What is awk?

Awk is this awesome, little program that's present on nearly ever Unix machine. It's designed to carry out various text processing tasks easily, such as numbering lines, replacing certain words, deleting and printing certain lines.

Let's take a look at several examples.

Example 1: Print the second column from a file

awk '{ print $2 }'

That's all there is to it. Awk automatically splits each line into columns and puts each column in variables $1, $2, $3, etc. This one-liner prints just the 2nd column, which is in variable $2.

You can also specify the symbol or word that you wish to split on with the -F command line switch. This switch is explained in more details in the e-book and in the last example below.

Example 2: Number lines in a file

awk '{ print NR ": " $0 }' file

The whole line itself goes into variable $0. This one-liner prints it but prepends the NR special variable and a colon ": " before it. The special variable NR always contains the current line number.

There are many other special variables and they're all explained in the e-book and summarized in the appendix.

Example 3: Count the number of words in a file

awk '{ total = total + NF } END { print total }'

Here another special variable is used. It's the NF that stands for number of fields, or number of columns, or number of words in the current line. This one-liner then just sums the total number of words up and prints them before quitting in the END block.

Example 4: Print only lines shorter than 64 characters

awk 'length < 64'

This one-liner uses the length function to determine the length of the current line. If the current line is less than 64 characters in length, then length < 64 evaluates to true that instructs awk to print the line.

Finally, let's take a look at an example that compares an Awk program with an equivalent C program. Suppose you want to print the list of all users on your system. With Awk it's as simple as this one-liner:

awk -F: '{ print $1 }' /etc/passwd

This one-liner says, "Take each line from /etc/passwd, split it on the colon and print the first field of each line." Very straightforward and easy to write once you know Awk!

Suppose you didn't know Awk. Then you'd have to write it in some other language, such as C. Compare the example above with the example in C language:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_LINE_LEN 1024

int main () {
    char line [MAX_LINE_LEN];
    FILE *in = fopen ("/etc/passwd", "r");
    if (!in) exit (EXIT_FAILURE);
    while (fgets(line, MAX_LINE_LEN, in) != NULL) {
        char *sep = strchr(line , ':');
        if (!sep) exit (EXIT_FAILURE);
        *sep = '\0';
        printf("%s\n", line);
    }
    fclose(in);
    return EXIT_SUCCESS ;
}

This is much longer and you have to compile the program first, only then you can run it. If you make any mistakes, you have to recompile it again. That's why one-liners are called one-liners! They are short, easy to write and they do one thing really well. I am pretty sure you're starting to see how mastering Awk and one-liners can make you much more efficient when working in the shell and with text files in general.

Once you read the e-book and work through the examples, you'll be able to solve the most common text processing tasks, such as joining lines in a file, numbering lines, replacing certain words and printing certain lines.

Book preview

I prepared a book preview that contains the first 11 pages of the book. It includes the table of contents, preface, introduction and the first page of the second chapter.

Buy it now!

The price of the e-book is just $5.95 and you can buy it through PayPal.

PayPal - The safer, easier way to pay online!

After you have made the payment, my automated e-book processing system will send the PDF e-book to you in a few minutes!

Testimonials

Iain Dooley, CEO and founder of Working Software LTD:

It never ceases to amaze me that, even though I spend 50% - 70% of my day using a *nix command line and have done so for the past 6 years, there are still countless thousands of useful tools and tips to learn, each with their corresponding little productivity boosts. The trouble, of course, is finding the time to organise and prioritise them, deciding which are the most important and useful to learn. "Awk One Liners Explained" is a fantastic resource of curated examples that helped me rapidly pick up a few cool tricks that have already provided many times the value I initially paid for the book. Any professional who spends time working with *nix systems can benefit from this book.

Tweet about my book!

I am really excited about my book and I would appreciate your help spreading the word via Twitter! Here is a quick link for tweeting:

My other e-books!

I am so passionate about programming and writing about programming that I have now written my second e-book called "Sed One-Liners Explained". It's written in the same style as this e-book and it explains sed, the Superman of Unix stream editing. Sed One-Liners Explained contains 100 well-explained one-liners and it's 98 pages long. Take a look!

And I am not stopping here - I am going to release several other books. My next e-book is called "Perl One-Liners Explained" and it's based on my "Famous Perl One-Liners Explained" article series.

Enjoy!

Enjoy the book and let me know what you think about it!

Sincerely,
Peteris Krumins (@pkrumins on Twitter)

You have probably all heard the following story about Jobs meeting Knuth for the first time in 1983.

Steve had managed to get Don Knuth, the legendary Stanford professor of computer science, to give a lunchtime lecture to the Mac team. Knuth is the author of at least a dozen books, including the massive and somewhat impenetrable trilogy "The Art of Computer Programming."

I was sitting in Steve's office when Lynn Takahashi, Steve's assistant, announced Knuth's arrival. Steve bounced out of his chair, bounded over to the door and extended a welcoming hand.

"It's a pleasure to meet you, Professor Knuth," Steve said. "I've read all of your books."

"You're full of shit," Knuth responded.

This story is actually not true.

I was watching Randall Munroe's (from xkcd) talk at Google when suddenly Knuth appeared out of nowhere. Randall used the opportunity and asked him to verify this story. Knuth said he gets asked this all the time and that it was not true. I cut that part of the video and put it on YouTube:

So there we have it. The story is fun but not true.

I was just watching a friend of mine work with git, and he'd always type all the git commands in full, like git status and git push. I realized that he must not be the only one to do so, so I decided to write this quick blog post and encourage everyone to create Huffman coded aliases for the most commonly used commands. Instead of typing git status, alias it to gs. Instead of git add, alias it to ga, etc.

Here are a bunch of aliases that I created for 99% of git commands I ever use:

alias ga='git add'
alias gp='git push'
alias gl='git log'
alias gs='git status'
alias gd='git diff'
alias gdc='git diff --cached'
alias gm='git commit -m'
alias gma='git commit -am'
alias gb='git branch'
alias gc='git checkout'
alias gra='git remote add'
alias grr='git remote rm'
alias gpu='git pull'
alias gcl='git clone'

Here is my typical workflow with these command:

$ vim file.c
$ gd                     # git diff
$ ga file.c              # git add file.c
$ gm "added feature x"   # git commit -m "added feature x"
$ ...
$ gp                     # git push

Short and sweet!

This article is part of the article series "Perl One-Liners Explained."
<- previous article next article ->

Perl One LinersThis is the sixth part of a nine-part article on famous Perl one-liners. In this part I will create various one-liners for selective printing and deleting of certain lines. See part one for introduction of the series.

Famous Perl one-liners is my attempt to create "perl1line.txt" that is similar to "awk1line.txt" and "sed1line.txt" that have been so popular among Awk and Sed programmers and Linux sysadmins.

The article on famous Perl one-liners will consist of nine parts:

The selective printing and selective deleting of certain lines is actually the same process. If you wish to delete certain lines, you just print the lines you're interested in. Or the other way around! For example, to delete all lines with even line numbers, print the odd lines, and to delete odd lines print the even lines.

After I am done with the 8 parts of the article, I will release the whole article series as a pdf e-book! Please subscribe to my blog to be the first to get it!

Awesome news: I have written an e-book based on this article series. Check it out:

Here are today's one-liners:

82. Print the first line of a file (emulate head -1).

perl -ne 'print; exit'

This is the simplest one-liner so far. Here Perl reads in the first line into $_ variable thanks to -n option, then it calls print statement that prints the contents of the $_ variable. And then it just exists. That's it. The first line got printed and that's all we wanted.

83. Print the first 10 lines of a file (emulate head -10).

perl -ne 'print if $. <= 10'

This one-liner uses the $. special variable. This variable stands for "current line number." Each time Perl reads in the next line, it increments $. by one. Therefore it's very simple to understand what this one-liner does, it prints the line if the line number is equal to or less than 10.

This one liner can also be written the other way around without use of if statement,

perl -ne '$. <= 10 && print'

Here the print statement gets called only if $. <= 10 boolean expression is true, and it's true only if current line number is less than or equal to 10.

84. Print the last line of a file (emulate tail -1).

perl -ne '$last = $_; END { print $last }'

Printing the last line of the file is a bit tricker, because you always have to maintain the previous line in memory. In this one-liner we always save the current line in $_ to $last variable. When Perl program ends, it always executes code in the END block. Now just before exiting it read in the last line, so when it quits, we print $last that prints the last line.

Another way to do the same is,

perl -ne 'print if eof'

This one-liner uses the eof function that returns 1 if the next read will return end of file. Since the next read after the last line in the file will really return eof, this one-liner does what it's supposed to do.

85. Print the last 10 lines of a file (emulate tail -10).

perl -ne 'push @a, $_; @a = @a[@a-10..$#a]; END { print @a }'

Now this is tricky. Here we push each line to the @a array, and then we replace with a slice of itself. We do @a = @a[@a-10..$#a], which means, replace @a with last 10 elements of a. @a-10 is evaluated in scalar context here and it returns number of elements in the array minus 10. #$a is the last index in the @a array. And @a[@a-10..$#a] takes the last 10 elements of the array, so @a always contains just 10 last elements.

Here is an example. Suppose @a contains ("line1", "line2", "line3", "line4"). And let's say we want to print last 4 lines of the file. Now when we read the 5th line, the array becomes ("line1", "line2", "line3", "line4", "line5"). At this moment @a-4 is 1, because @a in scalar context is 5. The $#a however is 4 because that's the last index in the array. Now taking the slice, @a[@a-4..$#a] is @a[1..4], which drops the front element from the array and the @a array becomes ("line2", "line3", "line4", "line5").

86. Print only lines that match a regular expression.

perl -ne '/regex/ && print'

Here /regex/ is short for $_ =~ /regex/. Since the -n operator puts every line in $_ variable the /regex/ returns true on all lines that matched the regex. If that happened, print prints the line.

87. Print only lines that do not match a regular expression.

perl -ne '!/regex/ && print'

This is the same as the previous one-liner, except the regular expression match has been negated. So all the lines that don't match the regex get printed.

88. Print the line before a line that matches a regular expression.

perl -ne '/regex/ && $last && print $last; $last = $_'

In this one-liner every line gets saved to $last variable. Now when the next line matches /regex/ and there has been a previous line $last, then it print $last prints the last line, and then it assigns the current line to the last line variable via $last = $_.

89. Print the line after a line that matches a regular expression.

perl -ne 'if ($p) { print; $p = 0 } $p++ if /regex/'

Here we set the variable $p if the line matches a regex. It indicates that the next line should be printed. Now when the next line is read in and $p is set, then that line gets printed and $p gets set to 0 again to reset the state.

90. Print lines that match regex AAA and regex BBB in any order.

perl -ne '/AAA/ && /BBB/ && print'

This one-liner is basically the same as one-liner #86 above. Here we test if a line matches two regular expressions instead of line. If a line matches both regexes, then it gets printed.

91. Print lines that don't match match regexes AAA and BBB.

perl -ne '!/AAA/ && !/BBB/ && print'

This one-liner is almost the same as one-liner #87. Here we test if a line doesn't match two regular expressions in any order. If it doesn't match /AAA/ and it doesn't match /BBB/, then we print it.

92. Print lines that match regex AAA followed by regex BBB followed by CCC.

perl -ne '/AAA.*BBB.*CCC/ && print'

Here we simply chain regexes AAA, BBB and CCC with .*, which stands for match anything or nothing at all. If AAA is followed by BBB and that is followed by CCC then we print the line. It also matches AAABBBCCC with nothing in between the regexes.

93. Print lines that are 80 chars or longer.

perl -ne 'print if length >= 80'

This one-liner prints all lines that are 80 chars or longer. In Perl you can sometimes omit the brackets () for function calls. In this one we omitted brackets for length function call. In fact, length, length() and length($_) are the same.

94. Print lines that are less than 80 chars in length.

perl -ne 'print if length < 80'

This is the opposite of previous one-liner. It checks if the length of a line is less than 80 characters.

95. Print only line 13.

perl -ne '$. == 13 && print && exit'

As I explained in one-liner #83, the $. special variable stands for "current line number". So if $. has value 13, then we print the line and exit.

96. Print all lines except line 27.

perl -ne '$. != 27 && print'

Just like in previous one-liner, we check if the current line is line 27, if it's not then we print it, otherwise we skip it.

Another way to write the same is to reverse print and $. != 27 and use if statement,

perl -ne 'print if $. != 27'

97. Print only lines 13, 19 and 67.

perl -ne 'print if $. == 13 || $. == 19 || $. == 67'

If you have Perl 5.10 or later then you can use the ~~ smart match operator,

perl -ne 'print if int($.) ~~ (13, 19, 67)' 

The smart matching operator ~~ appeared only in Perl 5.10. You can do all kinds of smart matching with it, for example, check if two arrays are the same, if an array contains an element, and many other use cases (see perldoc perlsyn). In this particular one-liner we use int($.) ~~ (13, 19, 67) that determines if numeric value $. is in the list (13, 19, 69). It's basically short for, grep { $_ == int($._) } (13, 19, 67). If the check succeeds the line gets printed.

98. Print all lines between two regexes (including lines that match regex).

perl -ne 'print if /regex1/../regex2/'

This one-liner uses the flip-flop operator, which becomes true when a line matches regex1 and becomes false after another line matches regex2. Therefore this one-liner prints all lines between (and including) lines that match regex1 and regex2.

99. Print all lines from line 17 to line 30.

perl -ne 'print if $. >= 17 && $. <= 30'

This one-liner is very simple to understand. The $. variable stands for the current line number, so it checks if the current line number is greater than or equal to 17 and less than or equal to 30.

I just thought of another way to write it,

perl -ne 'print if int($.) ~~ (17..30)'

This is one-liner uses the Perl 5.10 (and later) smart matching operator ~~. It basically says, is the current line number in the list (17, 18, 19, ..., 30). If it is, the smart match succeeds and the line gets printed.

You can write the same idea in older Perls as following,

perl -ne 'print if grep { $_ == $. } 17..30'

What happens here is grep checks if the current line number is in the list (17, 18, ..., 30). If it is, it returns a list of just one element, and a list of one element is true, and the line gets printed. Otherwise grep returns the empty list, which is false, and nothing gets printed.

100. Print the longest line.

perl -ne '$l = $_ if length($_) > length($l); END { print $l }'

This one-liner keeps the longest line seen so far in the $l variable. In case the current line $_ exceeds the length of currently longest line, it gets replaced. Just before exiting, the END block is executed and it prints the longest line $l.

101. Print the shortest line.

perl -ne '$s = $_ if $. == 1; $s = $_ if length($_) < length($s); END { print $s }'

This one-liner is the opposite of the previous one. But as we're finding the minimum and $s is not defined for the first line, we have to set it to first line explicitly. Otherwise it's the same just with the length check reversed length($_) < length($s).

102. Print all lines that contain a number.

perl -ne 'print if /\d/'

This one-liner uses a regular expression \d that stands for "match a number" and checks if a line contains one. If it does, the check succeeds, and the line gets printed.

103. Find all lines that contain only a number.

perl -ne 'print if /^\d+$/'

This one-liner is very similar to the previous one, but instead of matching a number anywhere on the line, it anchors the match to the beginning of the line, and to the end of the line. The regular expression ^\d+$ means "match one or more numbers that start at the beginning of line and end at the end of the line".

104. Print all lines that contain only characters.

perl -ne 'print if /^[[:alpha:]]+$/

This one-liner checks if the line contains only characters and if it does, it prints it. Here the [[:alpha:]] stands for "match all characters". You could also write the same as [a-zA-Z] (if you live in ASCII world).

105. Print every second line.

perl -ne 'print if $. % 2'

This one-liner prints first, third, 5th, 7th, etc, line. It does so because $. % 2 is true when the current line number is odd, and it's false when the current line number is even.

106. Print every second line, starting the second line.

perl -ne 'print if $. % 2 == 0'

This one-liner is very similar to the previous one but except printing 1st, 3rd, 5th, etc, lines, it prints 2nd, 4th, 6th, etc, lines. It prints them because $. % 2 == 0 is true when the current line number is 2, 4, 6, ....

107. Print all lines that repeat.

perl -ne 'print if ++$a{$_} == 2'

This one-liner keeps track of the lines it has seen so far and it also keeps the count of how many times it has seen the line before. If it sees the line the 2nd time, it prints it out because ++$a{$_} == 2 is true. If it sees the line more than 2 times, it just does nothing because the count for this line has gone beyond 2 and the result of the print check is false.

108. Print all unique lines.

perl -ne 'print unless $a{$_}++'

Here the lines get printed only if the hash value $a{$_} for the line is 0. Every time Perl reads in a line, it increments the value for the line and that makes sure that only never before seen lines get printed.

Perl one-liners explained e-book

I've now written the "Perl One-Liners Explained" e-book based on this article series. I went through all the one-liners, improved explanations, fixed mistakes and typos, added a bunch of new one-liners, added an introduction to Perl one-liners and a new chapter on Perl's special variables. Please take a look:

Have Fun!

Thanks for reading the article! The next part is going to be about various interesting, intriguing, silly and crazy regular expressions, because Perl is all about regular expressions.

Several weeks ago my friend Madars was in an airport in the Netherlands and he wanted to login into his server via ssh. It turned out that their public internet had only ports 80 and 443 open so he couldn't do that. He asked me if I could proxy either port 80 or 443 to his server. Surely, I had a solution. I modified the tcp proxy server that I had written for my Turn any Linux computer into SOCKS5 proxy in one command article and did:

sudo ./tcp-proxy2.pl 443 madars-server.com:22

This proxied the port 443 on my server to madars-server.com ssh port. Now Madars could do

ssh -p 443 catonmat.net

and he got connected to his server. Mission accomplished.

Here is the code of tcp-proxy2.pl,

use warnings;
use strict;

use IO::Socket::INET;
use IO::Select;

my @allowed_ips = ('all', '10.10.10.5');
my $ioset = IO::Select->new;
my %socket_map;

my $debug = 1;

sub new_conn {
    my ($host, $port) = @_;
    return IO::Socket::INET->new(
        PeerAddr => $host,
        PeerPort => $port
    ) || die "Unable to connect to $host:$port: $!";
}

sub new_server {
    my ($host, $port) = @_;
    my $server = IO::Socket::INET->new(
        LocalAddr => $host,
        LocalPort => $port,
        ReuseAddr => 1,
        Listen    => 100
    ) || die "Unable to listen on $host:$port: $!";
}

sub new_connection {
    my $server = shift;
    my $remote_host = shift;
    my $remote_port = shift;

    my $client = $server->accept;
    my $client_ip = client_ip($client);

    unless (client_allowed($client)) {
        print "Connection from $client_ip denied.\n" if $debug;
        $client->close;
        return;
    }
    print "Connection from $client_ip accepted.\n" if $debug;

    my $remote = new_conn($remote_host, $remote_port);
    $ioset->add($client);
    $ioset->add($remote);

    $socket_map{$client} = $remote;
    $socket_map{$remote} = $client;
}

sub close_connection {
    my $client = shift;
    my $client_ip = client_ip($client);
    my $remote = $socket_map{$client};
    
    $ioset->remove($client);
    $ioset->remove($remote);

    delete $socket_map{$client};
    delete $socket_map{$remote};

    $client->close;
    $remote->close;

    print "Connection from $client_ip closed.\n" if $debug;
}

sub client_ip {
    my $client = shift;
    return inet_ntoa($client->sockaddr);
}

sub client_allowed {
    my $client = shift;
    my $client_ip = client_ip($client);
    return grep { $_ eq $client_ip || $_ eq 'all' } @allowed_ips;
}

die "Usage: $0 <local port> <remote_host:remote_port>" unless @ARGV == 2;

my $local_port = shift;
my ($remote_host, $remote_port) = split ':', shift();


print "Starting a server on 0.0.0.0:$local_port\n";
my $server = new_server('0.0.0.0', $local_port);
$ioset->add($server);

while (1) {
    for my $socket ($ioset->can_read) {
        if ($socket == $server) {
            new_connection($server, $remote_host, $remote_port);
        }
        else {
            next unless exists $socket_map{$socket};
            my $remote = $socket_map{$socket};
            my $buffer;
            my $read = $socket->sysread($buffer, 4096);
            if ($read) {
                $remote->syswrite($buffer);
            }
            else {
                close_connection($socket);
            }
        }
    }
}

Download tcp-proxy2.pl

Download link: tcp proxy 2 (tcp-proxy2.pl)
Download URL: http://www.catonmat.net/download/tcp-proxy2.pl
Downloaded: 3673 times

I also pushed the tcp-proxy2.pl to a new GitHub repository: tcp-proxy2.pl on GitHub.

Enjoy!