This article is part of the article series "CommandLineFu One-Liners Explained."
<- previous article next article ->

CommandLineFu ExplainedHey everyone, this is the fourth article in the series on the most popular commandlinefu one-liners explained.

Here are the first three parts:

And here are today's one-liners:

31. Quickly access ASCII table.

$ man 7 ascii

Ever forgot a keycode for some ASCII character or escape code? Look no further, man ascii contains the 7-bit ASCII table. Take a look at it online.

Linux man pages are full of gems like these. One day I actually went through all the man pages to find the most interesting ones. An article about them is upcoming but before I get it published, here are a few interesting ones:

And finally the section 7 man pages that are most packed with wild and cool info,

  • man 7 bootparam - a tutorial and reference of Linux kernel boot parameters.
  • man 7 charsets - a Linux programmer's view of character sets and internationalization.
  • man 7 glob - how pathname globbing works.
  • man 7 hier - description of the Linux file system hierarchy.
  • man 7 operator - C language operator precedence and associativity table.
  • man 7 regex - basic and extended regular expression overview.
  • man 7 suffixes - lists common Linux file suffixes and the associated file types.
  • man 7 time - overview of time and timers.
  • man 7 units - kilo, kibi, mega, mebi, giga, gibi, ... - decimal (SI) and binary system of units.
  • man 7 utf8 - description of UTF-8 encoding.
  • man 7 url - description of URIs, URLs and URNs.

There are a lot more interesting man pages but these stood out from the rest.

32. Simple timer.

$ time read

This one-liner can be used a simple timer. For example, if you wish to time something, you can execute it when the event starts and press the return key when the event ends. It will output the time the event took to finish.

Here is how this one-liner works. First the time command times any command that is supplied to it. In this case the command supplied to it is read that reads a line from the standard input. As soon as you press enter, read finishes and time reports how long it took.

If you get annoyed by having to press enter, you can specify that read should return after having read 1 character,

$ time read -N 1

Now you can press any key to stop the timer.

If you wish to run a timer for a specific number of seconds, you can add -t flag for timeout,

$ time read -t 60

This would stop the timer after 60 seconds.

33. Shutdown a Windows machine.

$ net rpc shutdown -I IP_ADDRESS -U username%password

Everyone knows the net command, right? We all used to net use \\ip\ipc$ *, right? :)

Anyway, the net command comes as part of Samba, which allows you, for example, to use Linux in a Windows workgroup.

This particular one-liner executes the shutdown command on a Windows computer located at IP_ADDRESS with as the user username with password password.

To reboot a machine use the -r switch to net rpc:

$ net rpc shutdown -r -I IP_ADDRESS -U username%password

If you're on an unsecured network, don't forget about the good old nmblookup and smbclient tools that come with Samba.

34. Execute a command independently from the current shell.

$ (cd /tmp && ls)

This one-liner illustrates subshells. Here the commands cd /tmp and ls are executed but they do not affect the current shell. If you had done just cd /tmp && ls, your current shell would have changed directory to /tmp but in this one-liner it happens in a subshell and your current shell is not affected.

Surely, this is only a toy example. If you wanted to know what's in /tmp, you'd do just ls /tmp.

Actually, talking about cd, be aware of pushd and popd commands. They allow you to maintain a stack of directories you want to return to later. For example,

/long/path/is/long$ pushd .
/long/path/is/long$ cd /usr
/usr$ popd 
/long/path/is/long$

Or even shorter, passing the directory you're gonna cd to directly to pushd,

/long/path/is/long$ pushd /usr
/usr$ popd 
/long/path/is/long$

Another cool trick is to use cd - to return to the previous directory. Here is an example,

/home/pkrumins$ cd /tmp
/tmp$ cd -
/home/pkrumins$

35. Tunnel your SSH connection via intermediate host.

$ ssh -t reachable_host ssh unreachable_host

This one-liner creates an ssh connection to unreachable_host via reachable_host. It does it by executing the ssh unreachable_host on reachable_host. The -t forces ssh to allocate a pseudo-tty, which is necessary for working interactively in the second ssh to unreachable_host.

This one-liner can be generalized. You can tunnel through arbitrary number of ssh servers:

$ ssh -t host1 ssh -t host2 ssh -t host3 ssh -t host4 ...

Now catch me if you can. ;)

36. Clear the terminal screen.

$ CTRL+l

Pressing CTRL+l (that's small L) clears the screen leaving the current line at the top of the screen.

If you wish to clear just some line, you can use argumented version of CTRL+l - first press ESC, then the line you want to clear, let's say 21 (21st line), and then press the same CTRL+l. That will clear the 21st line on the screen without erasing the whole screen.

$ ESC 21 CTRL+l

This command outputs a special "clear-screen" sequence to the terminal. The same can be achieved by tput command,

$ tput clear

Another way to clear the terminal (usually when the screen gets garbled) is to use the reset command,

$ reset

37. Hear when the machine comes back online.

$ ping -a IP

Ever had a situation when you need to know when the system comes up after a reboot? Up until now you probably launched ping and either followed the timeouts until the system came back, or left it running and occasionally checked its output to see if the host is up. But that is unnecessary, you can make ping -a audible! As soon as the host at IP is back, ping will beep!

38. List 10 most often used commands.

$ history | awk '{a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head

The person who wrote it has the Unix mindset right. He's combining several shell commands to get the result he/she wants.

First, history outputs all the commands the person has executed. Next, awk counts how many times the second column $2 appears in the output. Once history has output all the commands and awk has counted them, awk loops over all the commands and outputs the count a[i] separated by space, followed by the command itself. Then sort takes this input and sorts numerically -n and reverses the output -r, so that most frequent commands were on top. Finally head outputs the first 10 most frequent history commands.

If you want to see more than 10 commands (or less), change head to head -20 for 20 commands or head -5 for 5 commands.

39. Check gmail for new mail.

$ curl -u you@gmail.com --silent "https://mail.google.com/mail/feed/atom" |
  perl -ne \
  '
    print "Subject: $1 " if /<title>(.+?)<\/title>/ && $title++;
    print "(from $1)\n" if /<email>(.+?)<\/email>/;
  '

Gmail is cool because they offer an Atom feed for the new mail. This one-liner instructs curl to retrieve the feed and authenticate as you@gmail.com. You'll be prompted a password after you execute the command. Next it feeds the output to perl. Perl extracts the title (subject) of each email and the sender's email. These two items are printed to stdout.

Here is a the output when I run the command,

Subject: i heard you liked windows! (from gates@microsoft.com)
Subject: got root? (from bofh@underground.org)

40. Watch Star-Wars via telnet.

$ telnet towel.blinkenlights.nl

Needs no explaining. Just telnet to the host to watch ASCII Star-Wars.

And here is another one,

$ telnet towel.blinkenlights.nl 666

Connecting on port 666 will spit out BOFH excuses.

That's it for today.

I hope you enjoyed the 4th part of the article. Tune in next time for the 5th part.

Oh, and I'd love if you followed me on Twitter!

It's interesting how the term "functor" means completely different things in various programming languages. Take C++ for example. Everyone who has mastered C++ knows that you call a class that implements operator() a functor. Now take Standard ML. In ML functors are mappings from structures to structures. Now Haskell. In Haskell functors are just homomorphisms over containers. And in Prolog functor means the atom at the start of a structure. They all are different. Let's take a closer look at each one.

Functors in C++

Functors in C++ are short for "function objects." Function objects are instances of C++ classes that have the operator() defined. If you define operator() on C++ classes you get objects that act like functions but can also store state. Here is an example,

#include <iostream>
#include <string>

class SimpleFunctor {
    std::string name_;
public:
    SimpleFunctor(const char *name) : name_(name) {}
    void operator()() { std::cout << "Oh, hello, " << name_ << endl; }
};

int main() {
    SimpleFunctor sf("catonmat");
    sf();  // prints "Oh, hello, catonmat"
}

Notice how I was able to call sf() in the main function, even though sf was an object? That's because I defined operator() in SimpleFunctor's class.

Most often functors in C++ are used as predicates, fake closures or comparison functions in STL algorithms. Here is another example. Suppose you have a list of integers and you wish to find the sum of all even ones, and the sum of all odd ones. Perfect job for a functor and for_each algorithm.

#include <algorithm>
#include <iostream>
#include <list>

class EvenOddFunctor {
    int even_;
    int odd_;
public:
    EvenOddFunctor() : even_(0), odd_(0) {}
    void operator()(int x) {
        if (x%2 == 0) even_ += x;
        else odd_ += x;
    }
    int even_sum() const { return even_; }
    int odd_sum() const { return odd_; }
};

int main() {
    EvenOddFunctor evenodd;
    
    int my_list[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    evenodd = std::for_each(my_list,
                  my_list+sizeof(my_list)/sizeof(my_list[0]),
                  evenodd);

    std::cout << "Sum of evens: " << evenodd.even_sum() << "\n";
    std::cout << "Sum of odds: " << evenodd.odd_sum() << std::endl;

    // output:
    // Sum of evens: 30
    // Sum of odds: 25
}

Here an instance of an EvenOddFunctor gets passed to for_each algorithm. The for_each algorithm iterates over each element in my_list and calls the functor. After it's done, it returns a copy of evenodd functor that contains the sum of evens and odds.

Functors in Standard ML

Vaguely talking in object-oriented terms, functors in ML are generic implementations of interfaces. In ML terms, functors are part of ML module system and they allow to compose structures.

Here is an example, suppose you want to write a plugin system and you wish all the plugins to implement the required interface, which, for simplicity, includes only the perform function. In ML you can first define a signature for plugins,

signature Plugin =
sig
    val perform : unit -> unit
end;

Now that we have defined an interface (signature) for plugins, we can implement two Plugins, let's say LoudPlugin and SilentPlugin. The implementation is done via structures,

structure LoudPlugin :> Plugin =
struct
    fun perform () = print "DOING JOB LOUDLY!\n"
end;

And the SilentPlugin,

structure SilentPlugin :> Plugin =
struct
    fun perform () = print "doing job silently\n"
end;

Now we get to functors. Remember functors in ML take structures as their arguments, so we can write one that takes Plugin as its argument,

functor Performer(P : Plugin) =
struct
    fun job () = P.perform ()
end;

This functor takes Plugin as P argument, and uses it in the job function, that calls P plugin's perform function.

Now let's use the Performer functor. Remember that a functor returns a structure,

structure LoudPerformer = Performer(LoudPlugin);
structure SilentPerformer = Performer(SilentPlugin);

LoudPerformer.job ();
SilentPerformer.job ();

This outputs,

DOING JOB LOUDLY!
doing job silently

This is probably the simplest possible example of Standard ML functors.

Functors in Haskell

Functors in Haskell is what real functors are supposed to be. Haskell functors resemble mathematical functors from category theory. In category theory a functor F is a mapping between two categories such that the structure of the category being mapped over is preserved, in other words, it's a homomorphism between two categories.

In Haskell this definition is implemented as a simple type class,

class Functor f where
  fmap :: (a -> b) -> f a -> f b

Looking back at ML example, a type class in Haskell is like a signature, except it's defined on types. It defines what operations a type has to implement to be an instance of this class. In this case, however, the Functor is not defined over types but over type constructors f. It says, a Functor is something that implements the fmap function that takes a function from type a to type b, and a value of type f a (a type constructed from type constructor f applied to type a) and returns a value of type f b.

To understand what it does, think of fmap as a function that applies an operation to each element in some kind of a container.

The simplest example of functors is regular lists and the map function that maps a function to each element in the list.

Prelude> map (+1) [1,2,3,4,5]
[2,3,4,5,6]

In this simple example, the Functor's fmap function is just map and type constructor f is [] - the list type constructor. Therefore the Functor instance for lists is defined as

instance Functor [] where
  fmap = map

Let's see if it really is true by using fmap instead of map in the previous example,

Prelude> fmap (+1) [1,2,3,4,5]
[2,3,4,5,6]

But notice that Functor definition does not say anything about preserving the structure! Therefore any sensible functor must satisfy the functor laws, which are part of the definition of the mathematical functor, implicitly. There are two rules on fmap:

fmap id = id
fmap (g . h) = fmap g . fmap h

The first rule says that mapping the identity function over every element in a container has no effect. The second rule says that a composition of two functions over every item in a container is the same as first mapping one function, and then mapping the other.

Another example of Functors that illustrate them the most vividly is operations over trees. Think of a tree as a container, then fmap maps a function over tree values, while preserving the tree's structure.

Let's define a Tree first,

data Tree a = Node (Tree a) (Tree a)
            | Leaf a
              deriving Show

This definition says that a Tree of type a is either a Node of two Trees (left and right branches) or a Leaf. The deriving Show expression allows us to inspect the Tree via show function.

Now we can define a Functor over Trees,

instance Functor Tree where
    fmap g (Leaf v) = Leaf (g v)
    fmap g (Node l r) = Node (fmap g l) (fmap g r)

This definition says, that fmap of function g over a Leaf with value v is just a Leaf of g applied to v. And fmap of g over a Node with left l and right r branches is just a Node of fmap applied to the values of left and right branches.

Now let's illustrate how fmap works on trees. Let's construct a tree with String leaves and map the length function over them to find out the length of each leaf.

Prelude> let tree = (Node (Node (Leaf "hello") (Leaf "foo")) (Leaf "baar"))
Prelude> fmap length tree
Node (Node (Leaf 5) (Leaf 3)) (Leaf 4)

Here I constructed the following tree,

           *
          / \
         /   \
        *  "baar"
       / \
      /   \
     /     \
    /       \
 "hello"  "foo"

And mapped length function over it, producing,

           *
          / \
         /   \
        *     4
       / \     
      /   \
     /     \
    /       \
   5         3

Another way of saying what fmap does is that is lifts a function from the "normal world" into the "f world."

In fact Functor is the most fundamental type class in Haskell because Monads, Applicatives and Arrows are all Functors. As I like to say it, Haskell starts where the functors start.

If you wish to learn more about Haskell type classes, start with the excellent article Typeclassopedia (starts at page 17).

Functors in Prolog

Finally, functors in Prolog. Functors in Prolog are the simplest of all. They refer to two things. The first is the atom at the start of the structure. Here is an example, given an expression,

?- likes(mary, pizza)

the functor is the first atom - likes.

The second is built-in predicate called functor. It returns the arity and the functor of a structure. For example,

?- functor(likes(mary, pizza), Functor, Arity).
Functor = likes
Arity = 2

That's it for functors in Prolog.

Conclusion

This article demonstrated how a simple term like "functor" can refer to completely different things in various programming languages. Therefore when you hear a the term "functor", it's important to know the context it's being mentioned in.

I thought I'd do a shorter article on catonmat this time. It goes hand in hand with my upcoming article series on "100% technical guide to anonymity" and it's much easier to write larger articles in smaller pieces. Then I can edit them together and produce the final article.

This article will be interesting for those who didn't know it already -- you can turn any Linux computer into a SOCKS5 (and SOCKS4) proxy in just one command:

ssh -N -D 0.0.0.0:1080 localhost

And it doesn't require root privileges. The ssh command starts up dynamic -D port forwarding on port 1080 and talks to the clients via SOCSK5 or SOCKS4 protocols, just like a regular SOCKS5 proxy would! The -N option makes sure ssh stays idle and doesn't execute any commands on localhost.

If you also wish the command to go into background as a daemon, then add -f option:

ssh -f -N -D 0.0.0.0:1080 localhost

To use it, just make your software use SOCKS5 proxy on your Linux computer's IP, port 1080, and you're done, all your requests now get proxied.

Access control can be implemented via iptables. For example, to allow only people from the ip 1.2.3.4 to use the SOCKS5 proxy, add the following iptables rules:

iptables -A INPUT --src 1.2.3.4 -p tcp --dport 1080 -j ACCEPT
iptables -A INPUT -p tcp --dport 1080 -j REJECT

The first rule says, allow anyone from 1.2.3.4 to connect to port 1080, and the other rule says, deny everyone else from connecting to port 1080.

Surely, executing iptables requires root privileges. If you don't have root privileges, and you don't want to leave your proxy open (and you really don't want to do that), you'll have to use some kind of a simple TCP proxy wrapper to do access control.

Here, I wrote one in Perl. It's called tcp-proxy.pl and it uses IO::Socket::INET to abstract sockets, and IO::Select to do connection multiplexing.

#!/usr/bin/perl
#

use warnings;
use strict;

use IO::Socket::INET;
use IO::Select;

my @allowed_ips = ('1.2.3.4', '5.6.7.8', '127.0.0.1', '192.168.1.2');
my $ioset = IO::Select->new;
my %socket_map;

my $debug = 1;

sub new_conn {
    my ($host, $port) = @_;
    return IO::Socket::INET->new(
        PeerAddr => $host,
        PeerPort => $port
    ) || die "Unable to connect to $host:$port: $!";
}

sub new_server {
    my ($host, $port) = @_;
    my $server = IO::Socket::INET->new(
        LocalAddr => $host,
        LocalPort => $port,
        ReuseAddr => 1,
        Listen    => 100
    ) || die "Unable to listen on $host:$port: $!";
}

sub new_connection {
    my $server = shift;
    my $client = $server->accept;
    my $client_ip = client_ip($client);

    unless (client_allowed($client)) {
        print "Connection from $client_ip denied.\n" if $debug;
        $client->close;
        return;
    }
    print "Connection from $client_ip accepted.\n" if $debug;

    my $remote = new_conn('localhost', 55555);
    $ioset->add($client);
    $ioset->add($remote);

    $socket_map{$client} = $remote;
    $socket_map{$remote} = $client;
}

sub close_connection {
    my $client = shift;
    my $client_ip = client_ip($client);
    my $remote = $socket_map{$client};
    
    $ioset->remove($client);
    $ioset->remove($remote);

    delete $socket_map{$client};
    delete $socket_map{$remote};

    $client->close;
    $remote->close;

    print "Connection from $client_ip closed.\n" if $debug;
}

sub client_ip {
    my $client = shift;
    return inet_ntoa($client->sockaddr);
}

sub client_allowed {
    my $client = shift;
    my $client_ip = client_ip($client);
    return grep { $_ eq $client_ip } @allowed_ips;
}

print "Starting a server on 0.0.0.0:1080\n";
my $server = new_server('0.0.0.0', 1080);
$ioset->add($server);

while (1) {
    for my $socket ($ioset->can_read) {
        if ($socket == $server) {
            new_connection($server);
        }
        else {
            next unless exists $socket_map{$socket};
            my $remote = $socket_map{$socket};
            my $buffer;
            my $read = $socket->sysread($buffer, 4096);
            if ($read) {
                $remote->syswrite($buffer);
            }
            else {
                close_connection($socket);
            }
        }
    }
}

To use it, you'll have to make a change to the previous configuration. Instead of running ssh SOCKS5 proxy on 0.0.0.0:1080, you'll need to run it on localhost:55555,

ssh -f -N -D 55555 localhost

After that, run the tcp-proxy.pl,

perl tcp-proxy.pl &

The TCP proxy will start listening on 0.0.0.0:1080 and will redirect only the allowed IPs in @allowed_ips list to localhost:55555.

Another possibility is to use another computer instead of your own as exit node. What I mean is you can do the following:

ssh -f -N -D 1080 other_computer.com

This will set up a SOCKS5 proxy on localhost:1080 but when you use it, ssh will automatically tunnel your requests (encrypted) via other_computer.com. This way you can hide what you're doing on the Internet from anyone who might be sniffing your link. They will see that you're doing something but the traffic will be encrypted so they won't be able to tell what you're doing.

That's it. You're now the proxy king!

Download tcp-proxy.pl

Download link: tcp proxy (tcp-proxy.pl)
Download URL: http://www.catonmat.net/download/tcp-proxy.pl
Downloaded: 5911 times

I also pushed the tcp-proxy.pl to GitHub: tcp-proxy.pl on GitHub. This project is also pretty nifty to generalize and make a program that redirects between any number of hosts:ports, not just two.

PS. I will probably also write "A definitive guide to ssh port forwarding" some time in the future because it's an interesting but little understood topic.

var http = require('http');

http.createServer(function(request, response) {
  var proxy = http.createClient(80, request.headers['host'])
  var proxy_request = proxy.request(request.method, request.url, request.headers);
  proxy_request.addListener('response', function (proxy_response) {
    proxy_response.addListener('data', function(chunk) {
      response.write(chunk, 'binary');
    });
    proxy_response.addListener('end', function() {
      response.end();
    });
    response.writeHead(proxy_response.statusCode, proxy_response.headers);
  });
  request.addListener('data', function(chunk) {
    proxy_request.write(chunk, 'binary');
  });
  request.addListener('end', function() {
    proxy_request.end();
  });
}).listen(8080);

This is just amazing. In 20 lines of node.js code and 10 minutes of time I was able to write a HTTP proxy. And it scales well, too. It's not a blocking HTTP proxy, it's event driven and asynchronous, meaning hundreds of people can use simultaneously and it will work well.

To get the proxy running all you have to do is download node.js, compile it, and run the proxy program via the node program:

$ ./configure --prefix=/home/pkrumins/installs/nodejs-0.1.92
$ make
$ make install

$ PATH=$PATH:/home/pkrumins/installs/nodejs-0.1.92/bin

$ node proxy.js

And from here you can take this proxy wherever your imagination takes. For example, you can start by adding logging:

var http = require('http');
var sys  = require('sys');

http.createServer(function(request, response) {
  sys.log(request.connection.remoteAddress + ": " + request.method + " " + request.url);
  var proxy = http.createClient(80, request.headers['host'])
  var proxy_request = proxy.request(request.method, request.url, request.headers);
  proxy_request.addListener('response', function (proxy_response) {
    proxy_response.addListener('data', function(chunk) {
      response.write(chunk, 'binary');
    });
    proxy_response.addListener('end', function() {
      response.end();
    });
    response.writeHead(proxy_response.statusCode, proxy_response.headers);
  });
  request.addListener('data', function(chunk) {
    proxy_request.write(chunk, 'binary');
  });
  request.addListener('end', function() {
    proxy_request.end();
  });
}).listen(8080);

Next, you can add a regex-based host blacklist in 15 additional lines:

var http = require('http');
var sys  = require('sys');
var fs   = require('fs');

var blacklist = [];

fs.watchFile('./blacklist', function(c,p) { update_blacklist(); });

function update_blacklist() {
  sys.log("Updating blacklist.");
  blacklist = fs.readFileSync('./blacklist').split('\n')
              .filter(function(rx) { return rx.length })
              .map(function(rx) { return RegExp(rx) });
}

http.createServer(function(request, response) {
  for (i in blacklist) {
    if (blacklist[i].test(request.url)) {
      sys.log("Denied: " + request.method + " " + request.url);
      response.end();
      return;
    }
  }

  sys.log(request.connection.remoteAddress + ": " + request.method + " " + request.url);
  var proxy = http.createClient(80, request.headers['host'])
  var proxy_request = proxy.request(request.method, request.url, request.headers);
  proxy_request.addListener('response', function(proxy_response) {
    proxy_response.addListener('data', function(chunk) {
      response.write(chunk, 'binary');
    });
    proxy_response.addListener('end', function() {
      response.end();
    });
    response.writeHead(proxy_response.statusCode, proxy_response.headers);
  });
  request.addListener('data', function(chunk) {
    proxy_request.write(chunk, 'binary);
  });
  request.addListener('end', function() {
    proxy_request.end();
  });
}).listen(8080);

update_blacklist();

Now to block proxy users from using Facebook, just echo facebook.com to blacklist file:

$ echo 'facebook.com' >> blacklist

The proxy server will automatically notice the changes to the file and update the blacklist.

Surely, a proxy server without IP control is no proxy server, so let's add that as well:

var http = require('http');
var sys  = require('sys');
var fs   = require('fs');

var blacklist = [];
var iplist    = [];

fs.watchFile('./blacklist', function(c,p) { update_blacklist(); });
fs.watchFile('./iplist', function(c,p) { update_iplist(); });

function update_blacklist() {
  sys.log("Updating blacklist.");
  blacklist = fs.readFileSync('./blacklist').split('\n')
              .filter(function(rx) { return rx.length })
              .map(function(rx) { return RegExp(rx) });
}

function update_iplist() {
  sys.log("Updating iplist.");
  iplist = fs.readFileSync('./iplist').split('\n')
           .filter(function(ip) { return ip.length });
}

http.createServer(function(request, response) {
  var allowed_ip = false;
  for (i in iplist) {
    if (iplist[i] == request.connection.remoteAddress) {
      allowed_ip = true;
      break;
    }
  }

  if (!allowed_ip) {
    sys.log("IP " + request.connection.remoteAddress + " is not allowed");
    response.end();
    return;
  }

  for (i in blacklist) {
    if (blacklist[i].test(request.url)) {
      sys.log("Denied: " + request.method + " " + request.url);
      response.end();
      return;
    }
  }

  sys.log(request.connection.remoteAddress + ": " + request.method + " " + request.url);
  var proxy = http.createClient(80, request.headers['host'])
  var proxy_request = proxy.request(request.method, request.url, request.headers);
  proxy_request.addListener('response', function(proxy_response) {
    proxy_response.addListener('data', function(chunk) {
      response.write(chunk, 'binary');
    });
    proxy_response.addListener('end', function() {
      response.end();
    });
    response.writeHead(proxy_response.statusCode, proxy_response.headers);
  });
  request.addListener('data', function(chunk) {
    proxy_request.write(chunk, 'binary');
  });
  request.addListener('end', function() {
    proxy_request.end();
  });
}).listen(8080);

update_blacklist();
update_iplist();

By default the proxy server will not allow any connections, so add all the IPs you want the proxy to be accessible from to iplist file:

$ echo '1.2.3.4' >> iplist

Finally, let's refactor the code a little:

var http = require('http');
var sys  = require('sys');
var fs   = require('fs');

var blacklist = [];
var iplist    = [];

fs.watchFile('./blacklist', function(c,p) { update_blacklist(); });
fs.watchFile('./iplist', function(c,p) { update_iplist(); });

function update_blacklist() {
  sys.log("Updating blacklist.");
  blacklist = fs.readFileSync('./blacklist').split('\n')
              .filter(function(rx) { return rx.length })
              .map(function(rx) { return RegExp(rx) });
}

function update_iplist() {
  sys.log("Updating iplist.");
  iplist = fs.readFileSync('./iplist').split('\n')
           .filter(function(rx) { return rx.length });
}

function ip_allowed(ip) {
  for (i in iplist) {
    if (iplist[i] == ip) {
      return true;
    }
  }
  return false;
}

function host_allowed(host) {
  for (i in blacklist) {
    if (blacklist[i].test(host)) {
      return false;
    }
  }
  return true;
}

function deny(response, msg) {
  response.writeHead(401);
  response.write(msg);
  response.end();
}

http.createServer(function(request, response) {
  var ip = request.connection.remoteAddress;
  if (!ip_allowed(ip)) {
    msg = "IP " + ip + " is not allowed to use this proxy";
    deny(response, msg);
    sys.log(msg);
    return;
  }

  if (!host_allowed(request.url)) {
    msg = "Host " + request.url + " has been denied by proxy configuration";
    deny(response, msg);
    sys.log(msg);
    return;
  }

  sys.log(ip + ": " + request.method + " " + request.url);
  var proxy = http.createClient(80, request.headers['host'])
  var proxy_request = proxy.request(request.method, request.url, request.headers);
  proxy_request.addListener('response', function(proxy_response) {
    proxy_response.addListener('data', function(chunk) {
      response.write(chunk, 'binary');
    });
    proxy_response.addListener('end', function() {
      response.end();
    });
    response.writeHead(proxy_response.statusCode, proxy_response.headers);
  });
  request.addListener('data', function(chunk) {
    proxy_request.write(chunk, 'binary);
  });
  request.addListener('end', function() {
    proxy_request.end();
  });
}).listen(8080);

update_blacklist();
update_iplist();

Again, it's amazing how fast you can write server software in node.js and JavaScript. It would have probably taken me a day to write the same in C. I love how fast you can prototype the software nowadays.

Download proxy.js

Download link: proxy server written in node.js
Download URL: http://www.catonmat.net/download/proxy.js
Downloaded: 13664 times

I am gonna build this proxy up, so I also put it on GitHub: proxy.js on GitHub

Happy proxying!

This article is part of the article series "CommandLineFu One-Liners Explained."
<- previous article next article ->

CommandLineFu ExplainedAnother week and another top ten one-liners from commandlinefu explained.

This is the third post in the series already, covering one-liners 21-30. See the previous two posts for the introduction of the series and one-liners 1-20:

Update: Russian translation now available.

#21. Display currently mounted file systems nicely

$ mount | column -t

The file systems are not that important here. The column -t command is what is important. It takes the input and formats it into multiple columns so that all columns were aligned vertically.

Here is how the mounted filesystem list looks without column -t command:

$ mount

/dev/root on / type ext3 (rw)
/proc on /proc type proc (rw)
/dev/mapper/lvmraid-home on /home type ext3 (rw,noatime)

And now with column -t command:

$ mount | column -t

/dev/root                 on  /      type  ext3   (rw)
/proc                     on  /proc  type  proc   (rw)
/dev/mapper/lvmraid-home  on  /home  type  ext3   (rw,noatime)

You can improve this one-liner now by also adding column titles:

$ (echo "DEVICE - PATH - TYPE FLAGS" && mount) | column -t

DEVICE                    -   PATH   -     TYPE   FLAGS
/dev/root                 on  /      type  ext3   (rw)
/proc                     on  /proc  type  proc   (rw)
/dev/mapper/lvmraid-home  on  /home  type  ext3   (rw,noatime)

Columns 2 and 4 are not really necessary. We can use awk text processing utility to get rid of them:

$ (echo "DEVICE PATH TYPE FLAGS" && mount | awk '$2=$4="";1') | column -t

DEVICE                    PATH   TYPE   FLAGS
/dev/root                 /      ext3   (rw)
/proc                     /proc  proc   (rw)
/dev/mapper/lvmraid-home  /home  ext3   (rw,noatime)

Finally, we can make it an alias so that we always enjoyed the nice output from mount. Let's call this alias nicemount:

$ nicemount() { (echo "DEVICE PATH TYPE FLAGS" && mount | awk '$2=$4="";1') | column -t; }

Let's try it out:

$ nicemount

DEVICE                    PATH   TYPE   FLAGS
/dev/root                 /      ext3   (rw)
/proc                     /proc  proc   (rw)
/dev/mapper/lvmraid-home  /home  ext3   (rw,noatime)

It works!

#22. Run the previous shell command but replace every "foo" with "bar"

$ !!:gs/foo/bar

I explained this type of one-liners in one-liner #5 already. Please take a look for a longer discussion.

To summarize, what happens here is that the !! recalls the previous executed shell command and :gs/foo/bar substitutes (the :s flag) all (the g flag) occurrences of foo with bar. The !! construct is called an event designator.

#23. Top for files

$ watch -d -n 1 'df; ls -FlAt /path'

This one-liner watches for file changes in directory /path. It uses the watch command that executes the given command periodically. The -d flag tells watch to display differences between the command calls (so you saw what files get added or removed in /path). The -n 1 flag tells it to execute the command every second.

The command to execute is df; ls -FlAt /path that is actually two commands, executed one after other. First, df outputs the filesystem disk space usage, and then ls -FlAt lists the files in /path. The -F argument to ls tells it to classify files, appending */=>@| to the filenames to indicate whether they are executables *, directories /, sockets =, doors >, symlinks @, or named pipes |. The -l argument lists all files, -A hides . and .., and -t sorts the files by time.

Special note about doors - they are Solaris thing that act like pipes, except they launch the program that is supposed to be the receiving party. A plain pipe would block until the other party opens it, but a door launches the other party itself.

Actually the output is nicer if you specify -h argument to df so it was human readable. You can also join the arguments to watch together, making them -dn1. Here is the final version:

$ watch -dn1 'df -h; ls -FlAt /path'

Another note - -d in BSD is --differences

#24. Mount a remote folder through SSH

$ sshfs name@server:/path/to/folder /path/to/mount/point

That's right, you can mount a remote directory locally via SSH! You'll first need to install two programs however:

  • FUSE that allows to implement filesystems in userspace programs, and
  • sshfs client that uses FUSE and sftp (secure ftp - comes with OpenSSH, and is on your system already) to access the remote host.

And that's it, now you can use sshfs to mount remote directories via SSH.

To unmount, use fusermount:

fusermount -u /path/to/mount/point

#25. Read Wikipedia via DNS

$ dig +short txt <keyword>.wp.dg.cx

This is probably the most interesting one-liner today. David Leadbeater created a DNS server, which when queried the TXT record type, returns a short plain-text version of a Wikipedia article. Here is his presentation on he did it.

Here is an example, let's find out what "hacker" means:

$ dig +short txt hacker.wp.dg.cx

"Hacker may refer to: Hacker (computer security), someone involved
in computer security/insecurity, Hacker (programmer subculture), a
programmer subculture originating in the US academia in the 1960s,
which is nowadays mainly notable for the free software/" "open
source movement, Hacker (hobbyist), an enthusiastic home computer
hobbyist http://a.vu/w:Hacker"

The one-liner uses dig, the standard sysadmin's utility for DNS troubleshooting to do the DNS query. The +short option makes it output only the returned text response, and txt makes it query the TXT record type.

This one-liner is actually alias worthy, so let's make an alias:

wiki() { dig +short txt $1.wp.dg.cx; }

Try it out:

$ wiki hacker

"Hacker may refer to: Hacker (computer security), ..."

It works!

If you don't have dig, you may also use host that also performs DNS lookups:

host -t txt hacker.wp.dg.cx

#26. Download a website recursively with wget

$ wget --random-wait -r -p -e robots=off -U Mozilla www.example.com

This one-liner does what it says. Here is the explanation of the arguments:

  • --random-wait - wait between 0.5 to 1.5 seconds between requests.
  • -r - turn on recursive retrieving.
  • -e robots=off - ignore robots.txt.
  • -U Mozilla - set the "User-Agent" header to "Mozilla". Though a better choice is a real User-Agent like "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)".

Some other useful options are:

  • --limit-rate=20k - limits download speed to 20kbps.
  • -o logfile.txt - log the downloads.
  • -l 0 - remove recursion depth (which is 5 by default).
  • --wait=1h - be sneaky, download one file every hour.

#27. Copy the arguments of the most recent command

ALT + . (or ESC + .)

This keyboard shortcut works in shell's emacs editing mode only, it copies the last argument form the last command to the current command. Here is an example:

$ echo a b c
a b c

$ echo <Press ALT + .>
$ echo c

If you repeat the command, it copies the last argument from the command before the last, then if you repeat again, it copies the last argument from command before the command before the last, etc.

Here is an example:

$ echo 1 2 3
1 2 3
$ echo a b c
a b c

$ echo <Press ALT + .>
$ echo c

$ echo <Press ALT + .> again
$ echo 3

However, if you wish to get 1st or 2nd or n-th argument, use the digit-argument command ALT + 1 (or ESC + 1) or ALT + 2 (or ESC +2), etc. Here is an example:

$ echo a b c
a b c

$ echo <Press ALT + 1> <Press ALT + .>
$ echo a
a

$ echo <Press ALT + 2> <Press ALT + .>
$ echo b
b

See my article on Emacs Editing Mode Keyboard Shortcuts for a tutorial and a cheat sheet of all the shortcuts.

#28. Execute a command without saving it in the history

$ <space>command

This one-liner works at least on bash, I haven't tested other shells.

If you start your command by a space, it won't be saved to bash history (~/.bash_history file). This behavior is controlled by $HISTIGNORE shell variable. Mine is set to HISTIGNORE="&:[ ]*", which means don't save repeated commands to history, and don't save commands that start with a space to history. The values in $HISTIGNORE are colon-separated.

If you're interested, see my article "The Definitive Guide to Bash Command Line History" for a short tutorial on how to work with shell history and a summary cheat sheet.

#29. Show the size of all sub folders in the current directory

$ du -h --max-depth=1

The --max-depth=1 causes du to summarize disk usage statistics for directories that are depth 1 from the current directory, that is, all directories in the current directory. The -h argument makes the summary human-readable, that is, displays 5MB instead of 5242880 (bytes).

If you are interested in both sub folder size and file size in the current directory, you can use the shorter:

$ du -sh *

#30. Display the top ten running processes sorted by memory usage

$ ps aux | sort -nk +4 | tail

This is certainly not the best way to display the top ten processes that consume the most memory, but, hey, it works.

It takes the output of ps aux, sorts it by 4th column numerically and then uses tail to output the last then lines which happen to be the processes with the biggest memory consumption.

If I was to find out who consumes the most memory, I'd simply use htop or top and not ps.

Bonus one-liner: Start an SMTP server

python -m smtpd -n -c DebuggingServer localhost:1025

This one-liner starts an SMTP server on port 1025. It uses Python's standard library smtpd (specified by -m smtpd) and passes it three arguments - -n, -c DebuggingServer and localhost:1025.

The -n argument tells Python not to setuid (change user) to "nobody" - it makes the code run under your user.

The -c DebuggingServer argument tells Python to use DebuggingServer class as the SMTP implementation that prints each message it receives to stdout.

The localhost:1025 argument tells Python to start the SMTP server on locahost, port 1025.

However, if you wish to start it on the standard port 25, you'll have to use sudo command, because only root is allowed to start services on ports 1-1024. These are also known as privileged ports.

sudo python -m smtpd -n -c DebuggingServer localhost:25

This one-liner was coined by Evan Culver. Thanks to him!

That's it for today,

but be sure to come back the next time for "Yet Another Ten One-Liners from CommandLineFu Explained!"