Last time I stopped at showing how to override functions in shared libraries by compiling your own shared library and preloading it via the LD_PRELOAD environment variable. Today I'll show you how to call the original function from the overridden function.

First let's review the code example that we used in the previous article. We had a program called prog.c that simply used fopen:

#include <stdio.h>

int main(void) {
    printf("Calling the fopen() function...\n");

    FILE *fd = fopen("test.txt", "r");
    if (!fd) {
        printf("fopen() returned NULL\n");
        return 1;
    }

    printf("fopen() succeeded\n");

    return 0;
}

Today let's write a shared library called myfopen.c that overrides fopen in prog.c and calls the original fopen from the c standard library:

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>

FILE *fopen(const char *path, const char *mode) {
    printf("In our own fopen, opening %s\n", path);

    FILE *(*original_fopen)(const char*, const char*);
    original_fopen = dlsym(RTLD_NEXT, "fopen");
    return (*original_fopen)(path, mode);
}

This shared library exports the fopen function that prints the path and then uses dlsym with the RTLD_NEXT pseudohandle to find the original fopen function. We must define the _GNU_SOURCE feature test macro in order to get the RTLD_NEXT definition from <dlfcn.h>. RTLD_NEXT finds the next occurrence of a function in the search order after the current library.

We can compile this shared library this way:

gcc -Wall -fPIC -shared -o myfopen.so myfopen.c -ldl

Now when we preload it and run prog we get the following output that shows that test.txt was successfully opened:

$ LD_PRELOAD=./myfopen.so ./prog
Calling the fopen() function...
In our own fopen, opening test.txt
fopen() succeeded

This is really useful if you need to change how a part of a program works or do some advanced debugging. Next time we'll look at how LD_PRELOAD is implemented.

Remember my previous two blog posts about publishing 30 and 20 of my projects to github? Here are another 10 projects in no particular order that I've written since then.

I really love writing open source projects. Everyone should publish all their projects to github. Good or bad. Like my friend just said: "Publish more open source, not less. "Too much noise" is a terrible excuse. Publish wisdom so we may use. Publish mistakes so we may learn."

If you find my projects interesting, consider following me on github! Thank you!

Shorten urls with bitly without api

This is a Perl module that shortens urls via bitly without using their API. I once had to shorten more than a few hundred urls quickly so I wrote this.

Here's an example of how to use it:

use bitly;

my $bitly = bitly->new('username', 'password');
my $url = bitly->shorten('http://www.url.com');

unless ($url) {
    say $bitly->{error};
}
else {
    say $url;
}

Speak text files to wav via Microsoft Speech API

I once had this idea of converting my blog posts to mp3. So I wrote a C++ program that uses the Microsoft's speech api. It takes a txt file as input and produces a wav file output.

Usage:

speak.exe <voice name> <text file> <wav file>
-or-
speak.exe --list-voices

Compile this project with Microsoft Visual Studio 2008 or later.

Bonus: At first I tried using Loquendo SDK and created a TextToWav project but their speech engine wasn't too great so I created speak-text-files-to-wav.

Load status server for Windows

This simple C++ program creates a single threaded TCP server that replies with information in JSON format about Window's CPU, Disk, and Memory usage. I created this because I had to monitor Windows servers at Browserling.

Here's an example:

C:\bin\> load-status-server.exe
Started server on port 7000

Now if you connect to server.com:7000, it will send you JSON with the current server load info:

$ nc server.com 7000
{
    "memory": {
        "usage": 71,
        "total_physical": 1068388352,
        "free_physical": 302112768,
        "total_paging_file": 3096842240,
        "free_paging_file": 2155446272,
        "total_virtual": 2147352576,
        "free_virtual": 2130919424,
        "free_extended_virtual": 0
    },
    "cpu": {
        "load": 5
    },
    "disk": {
        "free_user": 22858027008,
        "free_total": 22858027008,
        "total": 42947571712
    }
}

Compile this project with Visual Studio or mingw. It uses just Win32 calls.

HWND finder

We started working on a new product at Browserling that takes screenshots so I wrote a bunch of code that finds browser window HWNDs. Then I decided to open source some parts of it and created HWND finder. The idea is to have something like jQuery's syntax for finding HWNDs. We delayed launching this screenshot product so I haven't made any updates.

Here's an example:

#include "hwnd-finder.h"

HwndFinder hf;
HWND rendererHwnd = hf.find("Chrome_WidgetWin_1 > Chrome_WidgetWin_0 > Chrome_RenderWidgetHostHWND");

This finds Chrome renderer's window handle. Here Chrome_RenderWidgetHostHWND is a child of Chrome_WidgetWin_0 is a child of Chrome_WidgetWin_1 which is the top window.

node-number-range

This is a node.js module that streams number ranges. Here are all the ranges it supports:

var range = require('number-range');

* range(10) - range from 0 to 9
* range(-10, 10) - range from -10 to 9 (-10, -9, ... 0, 1, ... 9)
* range(-10, 10, 2) - range from -10 to 8, skipping every 2nd element (-10, -8, ... 0, 2, 4, 6, 8)
* range(10, 0, 2) - reverse range from 10 to 1, skipping every 2nd element (10, 8, 6, 4, 2)
* range(10, 0) - reverse range from 10 to 1
* range('5..50') - range from 5 to 49
* range('50..44') - range from 50 to 45
* range('1,1.1..4') - range from 1 to 4 with increment of 0.1 (1, 1.1, 1.2, ... 3.9)
* range('4,3.9..1') - reverse range from 4 to 1 with decerement of 0.1
* range('[1..10]') - range from 1 to 10 (all inclusive)
* range('[10..1]') - range from 10 to 1 (all inclusive)
* range('[1..10)') - range grom 1 to 9
* range('[10..1)') - range from 10 to 2
* range('(1..10]') - range from 2 to 10
* range('(10..1]') - range from 9 to 1
* range('(1..10)') - range from 2 to 9
* range('[5,10..50]') - range from 5 to 50 with a step of 5 (all inclusive)
* range('10..') - infinite range starting from 10
* range('(10..') - infinite range starting from 11

Very cool stuff. Especially the infinite ranges, which use the process.nextTick trick.

HTTP::Async::Retry

It's almost HTTP::Async::Retry. It's actually just a async_retry.pm file that you can drop into your project to do a quick hack. With a bit of effort it could be HTTP::Async::Retry.

I once had to scrape a lot of information so I used my favorite language Perl and used the HTTP::Async module. A lot of URLs would time out as I was creating thousands of connections per second. At first I simply copied the retry code from hack to hack but then at one moment I had enough so I simply wrote async_retry.pm that abstracts away the retries.

Here's an example:

use warnings;
use strict;

use HTTP::Request;
use async_retry qw/async_retry/;

my @urls = (
    'http://www.google.com/1';,
    'http://www.google.com/2';,
    'http://www.google.com/3';,
    'http://www.google.com/';,
    'http://www.google.com/5';,
);

async_retry(
    {
        retries => 5
    },
    [
        map { HTTP::Request->new(GET => $_) } @urls
    ],
    sub {
        my ($req, $res) = @_;
        print $res->base, "\n";
    }
);

This code tries to get all those Google urls and retries to get them 5 times. If a url succeeds or fails after retries, it calls the callback with HTTP::Request and HTTP::Response objects.

HTML Keyboard Widget

This is just an on-screen keyboard widget for Browserling. I wrote it because people with weird keyboard layouts couldn't input various English characters in Browserling. We'll add it to Browserling soon (it's a planned feature.)

Here's how it looks like:

You can try a live demo here.

Cached browser badges

This project just creates cached browser badges for Testling, so that we don't have to generate them again as it's costly.

~/.ssh/authorized_keys ssh key manager

This project manages public ssh keys in ~/.ssh/authorized_keys file. We'll use this at Browserling to manage the ssh keys for tunnels so that you can add, remove and list the keys. (It's a planned feature.)

Here's an example:

var sshManager = require('ssh-key-manager');
sshManager.addKey('pkrumins', 'ssh-rsa AAAAB3NzaC1y...', function (err) {
    if (err) {
        console.log(err);
        return;
    }
});

node-tree-kill

This is a node.js module that kills all processes in the process tree, including the given root process.

Here's an example:

var kill = require('tree-kill');
kill(301, 'SIGKILL');

In this example we kill all the children processes of the process with pid 301, including the process with pid 301 itself.

This module currently works on Linux only as it uses ps -o pid --no-headers --ppid PID to find the parent pids of PID.

GitHub is awesome!

Push all your projects to github all the time! Don't let your project rot on your hard drive! Publish it to github! Publish wisdom so we may use. Publish mistakes so we may learn.

And just another reminder, I'd love if you followed me on github and twitter! :)

This is going to be a super short and super simple tutorial for beginners about LD_PRELOAD. If you're familiar with LD_PRELOAD, you'll learn nothing new. Otherwise keep reading!

Did you know you could override the C standard library's functions (such as printf, fopen, etc) with your own version of these functions in any program? In this article I'll teach you how this can be done through the LD_PRELOAD environment variable.

Let's start with a simple C program (prog.c):

#include <stdio.h>

int main(void) {
    printf("Calling the fopen() function...\n");

    FILE *fd = fopen("test.txt","r");
    if (!fd) {
        printf("fopen() returned NULL\n");
        return 1;
    }

    printf("fopen() succeeded\n");

    return 0;
}

The code above simply makes a call to the standard fopen function and then checks its return value. Now, let's compile and execute it:

$ ls
prog.c  test.txt

$ gcc prog.c -o prog

$ ls
prog  prog.c  test.txt

$ ./prog
Calling the fopen() function...
fopen() succeeded

Now let's write our own version of fopen and compile it as a shared library:

#include <stdio.h>

FILE *fopen(const char *path, const char *mode) {
    printf("Always failing fopen\n");
    return NULL;
}

Let's call this file myfopen.c, and let's compile it as a shared library:

gcc -Wall -fPIC -shared -o myfopen.so myfopen.c

Now we can simply modify LD_PRELOAD:

$ LD_PRELOAD=./myfopen.so ./prog
Calling the fopen() function...
Always failing fopen
fopen() returned NULL

As you can see the fopen got replaced with our own version that is always failing. This is really handy if you've to debug or replace certain parts of libc or any other shared library.

Next time I'll write about how the LD_PRELOAD works internally.

We all know the regular expression character classes, right? There are 12 standard classes:

[:alnum:]  [:digit:]  [:punct:]
[:alpha:]  [:graph:]  [:space:]
[:blank:]  [:lower:]  [:upper:]
[:cntrl:]  [:print:]  [:xdigit:]

But have you seen a visual representation of what these classes match? Probably not. Therefore I created a visualization that illustrates which part of the ASCII set each character class matches. Call it a cheat sheet if you like:


small version, large version

A bunch of programs that I used

Just for my own reference, in case I ever need them again, here are the one-liners I used to create this cheat sheet:

perl -nle 'printf "%08b - %08b\n", map { hex "0x".(split / /)[0], hex "0x".(split / /)[1] } $_ '
perl -nle 'printf "%03o - %03o\n", map { (split / /)[0], (split / /)[1] } $_'

And I used this perl program to generate and check the red/green matches:

use warnings;
use strict;

my $red = "\e[31m";
my $green = "\e[32m";
my $clear = "\e[0m";

my ($start, $end) = @ARGV;

die 'start or end not given' unless defined $start && defined $end;

my @classes = qw/alnum alpha blank cntrl digit graph lower print punct space upper xdigit/;

for (map { chr } $start..$end) {
    for my $class (@classes) {
        print "${green}1${clear}" if /[[:$class:]]/;
        print "${red}0${clear}" unless /[[:$class:]]/;
    }
    print "\n"
}

Credits

I was inspired to create this visualization when I saw a similar table for C's ctype.h character classification functions.

Misc 108 Comments February 24, 2013

TCP Traceroute

Did you know you could traceroute over the TCP protocol?

The regular traceroute usually uses either ICMP or UDP protocols. Unfortunately firewalls and routers often block the ICMP protocol completely or disallow the ICMP echo requests (ping requests), and/or block various UDP ports.

However you'd rarely have firewalls and routers drop TCP protocol on port 80 because it's the web's port.

Check this out. Let's try to traceroute www.microsoft.com using ICMP protocol:

# traceroute -I www.microsoft.com  
traceroute to www.microsoft.com (65.55.57.27), 30 hops max, 60 byte packets
 1  50.57.125.2 (50.57.125.2)  0.552 ms  0.647 ms  0.742 ms
 2  core1-aggr701a-3.ord1.rackspace.net (184.106.126.50)  0.415 ms  0.555 ms  0.653 ms
 3  corea.ord1.rackspace.net (184.106.126.128)  0.707 ms  0.873 ms  0.984 ms
 4  bbr1.ord1.rackspace.net (184.106.126.147)  1.345 ms  1.341 ms  1.337 ms
 5  * * *
 6  204.152.140.33 (204.152.140.33)  3.614 ms  3.747 ms  3.244 ms
 7  xe-0-2-0-0.ch1-96c-2b.ntwk.msn.net (207.46.46.49)  3.319 ms  4.019 ms  4.010 ms
 8  ge-7-0-0-0.co1-64c-1a.ntwk.msn.net (207.46.40.94)  53.543 ms  53.105 ms  53.074 ms
 9  xe-5-2-0-0.co1-96c-1b.ntwk.msn.net (207.46.40.165)  52.942 ms  52.710 ms  52.670 ms
10  * * *
11  * * *
12  * * *
13  * * *

We get lots of * * * and we've no idea how the packets reach www.microsoft.com.

Now let's try UDP traceroute:

# traceroute -U www.microsoft.com
traceroute to www.microsoft.com (65.55.57.27), 30 hops max, 60 byte packets
 1  50.57.125.2 (50.57.125.2)  0.529 ms  0.599 ms  0.662 ms
 2  core1-aggr701a-3.ord1.rackspace.net (184.106.126.50)  0.480 ms  0.571 ms  0.658 ms
 3  corea.ord1.rackspace.net (184.106.126.128)  0.507 ms corea.ord1.rackspace.net (184.106.126.124)  0.463 ms  0.569 ms
 4  bbr1.ord1.rackspace.net (184.106.126.145)  1.345 ms  1.322 ms  1.290 ms
 5  * * *
 6  * 204.152.140.35 (204.152.140.35)  2.697 ms *
 7  xe-0-2-0-0.ch1-96c-2b.ntwk.msn.net (207.46.46.49)  3.665 ms ge-7-0-0-0.co1-64c-1a.ntwk.msn.net (207.46.40.94)  53.363 ms  52.597 ms
 8  xe-3-1-0-0.co1-96c-1b.ntwk.msn.net (207.46.33.190)  52.284 ms  52.643 ms xe-0-1-0-0.co1-96c-1a.ntwk.msn.net (207.46.33.177)  52.665 ms
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *

Same. Finally let's try traceroute over TCP protocol port 80:

# traceroute -T -p 80 www.microsoft.com
traceroute to www.microsoft.com (65.55.57.27), 30 hops max, 60 byte packets
 1  50.57.125.2 (50.57.125.2)  0.540 ms  0.629 ms  0.709 ms
 2  core1-aggr701a-3.ord1.rackspace.net (184.106.126.50)  0.486 ms  0.604 ms  0.691 ms
 3  corea.ord1.rackspace.net (184.106.126.128)  0.511 ms corea.ord1.rackspace.net (184.106.126.124)  0.564 ms  0.810 ms
 4  bbr1.ord1.rackspace.net (184.106.126.147)  1.339 ms  1.310 ms bbr1.ord1.rackspace.net (184.106.126.145)  1.307 ms
 5  chi-8075.msn.net (206.223.119.27)  3.619 ms  2.560 ms  2.528 ms
 6  * 204.152.140.35 (204.152.140.35)  3.640 ms *
 7  ge-7-0-0-0.co1-64c-1a.ntwk.msn.net (207.46.40.94)  52.523 ms xe-0-2-0-0.ch1-96c-2b.ntwk.msn.net (207.46.46.49)  3.825 ms xe-1-2-0-0.ch1-96c-2b.ntwk.msn.net (207.46.46.53)  3.355 ms
 8  xe-0-1-0-0.co1-96c-1a.ntwk.msn.net (207.46.33.177)  61.042 ms  61.032 ms  60.457 ms
 9  * * xe-5-2-0-0.co1-96c-1b.ntwk.msn.net (207.46.40.165)  100.069 ms
10  65.55.57.27 (65.55.57.27)  53.868 ms  53.038 ms  52.097 ms

A full network path to www.microsoft.com!

There are various different traceroute implementations and if your system doesn't have one that supports tcp protocol, I suggest you either get the new modern implementation of traceroute, or get the tcptraceroute by Michael Toren.