This article is part of the article series "MIT Introduction to Algorithms."
<- previous article next article ->

MIT AlgorithmsThis is the second post in an article series about MIT's lecture course "Introduction to Algorithms."

I changed my mind a little on how I will be posting the reviews of lectures. In the first post I said that I will be posting reviews of two or three lectures at a time, but I decided to group them by topic instead. This is more logical. Some lectures stand out alone. Lecture 3 is like that.

Lecture three is all all about a powerful technique in algorithm design called "Divide and Conquer." I won't be able to describe it better than the CLRS book, which says: "Divide and Conquer algorithms break the problem into several subproblems that are similar to the original problem but smaller in size, solve the subproblems recursively, and then combine these solutions to create a solution to the original problem."

There are three steps to applying Divide and Conquer algorithm in practice:

  • Divide the problem into one ore more subproblems.
  • Conquer subproblems by solving them recursively. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner.
  • Combine the solutions to the subproblems into the solution for the original problem.

An example of Divide and Conquer is the Merge Sort algorithm covered in lecture one:

  • Divide: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each.
  • Conquer: Sort the two subsequences recursively using Merge Sort.
  • Combine: Merge the two sorted subsequences to produce the sorted answer.

The recursion stops when the sequence to be sorted has length 1, in which case there is nothing else to be done, since every sequence of length 1 is already in sorted order!

Lecture three presents various other algorithms which use the powerful Divide and Conquer technique.

It starts by the running time analysis of Merge Sort algorithms and shows the general structure of recurrence equations generated by Divide and Conquer algorithms.

Then it moves on to Binary Search algorithm which allows finding an element in a sorted array in time proportional to logarithm of array length, or speaking in asymptotic notation speech, it's worst running time complexity is O(lg(n)).

A naive algorithm which just scanned through all elements of an array would have time complexity of O(n) which is much, much worse!

The next algorithm presented solves the problem stated as following: given a number x and given an integer n, compute x to the power n.

A naive algorithm would create a loop from 1 to n, and multiply the x with itself:

result = 1
Naive-Power(x, n):
  for 1 to n:
    result = x * result
  return result

This leads to running time of O(n).

A much more clever approach is to use an algorithm called Recursive Powering. The idea to calculate xn is to express xn as:

  • xn/2­·xn/2­ if n is even, or
  • x(n-1)/2·­x(n-1)/2·x, if n is odd.

Here is the pseudo-code of this algorithm:

Recursive-Power(x, n):
  if n == 1
    return x
  if n is even
    y = Recursive-Power(x, n/2)
    return y*y
  else
    y = Recursive-Power(x, (n-1)/2)
    return y*y*x

Doing asymptotic analysis of this algorithm leads to O(lg(n)) running time, which is a great improvement over O(n).

The lecture continues with four algorithms on how to compute Fibonacci Numbers! You should definitely watch that! Man, it's four algorithms! It's at 17:49 in the lecture!

I won't give the pseudo code here for these ones, but they are naive recursive algorithm, bottom up algorithm, naive recursive squaring and recursive squaring!

The next two topics are Matrix Multiplication the naive way, giving O(n3) running time, and an improved Strassen's Algorithm which reduces the number of multiplications yielding running time of approximately O(n2.81).

The last topic covered in this lecture is VLSI (Very large Scale Integration) layout problem: given a number of various electronic gates an chips, how to position them on the circuit board to minimize area they occupy.

Having described the lecture in great detail, you're welcome to watch it:

Topics covered in lecture 3:

  • [01:25] Divide and conquer method.
  • [04:54] Example of Divide and conquer applied to Merge sort.
  • [06:45] Running time analysis of Merge sort.
  • [09:55] Binary search algorithm.
  • [12:00] Analysis of binary search running time.
  • [13:15] Algorithm for powering a number (recursive powering).
  • [17:49] Algorithms for computing Fibonacci numbers (FBs).
  • [19:04] Naive recursive algorithm (exponential time) for computing FBs.
  • [22:45] Bottom-up algorithm for computing FBs.
  • [24:25] Naive recursive squaring algorithm for FBs (doesn't work because of floating point rounding errors).
  • [27:00] Recursive squaring algorithm for FBs.
  • [29:54] Proof by induction of recursive squaring FB algorithm
  • [34:00] Matrix multiplication algorithms.
  • [35:45] Naive (standard) algorithm for multiplying matrices.
  • [37:35] Divide and conquer algorithm for multiplying matrices.
  • [40:00] Running time analysis of divide and conquer matrix multiplication algorithm.
  • [43:09] Strassen's matrix multiplication algorithm.
  • [50:00] Analysis of Strassen's algorithm.
  • [55:25] VLSI layout problem.
  • [57:50] Naive embedding algorithm for VLSI.
  • [01:05:10] Divide and conquer algorithm for laying out VLSI.

Lecture 3 notes:

MIT Algorithms Lecture 3 Notes Thumbnail. Page 1 of 2.
Lecture 3, page 1 of 2.

MIT Algorithms Lecture 3 Notes Thumbnail. Page 2 of 2.
Lecture 3, page 2 of 2.

Have fun dividing and conquering! The next post will be about two lectures on sorting algorithms!

Ps. the lectures are taught from the CLRS book (also called "Introduction to Algorithms"):

This article is part of the article series "MIT Introduction to Algorithms."
<- previous article next article ->

MIT AlgorithmsI just finished watching the last lecture of MIT's "Introduction to Algorithms" course. Having a great passion for all aspects of computing, I decided to share everything I learned with you, my dear readers! This is the first post in an article series about this course.

As I wrote earlier, I am very serious about watching video lectures. If they are math-intensive, I usually take notes as if I were in the classroom. Lectures in this course were exactly like that -- logarithms, big-o's, thetas, expectations, and all the other math guys fighting with each other on the blackboards.

There are totally 23 video lectures, each around 1 hour 20 minutes long. I will be posting about 2 - 3 lectures at a time which will result in approximately 10 blog posts. Each post will contain annotated lecture, along with embedded video of the lecture and a time-stamped list of topics covered in the lecture. I will also post the notes I took myself as I watched the lectures (actually, I just bought a scanner (Canon CanonScan 4400F) just for this purpose!)

Understanding and designing effective algorithms is a very important skill for a top-notch programmer. You can still do good without knowing much about algorithms, but knowing them makes you superior. There are two kinds of people, those who can design effective algorithms and those who don't. ;)

Let's start with Lecture 1 of this course.

Lecture 1: Analysis of Algorithms

The first lecture is given by the famous professor Charles E. Leiserson. He is the L in CLRS. If that doesn't ring you a bell - it's one of the most popular books on algorithms!

He starts the lecture by explaining what this course and algorithms will be all about. He says that this course will be about "Analysis of Algorithms" and states: "Analysis of algorithms is the theoretical study of computer program performance and resource usage".

Designing great software is not just about performance. Charles presents a list of 12 things that can be more important than performance. Just for comparison with algorithms guru, what do you think can be more important than performance?

Here is the list of things more important than performance that Charles presented:

  • modularity,
  • correctness,
  • maintainability,
  • security,
  • functionality,
  • robustness,
  • user-friendliness,
  • programmer's time,
  • simplicity,
  • extensibility,
  • reliability, and
  • scalability.

He also asks "Why study algorithms and performance at all?". He and students answer:

  • Sometimes performance is correlated with user-friendliness.
  • Performance draws line between feasible and unfeasible.
  • Algorithms give language for talking about program behavior.
  • Performance can be used to "pay" for other things, such as security, features and user-friendliness.

The lecture continues with the definition of the Sorting Problem - given a sequence (a1, a2, ..., an) of numbers, permute them in such a way that a1 <= a2 <= ... <= an.

There are various algorithms to solve this problem. Two algorithms are presented to solve this problems - one of them is Insertion Sort and the other is Merge Sort.

Running time of these algorithms is analyzed by introducing Asymptotic Analysis and Recursion Trees.

Here is the whole lecture:

Topics covered in lecture 1:

  • [17:15] Main topic of the course - Analysis of algorithms.
  • [19:00] What's more important than performance?
  • [22:03] Why study algorithms and performance?
  • [27:45] The sorting problem.
  • [29:30] Insertion sort algorithm
  • [34:30] Example of Insertion sort.
  • [36:25] Running time of algorithms.
  • [39:39] Definition of worst-case, average-case and best-case types of analysis.
  • [46:50] How to analyze the Insertion sort's worst-case running time?
  • [49:28] BIG IDEA - Asymptotic analysis.
  • [50:49] Asymptotic notation - theta notation.
  • [57:14] Insertion sort analysis.
  • [01:02:42] Is Insertion sort fast?
  • [01:03:40] Merge sort algorithm.
  • [01:05:25] Example of Merge subroutine of Merge sort.
  • [01:08:15] Analysis of Merge sort's running time.
  • [01:10:55] Recurrence equation for Merge sort.
  • [01:13:15] Recursion tree solution of the Merge sort's recurrence equation.

Lecture 1 notes:

MIT Algorithms Lecture 1 Notes Thumbnail. Page 1 of 2.
Lecture 1, page 1 of 2.

MIT Algorithms Lecture 1 Notes Thumbnail. Page 2 of 2.
Lecture 1, page 2 of 2.

Lecture 2: Asymptotic Notation

Lecture 2, on the other hand, is given by genius professor Erik Demaine. He is the youngest professor in the history of the MIT! He became professor at MIT at 20! Wow!

This lecture is all about mathematical notation (Asymptotic Notation) used in the analysis of algorithms. It's the big-o notation, big omega notation, theta notation, small-o and small-omega notation.

The second half of the lecture is devoted to solving recurrence equations. Three methods are presented:

  • Substitution method,
  • Recursion-tree method, and
  • The Master method.

Here is the whole lecture:

Topics covered in lecture 2:

  • [01:25] Big-o (upper bounds) notation.
  • [03:58] Set definition of big-o notation.
  • [05:25] The meaning of O(h(n)) in notation f(n) = g(n) + O(h(n)).
  • [10:20] Big-omega (lower bounds) notation.
  • [11:40] Analogies of O, Ω and Θ to comparison operations of real numbers.
  • [12:28] Theta (tight bounds) notation.
  • [13:40] Small-o and small-omega notation.
  • [17:03] Solving recurrences: substitution method.
  • [37:56] Recursion-tree method.
  • [49:00] The Master method.
  • [01:02:00] Proof sketch of the Master method.

Lecture 2 notes:

MIT Algorithms Lecture 2 Notes Thumbnail. Page 1 of 2.
Lecture 2, page 1 of 2.

MIT Algorithms Lecture 2 Notes Thumbnail. Page 2 of 2.
Lecture 2, page 2 of 2.

MIT Algorithms Lecture 2 Notes Thumbnail. Master's Theorem.
Lecture 2. Sketch of Master's theorem proof.

Have fun absorbing all this information! Until next post!

Ps. It turned out that the lectures were not available anywhere but from MIT's OCW website. I found that they were released under CC license, which allowed me to upload them to Google Video, so I can embed them in the posts!

Pps. the lectures are taught from the CLRS book (also called "Introduction to Algorithms"):

This article is part of the article series "Musical Geek Friday."
<- previous article next article ->

hackers and crackersAfter a few weeks of silence, the Musical Geek Friday is back! This time it's a song dedicated to all the hackers and the crackers!

The song is written and performed by a guy calling himself Zearle. It's very interesting to read his autobiography, where, among other crazy things, he says that during his childhood FBI was closing in on his family and his first years were spent underground, going from one small town to another. It's very interesting reading.

The Song for Hackers and Crackers was recorded in 1999 and it is included in his "Class War" music album (downloadable for free).

He says that the song was written as a praise for warez groups. Thanks to them, he was able to refine his computer skills and get software for making music.

The music tracks he uses in his songs are mostly created by unsigned hiphop beatmakers with R&B/soul, reggae, folk, and acid jazz backgrounds.

This song is NSFW -- Not Suitable for Work, as it contains explicit language. Though, you can still listen to it on your headphones :)

Contextually, this song is similar to another song I posted on Musical Geek Friday #2: Leech Access is Coming At You.

Here it is! Song for Hackers and Crackers:

[audio:http://www.catonmat.net/download/zearle-hackers_and_crackers.mp3]

Download this song: song for hackers and crackers.mp3 (musical geek friday #13)
Downloaded: 58274 times

Download lyrics: song for hackers and crackers lyrics (musical geek friday #13)
Downloaded: 3553 times

Song lyrics (I censored the explicit language, see the 'download lyrics' link at above for uncensored version):

UH! This is dedicated to all the hackers and the crackers.
You know what i'm sayin'.
If you hear this, want you to.. bump this and do your dirt dog, do your dirt.

You can fill a stadium with cats using s**t cracked by Radium.
Liquid Sky will never die.
Before the crackers and the hackers, life was wacker.
Software we could never stack for.
We were floored by the evil hoards.
All the best s**t we could never afford.
The internet and mp3, they set us free.
I'm downloading Windows ME right now off FTP.
You b*****s will never find me!
Because I'm everywhere, see.
Yo dog, you got something new?
Hit me off on ICQ.
I'll help you back in jive.
I got some top-notch s**t on i-drive.
You shut one site down, we multiply.
Like a phoenix from a different place we rise.
Then you got the squares;
"Yo dude, your stealing, don't you care?"
L-O-L, b***h. I got a terabyte of warez.
I like my files zip, sit, or rar.
I download seven hundred segments from a thousand sites.
I drink coffee, snort speed; I'll stay up all night.
I'll hack a companies site, and blank it white.
Ghost ISP's; Find me, please...

Dedicated to the hackers and the crackers.
This is dedicated to the hackers and the crackers.
The ones that set us free.
Thanks G for setting us free.
This is dedicated to the hackers and the crackers.

I see in binary, I speak source code.
Step on my toes, I'll post a million jpg's.
If you in a fag pose.
And your digital stance getting firewalled hoe.
Cause I'm ridin' the net and my six four (old school car sixty four is the year).
I was a dick with my 56(K).
Now with my cable -- I'm able to get that stable.
On the out my name is Ace and I'm a Leo.
On the digital highway, my name is Neo and I'm a hero.
In a flash I'll school you on burning dreamcasts.
You need some ISO's?
Let me through my hard drive rifle.
Our exchange, you could never stifle.
With a digital hug, you just caught the lovebug.
I've bootlegged your CD.
I caused the fight between UN and Jay Z.
You see G?
It's all gonna be free.
Whether we take it with force or we take it nicely.
You feel that rattle in you bones?
When I tell you we just hacked DOW Jones?
And NASDAQ leaves you fighting on your back?

Cause I'm the he who loves to hack and crack.
Cause I'm the he who loves to hack and crack.
This is dedicated to the hackers and the crackers.
Serial codes, source codes, ISO's, rar's, zip's, sit's.
This is dedicated to the hackers and the crackers.

Your encryption is primitive egyptian.
I'll do more with a 486 and a Plextor.
We've won when six billion got Athlons.
And we tell each other how to get it on.
Cyber-army's and the Pentagon were storming.
I just found out who killed JFK.
The smoking gun will have it within a day.
All the lies they've been faster.
Go check the name "truth" on Napster.
Digital-disaster.
The p****s will never find me in this matrix.
A million keyboard voices all named Morpheus.
This is dedicated to those who set me free.
Got a buzz? Your always my cuz.
Radium, if i could only say to them; Thanks.
Kalisto, you know!
Utopia, and all my digital dogs.
My netgangs, my cybergangsters, my I/O-warriors, my computercomrades.
This is for you, this is dedicated to the hackers and the crackers.
I wouldn't be able to do s**t without ya'll man.
I'll be sittin' in front of my f**king computer doing a goddamn thing, playing games.
You have all made it possible. This is dedicated to you!

Download "Song for Hackers and Crackers"

Download this song: song for hackers and crackers.mp3 (musical geek friday #13)
Downloaded: 58274 times

Download lyrics: song for hackers and crackers lyrics (musical geek friday #13)
Downloaded: 3553 times

Click to listen:
[audio:http://www.catonmat.net/download/zearle-hackers_and_crackers.mp3]

Have fun and until next geeky Friday! :)

traffic shaping with iptablesA few years ago I worked as a Linux system administrator at a small (few hundred users) Internet service provider. Among all the regular system administrator duties, I also had the privilege to write various software and tools for Linux. One of my tasks was to write a tool to record how much traffic each of the clients was using.

The network for this provider was laid out in a very simple way. The gateway to the Internet was a single Linux box, which was a router, a firewall and performed traffic shaping. Now it had to be extended to do traffic accounting as well.

isp network diagram
Simplified network diagram, all that matters is that the gateway is a Linux box.

At that time I had already mastered IPTables and I had noticed that when listing the existing rules, iptables would display packet count and total byte count for each rule. I thought, yeah, why not use this for accounting? So I did. I created an empty rule (which gets passed through the firewall) for each IP address of users and a script which extracted the byte count.

Here is a detailed explanation of how I did it exactly.

First, let's see what iptables shows us when we have just booted up.

# iptables -L -n -v -x
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

At this moment we have no rules added. That's alright, let's just get familiar with the output we will be interested in when we have some rules. Notice the 'pkts' and 'bytes' columns. The 'pkts' stands for packets and displays the total number of packets matched by the rule. The 'bytes' stands for total number of bytes matched by the rule. Notice also three so called "chains" - INPUT, FORWARD and OUTPUT. The INPUT chain is for packets destinated to the Linux box itself, OUTPUT chain is for packets leaving the Linux box (generated by programs running on the Linux box) and FORWARD is for packets passing through the box.

You might also be interested in the command line arguments that I used:

  • -L lists all the rules.
  • -n does not resolve the ip addresses.
  • -v lists the packet and byte count.
  • -x displays the byte count (otherwise it gets abbreviated to 200K, 3M, etc).

A more serious firewall might have the FORWARD chain filled up with various entries already. Not to mess with them, let's create a new traffic accounting chain called TRAFFIC_ACCT:

# iptables -N TRAFFIC_ACCT
# iptables -L -n -v -x
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain TRAFFIC_ACCT (0 references)
    pkts      bytes target     prot opt in     out     source               destination

Now let's redirect all the traffic going through the machine to match the rules in the TRAFFIC_ACCT chain:

# iptables -I FORWARD -j TRAFFIC_ACCT
# iptables -L -n -v -x
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 TRAFFIC_ACCT  all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain TRAFFIC_ACCT (0 references)
    pkts      bytes target     prot opt in     out     source               destination

A side note: if you had a Linux desktop computer, then you could insert the same rule in the INPUT chain (iptables -I INPUT -j TRAFFIC_ACCT) as all the packets would be destinated for your computer.

IPTables command argument -L can actually take the name of a chain to list the rules from. From now on we will only be interested in rules of TRAFFIC_ACCT chain:

# iptables -L -n -v -x
Chain TRAFFIC_ACCT (1 references)
    pkts      bytes target     prot opt in     out     source               destination

Now to illustrate the main idea, we can play with the rules. For example, let's do the breakdown of traffic by tcp, udp and icmp protocols. To do that we insert three rules in the TRAFFIC_ACCT chain - one to match tcp protocol, one to match udp protocol and the last one to match icmp protocol.

# iptables -A TRAFFIC_ACCT -p tcp
# iptables -A TRAFFIC_ACCT -p udp
# iptables -A TRAFFIC_ACCT -p icmp

After some time has passed, let's look at what we have:

# iptables -L TRAFFIC_ACCT -n -v -x
Chain TRAFFIC_ACCT (1 references)
    pkts      bytes target     prot opt in     out     source               destination
    4356  2151124            tcp  --  *      *       0.0.0.0/0            0.0.0.0/0
     119    15964            udp  --  *      *       0.0.0.0/0            0.0.0.0/0
       3      168            icmp --  *      *       0.0.0.0/0            0.0.0.0/0

We see that 4356 tcp packets totaling 2151124 bytes (2 megabytes) have passed through the firewall, 119 udp packets and 3 icmp packets!

You can zero out the counters with -Z iptables command:

# iptables -Z TRAFFIC_ACCT
# iptables -L TRAFFIC_ACCT -n -v -x
Chain TRAFFIC_ACCT (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       0        0            tcp  --  *      *       0.0.0.0/0            0.0.0.0/0
       0        0            udp  --  *      *       0.0.0.0/0            0.0.0.0/0
       0        0            icmp --  *      *       0.0.0.0/0            0.0.0.0/0

You can remove all the rules from TRAFFIC_ACCT chain with -F iptables command:

# iptables -F TRAFFIC_ACCT
# iptables -L TRAFFIC_ACCT -n -v -x
Chain TRAFFIC_ACCT (1 references)
    pkts      bytes target     prot opt in     out     source               destination

Another fun example you can do is count how many actual connections have been made:

# iptables -A TRAFFIC_ACCT -p tcp --syn
# iptables -L -n -v -x
Chain TRAFFIC_ACCT (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       5      276            tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp flags:0x16/0x02

Shows us that 5 tcp packets which start the connections have been sent. Pretty neat, isn't it?

What I did when I was working as a sysadmin, was add user IP addresses to the TRAFFIC_ACCT chain. Then, I periodically listed and recorded traffic, and zero'ed it out.

You can even create two chains TRAFFIC_ACCT_IN and TRAFFIC_ACCT_OUT to match incoming and outgoing traffic.

# iptables -N TRAFFIC_ACCT_IN
# iptables -N TRAFFIC_ACCT_OUT
# iptables -I FORWARD -i eth0 -j TRAFFIC_ACCT_IN
# iptables -I FORWARD -o eth0 -j TRAFFIC_ACCT_OUT
# iptables -L -n -v -x
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 TRAFFIC_ACCT_OUT  all  --  *      eth0    0.0.0.0/0            0.0.0.0/0
       0        0 TRAFFIC_ACCT_IN  all  --  eth0   *       0.0.0.0/0            0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination

Chain TRAFFIC_ACCT_IN (1 references)
    pkts      bytes target     prot opt in     out     source               destination

Chain TRAFFIC_ACCT_OUT (1 references)
    pkts      bytes target     prot opt in     out     source               destination

For example, to record incoming and outgoing traffic usage of IP addresses 192.168.1.2 and 192.168.1.3 you would do:

# iptables -A TRAFFIC_ACCT_IN --dst 192.168.1.2
# iptables -A TRAFFIC_ACCT_IN --dst 192.168.1.3
# iptables -A TRAFFIC_ACCT_OUT --src 192.168.1.2
# iptables -A TRAFFIC_ACCT_OUT --src 192.168.1.2

And to list the rules:

# iptables -L TRAFFIC_ACCT_IN -n -v -x
Chain TRAFFIC_ACCT_IN (1 references)
    pkts      bytes target     prot opt in     out     source               destination
     368   362120            all  --  *      *       0.0.0.0/0            192.168.1.2
      61     9186            all  --  *      *       0.0.0.0/0            192.168.1.3

# iptables -L TRAFFIC_ACCT_OUT -n -v -x
Chain TRAFFIC_ACCT_OUT (1 references)
    pkts      bytes target     prot opt in     out     source               destination
     373    22687            all  --  *      *       192.168.1.2          0.0.0.0/0
     101    44711            all  --  *      *       192.168.1.3          0.0.0.0/0

That concludes it. You see that it is trivial to do accurate traffic accounting on a Linux machine. In a future post, I might publish a program which displays the traffic in a nice, visual, manner.

At the moment you can output the traffic in a nice manner with this combination of iptables and awk commands:

# iptables -L TRAFFIC_ACCT_IN -n -v -x | awk '$1 ~ /^[0-9]+$/ { printf "IP: %s, %d bytes\n", $8, $2 }'
IP: 192.168.1.2, 1437631 bytes
IP: 192.168.1.3, 449554 bytes

# iptables -L TRAFFIC_ACCT_OUT -n -v -x | awk '$1 ~ /^[0-9]+$/ { printf "IP: %s, %d bytes\n", $7, $2 }'
IP: 192.168.1.2, 88202 bytes
IP: 192.168.1.3, 244848 bytes

I first learned IPTables from this tutorial. It's probably the best tutorial one can find on the subject.

If you did not understand any parts of the article, please let me know in the comments. I will update the post and explain those parts.

python yesterday, today, tomorrowThis is the third post in an article series about Python video lectures. The previous two posts covered learning basics of Python and learning Python design patterns.

This video lecture is given by Google's "Über Tech Lead" Alex Martelli. In this video he talks about the most important language changes in each of the Python versions 2.2, 2.3, 2.4 and 2.5.

I am actually using an older version of Python, version 2.3.4. This lecture gave me a good insight of what new features to expect when I upgrade to a newer version of Python.

Here it is:

Interesting information from lecture:

  • [01:30] There are many versions of Python - Jython, IronPython, pypy and CPython.
  • [03:02] Python 2.2 was a backwards-compatible revolution. It introduced new-style objects, descriptors, iterators and generators, nested scopes, a lot of new modules in standard library.
  • [04:12] New rule for introducing extra features: 2.N.* has not extra features with respect to 2.N.
  • [04:32] Python 2.2 highlights: metaclasses, closures, generators and iterators.
  • [05:35] Python 2.3 was a stable version of Python with no changes to the language.
  • [06:05] Python 2.3 had a lot of optimizations, tweaks and fixes, such as import-from-zip, Karatsuba multiplication algorithm, and new stdlib modules - bz2, csv, datetime, heapq, itertools, logging, optparse, textwrap, timeit, and many others.
  • [08:50] Python 2.3 highlights: zip imports, sum builtin, enumerate builtin, extended slices, universal newlines.
  • [09:50] Python 2.4 added two new language features - generator expressions and decorators. New builtins were added - sorted, reversed and set, frozenset. New modules - collections, cookielib, decimal, subprocess.
  • [13:00] Example of generator expressions and decorators
  • [13:37] Example of sorted() and reversed() builtins.
  • [16:40] Python 2.5 was also evolution of language. It came with full support for RAII (with statement), introduced two new builtins - any and all, unified exceptions and added a ternary operator. New modules - ctypes, xml.etree, functools, hashlib, sqlite3, wsgiref, and others.
  • [18:40] Python 2.5 optimizations.
  • [23:25] RAII - Resource Allocation is Initialization.
  • [25:30] Examples of RAII.
  • [31:05] Python RAII is better than C++'s. Python's RAII can distinguish exception exits from normal ones.
  • [33:29] Example of writing your own context manager.
  • [36:30] Example of writing a RAII ready type with contextlib.
  • [38:05] Following Python's Zen, "Flat is better than nested", use contextlib.nested for multiple resources.
  • [40:40] Generator enhancements - yield can be inside a try clause, yield is now an expression (almost co-routines!).
  • [44:50] Python 2.5 absolute/relative imports.
  • [47:00] Joke - "If you exceed 200 dots when using relative imports, you have a serious psychological problem".
  • [47:45] Python 2.5 try/except/else/finally.
  • [48:55] Python 2.5 if/else ternary operator.
  • [49:35] Python 2.5 exceptions are new style.
  • [51:15] Python 2.5 any and all builtins.
  • [54:00] collections.defaultdict subclasses dict and overrides __missing__.
  • [56:55] ctypes is probably the most dangerous addition to Python. One mistake and you crash.
  • [01:01:30] hashlib replaces md5 and sha modules, and adds sha-(224|256|384|512). Uses OpenSSL as accelerator (if available).
  • [01:02:29] Lecture got cut here but the presentation still had two slides on sqlite3 and wsgiref!

Here is the timeline of Python versions. I hope Alex doesn't mind that I took it from his presentation. :)

python timeline of versions 2.2, 2.3 and 2.5

If you don't know what new-style objects are about, see these two tutorials:

Have fun writing better code in Python!