I wrote an awesome node.js module for use at StackVM - a module called node-lazy that does lazy list processing through events!

It comes really handy when you need to treat a stream of events like a list. The best use case currently is returning a lazy list from an asynchronous function, and having data pumped into it via events. In asynchronous programming you can't just return a regular list because you don't yet have data for it. The usual solution so far has been to provide a callback that gets called when the data is available. But doing it this way you lose the power of chaining functions and creating pipes, which leads to not that nice interfaces. (See the 2nd example below to see how it improved the interface in one of my modules.)

Check out this toy example:

var Lazy = require('lazy');

var lazy = new Lazy;
lazy
  .filter(function (item) {
    return item % 2 == 0
  })
  .take(5)
  .map(function (item) {
    return item*2;
  })
  .join(function (xs) {
    console.log(xs);
  });

This code says that lazy is going to be a lazy list that filters even numbers, takes first five of them, then multiplies all of them by 2, and then calls the join function (think of join as in threads) on the final list.

And now you can emit data events with data in them at some point later,

[0,1,2,3,4,5,6,7,8,9,10].forEach(function (x) {
  lazy.emit('data', x);
});

The output will be produced by the join function, which will output the expected [0, 4, 8, 12, 16].

And here is a real-world example. Some time ago I wrote a hash database for node.js called node-supermarket (think of key-value store except greater). Now it had a similar interface as a list, you could .forEach on the stored elements, .filter them, etc. But being asynchronous in nature it lead to the following code, littered with callbacks and temporary lists:

var Store = require('supermarket');

var db = new Store({ filename : 'users.db', json : true });

var users_over_20 = [];
db.filter(
  function (user, meta) {
    // predicate function
    return meta.age > 20;
  },
  function (err, user, meta) {
    // function that gets executed when predicate is true
    if (users_over_20.length < 5)
      users_over_20.push(meta);
  },
  function () {
    // done function, called when all records have been filtered

    // now do something with users_over_20
  }
)

This code selects first five users who are over 20 years old and stores them in users_over_20.

But now we changed the node-supermarket interface to return lazy lists, and the code became:

var Store = require('supermarket');

var db = new Store({ filename : 'users.db', json : true });

db.filter(function (user, meta) {
    return meta.age > 20;
  })
  .take(5)
  .join(function (xs) {
    // xs contains the first 5 users who are over 20!
  });

This is so much nicer!

If you wish to try node-lazy just do npm install lazy! Alternatively, if you don't have npm, you can git clone http://github.com/pkrumins/node-lazy.git and set your NODE_PATH environment variable to point to that directory.

Enjoy and follow the future node-lazy developments in its github repo - node-lazy at github!

Article Sponsors

None!

Want to sponsor my future (or past) articles? Contact me for prices and options!

I was interviewed recently on 2011-10-04 by Michael Matuzak from Lambdaphant. I copied the interview here in case Michael's website ever changes or goes down. (Update: Michael's website changed. Here is the original interview from the archive.).

This first interview is with Peteris Krumins who runs the blog catonmat. Peteris is also a founder along with James Halliday of StackVM.

How did you get started in programming?

I don't really know. All I remember is that I have always wanted to be a programmer. From the first day I learned about computer programming, whenever it was, I wanted to be a programmer. I didn't really get stared until I met this person on IRC in around 1996, who knew everything about computers and programming and he helped me a bunch with getting started with Unix and C programming.

I've read that you initially wrote an IRC client as a first project. What language was that in?

That was the first kind-of-a-big-project, not really the very first. I had messed around with various languages and written tens of toy programs before. So at that time I was doing mIRC scripting, messing with Eggdrop bots and decided to create my own IRC client. I first tried to write it in C++ and MFC but it was beyond me at that age but Visual Basic was really straight forward and I wrote a fully working IRC client pretty quickly.

What methods did you use to teach yourself knowing very little about programming?

I'd program by reading tons of source code of other programs. At that time I didn't have Internet access, so I'd carry a pack of floppies around with me and fill them with source code of various programs whenever I had access to Internet somewhere, and then compile and study them at home.

What advice would you give to kids interested in programming now?

My advice is to start programming in a language with light syntax. Like I remember I couldn't really understand the C++ syntax, with all the template and class stuff but mIRC scripting and Visual Basic at the same time was really straight forward. It was also important that I saw the results quickly, so I'd recommend kids to use a language that can create GUI really easily, perhaps I'd even recommend using the same Visual Basic and just creating all kinds of toy programs, like animated games, twitter clients, network chat programs and similar small programs.

Like most hackers you use many different languages. I'm sure you try and use the best tool for the job to get work done (unless you are purposefully using the wrong tool to learn), but let's imagine that you had to pick one language to use for the rest of your life. What would that be?

It would be Haskell because you can't get more functional than that and I love functional programming, it makes the code so elegant and keeps your mind busy trying to come up with the most beautiful abstractions.

Over the last couple of years your blog has become pretty popular. The great content is significant in making that happen. How did you go about gaining readers, and do you think the programming subreddit and HN were significant in gaining popularity?

Yes, the great content is absolutely the key. If you write about something passionately and thoroughly people will notice your blog and start following. So I knew from the first day that social media was the way to gain popularity. I'd submit all my posts to reddit, and later when Hacker News was created, I'd submit them there too, and also ask my friends to Stumble my posts and I'd submit them to delicious, and tell everyone on IRC, and post my articles to Linux and related forums. But even with all this effort after the first year of blogging I only had 1000 RSS subscribers, it took another year to get to 7000 and then another year to 12000.

James Halliday and yourself are working on a start-up called StackVM. What gave you the idea for StackVM? How long until users will see an official beta?

It was actually James's idea. He first created a working prototype in Haskell in January 2010, and then he showed it to me. At that time we both knew each other already and had talked about startups and ycombinator and I really enjoyed James's passion for functional programming. And then in March James offered to do a startup. I had actually wanted to do a startup since 2004 or 2005, when I remember telling my friends about Paul Graham and his startup essays but I hadn't met the right person. This was finally a great opportunity to do what I had dreamed about and so it all started. I am moving to Oakland to work together with James in a few weeks and after a few more weeks we'll have the official beta. We already have quite a powerful server for doing beta!

How are you liking start-up life?

It's not any different than the regular life so I love it.

Are you going to Start-up school this year? If so who are you most excited to see speak?

No, I am not going. It takes place in October 16 and I'll be in Bay Area only in early November.

What is your favorite music to hack away to?
No music actually required.

What is your favorite fiction book?

I don't have a favorite fiction book. I have been focusing on reading academic and scientific books. From these books I'd recommend The New Turing Omnibus, which contains 66 awesome lightweight articles on various fundamental computer science topics.

GitHub Social CodingRemember my previous article "I pushed 30 of my projects to GitHub?" Well, I just gathered 20 more projects that I had done (or did recently) and pushed them all to GitHub.

Quick note on GitHub - GitHub is the best invention ever for programmers. Nothing stimulates you more than pushing more and more projects to GitHub and seeing people forking them, following them, finding and fixing bugs for you. I wouldn't be doing so much coding if there wasn't GitHub.

If you like my projects, I'd love if you followed me on github! Oh, and also on twitter! Thank you!

Right, so here are the new projects:

node-video

This is a node.js module for recording HTML5 Theora/Ogg videos. It's written in C++ and is the first class citizen of node.js, meaning that it's fully asynchronous. It uses libtheora and libogg libraries for recording.

I wrote this module for my StackVM startup so that anyone could record virtual machine video screencasts. See StackVM demo video #1 at around 1min 23secs, where I demo this module in action.

node-png

This is another node.js module for producing PNG images from raw RGB/BGR/RGBA/BGRA buffers. It's also written in C++, is asynchronous, and uses libpng to produce images. I also wrote it for StackVM. I added a concept of stacked-pngs to the library where many virtual machine screen updates get stacked together to produce the final image, but that is a topic for a separate post.

node-jpeg

This is also a node.js module for producing JPEG images from raw RGB buffers. It uses libjpeg (or libjpeg-turbo, which is much faster than libjpeg), it's is written in C++ and is asynchronous.

This module was also written for StackVM and it will be used in cases when the client has a really slow connection. In that case the virtual machine screen updates get downsampled to quality and size that the client is able to receive.

node-gif

This is a module for node.js for producing GIF images. I like this module the most because it can be used to record what I call "gifcasts". Gifcasts are screencasts that get recorded to animated gifs. Here is an example gifcast that I recorded - A gifcast of me plurking from Windows XP.

This module is also written in C++ and uses giflib.

node-image

This is a module for node.js that unifies node-png, node-jpeg and node-gif. So instead of requiring all three modules, you just var Image = require('image') and then can do things like:

var png = Image.encodeSync('png', buffer); // or
var jpeg = Image.encodeSync('jpeg', buffer); // etc.
var gif = Image.encodeSync('gif', buffer);

node-supermarket

Node-supermarket is like a regular key-value store (hash-table), except greater. It uses node-sqlite as the underlying storage engine that gives it unprecedented stability. This library doesn't end here. The plan is to create an object store, where you can just dump the whole js objects, and then restore them back, map, filter and fold on them, etc.

supermarket-cart

Supermarket-cart stores connect sessions in supermarket key-value store.

node-base64

This ia a node.js module for doing base64 encoding/decoding. I wrote it because half a year ago when I started working on StackVM, node.js didn't have base64 encoding functions and all other modules were terribly broken for binary data. So I named this module "base64 module that actually works."

nodejs-proxy

This is a HTTP proxy written in node.js. It has access control and URL black lists. I wrote it for fun.

Perl TCP Proxy

This is a TCP proxy written in Perl. I wrote it as a helper program for my "Turn any Linux computer into SOCKS5 proxy in one command" post.

node-chess

This is a my, James's and Joshua's node.js knockout entry - an online chess game (half-working).

catonmat.net blog engine

I wrote a new catonmat.net engine in Python. I used Werkzeug, SQLAlchemy, Mako, Pygments, Memcached, Sphinx and repoze.profile to make it as awesome as it is.The design followed the "50 ideas for the new catonmat.net website."

Social Scraper

This is an older project from 2007 that I found on my hard drive. It's a social media website scraper (and also some popular news site scraper). It used to scrape data from boingboing, del.icio.us, digg, flickr, furl, reddit, simpy, stumbleupon and wired.

The Little Schemer Book Review

This is a book review of The Little Schemer. The book is a dialogue between you and the authors about interesting examples of Scheme programs and it teaches you to think recursively.

If anyone asks me which book do I recommend for learning basics of Lisp, I recommend this one (and The Seasoned Schemer, see below). It's very fun to read and can be read in one evening.

The Seasoned Schemer Book Review

This is a book review of The Seasoned Schemer. This book continues where The Little Schemer ended and introduces more advanced programming and Scheme concepts such as accumulators, letrec, letcc, call/cc and generators.

Where The Little Schemer can be read in one evening, this book will take one whole day.

The Reasoned Schemer Book Review

This is a book review of The Reasoned Schemer. Though this is not yet a full book review. I currently only had time to go through first few chapters. It's really complicated and takes a lot of effort to understand. One of the authors is Oleg Kiselyov, which instantly makes this book so conceptually difficult that it may take one full week to comprehend some of the topics.

Here is how I summarize this book:

The goal of the book is to show the beauty of relational programming. The authors of the book believe that it is natural to extend functional programming to relational programming. They demonstrate this by extending Scheme with a few new constructs, thereby combining the benefits of both styles. This extension also captures the essence of Prolog, the most well-known logic programming language.

The Little MLer Book Review

This is a book review of The Little MLer. The Little MLer book has two goals. The first and primary goal is to teach you to think recursively about types and programs. The second goal is to expose you to two important topics concerning large programs: dealing with exceptional situations and composing program components.

Having learned the concept of functors in ML, I realized that various programming languages like to call all kinds of unrelated things "functors". So I wrote a post "On Functors".

More!

These are not all the projects that I have pushed to GitHub since last time, but the others are not that interesting. Just for completeness, they are:

  • php2000 - written in 2000, a php routing engine via require().
  • webdev-template - a small webdev template with reset css.
  • node-bufferdiff - compares two node.js buffers fast.
  • node-time - time functions for node.js (had forgotten about Date object).
  • node-jsmin - javascript minification node.js module.
  • node-async - simplest possible asynchronous node.js C++ module (useful as an example).
  • rfb-protocols - implements hextile rfb decoder to RGB buffer in C++.

This is actually more than 20 projects, but not all of them count. :) Anyway, hope you find some of them useful and until the next post!

And just another reminder, I'd love if you followed me on github and twitter! :)

So I participated in the 48 hour Node.js Knockout competition together with James Halliday and Joshua Holbrook. Our team was called Dark Knights and we created an online chess application called Node Chess.

We didn't quite manage to completely finish the game and it has several bugs, like the turns don't alternate and the king can be captured, but it's crazy awesome anyway. If both players follow the rules, it all works correctly. Castling works, pawn promotion works, capture en-passant also. Try it and if you find it awesome, please vote! Oh, and it works only in Chrome. We were under time pressure and at one point it stopped working under Firefox and we did not get to fixing it.

Here is how the game looks,


A chess game between pkrumins and someone. King's Indian Defence.

Joshua did all the awesome vector graphics work. I did the chess engine work, and James used his amazing dnode node.js module to blend client and server code together. James has actually redefined how web development happens. Instead of writing server code and client code, as we are so used to, with his dnode framework it's now possible to use the same code both server and client side! Much less hustle and purely ingenious!

Here is the same game in perspective view, the highlighted squares are the available moves,


The same game in perspective view.

And the moves are animated, too! The pawns shoot the opponent pieces and the queen stabs them. Try it!

Right, so my reflections on the competition.

It was well organized, and we were sent access to a Joyent deployment server and a Heroku server early on together with instructions. It turned out that Heroku's didn't support Socket.IO or websockets. Win for Joyent. Pretty much everyone went with Joyent as far as I know. We had some technical difficulties at the start with deploying our code, but guys at #node.js helped us and we got our app running pretty quickly.

We used 3 Git repositories to push the code to, our own GitHub repositories (pkrumins, substack, jesusabdullah), then the node knockout's private repository for judges, and deployment repository on Joyent. Joyent was configured so that as you push your code to its Git repo, the hooks in it would restart the node.js service and the you'd be instantly running the latest version of your code.

So I'd make changes push to my GitHub repo, James would pull from me. He'd make changes, I'd pull from him, and same for Joshua. It went pretty flawless. We had like 12 merge errors total, but those were all resolved within a minute or two.

Now some numbers. We're actually amazed by our performance. Check out these numbers:

$ git log | grep Author | wc -l
429

429 commits! Can you believe that? 429 commits in 2 days! That's 9 commits per hour on average! That is what I call hacking!

My commits:

$ git log | grep Author | grep Peteris | wc -l
169

I did 3.5 commits per hour on average. And funnily, James and Joshua each had 130 commits:

$ git log | grep Author | grep James | wc -l
130
$ git log | grep Author | grep Joshua | wc -l
130

That's 2.7 commits per hour on average! Amazing! But we also slept between the competition days. On the both days we did about 4 hours of clean sleep, shrinking our competition time to 40 hours. Then our average becomes 10.7 commits per hour! Wowsers!

Here is a graph, made with Raphael.js, that shows our git commit activity by hour, starting from 3am UTC Aug 28 to 3am UTC Aug 30:


Team "Dark Knights" git commit activity by hour.

Our peak commit intensity was at 9pm the last night, when we did 23 commits in one hour. Our team was also widely spread out. I am in Riga, Latvia, James is in Kenai, Alaska, and Joshua is in Fairbanks, Alaska. Yet we managed to keep the same schedule. I'd go to bed at noon (noon for me is 3pm UTC, see the graph above), while James and Joshua at midnight, and we'd wake up several hours later and keep hacking!

Total number of code lines written:

$ wc -l `find . -name '*.js' -o -name '*.html' -o -name '*.css' | \
  egrep -v '(jquery|raphael)'`
3074 total

So we wrote 3074 lines in two days, which according to git break up into added vs. deleted as following:

$ git log --numstat | grep '^[0-9]' | \
  egrep '(\.js|\.css|\.html|jquery|raphael)' | \
  awk '{a+=$1;d+=$2}END{print "Added: " a, "Deleted: " d}'
Added: 5210 Deleted: 2042

Hmm, 5210-2042 doesn't quite add up to 3074 but is close enough. From these 3074 lines of code non-empty were:

$ cat `find . -name '*.js' -o -name '*.html' -o -name '*.css' | \
  egrep -v '(jquery|raphael)'` | perl -nle 'print if /\S/' | \
  wc -l
2659

So 2659 real lines of code in 2 days! Talk about productivity! And that's just code alone. Joshua also did 50 artworks,

$ find . -name '*.svg' -o -name '*.png' | wc -l
50

Total number of file changes:

$ git log --shortstat | grep 'files changed' | \
  awk '{t+=$1}END{print t}'  
724

We communicated in IRC, in our #stackvm startup channel. Here are some statistics on how much stuff went on in our IRC channel:

$ (
  grep -v '^0[012]:' '#stackvm.08-28.log';
  cat '#stackvm.08-29.log';
  grep '^0[012]' '#stackvm.08-30.log'
  ) | wc -l
5069

So 5069 events happened during the challenge. That's 105 events per hour on average. We have a special lulbot in it who tells us when we commit, for example:

05:59 < lulzbot-X> Whoa Nelly! New commits to pkrumins/node-chess (master)!
05:59 < lulzbot-X>     * Peteris Krumins: MoveGenerator stub
05:59 < lulzbot-X>     * Peteris Krumins: abstract pieces
05:59 < lulzbot-X> githubs: http://github.com/pkrumins/node-chess/tree/master

Here lulzbot informed us that I committed MoveGenerator stub and abstracted pieces in node-chess repo.

Out of these 5069 events, we talked this much,

$ (
  grep -v '^0[012]:' '#stackvm.08-28.log';
  cat '#stackvm.08-29.log';
  grep '^0[012]' '#stackvm.08-30.log'
  ) | egrep -i '^< pkrumins|substack|jesus' | wc -l
2682

So we spoke 2682 times or 83.8 times per hour. We also asked quite a lot questions:

$ (
  grep -v '^0[012]:' '#stackvm.08-28.log';
  cat '#stackvm.08-29.log';
  grep '^0[012]' '#stackvm.08-30.log';
  ) | grep '?$' | wc -l
246

246 questions, for example (random selection):

< pkrumins> wait, are we including a version of socket.io.js in dnode?
< jesusabdullah> but: simplified pieces for thumbs--yea or nay?
< pkrumins> is anyone working ont he problem where the opponent cant make moves?
< SubStack> pkrumins did you see how I just dumped the node EventEmitter code into our lib/?
< SubStack> does S create a row?
< jesusabdullah> pkrumins: You fixing the board?
< pkrumins> how does resizing in raphael happen?

My chess code wasn't the easiest to write and to make sure it works correctly, I wrote 52 expresso tests,

$ expresso 

   100% 52 tests

Without tests I would have never got that chess code right.

That's about it. The competition was awesome, A++ would participate again. Hope they organize node.js knockout the next year, too!

I hope you enjoyed my post and don't forget to vote for our project! Your vote is so important to us. Thank you!

Hey everyone. We at StackVM just finished recording the 2nd demo video. The 2nd video shows all the cool new features we have recently built - user login system, chatting and sharing of virtual machines by just dragging and dropping. Also this time James Halliday joins me from Fairbanks, Alaska!

Here is the video #2,

StackVM brings virtual machines to the web. Join #stackvm on FreeNode to discuss!

If you haven't seen the first video, see my StackVM startup announcement post!

During the past few weeks we have also written two new node.js libraries for use at StackVM:

We did not demo gifcasts in this video but I am going to do a separate video in the next week or two showing just that. They're pretty awesome!

In a few weeks we'll also post the 3rd demo video. In that video we have planned to show virtual network editor that allows to network virtual machines by just dragging and dropping! Be sure to subscribe to catonmat's rss feed and follow me on twitter to know when the video is out!

See you!

Ps. Join #stackvm on FreeNode to discuss StackVM with me and James! We're there 24/7!