Last time I explained how YouTube videos can be downloaded with gawk programming language by getting the YouTube page where the video is displayed and finding out how the flash video player retrieves the FLV (flash video) media file.
This time I’ll use Perl programming language which is my favorite language at the moment and write a one-liner which downloads a YouTube video.
Instead of parsing the YouTube video page, let’s look how an embedded YouTube video player on a 3rd party website gets the video.
Let’s go to this cool video and look at the embed html code:
For this video it looks as following:
<object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/qg1ckCkm8YI"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/qg1ckCkm8YI" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed></object>
95% of this code is boring, the only interesting part is this URL:
http://www.youtube.com/v/qg1ckCkm8YI
Let’s load this in a browser, and as we do it, we get redirected to some other URL:
http://www.youtube.com/jp.swf?video_id=qg1ckCkm8YI&eurl=&iurl=http://img.youtube.com/vi/qg1ckCkm8YI/default.jpg&t=OEgsToPDskJCPW5DvMKeM3srnQ5e0LSY
So far we have no information how the flash player will retrieve the video, the only thing we know that ‘iurl’ stands for ‘image url’ and is the location of the thumbnail image.
Let’s sniff the traffic again, this time with an excellent (though, commercial) Internet Explorer plugin ‘HttpWatch Professional’.
This plugin displays all the requests the browser makes no matter if it’s HTTP or HTTPS traffic and displays in a nice manner which makes our job much quicker than by using Ethereal.
The FireFox’s alternative to this tool is Live HTTP Headers extension which basically does the same as HttpWatch Professional but it takes more time to understand the output.
Here is what we see with HttpWatch Professional when we load the URL in the browser:
We see that to get a video browser first requested:
http://www.youtube.com/get_video?video_id=qg1ckCkm8YI&t=OEgsToPDskJ3bp4DEiMuxUmjx7oumUec&eurl=
then got redirected to:
http://cache.googlevideo.com/get_video?video_id=qg1ckCkm8YI
and then another time to:
http://74.125.13.83/get_video?video_id=qg1ckCkm8YI
This is exactly what what we saw in the previous article on downloading videos with gawk!
Now let’s write a Perl one-liner that retrieves this video file!
What is a one-liner you might ask? Well, my definition of one liner is that it is a program you are willing to type out without saving it to disk.
First of all we will need some perl packages (modules) which will ease working with HTTP protocol. There are two widely used available on Perl’s module archive (CPAN) - LWP and WWW::Mechanize.
WWW::Mechanize is built on top of LWP, so let’s go to a higher level of abstraction and use this module.
The WWW::Mechanize package does not come as Perl’s core package by default, so you’ll have to get it installed.
To do it, type
perl -MCPAN -eshell
In your console and when the CPAN shell appears, type
install WWW::Mechanize
to get the module installed.
If everything goes fine, the CPAN will tell you that the module got installed.
I don’t want to go into Perl language’s details again, also I don’t want to go into WWW::Mechanize package’s details.
If you want to learn Perl I recommend this article as a starter, these books and of course perldoc. Once you learn the basics you can quickly pick up the WWW::Mechanize package by reading the documentation, faq and trying examples.
Now finally let’s write the one-liner. So what do we have to do?
First we have to retrieve
http://www.youtube.com/v/qg1ckCkm8YI
then follow the redirect (which WWW::Mechanize will do for us), then get the ‘t’ identifier from query string and finally request and save output of
http://www.youtube.com/get_video?video_id=qg1ckCkm8YI&t=OEgsToPDskJ3bp4DEiMuxUmjx7oumUec&eurl=
That’s it!
So here is the final version which can probably be made even shorter:
perl -MWWW::Mechanize -e '$_ = shift; s#http://|www\.|youtube\.com/|watch\?|v=|##g; $m = WWW::Mechanize->new; ($t = $m->get("http://www.youtube.com/v/$_")->request->uri) =~ s/.*&t=(.+)/$1/; $m->get("http://www.youtube.com/get_video?video_id=$_&t=$t", ":content_file" => "$_.flv")'
A little longer than a usual one-liner but does the job nicely. To keep it short, there is no error checking!
To use this one-liner just copy it to command line and specify the URL of a YouTube video (or just the ID of the video, or a variation of URL (like without ‘http://’). Like this:
perl -MWWW::Mechanize -e '...' http://www.youtube.com/watch?v=l69Vi5IDc0g
or just
perl -MWWW::Mechanize -e '...' l69Vi5IDc0g
Let’s spread this one liner to multiple lines and see what it does as it is not documented.
One could do the spreading out to multiple lines by hand, but that’s not what humans are for, let’s make Perl do it. By adding -MO=Deparse to the command line list we get the output of the Perl generated source code (i added line numbers myself):
use WWW::Mechanize;
1) $_ = shift @ARGV;
2) s[http://|www\.|youtube\.com/|watch\?|v=|][]g;
3) $m = 'WWW::Mechanize'->new;
4) ($t = $m->get("http://www.youtube.com/v/$_")->request->uri) =~ s/.*&t=(.+)/$1/;
5) $m->get("http://www.youtube.com/get_video?video_id=$_&t=$t", ':content_file', "$_.flv");
So our one liner is actually 5 lines.
On line 1 we put the first argument of ARGV variable into special variable $_ so we could use advantage of it and save some typing.
On line 2 we just leave the ID of the video by removing parts from the URL one by one so a user could specify the video URL in various formats like ‘www.youtube.com/watch?v=ID, or just ‘youtube.com?v=ID’ or just ‘v=ID’ or even just ‘ID’. The ID gets stored in the special $_ variable.
On line 3 we create a WWW::Mechanize object we are going to use twice.
Line 4 needs more explanation because we are doing so much in it. First it retrieves that embedded video URL I talked about earlier, the server actually redirects us away, so we have to look at the last request’s location. We save this location into variable $t and then extract the ‘t’ YouTube ID out.
As a YouTube video is uniquely specifed with two IDs, the video ID and ‘t’ ID, on line 5 we retrieve the file and tell WWW::Mechanize to save contents to the ID.flv file. WWW::Mechanize handles redirects for us so everything should work. Indeed, I tested it out and it worked.
Can you golf it shorter?
I golfed it a little myself, here is what I came up with:
perl -MWWW::Mechanize -e '$_ = shift; ($y, $i) = m#(http://www\.youtube\.com)/watch\?v=(.+)#; $m = WWW::Mechanize->new; ($t = $m->get("$y/v/$i")->request->uri) =~ s/.*&t=(.+)/$1/; $m->get("$y/get_video?video_id=$i&t=$t", ":content_file" => "$i.flv")'
To use this one liner you must specify the full URL to youtube video, like this one:
http://www.youtube.com/watch?v=l69Vi5IDc0g
This one liner saves the “http://www.youtube.com” string in variable $y and the ID of the video in variable $i. The $y comes handy because we don’t have to use the full YouTube URL, instead we use use $y.
Also, are you interested in Perl programming language? Here are three excellent books on Perl from Amazon (recommended by me):

|
|
|


July 27th, 2007 at 7:34 am
All of the methods you used from WWW::Mechanize were inherited from LWP…
Here’s a first go at a golf, very similar to yours:
perl -MLWP -e '($y,$i)=shift=~/^(.+m)\/.+v=(.+)/;($m=LWP::UserAgent->new)->get("$y/get_video?video_id=$i&t=".($m->get("$y/v/$i")->request->uri=~/&t=(.+)/)[0],":content_file"=>"$i.flv")' 'http://www.youtube.com/watch?v=l69Vi5IDc0g'July 27th, 2007 at 7:38 am
With some whitespace:
perl -MLWP -e ‘($y,$i) = shift =~ /^(.+m)\/.+v=(.+)/; ($m = LWP::UserAgent->new) ->get(”$y/get_video?video_id=$i&t=” . ($m->get(”$y/v/$i”) ->request->uri =~ /&t=(.+)/)[0], “:content_file” => “$i.flv”)’
July 27th, 2007 at 7:46 am
Sorry to spam, I’m new to the intertubes.
Third time’s the charm (perltidied):
perl -MLWP -e' ( $y, $i ) = shift =~ /^(.+m)\/.+v=(.+)/; ( $m = LWP::UserAgent->new )->get( "$y/get_video?video_id=$i&t=" . ( $m->get("$y/v/$i")->request->uri =~ /&t=(.+)/ )[0], ":content_file" => "$i.flv" ) 'July 27th, 2007 at 7:52 am
A little shorter:
perl -MWWW::Mechanize -e'$y="http://youtube.com";($i)=pop=~/\w+$/g;$m=new WWW::Mechanize;$m->get("$y/v/$i")->request->uri=~/&t=.+/;$m->get("$y/get_video?video_id=$i$&",":content_file"=>"$i.flv")'July 27th, 2007 at 7:54 am
heh, same problem as Saldane. here it is with unnecessary \ns after semicolons:
perl -MWWW::Mechanize -e'$y="http://youtube.com"; ($i)=pop=~/\w+$/g; $m=new WWW::Mechanize; $m->get("$y/v/$i")->request->uri=~/&t=.+/; $m->get("$y/get_video?video_id=$i$&",":content_file"=>"$i.flv")'July 27th, 2007 at 8:13 am
Intermediate Perl
http://www.flazx.com/ebook4407.php
July 27th, 2007 at 6:42 pm
nice.. and quoted for windows shell:
perl -MLWP -e"$y='http://youtube.com';($i)=pop=~/\w+$/g;($m=new LWP::UserAgent)->get(qq{$y/v/$i})->request->uri=~/&t=.+/;$m->get(qq{$y/get_video?video_id=$i$&},':content_file',$i.'.flv')" "l69Vi5IDc0g"July 27th, 2007 at 6:54 pm
Sweet! Thanks for golfing
I noticed the comments do not look good at all. I will fix the design so that the code did not get cut off if it runs over the edge
July 28th, 2007 at 1:44 am
That is nice. A friend of mine always boasted about Perl and how good it is.
________________
http://www.FreeOpenMoko.com
July 31st, 2007 at 11:46 pm
[…] I said, we will be creating the tool in Perl programming language. In the previous post about YouTube I used the WWW::Mechanize package. I can tell you in advance that it will not work this time there […]
August 12th, 2007 at 3:29 am
I have to say, that I could not agree with you in 100% regarding o.us poetry, but it’s just my opinion, which could be wrong
August 12th, 2007 at 6:50 am
Daniel, what do you mean by ‘o.us poetry’?
August 15th, 2007 at 9:31 am
problem, video -CrLh0xR3FM causes problems
there is a 0xR in it - and perl recognizes this as unicode. Anyway you can quote it to take in the whole string?
August 30th, 2007 at 6:22 am
no comment.
August 30th, 2007 at 6:24 am
No Comment….
September 14th, 2007 at 9:27 am
[…] See also Peteris’ excellent articles on Downloading YouTube Videos and Perls’ Special […]
November 4th, 2007 at 2:54 am
I put together the following korn script from your perl code… It downloads the video and converts it to a DVD-style MPEG. Good work; I hope others will find it useful!
#!/bin/ksh set -e if [ -z "$1" ]; then echo "Please supply quoted URL as argument." exit 1 fi URL="$1" FILE=`echo "$URL" | awk -F 'v=' '{print $2}'` perl -MWWW::Mechanize -e '$_ = shift; ($y, $i) = m#(http://www\.youtube\.com)/watch\?v=(.+)#; $m = WWW::Mechanize->new; ($t = $m->get("$y/v/$i")->request->uri) =~ s/.*&t=(.+)/$1/; $m->get("$y/get_video?video_id=$i&t=$t", ":content_file" => "$i.flv")' "$URL" mencoder -of mpeg -mpegopts format=dvd -ofps 30000/1001 -oac lavc -ovc lavc -srate 48000 -af lavcresample=48000 -vf scale=704:480,expand=720:480 -lavcopts acodec=ac3:abitrate=192:vcodec=mpeg2video:vrc_buf_size=1835:vrc_maxrate=9800:vbitrate=1856:keyint=18:aspect=4/3 -o "$FILE.mpeg" "$FILE.flv"November 4th, 2007 at 6:26 am
Greg, thanks for the script
November 7th, 2007 at 11:06 am
I am not able to download youtube vieo using VBScript file. I am still getting the .dll error even though I opened IE and did the required change? Could you tell me why is that?
Thanks.
-Manish
November 7th, 2007 at 11:07 am
I am not able to download youtube video using VBScript file. I am still getting the .dll error even though I opened IE and did the required change? Could you tell me why is that?
Thanks.
-Manish
November 7th, 2007 at 10:34 pm
Kankani, what .dll error?
November 14th, 2007 at 5:54 pm
[…] by Downloading YouTube videos with a Perl one-liner, I’ve put together a piece of code to do the same thing with Groovy. Not as succinct as Perl. […]
February 3rd, 2008 at 5:06 am
I’m not a programmer so I guess I’ll stick to How To Download YouTube Videos The Easy Way For Free.
April 1st, 2008 at 6:05 pm
Hi,
this is great and it works perfectly (I’m wonrking on Windows). I have one question, is it possible to have the percentage accomplished being displayed to know how long it will take to complete ? I it could be done that’d awesome.
Thanks a lot
April 19th, 2008 at 9:48 am
After reading the comment from Vinniemc I thought of taking a stab at showing some kind of progress indicator.
For showing progress indicator I had to find a mechanism where LWP::UserAgent would call my function after it received each chunk of file. I was delighted to find in LWP::UserAgent’s perldoc that its possible to specify a call back method to LWP::UserAgent’s get() method via the special field name “:content_cb”. After trying unsuccessfully to use this call back functionality I went back to the LWP perldoc. On re-reading I found that its not possible to use the option “:content_file” & “:content_cb” at the same time !
After searching some more I found the lwp cookbook which has an example of manually processing http responses. So based on that I was able to hack up the progress indicator. Unfortunately it hardly qualifies as a one liner anymore! In my attempt to still make the script small it has become a little obfuscated and might be difficult to understand. So here is the code
perl -MLWP -e '$_ = shift; ($y, $i) = m#(http://www\.youtube\.com)/watch\?v=(.+)#; $m = LWP::UserAgent->new; ($t = $m->get("$y/v/$i")->request->uri) =~ s/.*&t=(.+)/$1/; open($fh,">$i.flv");binmode($fh);$t1=$t2=time;print "\n";$res = $m->request(HTTP::Request->new(GET => "$y/get_video?video_id=$i&t=$t"),sub { ($c,$res) = @_;$br += length($c);$t2 = time;if($t2 > $t1){if ($res->content_length) {printf STDERR "%d%% - ",100*$br/$res->content_length;$t1= $t2;}}print $fh $c;});close($fh);print "\n";' http://www.youtube.com/watch?v=l69Vi5IDc0gJuly 3rd, 2008 at 3:49 am
I want to download videos and movies.
July 9th, 2008 at 6:55 am
Why Perl? You can even do it with bash!
And you also get to download the mp4 format and a download o-meter for free.
July 23rd, 2008 at 10:24 am
wow,perl is so strong.
August 3rd, 2008 at 6:34 pm
Very good..
It was very useful.