This article is part of the article series "Perl One-Liners Explained."
<- previous article next article ->

Perl One LinersThis is the fifth part of a nine-part article on famous Perl one-liners. In this part I will create various one-liners for text conversion and substitution. See part one for introduction of the series.

Famous Perl one-liners is my attempt to create "perl1line.txt" that is similar to "awk1line.txt" and "sed1line.txt" that have been so popular among Awk and Sed programmers.

The article on famous Perl one-liners will consist of nine parts:

After I'm done with explaining the one-liners, I'll release an ebook. Subscribe to my blog to know when that happens!

Awesome news: I have written an e-book based on this article series. Check it out:

Alright then, here are today's one-liners:

Text conversion and substitution

62. ROT13 a string.

'y/A-Za-z/N-ZA-Mn-za-m/'

This one-liner uses the y operator (also known as tr operator) to do ROT13. Operators y and tr do string transliteration. Given y/SEARCH/REPLACE/, the operator transliterates all occurrences of the characters found in SEARCH list with the corresponding (position-wise) characters in REPLACE list.

In this one-liner A-Za-z creates the following list of characters:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

And N-ZA-Mn-za-m creates this list:

NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm

If you look closely you'll notice that the second list is actually the first list offset by 13 characters. Now the y operator translates each character in the first list to a character in the second list, thus performing the ROT13 operation.

If you wish to ROT13 the whole file then do this:

perl -lpe 'y/A-Za-z/N-ZA-Mn-za-m/' file

The -p argument puts each of file's line in the $_ variable, the y does ROT13, and -p prints the $_ out. The -l appends a newline to the output.

Note: remember that applying ROT13 twice produces the same string, i.e., ROT13(ROT13(string)) == string.

63. Base64 encode a string.

perl -MMIME::Base64 -e 'print encode_base64("string")'

This one-liner uses the MIME::Base64 module that is in the core (no need to install it, it comes with Perl). This module exports the encode_base64 function that takes a string and returns base64 encoded version of it.

To base64 encode the whole file do the following:

perl -MMIME::Base64 -0777 -ne 'print encode_base64($_)' file

Here the -0777 argument together with -n causes Perl to slurp the whole file into the $_ variable. Then the file gets base64 encoded and printed out, just like the string example above.

If we didn't slurp the file and encoded it line-by-line we'd get a mess.

64. Base64 decode a string.

perl -MMIME::Base64 -le 'print decode_base64("base64string")'

The MIME::Base64 module also exports decode_base64 function that takes a base64-encoded string and decodes it.

The whole file can be similarly decoded by:

perl -MMIME::Base64 -ne 'print decode_base64($_)' file

There is no need to slurp the whole file into $_ because each line of a base64 encoded file is exactly 76 characters and decodes nicely.

65. URL-escape a string.

perl -MURI::Escape -le 'print uri_escape($string)'

You'll need to install the URI::Escape module as it doesn't come with Perl. The module exports two functions - uri_escape and uri_unescape. The first one does URL-escaping (sometimes also referred to as URL encoding), and the other does URL-unescaping (URL decoding).

66. URL-unescape a string.

perl -MURI::Escape -le 'print uri_unescape($string)'

This one-liner uses the uri_unescape function from URI::Escape module to do URL-unescaping.

67. HTML-encode a string.

perl -MHTML::Entities -le 'print encode_entities($string)'

This one-liner uses the encode_entities function from HTML::Entities module. This function encodes HTML entities. For example, < and > get turned into &lt; and &gt;.

68. HTML-decode a string.

perl -MHTML::Entities -le 'print decode_entities($string)'

This one-liner uses the decode_entities function from HTML::Entities module.

69. Convert all text to uppercase.

perl -nle 'print uc'

This one-liner uses the uc function, which by default operates on the $_ variable and returns an uppercase version of it.

Another way to do the same is to use -p command line option that enables automatic printing of $_ variable and modify it in-place:

perl -ple '$_=uc'

The same can also be also achieved by applying the \U escape sequence to string interpolation:

perl -nle 'print "\U$_"'

It causes anything after it (or until the first occurrence of \E) to be upper-cased.

70. Convert all text to lowercase.

perl -nle 'print lc'

This one-liner is very similar to the previous. Here the lc function is used that converts the contents of $_ to lowercase.

Or, using escape sequence \L and string interpolation:

perl -nle 'print "\L$_"'

Here \L causes everything after it (until the first occurrence of \E) to be lower-cased.

71. Uppercase only the first word of each line.

perl -nle 'print ucfirst lc'

The one-liner first applies the lc function to the input that makes it lower case and then uses the ucfirst function that upper-cases only the first character.

It can also be done via escape codes and string interpolation:

perl -nle 'print "\u\L$_"'

First the \L lower-cases the whole line, then \u upper-cases the first character.

72. Invert the letter case.

perl -ple 'y/A-Za-z/a-zA-Z/'

This one-liner does transliterates capital letters A-Z to lowercase letters a-z, and lowercase letters to uppercase letters, thus switching the case.

73. Camel case each line.

perl -ple 's/(\w+)/\u$1/g'

This is a lousy Camel Casing one-liner. It takes each word and upper-cases the first letter of it. It fails on possessive forms like "friend's car". It turns them into "Friend'S Car".

An improvement is:

s/(?<!['])(\w+)/\u\1/g

Which checks if the character before the word is not single quote '. But I am sure it still fails on some more exotic examples.

74. Strip leading whitespace (spaces, tabs) from the beginning of each line.

perl -ple 's/^[ \t]+//'

This one-liner deletes all whitespace from the beginning of each line. It uses the substitution operator s. Given s/REGEX/REPLACE/ it replaces the matched REGEX by the REPLACE string. In this case the REGEX is ^[ \t]+, which means "match one or more space or tab at the beginning of the string" and REPLACE is nothing, meaning, replace the matched part with empty string.

The regex class [ \t] can actually be replaced by \s+ that matches any whitespace (including tabs and spaces):

perl -ple 's/^\s+//'

75. Strip trailing whitespace (space, tabs) from the end of each line.

perl -ple 's/[ \t]+$//'

This one-liner deletes all whitespace from the end of each line.

Here the REGEX of the s operator says "match one or more space or tab at the end of the string." The REPLACE part is empty again, which means to erase the matched whitespace.

76. Strip whitespace from the beginning and end of each line.

perl -ple 's/^[ \t]+|[ \t]+$//g'

This one-liner combines the previous two. Notice that it specifies the global /g flag to the s operator. It's necessary because we want it to delete whitespace at the beginning AND end of the string. If we didn't specify it, it would only delete whitespace at the beginning (assuming it exists) and not at the end.

77. Convert UNIX newlines to DOS/Windows newlines.

perl -pe 's|\n|\r\n|'

This one-liner substitutes the Unix newline \n LF with Windows newline \r\n CRLF on each line. Remember that the s operator can use anything for delimiters. In this one-liner it uses vertical pipes to delimit REGEX from REPLACE to improve readibility.

78. Convert DOS/Windows newlines to UNIX newlines.

perl -pe 's|\r\n|\n|'

This one-liner does the opposite of the previous one. It takes Windows newlines CRLF and converts them to Unix newlines LF.

79. Convert UNIX newlines to Mac newlines.

perl -pe 's|\n|\r|'

Apple Macintoshes used to use \r CR as newlines. This one-liner converts UNIX's \n to Mac's \r.

80. Substitute (find and replace) "foo" with "bar" on each line.

perl -pe 's/foo/bar/'

This one-liner uses the s/REGEX/REPLACE/ command to substitute "foo" with "bar" on each line.

To replace all "foos" with "bars", add the global /g flag:

perl -pe 's/foo/bar/g'

81. Substitute (find and replace) "foo" with "bar" on lines that match "baz".

perl -pe '/baz/ && s/foo/bar/'

This one-liner is equivalent to:

while (defined($line = <>)) {
  if ($line =~ /baz/) {
    $line =~ s/foo/bar/
  }
}

It puts each line in variable $line, then checks if line matches "baz", and if it does, it replaces "foo" with "bar" in it.

Perl one-liners explained e-book

I've now written the "Perl One-Liners Explained" e-book based on this article series. I went through all the one-liners, improved explanations, fixed mistakes and typos, added a bunch of new one-liners, added an introduction to Perl one-liners and a new chapter on Perl's special variables. Please take a look:

Have Fun!

Have fun with these one-liners for now. The next part is going to be about selective printing and deleting of certain lines.

Can you think of other text conversion and substitution procedures that I did not include here?

Hey everyone, just wanted to do a quick post on how to keep track of who's talking about you on the net. Nothing really unique, just a list of tools that I use often. Why is it important? Well, it's always interesting to know what people are saying about you and sometimes you want to engage in a conversation or just thank them for linking to your article.

Alright, here are the tools that I use:

Twitter Search

Twitter search is definitely the #1 source for keeping track of who's talking about you right now. But you already knew that.

Twitter Search
Twitter search example for the term "catonmat."

Perhaps what you didn't know is that they have an RSS feed for search queries.

Twitter search RSS feed
Location of RSS feed link for Twitter search results.

Now combined with a service like feedblitz.com you can email the RSS updates to yourself or just read them from your favorite RSS reader.

I am monitoring terms "Peteris Krumins", "pkrumins" and "catonmat".

Google Alerts

Google Alerts automatically notifies you when the Google search engine locates new results for your search terms. You can choose to have your alerts delivered via email or RSS feed.

Google Alerts
Google Alerts email for the term "catonmat."

You can even customize the type of alerts you wish to receive. Google Alerts lets you choose to get notified when a new result appears on web pages, usenet (google groups), blogs, news or videos.

Backtype Comment Alerts

Backtype is Google for comments. Want to find out when someone's mentioned you on Reddit, FriendFeed, Digg or Hacker News? Backtype will alert you.

Backtype Comment Alerts
Backtype Alerts email for the term "peteris."

Backtype also recently launched a service called BackTweets that allows you to find who's linking back to you via shortened URLs.

Have Fun!

Have fun keeping track of yourself!

Btw, let me know in the comments if I missed any other cool tools.

hackers steal moneyI recently watched an interesting video lecture on stealing botnets. A group of researchers at UCSB recently managed to take control over a part of Torpig botnet for 10 days. During this time, they observed 180 thousand infections and recorded almost 70GB of data that bots collected. This data included submitted form information from all the websites the infected person had visited, smtp, ftp, pop3, windows, passwords, credit card numbers and passwords from various password managers.

Here are the most interesting facts from the lecture:

Torpig uses a technique called "domain fluxing" to avoid being shut down by simply blocking the IP or the domain name of control center servers. The idea is simple - depending on date and time the algorithm generates a domain name to connect to. If the domain gets shut down, the bots will simply use a different domain after some time.

The researchers were able to take control over a part of the botnet by cracking the domain name generating algorithm and registering some of the domain names to be used for communication in the future.

The bad guys noticed that a part of botnet has been taken over and issued a software update to all bots to use a new domain flux algorithm, which used Twitter's popular topics for the day to generate domain names. It was no longer possible to predict the domain that would be used tomorrow.

When communicating with command & control server, the bots included a unique id field that was generated from machine's hardware. This allowed researchers to estimate the real number of unique computers infected. Researchers saw 1.2 million unique IP addresses but only 180k unique machines.

The bots would steal financial data from 410 financial institutions (top 5: PayPal, Poste Italiane, Capital One, E*Trade, Chase), they would log credit card information (top 5 cards: Visa, Mastercard, American Express, Maestro, Discover), and they would also steal all the passwords from browser's password manager.

In a 2008 study Symantec estimated that credit card information is valued at $.10 to $25 per card in the underground market. The bank account information is valued at $10.00 to $1,000 per account. Using this study, researchers estimated that during 10 day period the amount of financial data bots collected were worth $83k to $8.3 million.

Using various estimations researchers calculated that if the bots are used for denial of service the total bandwidth would be 17Gbps.

Researchers observed that there was a fraction of people who'd fill out the phishing page and then immediately email the company's security group telling that they may have been victims of identity theft.

Since Torpig was sending all the HTTP POST data and emails to command & control servers, researchers did statistics on emails and found out that 14% of all captured emails were about jobs and resumes, 10% discussed computer security/malware, 7% discussed money, 6% were sports fans, 5% were worried about exams and their grades, 4% were seeking partners online.

Researchers collected 300,000 unique credentials on 370,000 websites. 28% of people reused their password on multiple domains. There were 173,686 unique passwords.

Researchers converted the passwords in Unix format and tried to crack them with John the Ripper. 56,000 were cracked in less than 65 minutes using brute-force. Using a wordlist 14,000 passwords were cracked in the next 10 minutes. And another 30,000 passwords were cracked in the next 24 hours. That's 58% of all passwords cracked in 24 hours.

You're welcome to watch the video lecture. It's 1h 15m long. It's presented by Richard A. Kemmerer.

Here are all the topics in the lecture:

  • [02:00] Botnet terminology - bot, botnet, command & control server, control channel, botmaster.
  • [03:00] Introduction to the Torpig trojan and Mebroot malware platform.
  • [05:00] How Torpig works.
  • [11:30] Torpig HTML injection.
  • [15:00] Domain fluxing.
  • [19:15] Taking over Torpig's c&c server.
  • [24:10] Data collection principles.
  • [26:00] C&c server protocol.
  • [31:10] Botnet's size estimation.
  • [37:00] Botnet's threats: theft of financial information, denial of service, proxy servers, privacy thefts.
  • [37:30] Threat: Theft of financial information.
  • [42:00] Threat: Denial of service.
  • [43:30] Threat: Proxy servers.
  • [44:20] Threat: Privacy theft.
  • [47:00] Password analysis.
  • [50:40] Criminal retribution.
  • [53:00] Law enforcement.
  • [58:00] Repatriating the data.
  • [01:00:00] Ethics.
  • [01:02:00] Conclusions.
  • [01:06:00] Questions and answers.

For more information see the publication "Your Botnet is My Botnet: Analaysis of a Botnet Takeover."

This article is part of the article series "MIT Linear Algebra."
<- previous article next article ->

MIT Introduction to Linear AlgebraThis is the fifth post in an article series about MIT's course "Linear Algebra". In this post I will review lecture five that finally introduces real linear algebra topics such as vector spaces their subspaces and spaces from matrices. But before it does that it closes the topics that were started in the previous lecture on permutations, transposes and symmetric matrices.

Here is a list of the previous posts in this article series:

Lecture 5: Vector Spaces and Subspaces

Lecture starts with reminding some facts about permutation matrices. Remember from the previous lecture that permutation matrices P execute row exchanges and they are identity matrices with reordered rows.

Let's count how many permutation matrices are there for an nxn matrix.

For a matrix of size 1x1, there is just one permutation matrix - the identity matrix.

For a matrix of size 2x2 there are two permutation matrices - the identity matrix and the identity matrix with rows exchanged.

For a matrix of size 3x3 we may have the rows of the identity matrix rearranged in 6 ways - {1,2,3}, {1,3,2}, {2,1,3}, {2,3,1}, {3,1,2}, {3,2,1}.

For a matrix of size 4x4 the number of ways to reorder the rows is the same as the number of ways to rearrange numbers {1,2,3,4}. This is the simplest possible combinatorics problem. The answer is 4! = 24 ways.

In general, for an nxn matrix, there are n! permutation matrices.

Another key fact to remember about permutation matrices is that their inverse P-1 is their transpose PT. Or algebraically PT·P = I.

The lecture proceeds to transpose matrices. The transpose of a matrix exchanges its columns with rows. Another way to think about it that it flips the matrix over its main diagonal. Transpose of matrix A is denoted by AT.

Here is an example of transpose of a 3-by-3 matrix. I color coded the columns to better see how they get exchanged:

Transpose A^T of a 3x3 matrix A

A matrix does not have to be square for its transpose to exist. Here is another example of transpose of a 3-by-2 matrix:

Transpose A^T of a 3x2 matrix A

In algebraic notation transpose is expressed as (AT)ij = Aji, which says that an element aij at position ij get transposed into the position ji.

Here are the rules for matrix transposition:

  • The transpose of A + B is (A + B)T = AT + BT.
  • The transpose of A·B is (A·B)T = BT·AT.
  • The transpose of A·B·C is (A·B·C)T = CT·BT·AT.
  • The transpose of A-1 is (A-1)T = (AT)-1.

Next the lecture continues with symmetric matrices. A symmetric matrix has its transpose equal to itself, i.e., AT = A. It means that we can flip the matrix along the diagonal (transpose it) but it won't change.

Here is an example of a symmetric matrix. Notice that the elements on opposite sides of the diagonal are equal:

Symmetric matrix

Now check this out. If you have a matrix R that is not symmetric and you multiply it with its transpose RT as R·RT, you get a symmetric matrix! Here is an example:

Matrix times its transpose is symmetric matrix

Are you wondering why it's true? The proof is really simple. Remember that matrix is symmetric if its transpose is equal to itself. Now what's the transpose of the product R·RT? It's (R·RT)T = (RT)T·RT = R·RT - it's the same product, which means that R·RT is always symmetric.

Here is another cool fact - the inverse of a symmetric matrix (if it exists) is also symmetric. Here is the proof. Suppose A is symmetric, then the transpose of A-1 is (A-1)T = (AT)-1. But AT = A, therefore (AT)-1 = A-1.

At this point lecture finally reaches the fundamental topic of linear algebra - vector spaces. As usual, it introduces the topic by examples.

Example 1: Vector space R2 - all 2-dimensional vectors. Some of the vectors in this space are (3, 2), (0, 0), (π, e) and infinitely many others. These are all the vectors with two components and they represent the xy plane.

Example 2: Vector space R3 - all vectors with 3 components (all 3-dimensional vectors).

Example 3: Vector space Rn - all vectors with n components (all n-dimensional vectors).

What makes these vectors vector spaces is that they are closed under multiplication by a scalar and addition, i.e., vector space must be closed under linear combination of vectors. What I mean by that is if you take two vectors and add them together or multiply them by a scalar they are still in the same space.

For example, take a vector (1,2,3) in R3. If we multiply it by any number α, it's still in R3 because α·(1,2,3) = (α, 2α, 3α). Similarly, if we take any two vectors (a, b, c) and (d, e, f) and add them together, the result is (a+d, b+e, f+c) and it's still in R3.

There are actually 8 axioms that the vectors must satisfy for them to make a space, but they are not listed in this lecture.

Here is an example of not-a-vector-space. It's 1/4 of R2 (the 1st quadrant). The green vectors are in the 1st quadrant but the red one is not:

Not a vector space
An example of not-a-vector-space.

This is not a vector space because the green vectors in the space are not closed under multiplication by a scalar. If we take the vector (3,1) and multiply it by -1 we get the red vector (-3, -1) but it's not in the 1st quadrant, therefore it's not a vector space.

Next, Gilbert Strang introduces subspaces of vector spaces.

For example, any line in R2 that goes through the origin (0, 0) is a subspace of R2. Why? Because if we take any vector on the line and multiply it by a scalar, it's still on the line. And if we take any two vectors on the line and add them together, they are also still on the line. The requirement for a subspace is that the vectors in it do not go outside when added together or multiplied by a number.

Here is a visualization. The blue line is a subspace of R2 because the red vectors on it can't go outside of line:

Subspace of R2
An example of subspace of R2.

And example of not-a-subspace of R2 is any line that does not go through the origin. If we take any vector on the line and multiply it by 0, we get the zero vector, but it's not on the line. Also if we take two vectors and add them together, they are not on the line. Here is a visualization:

Not a vector subspace
An example of not-a-subspace of R2.

Why not list all the subspaces of R2. They are:

  • the R2 itself,
  • any line through the origin (0, 0),
  • the zero vector (0, 0).

And all the subspaces of R3 are:

  • the R3 itself,
  • any line through the origin (0, 0, 0),
  • any plane through the origin (0, 0, 0),
  • the zero vector.

The last 10 minutes of the lecture are spent on column spaces of matrices.

The column space of a matrix is made out of all the linear combinations of its columns. For example, given this matrix:

Matrix a

The column space C(A) is the set of all vectors {α·(1,2,4) + β·(3,3,1)}. In fact, this column space is a subspace of R3 and it forms a plane through the origin.

More about column spaces in the next lecture.

You're welcome to watch the video lecture five:

Topics covered in lecture five:

  • [01:30] Permutations.
  • [03:00] A=LU elimination without row exchanges.
  • [03:50] How Matlab does A=LU elimination.
  • [04:50] PA=LU elimination with row exchanges
  • [06:40] Permutation matrices.
  • [07:25] How many permutation matrices are there?
  • [08:30] Permutation matrix properties.
  • [10:30] Transpose matrices.
  • [11:50] General formula for transposes: (AT)ij = Aji.
  • [13:06] Symmetric matrices.
  • [13:30] Example of a symmetric matrix.
  • [15:15] R·RT is always symmetric.
  • [18:23] Why is R·RT symmetric?
  • [20:50] Vector spaces.
  • [22:05] Examples of vector spaces.
  • [22:55] Real vector space R2.
  • [23:20] Picture of R2 - xy plane.
  • [26:50] Vector space R3.
  • [28:00] Vector space Rn.
  • [30:00] Example of not a vector space.
  • [32:00] Subspaces of vector spaces.
  • [33:00] A vector space inside R2.
  • [34:35] A line in R2 that is subspace.
  • [34:50] A line in R2 that is not a subspace.
  • [36:30] All subspaces of R2.
  • [39:30] All subspaces of R3.
  • [40:20] Subspaces of matrices.
  • [41:00] Column spaces of matrices C(A).
  • [44:10] Example of column space of matrix with columns in R3.

Here are my notes of lecture five:

MIT Linear Algebra, Lecture 5: Vector Spaces and Subspaces
My notes of linear algebra lecture 5 on vector spaces and subspaces.

Have fun with this lecture! The next post is going to be more about column spaces and null spaces of matrices.

PS. This course is taught from Introduction to Linear Algebra textbook. Get it here:

This article is part of the article series "Vim Plugins You Should Know About."
<- previous article next article ->

Vim Plugins, surround.vimThis is the sixth post in the article series "Vim Plugins You Should Know About". This time I am going to introduce you to a vim plugin called "nerd_tree.vim". It's so useful that I can't imagine working without it in vim.

Nerd Tree is a nifty plugin that allows you to explore the file system and open files and directories directly from vim. It opens the file system tree in a new vim window and you may use keyboard shortcuts and mouse to open files in new tabs, in new horizontal and vertical splits, quickly navigate between directories and create bookmarks for your most important projects.

This plugin was written by Marty Grenfell (also known as scrooloose).

Previous articles in the series:

Ps. Please help me reach 10,000 RSS subscribers. I am almost there. If you enjoy my posts and have not yet subscribed, subscribe here!

How to use nerd_tree.vim?

Nerd Tree plugin can be activated by the :NERDTree vim command. It will open in vim as a new vertical split on the left:

Vim Nerd Tree
A screenshot of Nerd Tree plugin in action.

Here are the basics of how to use the plugin:

  • Use the natural vim navigation keys hjkl to navigate the files.
  • Press o to open the file in a new buffer or open/close directory.
  • Press t to open the file in a new tab.
  • Press i to open the file in a new horizontal split.
  • Press s to open the file in a new vertical split.
  • Press p to go to parent directory.
  • Press r to refresh the current directory.

All other keyboard shortcuts can be found by pressing ?. It will open a special help screen with the shortcut listings. Press ? again to get back to file tree.

To close the plugin execute the :NERDTreeClose command.

Typing :NERDTree and :NERDTreeClose all the time is really inconvenient. Therefore I have mapped the toggle command :NERDTreeToggle to the F2 key. This way I can quickly open and close Nerd Tree whenever I wish. You can also map it to F2 by putting map <F2> :NERDTreeToggle<CR> in your .vimrc file.

How to install nerd_tree.vim?

To get the latest version:

  • 1. Download NERD_tree.zip.
  • 2. Extract NERD_tree.zip to ~/.vim (on Unix/Linux) or ~\vimfiles (on Windows).
  • 3. Run :helptags ~/.vim/doc (on Unix/Linux) or :helptags ~/vimfiles/doc (on Windows) to rebuild the tags file (so that you can read :help NERD_tree.)
  • 4. Restart Vim.

Have Fun!

Have fun exploring your files with this awesome plugin and until next time!