perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'

Can you figure out how it works? I give an explanation below, but try to figure it out yourself. Here is what happens when you run it:

$ perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'
1
2
2 is prime
3
3 is prime
4
5
5 is prime
6
7
7 is prime
8
9
10
11
11 is prime

Here is how it works.

First, the number is converted in its unary representation by (1x$_). For example, the number 5 gets converted into 1x5, which is 11111 (1 repeated 5 times.)

Next, the unary string gets tested against the regular expression. If it matches, the number is a composite, otherwise it's a prime.

The regular expression works this way. It consists of two parts ^1?$ and ^(11+?)\1+$.

The first part matches number 1 and the empty string. Clearly, the empty string and number 1 are not prime numbers, therefore this regular expression matches, which indicates that they are not prime numbers.

The second part determines if two or more 1s repeatedly make up the whole number. If two or more 1s repeatedly make up the whole number, the regex matches, which means that the number is composite. Otherwise it's a prime.

Let's look at the second regex part on numbers 5 and 4.

The number 5 in unary representation is 11111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the whole regex now becomes ^11(11)+$. It can't match five ones, therefore it fails. But since it used +?, it backtracks and matches the first three ones 111. The back-reference becomes 111 and the whole regex becomes ^111(111)+$. It doesn't match again. This repeats for 1111 and 11111, which also don't match, therefore the whole regex doesn't match and the number is a prime.

The number 4 in unary representation is 1111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the regex becomes ^11(11)+$. It matches the original string, therefore the number is not a prime.

PS. I didn't invent this regular expression, it was invented in 1998 by Abigail.

Don't take this regular expression too seriously, it's actually neither a regular expression (as defined in automata theory), nor a way to check if a number is a prime. It's just an awesome thing that Perl can do. See this cool article called The Prime That Wasn't by Andrei Zmievski for a discussion about how this regex fails for larger numbers because of backtracking.

Also if you wish to learn more about Perl one-liners, check out my Perl One-Liners Explained article series and download the perl1line.txt file.

Comments

Gaal Yahas Permalink
December 25, 2011, 21:30

I believe credit for this is due to Abigail, who posted it (spelled a little differently) over a decade ago.

December 26, 2011, 14:08

That is correct!

August 23, 2014, 08:46

Drogba is one of the best strikers in the history of Chelsea. He himself has contributed to the success of large M88 Blues in one recent decades with the M88 peak Champions League 2011/12. But then "elephants" have gone in to save the lost glory in China and Turkey.
It was not until the past summer, in an effort to strengthen the attack, Mourinho decided to invite former students to collaborate. This year already 36 years old, Drogba is no longer in the peak of m88
, but with what is shown in the colors of Galatasaray shirt last season (14 goals in 36 games), the Ivorian striker will be the reserve cards strategies for striker Diego Costa expensive.

Jonas Permalink
December 25, 2011, 22:18

Source: http://www.catb.org/jargon/html/O/one-liner-wars.html

December 26, 2011, 14:10

Good find.

December 26, 2011, 01:47

I just wrote a coprimality test with a regular expression tonight, based on the same technique.

https://gist.github.com/1520171

December 26, 2011, 14:08

Very cool!

December 26, 2011, 07:09

This perl program gives a segmentation fault on x86 Linux (RHEL 5) when served an input of 1234321

December 26, 2011, 14:08

Yup, doesn't work for large numbers. There is so much backtracking going on for large numbers that Perl runs out of memory.

epiphany47 Permalink
December 26, 2011, 08:47

This isn't a "regular expression", it's a "regexp" =] - see comments on HN:
http://news.ycombinator.com/item?id=3391547

This script uses backreferences which give regexp's more "computing power" than regular expressions actually have, but remove the guaranteed runtime of an O(n) regular expression. For more, read the excellent article at http://swtch.com/~rsc/regexp/regexp1.html

You can actually prove that "regular expressions" don't have the computing power to enumerate the prime numbers. For more, read the wiki article, http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages

On an unrelated note, I've been a longtime fan of your NodeJS articles, and look forward to more of your writing. =]

December 26, 2011, 14:10

I have studied computer science intensively and you're right that this regular expression isn't a real regular expression as in automata theory.

I'm publishing the next nodejs post in a few days!

newestbie Permalink
December 27, 2011, 09:16

my $string = "-" x 1000;
$string =~ s/(?!(-{2,})\1{1,}$)(?=(-{2,}))/print length($2), "\n"/eg;

January 12, 2012, 12:05

Links to other explanations and to a usenet post by Abigail that dates it back to 1997 can be found at my blog.
Abigail's looked like this
perl -wle 'print "Prime" if (1 x shift) !~ /^1?$|^(11+?)\1+$/'

December 17, 2014, 08:14

well, this was a kind of new info for me. Thanks for sharing.
happy new year 2015
happy new year 2015 wishes
happy new year 2015 images
digg.

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "linux_298": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements