Fibonacci of Pisa

In this article I'd like to show how the theory does not always match the practice. I am sure you all know the linear time algorithm for finding Fibonacci numbers. The analysis says that the running time of this algorithm is O(n). But is it still O(n) if we actually run it? If not, what is wrong?

Let's start with the simplest linear time implementation of the Fibonacci number generating algorithm in Python:

def LinearFibonacci(n):
  fn = f1 = f2 = 1
  for x in xrange(2, n):
    fn = f1 + f2
    f2, f1 = f1, fn
  return fn

The theory says that this algorithm should run in O(n) - given the n-th Fibonacci number to find, the algorithm does a single loop up to n.

Now let's verify if this algorithm is really linear in practice. If it's linear then the plot of n vs. running time of LinearFibonacci(n) should be a line. I plotted these values for n up to 200,000 and here is the plot that I got:

Quadratic performance of linear algorithm
Note: Each data point was averaged over 10 calculcations.

Oh no! This does not look linear at all! It looks quadratic! I fitted the data with a quadratic function and it fit nearly perfectly. Do you know why the seemingly linear algorithm went quadratic?

The answer is that the theoretical analysis assumed that all the operations in the algorithm executed in constant time. But this is not the case when we run the algorithm on a real machine! As the Fibonacci numbers get larger, each addition operation for calculating the next Fibonacci number "fn = f1 + f2 " runs in time proportional to the length of the previous Fibonacci number. It's because these huge numbers no longer fit in the basic units of computation in the CPU; so a big integer library is required. The addition of two numbers of length O(n) in a big integer library takes time of O(n).

I'll show you that the running time of the real-life linear Fibonacci algorithm really is O(n^2) by taking into account this hidden cost of bigint library.

So at each iteration i we have a hidden cost of O(number of digits of fi) = O(digits(fi)). Let's sum these hidden cost for the whole loop up to n:

Hidden bigint cost in linear fibonacci

Now let's find the number of digits in the n-th Fibonacci number. To do that let's use the well-known Binet's formula, which tells us that the n-th Fibonacci number fn can be expressed as:

Binet’s Fibonacci Formula

It is also well-known that the number of digits in a number is integer part of log10(number) + 1. Thus the number of digits in the n-th Fibonacci number is:

Digits in the n-th Fibonacci number

Thus if we now sum all the hidden costs for finding the n-th Fibonacci number we get:

Hidden integer library cost in linear fibonacci algorithm

There we have it. The running time of this "linear" algorithm is actually quadratic if we take into consideration that each addition operation runs proportionally to the length of addends.

Next time I'll show you that if the addition operation runs in constant time, then the algorithm is truly linear; and later I will do a similar analysis of the logarithmic time algorithm for finding Fibonnaci numbers that uses this awesome matrix identity:

Fibonacci Fibonnaci Matrix Identity

Don't forget to subscribe if you are interested! It's well worth every byte!

Comments

July 07, 2009, 12:25

Hi all!

Just a note about the Python code:

def LinearFibonacci(n):
  fn = f1 = f2 = 1
  for x in xrange(2, n):
    fn = f1 + f2
    f2, f1 = f1, fn
  return fn

fn is not needed here, and if one opts to use it anyway, need not be initialised before the loop. Furthermore, the variable names "f1" and "f2" are confusing here, because I expect f2 to be bigger than f1. I would write this code as:

def LinearFibonacci(n):
  curr, next = 1, 1
  for x in xrange(2, n):
    next, curr = ((next+curr),next)
  return next

Tested and works.

Richie Permalink
July 07, 2009, 12:34

Without reading your analysis it's clear that LnearFibonacci computes all the fib numbers up to "n", and then throws all but the last number away. Next time it recomputes the same numbers again, so of course you get complexity O(n^2).

If the function filled an array then computation would be linear and only one call would be required.

alexandru Permalink
July 07, 2009, 12:50

You can always use the O(logN) algorithm to compute nth fibonacci number (or O(NlogN) using your analysis). Your article's point is that big numbers (ie. not implemented in the hardware) don't have constant time operations. :|

ramsandesh Permalink
April 22, 2010, 02:52

excuse me sir,
I m ram sandesh.I m doing my B.tech degree.I m interested in knowing this algorithm with time complexity of O(log n).can u please mail me back graph & algorithm.i guess u have done with matrix multiplication method...

William Permalink
May 26, 2011, 04:08

Actually, the «O(log n)» algorithm is not O(n log n) but O(n^a) where a single call is also O(n^a). If you use naïve long multiplication, a is 2; if you use Karatsuba multiplication, a is log(3)/log(2), which is better.

July 07, 2009, 12:56

yes, LinearFibonacci() is O(n), but you run it m number of times which means the total computation is O(mn) time. And on each run of LinearFibonacci() m = n so that's O(n^2). Why should that be a surprise?

Also, Richie is correct: if you cached your previous results, it would be O(n). You've only come up with an O(n^2) algorithm for a problem that could be solved in O(n) time.

Madars Permalink
July 07, 2009, 12:58

Richie: your point about throwing the results away is valid when multiple calls to LinearFibonacci are made. However, it seems that Peteris considers only the performance of single call to LinearFib; by modifying LinearFib according to your suggestion the running time will still be quadratic.

However your point is noted when the application requires multiple Fibonacci numbers - dropping previously computed results is a bit foolish then.

Troll Permalink
July 07, 2009, 13:06

It seems beside the point to bring up caching the results. This is a very good analysis of the given algorithm

David Permalink
July 07, 2009, 13:06

Corey, Richie: I believe you are incorrect. The following is a naive O(n^2) algorithm for calculating the nth fibonacci number:

def fib(n):
if n == 0 or n == 1:
return n
else: return fib(n-1) + fib(n-2)

However, the OP's algorithm simply performs n additions to find the nth fibonacci number -- there is no recursion. To work out [LinearFibonacci(x) for x in xrange(1,n)] would be O(n^2), but LinearFibonacci(n) itself is just O(n) (counting additions as constant time).

July 07, 2009, 13:12

As Madars says, I consider only the performance of a single call to LinearFibonacci.

For example, if I did a call to LinearFib(1000) and LinearFib(2000), from the analysis algorithm I would expect that LinearFib(2000) would take twice as much time as LinearFib(1000). But in practice it takes four times as much time. The caching would not help here, as you would still have to iterate from 1000 to 2000 without caching!

Bob Permalink
July 07, 2009, 13:38

Corey & Richie - you are wrong. It's only the test where the code iterates through 1 to 16 for the parameters. The points on the graph are the time take to complete the operation for a single value of N.

Alexandru hits the nail on the head - "big numbers don't have constant time operations"

huaghaguah Permalink
July 07, 2009, 13:39

Common sense: theory does not (necessarily) match practice when the theoretical assumptions don't hold.

http://gmplib.org/manual/Fibonacci-Numbers-Algorithm.html

Not sure if this is still quadratic or not, but either way it's going to be an effective speed-up.

constant time Permalink
July 07, 2009, 13:52

How about a closed form evaluation?

F(n) = (Phi^n - (1-Phi)^n) / sqrt(5)

Where Phi is the Golden Ratio.
Phi = (1 + sqrt(5)) / 2

EntropySpark Permalink
February 02, 2014, 04:26

In that case, you'll need phi and sqrt(5) to have enough precision to round correctly. For the 1000th Fibonacci number, you'd need both to have over 200 digits of accuracy. You then have to take your large (data-wise) decimal number and take it to a large power. Computationally, it would be much worse than repeated addition.

July 07, 2009, 14:37

To David: It's not a naive O(N^2) algorithm, it's actually O(Fib(n)) (assuming constant time addition).

To Peteris: Nice analysis :) It was well worth every byte :)

Dror Levin Permalink
July 07, 2009, 14:53

Thanks for the analysis, sure surprised me.

If possible, I'd like to hear how you generated that graph and those formulas. I generally use gnuplot and latex/lyx, but I'm sure there are numerous tips I could learn from you :)

July 07, 2009, 15:11

Why have you compared the running time of Fib(n) to Fib(n)? There's no reason why this should give you any information about the complexity of the function. You must have meant to compare the running time of Fib(n) to n, which should be linear. It's no surprise that you found a quadratic relationship since the running time is linear in n and Fib(n) is quadratic in n, so Fib(n) is quadratic with respect to the running time.

Albert Permalink
July 07, 2009, 15:20

All you've done is point out that you have to know what you're doing and not blindly follow your textbook and it's assumptions.

Albert Permalink
July 07, 2009, 15:20

its*

July 07, 2009, 15:40

John Tantalo, no, I compared n to running_time(Fib(n)).

July 07, 2009, 15:45

Dror Levin, here is how I did it:

First I created a Python program to time execution for an individual n:

#!/usr/bin/python
#

import sys
from time import time

def LinearFibonacci(n):
  """ Finds n-th Fibonacci number. n >= 1 """
  fn = f1 = f2 = 1
  for x in xrange(2, n):
    fn = f1 + f2
    f2, f1 = f1, fn
  return fn

if __name__ == '__main__':
  args = sys.argv[1:]
  if len(args)<1:
    print "Usage: prog.py <fib>"
    sys.exit(1)
  fib = int(args[0])
  start = time()
  LinearFibonacci(fib)
  end = time()
  print "%d, %f" % (fib, end-start)

Next I created a loop in shell to run the program 10 times for each individual value on n:

for ((i=1;i<=200000;i+=1000)); do
  for ((j=1;j<=10;j++)); do
    ./linear.py $i >> linear.txt;
    sleep 0.1
  done;
done;

Next, I used an Awk program to calculate the average of 10 runs:

awk -F, '
  NR==1{this=$1;prev=this}
  {this=$1}
  this!=prev{printf "%d, %f\n", prev, a/10; prev=this; a=0}
  {a+=$2}
  END {printf "%d, %f\n", prev, a/10}'
  linear-txt > linear-averaged.txt

Finally I used the following script for GnuPlot to create the graph and fit the data with a quadratic function:

filename = "linear-joined.txt"
f(x) = a*x**2 + b*x
fit f(x) filename via a, b
set title "Running time of Linear Fibonacci algorithm"      
set xlabel "n-th fibonacci number" tc lt 3
set ylabel "running time in seconds" tc lt 3 
set tics font "arial, 7"
set terminal gif size 480,360 font arial 9
set output './linear-fibonacci.gif'
plot filename with points pointtype 1 pointsize 0.6 title "LinearFibonacci(n)", f(x) title "O(x^2) approximation"

For the formulas I used this online latex generator.

Have fun!

aahmad Permalink
July 07, 2009, 15:48

Since you've introduced the matrix form for the fibonacci recurrence, I think it would interest your readers to decompose to PDP^-1 form and thereby derive the closed form formula one reader mentions.

July 07, 2009, 15:55

I compared n to running_time(Fib(n))

Then why does the plot have "n-th fibonacci number" on the x axis? This should be "n".

And why do you say "the plot of LinearFibonacci(n) vs. running time of LinearFibonacci(n) should be a line"? This should be "the plot of n vs. running time of LinearFibonacci(n) should be a line".

July 07, 2009, 15:57

John, that's a mistake right there. Will fix it now! Thanks for spotting.

July 07, 2009, 15:58

I see by your sample code that you were indeed comparing n to the running time of fib(n). Please correct your graphs and analysis!

July 07, 2009, 16:07

Learn all about memoization =]
http://en.wikipedia.org/wiki/Memoization

July 07, 2009, 16:25

John Tantalo, I have now corrected the graph and analysis.

July 07, 2009, 16:25

Brandon Pelfrey, sorry, but this time it has nothing to do with memoization! See one of my previous comments above!

ZachPruckowski Permalink
July 07, 2009, 17:08

But if we used a 128-bit processor (or created instructions to do a single-step 128-bit add) and updated our language to support these 128-bit integers, we'd return to O(n) time, since adding is once again an O(1) operation. I think that most modern processors support special case 128-bit-adds through SSE - so it's likely your Python Big-Number library that's the bottleneck?

July 07, 2009, 17:21

Zach, you're missing the point, which is that in the real world, you need to take the finite capabilities of your computational unit into consideration when considering the order of your algorithm. Great analysis Peteris!

Ray Permalink
July 07, 2009, 18:05

Zach, you're also just plain wrong. 128-bit integers only get you up to around f(200) or so (too lazy to calculate the exact number of bits, but it's close). Fibonacci numbers are *big*.

Anyway, any decent implementation will have a short linear part at the beginning, and then an n^2 part after that.

Unless, of course, you can play tricks to get your computation time down to constant again. No existing processor can do this, but a processor that had a configurable number of bits could stay linear for as long as it had computation units. Again, difference between theory and practice.

steve Permalink
July 07, 2009, 19:33

No technically is it exponential. This is a common problem used to trip up CS grad students during their qualifying exam. When finding the complexity of an algorithm, you must consider the number of steps with respect to the input. The input is log(n) and the number of steps in the for loop is n, hence the algorithm is exponential with respect to the input.

This is why the naive primality test:

isPrime(int n) {
 for (i=2;i<=n/2;i++) {
   if (n%i==0) return false;
 }
 return true;
}

fails to be O(n). Although long suspected to be in P, primality was not classified as such until the recent discovery of the AKS primality test.

See: http://en.wikipedia.org/wiki/Pseudo-polynomial_time for more info

Richie Permalink
July 07, 2009, 19:58

Yes, in my comment I assumed addition time is constant. With long precision numbers it is not. So my comment was not entirely correct. :)

knweiss Permalink
July 07, 2009, 20:05

Another practical issue theoretical computer scientist often ignore (apart from machine code word sizes) is the memory hierarchy. I.e. the evaluation of an algorithm is ignorant of the locality of memory accesses. But in practice this makes a HUGE difference.

Calyth Permalink
July 07, 2009, 22:33

@ZachPruckowski: the 187th fibonacci number cannot be expressed in one 128-bit register, which renders the point rather moot.

The point is that physical limitation of the machine can and will increase computational time, and can make things inaccurate. Not realizing that fibonacci sequence can quickly exceed your register size can lead to unexpected increase in computational time, and not taking care while calculating your floating point calculation can cause erroneous results.

July 07, 2009, 23:17

Just like the good old days, when int multiplications ran in time proportional to the number of bits set in the second operand ... and divisions ran *eventually*.

Interesting analysis & shows the (occasional) perils of abstraction. Thanks.

-----N

Macloud Permalink
July 08, 2009, 00:31

>In this article I’d like to show how the theory does not always match the practice

Just be careful to pick right theory. Theory saying that f1+f2 takes a constant time is useful in some situation, not in this one. You picked the right theory and it matches the practice.

July 08, 2009, 05:36

Hi Cantonmat,

I looked at this a while ago because I was annoyed about stupid implementations of Fibonacci. I quickly noticed that the value of Fn quickly overflows fixed precision datatypes so as you point out you can't use fixed precision arithmetic and have to take account the non-trivial complexity of your operations.

I looks at the real complexity for Binet's formula and implement the better identity used by GMP
F2k = (Fk+1)^2 − (Fk−1)^2

Glimpses of Lisp: Fibonacci and the Netflix prize

July 08, 2009, 07:42

As ZachPruckowski and other pointed out, the analysis is misleading. The skew is due to the big integer computation and has no bearing on the theory of asymptotic analysis. The syntactical sugar of python disguising the underlying big number implementation has mislead your analysis.
It is like expecting length(string) is O(1) operation, because you could express it as an say # operator.

MAK Permalink
August 19, 2009, 19:03

The fact that several commentators are still confused show how deep rooted this misconception is.

Possibly one the most "simple yet insightful" posts I've read. Great work!

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "coding": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements