Skip to content

Latest commit

 

History

History
191 lines (130 loc) · 11.4 KB

File metadata and controls

191 lines (130 loc) · 11.4 KB

Fundamentals of Algorithms Analysis

You are probably reading this book because you want to write better and faster code. How can you do that? Can you time how long it takes to run a program? Of course, you can! However, if you run the same program on a computer, cellphone, or even a smartwatch, it will take different times.

image

Wouldn’t it be great if we can compare algorithms regardless of the hardware where we run them? That’s what time complexity is for! But, why stop with the running time? We could also compare the memory "used" by different algorithms, and we call that space complexity.

In this chapter, you will learn:
  • What’s the best way to measure your code’s performance regardless of what hardware you use.

  • Learn how to use Big O notation to compare algorithms.

  • How to use algorithms analysis to improve your programs speed.

Before going deeper into space and time complexity, let’s cover the basics real quick.

What are Algorithms?

Algorithms (as you might know) are steps of how to do some tasks. When you cook, you follow a recipe (or an algorithm) to prepare a dish. Let’s say you want to make a pizza.

Example of an algorithm to make pizza
import { rollOut, applyToppings, Oven } from '../pizza-utils';

function makePizza(dough, toppings = ['cheese']) {
    const oven = new Oven(450);
    const rolledDough = rollOut(dough);
    const rawPizza = applyToppings(rolledDough, toppings);
    const pizzaPromise = oven.bake(rawPizza, { minutes: 20 });
    return pizzaPromise;
}

If you play a game, you’ll devise strategies (or algorithms) to win. Likewise, algorithms in computers are a set of instructions used to solve a problem.

Tip
Algorithms are the steps on how to perform a task.

Comparing Algorithms

Not all algorithms are created equal. There are “good” and “bad” algorithms. The good ones are fast; the bad ones are slow. Slow algorithms cost more money to run. Inefficient algorithms could make some calculations impossible in our lifespan!

Let’s say you want to compute the shortest path from Boston to San Francisco. Slow algorithms can take hours or crash before finishing. On the other hand, a "good" algorithm might compute in a few seconds.

Usually, algorithms time grows as the size of the input increases. For instance, calculating the shortest distance from your house to the local supermarket will take less time than other destination thousands of miles away.

Another example is sorting an array. A good sorting algorithm is part04-algorithmic-toolbox.asc, and an inefficient algorithm for large inputs is part04-algorithmic-toolbox.asc. Organizing 1 million elements with merge sort could take 20 seconds, for instance, while selection sort takes 12 days, ouch! The fantastic thing is that both programs solve the same problem with comparable data and hardware; yet, there’s a big difference in time! Bad algorithms would perform poorly, even on a supercomputer.

To give you a clearer picture of how different algorithms perform as the input size grows, look at the following problems and how their relative execution time changes as the input size increases.

Table 1. Relationship between algorithm input size and time taken to complete
Input size → 10 100 10k 100k 1M

Finding if a number is odd

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

Sorting array with merge sort

< 1 sec.

< 1 sec.

< 1 sec.

few sec.

20 sec.

Sorting array with Selection Sort

< 1 sec.

< 1 sec.

2 minutes

3 hours

12 days

Finding all subsets

< 1 sec.

40,170 trillion years

> centillion years

Finding string permutations

4 sec.

> vigintillion years

> centillion years

As you can see in the table, most algorithms on the table are affected by the input size. But not all and not at the same rate. Finding out if a number is odd will take the same if it is 1 or 1 million. We say then that the growth rate is constant. Others grow very fast. Finding all the permutations on a string of length 10 takes a few seconds, while if the string has a size of 100, it won’t even finish!

After completing this book, you are going to think algorithmically. You will be able to tell the growth rate of your programs and scale them. You’ll find bottlenecks of existing software and have an part04-algorithmic-toolbox.asc.

Increasing your code performance

The first step to improve your code performance is to learn how to measure it. As somebody said:

Measurement is the first step that leads to control and eventually to improvement. If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
— H. J. Harrington

This section will learn the basics of measuring our current code performance and compare it with other algorithms.

Calculating Time Complexity

In computer science, time complexity describes the number of operations a program will execute given the size of the input n.

How do you get a function that gives you the rough number of operations that the CPU will execute?

One idea is to analyze your code line by line and mind code inside loops. Let’s do an example to explain this point. For instance, we have a function to find the minimum value on an array called getMin.

Operations per line
Figure 1. Translating lines of code to an approximate number of operations

Assuming that each line of code is an operation, we get the following:

3n + 3

n = input size.

That means that if you have an array of 3 elements, e.g. getMin([3, 2, 9]), then it will execute around 3(3)+3 = 12 operations. Of course, this is not for every case. For instance, Line 12 is only executed if the condition on line 11 is met. As you might learn in the next section, we want to get the big picture and get rid of smaller terms to compare algorithms easier.

Space Complexity

Space complexity is similar to time complexity. However, instead of the count of operations executed, it will account for the amount of memory used additionally to the input.

For calculating the space complexity, we keep track of the “variables” and memory used. In the getMin example, we create a variable called min, which only holds one value at a time. So, the space complexity is 1. On other algorithms, If we have to use an auxiliary array that holds the same number of elements as the input, then the space complexity would be n.

Simplifying Complexity with Asymptotic Analysis

When we compare algorithms, we about the growth rate when the input gets huge (towards infinity). Then you have a function like 20*n^3 + 100. If n is one million. The term + 100 makes a tiny contribution to the result (less than 0.000001%). Here is when the asymptotic analysis comes to the rescue.

Tip
Asymptotic analysis describes the behavior of functions as their inputs approach to infinity.

In the previous example, we analyzed getMin with an array of size 3; what happens if the size is 10, 10k, or 10 million?

Table 2. Operations performed by an algorithm with a time complexity of 3n + 3
n (size) Operations total

10

3(10) + 3

33

10k

3(10k)+3

30,003

1M

3(1M)+3

3,000,003

As the input size n grows bigger and bigger, then the expression 3n + 3 is closer and closer to 3n. Dropping terms might look like a stretch at first, but you will see that what matters the most is the higher-order terms of the function rather than lesser terms and constants.

What is Big O Notation?

There’s a notation called Big O, where O refers to the order of a function in the worst-case scenario.

Tip
Big O = Big Order (rate of growth) of a function.

If you have a program that has a runtime of:

7n3 + 3n2 + 5

You can express it in Big O notation as O(n3). The other terms (3n2 + 5) will become less significant as the input grows bigger.

Big O notation only cares about the “biggest” terms in the time/space complexity. It combines what we learn about time and space complexity, asymptotic analysis and adds a worst-case scenario.

All algorithms have three scenarios:
  • Best-case scenario: the most favorable input arrangement where the program will take the least amount of operations to complete. E.g., a sorted array is beneficial for some sorting algorithms.

  • Average-case scenario: this is the most common case. E.g., array items in random order for a sorting algorithm.

  • Worst-case scenario: the inputs are arranged in such a way that causes the program to take the longest to complete. E.g., array items in reversed order for some sorting algorithm will take the longest to run.

To sum up:

Tip
Big O only cares about the run time function’s highest order in the worst-case scenario.
Warning
Don’t drop terms that are multiplying other terms. O(n log n) is not equivalent to O(n). However, O(n + log n) is.

There are many common notations like polynomial, O(n2) as we saw in the getMin example, constant O(1), and many more that we are going to explore in the next chapter.

Again,

Tip
the time complexity is not a direct measure of how long a program takes to execute, but rather how many operations it performs given the input size.

Nevertheless, there’s a relationship between time complexity and clock time, as shown in the following table.

Table 3. How long an algorithm takes to run based on their time complexity and input size
Input Size O(1) O(n) O(n log n) O(n2) O(2n) O(n!)

1

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

10

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

4 seconds

10k

< 1 sec.

< 1 sec.

< 1 sec.

2 minutes

100k

< 1 sec.

< 1 sec.

1 second

3 hours

1M

< 1 sec.

1 second

20 seconds

12 days

Note
This is just an illustration since, in different hardware, the times will be distinct. These times are under the assumption of running on 1 GHz CPU, and it can execute on average one instruction in 1 nanosecond (usually takes more time). Also, keep in mind that each line might be translated into dozens of CPU instructions depending on the programming language.

Summary

In this chapter, we learned how you could measure your algorithm performance using time complexity. Rather than timing how long your program takes to run, you can approximate the number of operations it will perform based on the input size.

We learned about time and space complexity and how they can be translated to Big O notation. Big O refers to the order of the function.

In the next section, we will go deeper into how to analyze time complexity and provide examples!