1. Efficiency of Algorithms
All algorithms have a cost and we want to minimize that cost the best ways possible. This is done by saving operations in our algorithms. An algorithm's complexity are the time and space requirements. Time complexity is the time it takes to execute and space complexity is the memory needed to execute — there is an inverse relationship between time and space complexity but sometimes you can save both. The process of measuring complexity in algorithms is called analysis of algorithms. The problem size is the number of items or inputs that an algorithm processes A function that measures how an algorithm's time requirement grows as the problem size grows is called a growth-rate function. An algorithm's basic operation is the most significant contributor to its total time requirement.
The process of counting computer operations is greatly simplified if we accept the RAM model of computation:
- Access to each memory element takes a constant time (1 step).
- Assignments takes 1 step.
- Each "simple" operation (+, -, =, /, if, call) takes 1 step.
- Loops and function/method calls are not simple operations: they depend upon the size of the data and the contents of a subroutine:
- "sort()" is not a single-step operation.
- "max(list)" is not a single-step operation.
- "if x in list" is not a single-step operation.
The RAM method isn't always the most accurate but it's still very useful as a tool to compare algorithms.
Now that we have some terms down, let's take a break with the terms and look at an example.
Let's compute the sum from $1 + 2 + 3 + ... + n$. Take these 3 algorithms below.
Algorithm A
Algorithm B
Algorithm C
Now, let's look at the basic operations of each algorithm. With for loops, remember that a single for loop from 1 to n will take n operations and nest for loops from 1 to n will take $n^2$.
Algorithm A | Algorithm B | Algorithm C | |
---|---|---|---|
Additions | $n$ | $n(n + 1) / 2$ | $1$ |
Multiplications | $1$ | ||
Divisions | $1$ | ||
Total basic operations | $n$ | $n(n + 1) / 2$ | $3$ |
From the table above, algorithm C is the best and algorithm B is the worst as it requires more and more operations as the problem size increases, while C has the same number of operations no matter what.
Some basic growth-rate functions below, assume $log()$ is in base 2.
$$ 1 < log(log(n)) < log(n) < log^2(n) < nlog(n) < n^2 < n^3 < 2^n < n! $$
The best case is when the algorithm takes the least time. Worst case is the most amount of time. And average case is the time for execution for a typical dataset. Always assume the worst case!
Big Oh Notation
We express these complexities and cases through big oh notation, or $O(n)$. Think of big oh as a way to communicate the upper bound or worst case of an algorithm. To get something into big oh notation, figure out the basic operations then remove constants and keep the highest-order term. So, for example, $4n+3=O(n)$, $26n^4+n^3+n=O(n^4)$, and $n!+n=O(n!)$. $\Omega(n)$ defines the lower bound or the best case and $\Theta(n)$ defines the average case.
Formal Defintion
$f(n)=O(g(n))$ if there are positive constants $n_0$ and $c$ such that to the right of $n_0$ the value of $f(n)$ always lie on or below $c\cdot g(n)$.
Identities
$O(kg(n))=O(g(n))$
$O(g_1(n))+O(g_2(n))=O(g_1(n)+g_2(n))$
$O(g_1(n))\cdot O(g_2(n))=O(g_1(n)\cdot g_2(n))$
$O(g_1(n)+g_2(n)+...+g_m(n))=O(max(g_1(n),g_2(n),...,g_m(n)))$
$O(max(g_1(n),g_2(n),...,g_m(n)))=max(O(g_1(n)),O(g_2(n)),...,O(g_m(n)))$
Graph of Some Big oh's
Algorithms
Below are some examples of the worse-case efficiencies of sorting and searching that just about everyone who took a basic programming class will learn. For greater anaylsis of each algorithm, I would recommend going to the links provided and firguring out the big oh yourself.
Sorting
- Bubble: $O(n^2)$
- Selection: $O(n^2)$
- Insertion: $O(n^2)$
- Binary Insertion: $O(n^2log(n))$
Searching
- Linear: $O(n)$
- Binary (when the data is sorted): $O(log(n))$
Examples of Common Big Oh's
$O(1)$
$O(n)$
$O(n^2)$
$O(log(n))$
$O(\sqrt{n})$
$O(n!)$
$O(2^n)$
Commonly Used Summations
Arithmetic Series
$$\sum^{n}_{k=1}{k}=1+2+3+\dots+n=\frac{n(n-1)}{2}=O(n^2)$$
Geometric Series
$$\sum^{n}_{k=0}{x^k}=1+x+x^2+\dots+x^n=\frac{x^{n+1}-1}{x-1}=O(x^n), x\neq1$$
Harmonic Series
$$\sum^{n}_{k=1}{\frac{1}{n}}=1+\frac{1}{2}+\frac{1}{3}+\dots+\frac{1}{n}=O(log(n))$$
Tradeoffs
Something interesting to note with algorithms is trading time and space complexities. Sometimes, you can optimize both, however, programmers can and will trade some space complexity (increasing it) for time complexity (decrease it).