Analysis of Algorithms
This topic is key to literally every other one, and also seems to make up 90% of the exam questions (despite there being only 1 lecture on it) so it's very important.
- Need some way to characterise how good a data structure or algorithm is
- Most algorithms take input and generate output
- The run time of an algorithm typically grows with input size
- Average case is often difficult to determine
- Focus on the worst case
- Runtime analysis and benchmarks can be used to determine the performance of an algorithm, but this is often not possible
- Results will also vary from machine to machine
- Theoretical analysis is preferred as it gives a more high-level analysis
- Characterises runtime as a function of input size
Pseudocode
- Pseudocode is a high level description of an algorithm
- Primitive perations are assumed to take unit time
- For example
- Evaluating an expression
- Assigning to a variable
- Indexing into an array
- Calling a method
Looking at an algorithm, can count the number of operations in each step to analyse its runtime
public static double arrayMax(double[] data){
int n = data.length; //2 ops
double max = data[0]; //2 ops
for (int j=1; j < n;j++) //2n ops
if(data[j] > max) //2n-2 ops
max = data[j]; //0 to 2n-2 ops
return max; //1 op
}
- In the best case, there are primitive operations
- In the worst case,
- The runtime is therefore
- is the time to execute a primitive operation
Functions
There are 7 important functions that appear often when analysing algorithms
- Constant -
- A fixed constant
- Could be any number but 1 is the most fundamental constant
- Sometimes denoted where
- Logarithmic -
- For some constant ,
- Logarithm is the inverse of the power function
- Usually, because we are computer scientists and everything is base 2
- Linear -
-
- is a fixed constant
-
- n-log-n -
- Commonly appears with sorting algorithms
- Quadratic -
- Commonly appears where there are nested loops
- Cubic -
- Less common, also appears where there are 3 nested loops
- Can be generalised to other polynomial functions
- Exponential -
-
- is some arbitrary base, is the exponent
-
The growth rate of these functions is not affected by changing the hardware/software environment. Growth rate is also not affected by lower-order terms.
- Insertion sort takes time
- Characterised as taking time
- Merge sort takes
- Characterised as
- The
arrayMax
example from earlier took time- Characterised as
- A polynomial of degree , is of order
Big-O Notation
- Big-O notation is used to formalise the growth rate of functions, and hence describe the runtime of algorithms.
- Gives an upper bound on the growth rate of a function as
- The statement " is " means that the growth rate of is no more than the growth rate of
- If is a polynomial of degree , then is
- Drop lower order terms
- Drop constant factors
- Always use the smallest possible class of functions
- is , not
- Always use the simplest expression
- is , not
Formally, given functions and , we say that is if there is a positive constant and a positive integer constant , such that
where , and
Examples
is :
The function is not The inequality does not hold, since must be constant.
Big-O of :
Big-O of :
is
Asymptotic Analysis
- The asymptotic analysis of an algorithm determines the running time big-O notation
- To perform asymptotic analysis:
- Find the worst-case number of primitive operations in the function
- Express the function with big-O notation
- Since constant factors and lower-order terms are dropped, can disregard them when counting primitive operations
Example
The th prefix average of an array is the average of the first elements of . Two algorithms shown below are used to calculate the prefix average of an array.
Quadratic time
//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
int n = x.length;
double[] a = new double[n];
for(int j = 0; j < n; j++){
double total = 0;
for(int i = 0; i <= j; i++)
total += x[i];
a[j] = total / (j+1);
}
return a;
}
The runtime of this function is . The sum of the first integers is , so this algorithm runs in quadratic time. This can easily be seen by the nested loops in the function too.
Linear time
//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
int n = x.length;
double[] a = new double[n];
double total = 0;
for(int i = 0; i <= n; i++){
total += x[i];
a[i] = total / (i+1);
}
return a;
}
This algorithm uses a running average to compute the same array in linear time, by calculating a running sum.
Big-Omega and Big-Theta
Big-Omega is used to describe the best case runtime for an algorithm. Formally, is if there is a constant and an integer constant such that
Big-Theta describes the average case of the runtime. is if there are constants and , and an integer constant such that
The three notations compare as follows:
- Big-O
- is if is asymptotically less than or equal to
- Big-
- is if is asymptotically greater than or equal to
- Big-
- is if is asymptotically equal to