Analysis of Algorithms

This topic is key to literally every other one, and also seems to make up 90% of the exam questions (despite there being only 1 lecture on it) so it's very important.

  • Need some way to characterise how good a data structure or algorithm is
  • Most algorithms take input and generate output
  • The run time of an algorithm typically grows with input size
  • Average case is often difficult to determine
    • Focus on the worst case
  • Runtime analysis and benchmarks can be used to determine the performance of an algorithm, but this is often not possible
    • Results will also vary from machine to machine
  • Theoretical analysis is preferred as it gives a more high-level analysis
    • Characterises runtime as a function of input size

Pseudocode

  • Pseudocode is a high level description of an algorithm
  • Primitive perations are assumed to take unit time
  • For example
    • Evaluating an expression
    • Assigning to a variable
    • Indexing into an array
    • Calling a method

Looking at an algorithm, can count the number of operations in each step to analyse its runtime

public static double arrayMax(double[] data){
    int n = data.length; //2 ops
    double max = data[0]; //2 ops
    for (int j=1; j < n;j++) //2n ops
        if(data[j] > max) //2n-2 ops
            max = data[j]; //0 to 2n-2 ops
    return max; //1 op
}
  • In the best case, there are primitive operations
  • In the worst case,
  • The runtime is therefore
    • is the time to execute a primitive operation

Functions

There are 7 important functions that appear often when analysing algorithms

  • Constant -
    • A fixed constant
    • Could be any number but 1 is the most fundamental constant
    • Sometimes denoted where
  • Logarithmic -
    • For some constant ,
    • Logarithm is the inverse of the power function
    • Usually, because we are computer scientists and everything is base 2
  • Linear -
      • is a fixed constant
  • n-log-n -
    • Commonly appears with sorting algorithms
  • Quadratic -
    • Commonly appears where there are nested loops
  • Cubic -
    • Less common, also appears where there are 3 nested loops
    • Can be generalised to other polynomial functions
  • Exponential -
      • is some arbitrary base, is the exponent

The growth rate of these functions is not affected by changing the hardware/software environment. Growth rate is also not affected by lower-order terms.

  • Insertion sort takes time
    • Characterised as taking time
  • Merge sort takes
    • Characterised as
  • The arrayMax example from earlier took time
    • Characterised as
  • A polynomial of degree , is of order

Big-O Notation

  • Big-O notation is used to formalise the growth rate of functions, and hence describe the runtime of algorithms.
  • Gives an upper bound on the growth rate of a function as
  • The statement " is " means that the growth rate of is no more than the growth rate of
  • If is a polynomial of degree , then is
    • Drop lower order terms
    • Drop constant factors
  • Always use the smallest possible class of functions
    • is , not
  • Always use the simplest expression
    • is , not

Formally, given functions and , we say that is if there is a positive constant and a positive integer constant , such that

where , and

Examples

is :

The function is not The inequality does not hold, since must be constant.

Big-O of :

Big-O of :

is

Asymptotic Analysis

  • The asymptotic analysis of an algorithm determines the running time big-O notation
  • To perform asymptotic analysis:
    • Find the worst-case number of primitive operations in the function
    • Express the function with big-O notation
  • Since constant factors and lower-order terms are dropped, can disregard them when counting primitive operations

Example

The th prefix average of an array is the average of the first elements of . Two algorithms shown below are used to calculate the prefix average of an array.

Quadratic time

//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
    int n = x.length;
    double[] a = new double[n];
    for(int j = 0; j < n; j++){
        double total = 0;
        for(int i = 0; i <= j; i++)
            total += x[i];
        a[j] = total / (j+1);
    }
    return a;
}

The runtime of this function is . The sum of the first integers is , so this algorithm runs in quadratic time. This can easily be seen by the nested loops in the function too.

Linear time

//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
    int n = x.length;
    double[] a = new double[n];
    double total = 0;
    for(int i = 0; i <= n; i++){
        total += x[i];
        a[i] = total / (i+1);
    }
    return a;
}

This algorithm uses a running average to compute the same array in linear time, by calculating a running sum.

Big-Omega and Big-Theta

Big-Omega is used to describe the best case runtime for an algorithm. Formally, is if there is a constant and an integer constant such that

Big-Theta describes the average case of the runtime. is if there are constants and , and an integer constant such that

The three notations compare as follows:

  • Big-O
    • is if is asymptotically less than or equal to
  • Big-
    • is if is asymptotically greater than or equal to
  • Big-
    • is if is asymptotically equal to