Introduction

This book is a collection of all my notes from my degree (computer systems engineering).

It exists for a few purposes:

To consolidate knowledge
To aid revision
To act as a reference during exams

Contributing

If you wish to contribute to this, either to make any additions or just to fix any mistakes I've made, feel free.

The sources are all available on my Github.

CS118

This section is mainly just a reference of some of the more detailed bits of the module. It assumes a pretty strong prior knowledge of object oriented programming so doesn't aim to be comprehensive, it just specifies some details to remember for the exam.

The version of Java on DCS systems at the time of writing is 11. This is also the version these notes refer to.

Useful Resources

https://en.wikipedia.org/wiki/Single-precision_floating-point_format
The Oracle documentation for specifics on how Java implements stuff

IEEE 754

IEEE 754 is a standardised way of storing floating point numbers with three components

A sign bit
A biased exponent
A normalised mantissa

Type	Sign	Exponent	Mantissa	Bias
Single Precision (32 bit)	1 (bit 31)	8 (bit 30 - 23)	23 (bit 22- 0)	127
Double Precision (64 bit)	1 (bit 63)	11 (bit 62 - 52)	52 (51 - 0)	1023

The examples below all refer to 32 bit numbers, but the principles apply to 64 bit.

The exponent is an 8 bit unsigned number in biased form
- To get the true exponent, subtract 127 from the binary value
The mantissa is a binary fraction, with the first bit representing $1/2$ , second bit $1/4$ , etc.
- The mantissa has an implicit $1.$ , so 1 must always be added to the mantissa

Formula

$- 1^{s i g n} \times 2^{E - 127} \times (1 + i = 1 \sum 23 b_{23 - i} 2^{- i})$

Decimal to Float

The number is converted to a binary fractional format, then adjusted to fit into the form we need. Take 12.375 for example:

Integer part $(12)_{10} = (1100)_{2}$
Fraction part $(0.375)_{10} = (0.011)_{2}$

Combining the two parts yields $1100.011$ . However, the standard requires that the mantissa have an implicit 1, so it must be shifted to the right until the number is normalised (ie has only 1 as an integer part). This yields $(1.100011)_{2}$ . As this has been shifted, it is actually $(1.100011)_{2} \times 2^{3}$ . The three $(10)$ is therefore the exponent, but this has to be normalised (+127) to yield 130 $(10000010)$ . The number is positive (sign bit zero) so this yields:

Sign	Biased Exponent	Normalised Mantissa
0	1000 0010	100011

$01000001010001100000000000000000$

Float to Decimal

Starting with the value 0x41C80000 = 01000001110010000000000000000000:

Sign	Biased Exponent	Normalised Mantissa
0	1000 0011	1001

The exponent is 131, biasing (-127) gives 4
The mantissa is 0.5625, adding 1 (normalising) gives 1.5625
$2^{4} \times 1.5625$ gives 25

Special Values

Zero
- When both exponent and mantissa are zero, the number is zero
- Can have both positive and negative zero
Infinity
- Exponent is all 1s, mantissa is zero
- Can be either positive or negative
Denormalised
- If the exponent is all zeros but the mantissa is non-zero, then the value is a denormalised number
- The mantissa does not have an assumed leading one
NaN (Not a Number)
- Exponent is all 1s, mantissa is non-zero
- Represents error values

Exponent	Mantissa	Value
0	0	$\pm 0$
255	0	$\pm \infty$
0	not 0	denormalised
255	not 0	NaN

OOP Principles

Constructors

All Java classes have a constructor, which is the method called upon object instantiation.

An object can have multiple overloaded constructors
A constructor can have any access modifier
Constructors can call other constructors through the this() method.
If no constructor is specified, a default constructor is generated which takes no arguments and does nothing.
The first call in any constructor is to the superclass constructor.
- This can be elided, and the default constructor is called
  - If there is no default constructor, a constructor must be called explicitly
- Can call explicitly with super()

Access Modifiers

Access modifiers apply to methods and member variables.

private: only the members of the class can see
public: anyone can see
protected: only class and subclasses can see
Default: package-private, only members of the same package can see

Inheritance

To avoid the diamond/multiple inheritance problem, Java only allows for single inheritance
This is done using the extends keyword in the class definition
Inherits all public and protected methods and members
Can, however, implement multiple interfaces

Example:

public class Car extends Vehicle implements Drivable, Crashable{
    // insert class body here
}

The Car class extends the Vehicle base class (can be abstract or concrete) and implements the behaviours defined by the interfaces Drivable and Crashable.

`static`

The static keyword defines a method, a field, or a block of code that belongs to the class instead of the object.

Static fields share a mutable state accross all instances of the class
Static methods are called from the class instead of from the object
Static blocks are executed once, the first time the class is loaded into memory

Polymorphism

Polymorphism: of many forms. A broad term describing a few things in java.

Dynamic Polymorphism

An object is defined as polymorphic if it passes more than one instanceof checks. An object can be referred to as the type of any one of it's superclasses. Say for example there is a Tiger class, which subclasses Cat, which subclasses Animal, giving an inheritance chain of Animal <- Cat <- Tiger, then the following is valid:

Animal a = new Tiger();
Cat c = new Tiger();
Tiger t = new Tiger();

When referencing an object through one of it's superclass types, you can only call objects that the reference type implements. For example, if there was two methods, Cat::meow and Tiger::roar, then:

c.meow() //valid
t.meow() //valid
a.meow() //not valid - animal has no method meow
t.roar() //valid
c.roar() // not valid - cat has no method roar

Even though all these variables are of the same runtime type, they are being called from a reference of another type.

When calling a method of an object, the actual method run is the one that is furthest down the inheritance chain. This is dynamic/runtime dispatch.

public class Animal{
    public speak(){return "...";}
}

public class Dog extends Animal{
    public speak(){return "woof";}
}

public class Cat extends Animal{
    public speak(){return "meow";}
}

Animal a = new Animal();
Animal d = new Dog();
Animal c = new Cat();

a.speak() // "..."
d.speak() // "woof"
c.speak() // "meow"

Even though the reference was of type Animal, the actual method called was the overridden subclass method.

Static Polymorphism (Method Overloading)

Note: different to overridding

Multiple methods with the same name can be written, as long as they have different parameter lists
The method that is called depends upon the number of and type of the arguments passed

Example:

public class Addition{
    private int add(int x, int y){return x+y;}
    private float add(float x, float y){return x+y;}
    public static void main(String[] args){
        add(1,2); //calls first method
        add(3.14,2.72); //calls second method
        add(15,1.5); //calls second method
    }
}

Abstraction

Abstraction is the process of removing irrelevant details from the user, while exposing the relevant details. For example, you don't need to know how a function works, it's inner workings are abstracted away, leaving only the function's interface and details of what it does.

In the example below, the workings of the sine function are abstracted away, but we still know what it does and how to use it.

float sin(float x){
    //dont care really
}
sin(90); // 1.0

Encapsulation

Encapsulation is wrapping the data and the code that acts on it into a single unit. The process is also known as data hiding, because the data is often hidden (declared private) behind the methods that retrieve them (getters/setters).

Reference Variables

There is no such thing as an object variable in Java. Only primitives (int,char,float...), and references. All objects are heap-allocated (new), and a reference to them stored. Methods are all pass by value: either the value of the primitive, or the value of the reference. Java is not pass by reference . Objects are never copied/cloned/duplicated implicitly.

If a reference type is required (ie Integer), but a primitive is given ((int) 1), then the primitive will be autoboxed into it's equivalent object type.

Abstract Classes and Interfaces

Abstract classes are classes that contain one or more abstract methods.
- A class must be declared abstract
- Abstract methods have no body, ie are unimplemented.
- The idea of them is to generalise behaviour, and leave it up to subclasses to implement
- Abstract classes cannot be instantiated directly, though can still have constructors for subclasses to call
Interfaces are a special kind of class that contain only abstract methods (and fields declared public static final)
- Used to define behaviour
- Technically can contain methods, but they're default implementations
  - This raises all sorts of issues so is best avoided
- Don't have to declare methods abstract, it's implicit

The diagram shows the inheritance hierarchy of the java collections framework, containing interfaces, abstract classes, and concrete classes.

Exceptions

Exceptions are events that occur within the normal flow of program execution that disrupt the normal flow of control.

Throwing Exceptions

Exceptions can occur when raised by other code we call, but an exception can also be raised manually using a throw statement. Any object that inherits, either directly or indirectly, from the Throwable class, can be raised as an exception.

//pop from a stack
public E pop(){
    if(this.size == 0)
        throw new EmptyStackException();
    //pop the item
}

Exception Handling

Exceptions can be caught using a try-catch block
If any code within the try block raises an exception, the catch block will be executed
- catch blocks must specify the type of exception to catch
- Can have multiple catch blocks for different exceptions
  - Only 1 catch block will be executed
A finally block can be included to add any code to execute after the try-catch, regardless of if an exception is raised or not.
The exception object can be queried through the variable e

try{
    //try to do something
} catch (ExceptionA e){
    //if an exception of type ExceptionA is thrown, this is executed
} catch (ExceptionB e){
    //if an exception of type ExceptionB is thrown, this is executed
} finally{
    //this is always executed
}

Exception Heirachy

The Throwable class is the parent class of all errors and exceptions in Java
There are two subclasses of Throwable
- Error, which defines hard errors within the JVM that aren't really recoverable
- Exception, which defines errors that may occur within the code
  - There are two kinds of exception, checked and unchecked

Checked and Unchecked Exceptions

Checked exceptions must be either caught or re-thrown
- IOException is a good example
When a method that may throw a checked exception is required, there are two options
- Wrap the possibly exception-raising code in a try-catch
- Use the throws keyword in the method definition to indicate that the method may throw a checked exception

public static void ReadFile() throws FileNotFoundException{
    File f = new File("non-existant-file.txt")
    FileInputStream stream = new FileInputStream(f);
}
// OR
public static void ReadFile(){
    File f = new File("non-existant-file.txt")
    try{
        FileInputStream stream = new FileInputStream(f);
    } catch (FileNotFoundException){
        e.printStackTrace();
        return;
    }
}

Unchecked Exceptions all subclass RunTimeException
- ie NullPointerException and ArrayIndexOutOfBoundsException
Can be thrown at any point and will cause program to exit if not caught

Custom Exceptions

Custom exception classes can be created
Should subclass Throwable
- Ideally the most specific subclass possible
- Subclassing Exception gives a new checked exception
- Subclassing RunTimeException gives a new unchecked exception
All methods such as printStackTrace and getMessage inherited from superclass
Should provide at least one constructor that overrides a superclass constructor

public class IncorrectFileExtensionException
  extends RuntimeException {
    public IncorrectFileExtensionException(String errorMessage, Throwable err) {
        super(errorMessage, err);
    }
}

Generics

Generics allow for classes to be parametrised over some type or types, to provide additional compile time static type checking. A simple box class parametrised over some type E, for example:

public class Box<E>{
    E item;

    public Box(E item){
        this.item = item;
    }
    public E get(){
        return item;
    }
    public E set(E item){
        this.item = item;
    }
}

Generic Methods

Methods can be generic too, introducing their own type parameters. The parameters introduced in methods are local to that method, not the whole class. As an example, the static method below compares two Pair<K,V> classes to see if they are equal.

public static <K, V> boolean compare(Pair<K, V> p1, Pair<K, V> p2) {
        return p1.getKey().equals(p2.getKey()) &&
               p1.getValue().equals(p2.getValue());
    }

Type erasure

Type information in generic classes and methods is erased at runtime, with the compiler replacing all instances of the type variable with Object. Object is also what appears in the compiled bytecode. This means that at runtime, any type casting of generic types is unchecked, and can cause runtime exceptions.

CS126

The book Data Structures and Algorithms in Java by Goodrich, Tamassia and Goldwasser is a good resource as it aligns closely with the material. It can be found online fairly easily.

Arrays & Linked Lists

Arrays

Arrays are the most common data structure and are very versatile

A sequenced collection of variables of the same type (homogenous)
Each cell in the array has an index $0... (n - 1)$
Arrays are of fixed length and so have a max capacity
Can store primitives, or references to objects
When inserting an element into the array, all to the right must be shifted up by one
The same applies in reverse for removal to prevent null/0 gaps being left

Sorting Arrays

The sorting problem:
- Consider an array of unordered elements
- We want to put them in a defined order
- For example [3, 6, 2, 7, 8, 10, 22, 9] needs to become [2, 3, 6, 7, 8, 9, 10, 22]
One possible solution: insertion sort:
- Go over the entire array, inserting each element at it's proper location by shifting elements along

public static void insertionSort(int[] data){
    int n = data.length;
    for(int k = 1; k < n; k++){             //start with second element
        int cur = data[k];                  //insert data[k]
        int j = k;                          //get correct index j for cur
        while(j < 0 && data[j-1] > cur){    //data[j-1] must go after cur
            data[j] = data[j-1];            // slide data[j-1] to the right
            j--;                            //consider previous j for cur
        }
        data[j] = cur; //cur is in the right place
    }
}

Insertion sort sucks
Has worst case quadratic complexity, as k comparisons are required for k iterations.
When the list is in reverse order (worst case), $\frac{n ( n - 1 )}{2}$ comparisons are made
Can do much better with alternative algorithms

Singly Linked Lists

Linked lists is a concrete data structure consisting of a chain of nodes which point to each other
Each node stores the element, and the location of the next node
The data structure stores the head element and traverses the list by following the chain
Operations on the head of the list (ie, prepending) are efficient, as the head node can be accessed via its pointer
Operations on the tail require first traversing the entire list, so are slow
Useful when data needs to always be accessed sequentially
Generally, linked lists suck for literally every other reason

Doubly Linked Lists

In a doubly linked list, each node stores a pointer to the node in front of and behind it
This allows the list to be traversed in both directions, and for nodes to be easily inserted mid-sequence
Sometimes, special header and trailer "sentinel" nodes are added to maintain a reference to the head an tail of the list
- Also removes edge cases when inserting/deleting nodes as there is always nodes before/after head and tail

Analysis of Algorithms

This topic is key to literally every other one, and also seems to make up 90% of the exam questions (despite there being only 1 lecture on it) so it's very important.

Need some way to characterise how good a data structure or algorithm is
Most algorithms take input and generate output
The run time of an algorithm typically grows with input size
Average case is often difficult to determine
- Focus on the worst case
Runtime analysis and benchmarks can be used to determine the performance of an algorithm, but this is often not possible
- Results will also vary from machine to machine
Theoretical analysis is preferred as it gives a more high-level analysis
- Characterises runtime as a function of input size $n$

Pseudocode

Pseudocode is a high level description of an algorithm
Primitive perations are assumed to take unit time
For example
- Evaluating an expression
- Assigning to a variable
- Indexing into an array
- Calling a method

Looking at an algorithm, can count the number of operations in each step to analyse its runtime

public static double arrayMax(double[] data){
    int n = data.length; //2 ops
    double max = data[0]; //2 ops
    for (int j=1; j < n;j++) //2n ops
        if(data[j] > max) //2n-2 ops
            max = data[j]; //0 to 2n-2 ops
    return max; //1 op
}

In the best case, there are $4 n + 3$ primitive operations
In the worst case, $6 n + 1$
The runtime $T (n)$ is therefore $a (4 n + 3) \leq T (n) \leq a (6 n + 1)$
- $a$ is the time to execute a primitive operation

Functions

There are 7 important functions $f (n)$ that appear often when analysing algorithms

Constant - $1$
- $f (n) = c$
- A fixed constant
- Could be any number but 1 is the most fundamental constant
- Sometimes denoted $f (n) = c \times g (n)$ where $g (n) = 1$
Logarithmic - $lo g n$
- For some constant $b > 1$ , $f (n) = lo g_{b} (n)$
- Logarithm is the inverse of the power function
  - $x = lo g_{b} n \Leftrightarrow b^{x} = n$
- Usually, $b = 2$ because we are computer scientists and everything is base 2
Linear - $n$
- $f (n) = c n$
  - $c$ is a fixed constant
n-log-n - $n lo g n$
- $f (n) = n \times lo g n$
- Commonly appears with sorting algorithms
Quadratic - $n^{2}$
- $f (n) = n^{2}$
- Commonly appears where there are nested loops
Cubic - $n^{3}$
- $f (n) = n^{3}$
- Less common, also appears where there are 3 nested loops
- Can be generalised to other polynomial functions
Exponential - $2^{n}$
- $f (n) = b^{n}$
  - $b$ is some arbitrary base, $n$ is the exponent

The growth rate of these functions is not affected by changing the hardware/software environment. Growth rate is also not affected by lower-order terms.

Insertion sort takes time $\frac{1}{2} n^{2}$
- Characterised as taking $n^{2}$ time
Merge sort takes $2 n lo g n$
- Characterised as $n lo g n$
The arrayMax example from earlier took $a (4 n + 3) \leq T (n) \leq a (6 n + 1)$ time
- Characterised as $n$
A polynomial $f (n)$ of degree $d$ , is of order $n^{d}$

Big-O Notation

Big-O notation is used to formalise the growth rate of functions, and hence describe the runtime of algorithms.
Gives an upper bound on the growth rate of a function as $n \to \infty$
The statement " $f (n)$ is $O (g (n))$ " means that the growth rate of $f (n)$ is no more than the growth rate of $g (n)$
If $f (n)$ is a polynomial of degree $d$ , then $f (n)$ is $O (n^{d})$
- Drop lower order terms
- Drop constant factors
Always use the smallest possible class of functions
- $2 n$ is $O (n)$ , not $O (n^{2})$
Always use the simplest expression
- $3 n + 5$ is $O (n)$ , not $O (3 n)$

Formally, given functions $f (n)$ and $g (n)$ , we say that $f (n)$ is $O (g (n))$ if there is a positive constant $c$ and a positive integer constant $n_{0}$ , such that

$f (n) \leq c g (n) for n \geq n_{0}$

where $c > 0$ , and $n_{0} \geq 1$

Examples

$2 n + 10$ is $O (n)$ :

$f (n) = 2 n + 10, g (n) = n$ $2 n + 10 \leq c n$ $(c - 2) n \geq 10$ $n \geq \frac{10}{c - 2}$ $c = 3, n_{0} = 10$

The function $n^{2}$ is not $O (n)$ $f (n) = n^{2}, g (n) = n$ $n^{2} \leq c n$ $n \leq c$ The inequality does not hold, since $c$ must be constant.

Big-O of $7 n - 2$ : $f (n) = 7 n - 10, g (n) = n$ $7 n - 2 \leq c n$ $(c - 7) n \geq 2$ $n \geq \frac{2}{c - 7}$ $c = 7, c_{0} = 1$

Big-O of $3 n^{3} + 20 n^{2} + 5$ : $f (n) = 3 n^{3} + 20 n^{2} + 5, g (n) = n^{3}$ $3 n^{3} + 20 n^{2} + 5 \leq c n^{3} for n \geq n_{0}$ $c = 4, n_{0} = 21$

$3 lo g n + 5$ is $O (lo g n)$ $f (n) = 3 lo g n + 5, g (n) = lo g n$ $3 lo g n + 5 \leq c lo g n for n \geq n_{0}$ $lo g n \geq \frac{5}{c - 3}$ $c = 8, n_{0} = 2$

Asymptotic Analysis

The asymptotic analysis of an algorithm determines the running time big-O notation
To perform asymptotic analysis:
- Find the worst-case number of primitive operations in the function
- Express the function with big-O notation
Since constant factors and lower-order terms are dropped, can disregard them when counting primitive operations

Example

The $i$ th prefix average of an array $X$ is the average of the first $i + 1$ elements of $X$ . Two algorithms shown below are used to calculate the prefix average of an array.

Quadratic time

//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
    int n = x.length;
    double[] a = new double[n];
    for(int j = 0; j < n; j++){
        double total = 0;
        for(int i = 0; i <= j; i++)
            total += x[i];
        a[j] = total / (j+1);
    }
    return a;
}

The runtime of this function is $O (1 + 2 + ... + n) + O (n)$ . The sum of the first $n$ integers is $\frac{n ^{2} + n}{2}$ , so this algorithm runs in quadratic $O (n^{2})$ time. This can easily be seen by the nested loops in the function too.

Linear time

//returns an array where a[i] is the average of x[0]...x[i]
public static double[] prefixAverage(double[] x){
    int n = x.length;
    double[] a = new double[n];
    double total = 0;
    for(int i = 0; i <= n; i++){
        total += x[i];
        a[i] = total / (i+1);
    }
    return a;
}

This algorithm uses a running average to compute the same array in linear time, by calculating a running sum.

Big-Omega and Big-Theta

Big-Omega is used to describe the best case runtime for an algorithm. Formally, $f (n)$ is $Ω (g (n))$ if there is a constant $c > 0$ and an integer constant $n_{0} g e q 1$ such that $f (n) \geq c \cdot g (n) for n \geq n_{0}$

Big-Theta describes the average case of the runtime. $f (n)$ is $Θ (g (n))$ if there are constants $c^{'} > 0$ and $c^{''} > 0$ , and an integer constant $n_{0} \geq 1$ such that $c^{'} g (n) \leq f (n) \leq c^{''} g (n) for n \geq n_{0}$

The three notations compare as follows:

Big-O
- $f (n)$ is $O (g (n))$ if $f (n)$ is asymptotically less than or equal to $g (n)$
Big- $Ω$
- $f (n)$ is $Ω (g (n))$ if $f (n)$ is asymptotically greater than or equal to $g (n)$
Big- $Θ$
- $f (n)$ is $O (g (n))$ if $f (n)$ is asymptotically equal to $g (n)$

Recursive Algorithms

Recursion allows a problem to be broken down into sub-problems, defining a problem in terms of itself. Recursive methods work by calling themselves. As an example, take the factorial function:

$n! = {1 n \times (n - 1)! if n = 0 otherwise$

In java, this can be written:

public static int factorial(int n){
    if(n == 0) return 1;
    return n * factorial(n-1);
}

Recursive algorithms have:

A base case
- This is the case where the method doesn't call itself, and the stack begins to unwind
- Every possible chain of recursive calls must reach a base case
  - If not the method will recurse infinitely and cause an error
A recursive case
- Calls the current method again
- Should always eventually end up on a base case

Binary Search

Binary search is a recursively defined searching algorithm, which works by splitting an array in half at each step. Note that for binary search, the array must already be ordered.

Three cases:

If the target equals data[midpoint] then the target has been found
- This is the base case
If the target is less than data[midpoint] then we binary search everything to the left of the midpoint
If the target is greater than data[midpoint] then we binary search everything to the right of the midpoint

public static boolean binarySearch(int[] data, int target, int left, int right){
    if (left > right)
        return false;
    int mid = (left + right) / 2;
    if(target == data[mid])
        return true;
    else if (target < data[mid])
        return binarySearch(data,target,low,mid-1);
    else
        return binarySearch(data,target,mid+1,high);

}

Binary search has $O (lo g n)$ , as the size of the data being processed halves at each recursive call. After the $i^{t h}$ call, the size of the data is at most $n / 2^{i}$ .

Linear Recursion

The method only makes one recursive call
There may be multiple possible recursive calls, but only one should ever be made (ie binary search)
For example, a method used in computing powers by repeated squaring:

$p o w (x, n) = ⎩ ⎨ ⎧ 1 x (p o w (x, \frac{n - 1}{2}))^{2} (p o w (x, \frac{n}{2}))^{2} if n = 0 n is odd n is even$

public static int pow(int x, int n){
    if (n == 0) return 1;
    if (n % 2 == 0){
        y = pow(x,n/2);
        return x * y * y;
    }
    y = pow(x,(n-1)/2);
    return y * y;
}

Note how despite multiple cases, pow only ever calls itself once.

Binary Recursion

Binary recursive methods call themselves twice recursively. Fibonacci numbers are defined using binary recursion:

$F_{0}$ = 0
$F_{1} = 1$
$F_{i} = F_{i - 1} + F_{i - 2}$

public static int fib(int n){
    if (n == 0) return 0;
    if (n == 1) return 1;
    return fib(n-1) + fib(n-2);
}

This method calls itself twice, which isn't very efficient. It can end up having to compute the same result many many times. A better alternative is shown below, which uses linear recursion, and is therefore much much more efficient.

public static Pair<Integer,Integer> linearFib(int n){
    if(k = 1) return new Pair(n,0);
    Pair result = linearFib(n-1);
    return new Pair(result.snd+1, result.fst);
}

Multiple Recursion

Multiple recursive algorithms call themselves recursively more than twice. These are generally very inefficient and should be avoided.

Stacks & Queues

Abstract Data Types (ADTs)

An ADT is an abstraction of a data structure
Specifies the operations performed on the data
Focus is on what the operation does, not how it does it
Expressed in java with an interface

Stacks

A stack is a last in, first out data structure (LIFO)
Items can be pushed to or popped from the top
Example uses include:
- Undo sequence in a text editor
- Chain of method calls in the JVM (method stack)
- As auxillary storage in multiple algorithms

The Stack ADT

The main operations are push() and pop(), but others are included for usefulness

public interface Stack<E>{
    int size();
    boolean isEmpty();
    E peek(); //returns the top element without popping it
    void push(E elem); //adds elem to the top of the stack
    E pop(); //removes the top stack item and returns it
}

Example Implementation

The implementation below uses an array to implement the interface above. Only the important methods are included, the rest are omitted for brevity.

public class ArrayStack<E> implements Stack<E>{
    private E[] elems;
    private int top = -1;

    public ArrayStack(int capacity){
        elems = (E[]) new Object[capacity];
    }

    public E pop(){
        if (isEmpty()) return null;
        E t = elems[top];
        top = top-1;
        return t;
    }
    public E push(){
        if (top == elems.length-1) throw new FullStackException; //cant push to full stack
        top++;
        return elems[top];
    }
}

Advantages
- Performant, uses an array so directly indexes each element
- $O (n)$ space and each operation runs in $O (1)$ time
Disadvantages
- Limited by array max size
- Trying to push to full stack throws an exception

Queues

Queues are a first in, first out (FIFO) data structure
Insertions are to the rear and removals are from the front
- In contrast to stacks which are LIFO
Example uses:
- Waiting list
- Control access to shared resources (printer queue)
- Round Robin Scheduling
  - A CPU has limited resources for running processes simultaneously
  - Allows for sharing of resources
  - Programs wait in the queue to take turns to execute
  - When done, move to the back of the queue again

The Queue ADT

public interface Queue<E>{
    int size();
    boolean isEmpty();
    E peek();
    void enqueue(E elem); //add to rear of queue
    E dequeue(); // pop from front of queue
}

Lists

The list ADT provides general support for adding and removing elements at arbitrary positions

The List ADT

public interface List<E>{
    int size();
    boolean isEmpty();
    E get(int i); //get the item from the index i
    E set(int i, E e); //set the index i to the element e, returning what used to be at that index
    E add(int i, E e); //insert an element in the list at index i
    void remove(int i); //remove the element from index i
}

Array Based Implementation (`ArrayList`)

Array lists are growable implementations of the List ADT that use arrays as the backing data structure. The idea is that as more elements are added, the array resizes itself to be bigger, as needed. Using an array makes implementing get() and set() easy, as they can both just be thin wrappers around array[] syntax.

When inserting, room must be made for new elements by shifting other elements forward
- Worst case (inserting to the head) $O (n)$ runtime
When removing, need to shift elements backward to fill the hole
- Same worst case as insertion, $O (n)$

When the array is full, we need to replace it with a larger one and copy over all the elements. When growing the array list, there are two possible strategies:

Incremental
- Increase the size by a constant $c$
Doubling
- Double the size each time

These two can be compared by analysing the amortised runtime of the push operation, ie the average time required $T (n) / n$ for a $n$ pushes taking a total time $T (n)$ .

With incremental growth, over $n$ push operations, the array is replaced $k = n / c$ times, where $c$ is the constant amount the array size is increased by. The total time $T (n)$ of $n$ push operations is proportional to: $c + 2 c + 3 c + ... + k c = c (1 + 2 + .. + k) = \frac{c k ( k + 1 )}{2}$

Since $c$ is a constant, $T (n)$ is $Ω (k^{2})$ , meaning the amortised time of a push operation is $Ω (n)$ .

With doubling growth, the array is replaced $k = lo g_{2} n$ times. The total time $T (n)$ of $n$ pushes is proportional to:

$1 + 2 + 4 + 8 + ... + 2^{k} = 2^{k + 1} - 1 = 2 n - 1$

Thus, $T (n)$ is $O (n)$ , meaning the amortised time $T (n) / n$ is $O (1)$

Positional Lists

Positional lists are a general abstraction of a sequence of elements without indices
A position acts as a token or marker within the broader positional list
A position p is unaffected by changes elsewhere in a list
- It only becomes invalid if explicitly deleted
A position instance is an object (ie there is some Position class)
- ie p.getElement() returns the element stored at position p
A very natural way to implement a positional list is with a doubly linked list, where each node represents a position.
- Where a pointer to a node exists, access to the previous and next node is fast ( $O (1)$ )

ADT


public interface PositionalList<E>{
    int size();
    boolean isEmpty();
    Position<E> first(); //return postition of first element
    Position<E> last();  //return position of last element
    Position<E> before(Position<E> p); //return position of element before position p
    Position<E> after(Posittion<E> p); //return position of element after position p
    void addFirst(E e); //add a new element to the front of the list
    void addLast(E e); // add a new element to the back of the list
    void addBefore(Position<E> p, E e); // add a new element just before position p
    void addAfter(Position<E> p, E e); // add a new element just after position p
    void set(Position<E> p, E e); // replaces the element at position p with element e
    E remove(p); //removes and returns the element at position p, invalidating the position
}

Iterators

Iterators are a software design pattern that abstract the process of scanning through a sequence one element at a time. A collection is Iterable if it has an iterator() method, which returns an instance of a class which implements the Iterator interface. Each call to iterator() returns a new object. The iterable interface is shown below.

public interface Iterator<E>{
    boolean hasNext(); //returns true if there is at least one additional element in the sequence
    E next(); //returns the next element in the sequence, advances the iterator by 1 position.
}
// example usage
public static void iteratorOver(Iterable<E> collection){
    Iterator<E> iter = collection.iterator();
    while(iter.hasNext()){
      E var = iter.next();
      System.out.println(var);
    }
}

Maps

Maps are a searchable collection of key-value entries
Lookup the value using the key
Keys are unique

The Map ADT

public interface Map<K,V>{
    int size();
    boolean isEmpty();
    V get(K key); //return the value associated with key in the map, or null if it doesn't exist
    void put(K key, V value); //associate the value with the key in the map
    void remove(K key); //remove the key and it's value from the map
    Collection<E> entrySet(); //return an iterable collection of the values in the map
    Collection<E> keySet(); //return an iterable collection of the keys in the map
    Iterator<E> values(); //return an iterator over the map's values
}

List-Based Map

A basic map can be implemented using an unsorted list.

get(k)
- Does a simple linear search of the list looking for the key,value pair
- Returns null if search reaches end of list and is unsuccessful
put(k,v)
- Does linear search of the list to see if key already exists
  - If so, replace value
- If not, just add new entry to end
remove(k)
- Does a linear search of the list to find the entry and removes it
All operations take $O (n)$ time so this is not very efficient

Hash Tables

Recall the map ADT
Intuitively, a map M supports the abstraction of using keys as indices such as M[k]
A map with n keys that are known to be integers in a fixed range is just an array
A hash function can map general keys (ie not integers) to corresponding indices in a table/array

Hash Functions

A hash function $h$ maps keys of a given type to integers in a fixed interval $[0, N - 1]$ .

A very simple hash function is the mod function: $h (x) = x mod N$
- Works for integer keys
- The integer $h (x)$ is the hash value of the key $x$
The goal of a hash function is to store an entry $(k, v)$ at index $i = h (k)$
Function usually has two components:
- Hash code $h_{1}$
  - keys -> integers
- Compression function $h_{2}$
  - integers -> integers in range $[0, N - 1]$
- Hash code applied first, then compression - $h (x) = h_{2} (h_{1} (x))$ Some example hash functions:
Memory address
- Use the memory address of the object as it's hash code
Integer cast
- Interpret the bits of the key as an integer
- Only suitable with $\leq$ 64 bits
Component sum
- Partition they key into bitwise components of fixed length and sum the components
Polynomial accumulation
- Partition the bits of the key into a sequence of components of fixed length $a_{0}$ , $a_{1}$ , ... , $a_{n - 1}$
- Evaluate the polynomial $P (z) = a_{0} + a_{1} z + a_{2} z^{2} + ... + a_{n - 1} z^{n - 1}$ for some fixed value $z$
- Especially suitable for strings
- Polynomial can be evaluated in $O (n)$ time as $p_{i} (z) = a_{n - i - 1} + z p_{i - 1} (z)$

Some example compression functions:

Division
- $h_{2} (y) = y mod N$
- The size $N$ is usually chosen to be a prime to increase performance
Multiply, Add, and Divide (MAD)
- $h_{2} (y) = (a y + b) m o d N$
- $a$ and $b$ are nonnegative integers such that $a mod N \neq = 0$

Collision Handling

Collisions occur when different keys hash to the same cell. There are several strategies for resolving collisions.

Separate Chaining

With separate chaining, each cell in the map points to another map containing all the entries for that cell.

Linear Probing

Open addressing
- The colliding item is placed in a different cell of the table
Linear probing handles collisions by placing the colliding item at the next available table cell
Each table cell inspected is referred to as a "probe"
Colliding items can lump together, causing future collisions to cause a longer sequence of probes

Consider a hash table $A$ that uses linear probing.

get(k)
- Start at cell $h (k)$
- Prove consecutive locations until either
  - Key is found
  - Empty cell is found
  - All cells have been unsuccessfully probed
To handle insertions and deletions, need to introduce a special marker object defunct which replaces deleted elements
remove(k)
- Search for an entry with key k
- If an entry (k, v) is found, replace it with defunct and return v
- Else, return null

Double Hashing

Double hashing uses two hash functions h() and f()
If cell h(k) already occupied, tries sequentially the cell $(h (k) + i \cdot f (k)) mod N$ for $i = 1, 2, 3...$
f(k) cannot return zero
Table size $N$ must be a prime to allow probing of all cells
Common choice of second hash func is $f (k) = q - k mod q$ where q is a prime
if $f (k) = 1$ then we have linear probing

Performance

In the worst case, operations on hash tables take $O (n)$ time when the table is full and all keys collide into a single cell
The load factor $α = n / N$ affects the performance of a hash table
- $n$ = number of entries
- $N$ = number of cells
When $α$ is large, collision is likely
Assuming hash values are true random numbers, the "expected number" of probes for an insertion with open addressing is $\frac{1}{1 - α}$
However, in practice, hashing is very fast and operations have $O (1)$ performance, provided $α$ is not close to 1

Sets

A set is an unordered collection of unique elements, typically with support for efficient membership tests

Like keys of a map, but with no associated value

Set ADT

Sets also provide for traditional mathematical set operations: Union, Intersection, and Subtraction/Difference.

public interface Set<E>{
    void add(E e); //add element e to set if not already present
    void remove(E e); //remove element e from set if present
    boolean contains(E e); //test if element e is in set
    Iterator<E> iterator(); //returns an iterator over the elements
    //updates the set to include all elements of set T
    // union
    void addAll(Set<E> T);
    //updates the set to include only the elements of the set that are also in T
    //intersection
    void retainAll(Set<E> T);
    //updates the set to remove any elements that are also in T
    //difference
    void removeAll(Set<E> T);
}

Generic Merging

Generic merge is a generalised merge of two sorted lists A and B to implement set operations. Uses a template method merge and 3 auxillary methods that describe what happens in each case:

aIsLess
- Called when the element of A is less than the element of B
bIsLess
- Called when the element of B is less than the element of A
bothEqual
- Called when the element of A is equal to the element of B

public static Set<E> merge(Set<E> A, Set<E> B){
    Set<E> S = new Set<>();
    while (!A.isEmpty() && !B.isEmpty()){
        a = A.firstElement();
        b = B.firstElement();
        if(a < b){
            aIsLess(a,S);
            A.remove(a);
        }
        else if (b < a){
            bIsLess(b,S);
            B.remove(b);
        }
        else{ //b == a
            bothEqual(a,b,S);
            A.remove(a);
            B.remove(b);
        }
        while(!A.isEmpty()){
            aIsLess(a,S);
            A.remove(a);
        }
        while(!B.isEmpty()){
            bIsLess(b,S);
            B.remove(b);
        }
    }
    return S;
}

Any set operation can be implemented using generic merge
Union
- aIsLess adds a into S
- bIsLess adds b into S
- bothEqual adds a (or b) into S
Intersection
- aIsLess and bIsLess do nothing
- bothEqual adds a (or b) into S
Difference
- aIsLess adds a into S
- bIsLess and bothEqual do nothing
Runs in linear time, $O (N_{A} + N_{B})$ , provided the auxillary methods are $O (1)$

Trees

A tree is an abstract model of a heirarchical structure
A tree consists of nodes with a parent-child relationship
- A parent has one or more children
- Each child has only one parent
The root is the top node in the tree, the only node without a parent
An internal node has at least one child
An external node (or leaf) is a mode with no children
Nodes have ancestors (ie, the parent node of a parent)
The depth of a node is its number of ancestors
The height of a tree is its maximum depth

Tree ADT

Tree ADTs are defined using a similar concept to positional lists, as they don't have a natural ordering/indexing in the same way arrays do.

public interface Tree<E>{
    int size();
    boolean isEmpty();
    Node<E> root(); //returns root node
    Node<E> parent(Node<E> n); //returns parent of Node n
    Iterable<Node<E>> children(Node<E> n); //collection of all the children of Node n
    int numChildren(Node<E> n);
    Iterator<E> iterator(); //an iterator over the trees elements
    Iterator<Node<E>> nodes(); //collection of all the nodes
    boolean isInternal(Node<E> n); //does the node have at least one child
    boolean isExternal(Node<E> n); //does the node have no children
    boolean isRoot(Node<E> n); //is the node the root

}

Tree Traversal

Trees can be traversed in 3 different orders. As trees are recursive data structures, all 3 traversals are defined recursively. The tree below is used as an example in all 3 cases.

Pre-order

Visit the root
Pre order traverse the left subtree
Pre order traverse the right subtree

Pre-order traversal of the tree gives: F B A D C E G I H

In-order

In order traverse the left subtree
Visit the root
In order traverse the right subtree

In-order traversal of the tree gives: A B C D E F G H I

Post-order

Post order traverse the left subtree
Post order traverse the right subtree
Visit the root

Post-order traversal of the tree gives: A C E D B H I G F

Binary Trees

A binary tree is a special case of a tree:

Each node has at most two children (either 0, 1 or 2)
The children of the node are an ordered pair (the left node is less than the right node)

A binary tree will always fulfil the following properties:

$e = i + 1$
$n = 2 e - 1$
$h \leq i$
$h \leq (n - 1) /2$
$e \leq 2^{h}$
$h \geq lo g_{2} e$
$h \geq lo g_{2} (n + 1) - 1$

Where:

$n$ is the number of nodes in the tree
$e$ is the number of external nodes
$i$ is the number of internal nodes
$h$ is the height/max depth of the tree

Binary Tree ADT

The binary tree ADT is an extension of the normal tree ADT with extra accessor methods.

public interface BinaryTree<E> extends Tree<E>{
    Node<E> left(Node<E> n); //returns the left child of n
    Node<E> right(Node<E> n); //returns the right child of n
    Node<E> sibling(Node<E> n); //returns the sibling of n
}

Arithmetic Expression Trees

Binary trees can be used to represent arithmetic expressions, with internal nodes as operators and external nodes as operands. The tree below shows the expression $(2 \times (a - 1)) + (3 \times b)$ . Traversing the tree in-order will can be used to print the expression infix, and post-order evaluating each node with it's children as the operand will return the value of the expression.

Implementations

Binary trees can be represented in a linked structure, similar to a linked list
Node objects are positions in a tree, the same as positions in a positional list
Each node is represented by an object that stores
- The element
- A pointer to the parent node
- A pointer to the left child node
- A pointer to the right child node
Alternatively, the tree can be stored in an array A
A[root] is 0
If p is the left child of q, A[p] = 2 * A[q] + 1
If p is the right child of q, A[p] = 2 * A[q] + 2
In the worst, case the array will have size $2^{n} - 1$

Binary Search Trees

Binary trees can be used to implement a sorted map
Items are stored in order by their keys
For a node $p$ with key $K_{p}$ , every key in the left subtree is less than $K_{p}$ , and every node in the right subtree is greater than $K_{p}$
This allows for support of nearest-neighbour queries, so can fetch the key above or below another key
Binary search can perform nearest neighbour queries on an ordered map to find a key in $O (lo g n)$ time
A search table is an ordered map implemented using a sorted sequence
- Searches take $O (lo g n) t im e$
- Insertion and removal take $O (n)$ time
- Only effective for maps of small size

Methods

Binary trees are recursively defined, so all the methods operating on them are easily defined recursively also.

Search
To search for a key $K$
- Compare it with the key at $K_{roo t}$
- If $K_{roo t} = K$ , the value has been found
- If $K_{roo t} < K$ , search the right subtree
- If $K_{roo t} > K$ , search the left subtree
Insertion
- Search for the key being inserted $K$
- Insert $K$ at the leaf reached by the search
Deletion
- Find the internal node that is follows the key being inserted in an in order traversal (the in order successor)
- Copy key into the in order successor node
- Remove the node copied out of

Performance

Consider a binary search tree with $n$ items and height $h$
The space used is $O (n)$
The methods get, put, remove take $O (h)$ time
- The height h is $O (lo g n)$ in the best case, when the tree is perfectly balanced
- In the worst case, when the tree is basically just a linked list, this decays to $O (n)$

AVL Trees

AVL trees are balanced binary trees
- For every internal node $v$ of the tree, the heights of the subtrees of $v$ can differ by at most 1
The height of an AVL tree storing $n$ keys is $O (lo g n)$
Balance is maintained by rotating nodes every time a new one is inserted/removed

Performance

The runtime of a single rotation is $O (1)$
The tree is assured to always have $h = lo g n$ , so the runtime of all methods is $O (lo g n)$
This makes AVL trees an efficient implementation of binary trees, as their performance does not decay as the tree becomes unbalanced

Priority Queues

A priority queue is an implementation of a queue where each item stored has a priority. The items with the highest priority are moved to the front of the queue to leave first. A priority queue takes a key along with a value, where the key is used as the priority of the item.

Priority Queue ADT

public interface PriorityQueue<K,V>{
    int size();
    boolean isEmpty();
    void insert(K key, V value); //inserts a value into the queue with key as its priority
    V removeMin(); //removes the entry with the lowest key (at the front of the queue)
    V min(); //returns but not removes the smallest key entry (peek)
}

Entry Objects

To store a key-value pair, a tuple/pair-like object is needed
An Entry<K,V> object is used to store each queue item
- Key is what is used to defined the priority of the item in the queue
- Value is the queue item
This pattern is similar to what is used in maps

public class Entry<K,V>{
    private K key;
    private V value;

    public Entry(K key, V value){
        this.key = key;
        this.value = value;
    }

    public K getKey(){
        return key;
    }

    public V getValue(){
        return value;
    }

}

Total Order Relations

Keys may be arbitrary values, so they must have some order defined on them
- Two entries may also have the same key
A total order relation is a mathematical concept which formalises ordering on a set of objects where any 2 are comparable.
A total ordering satisfies the following properties $\forall a, b, c \in X$
- $a \leq b$ or $b \leq a$
  - Comparability property
- If $a \leq b$ $b \leq c$ , then $a \leq c$
  - Transitive property
- If $a \leq b$ and $b \leq a$ , then $a = b$
  - Antisymmetric property
- $a \leq a$
  - Reflexive property

Comparators

A comparator encapsulates the action of comparing two objects with a total order declared on them
A priority queue uses a comparator object given to it to compare two keys to decide their priority

public class Comparator<E>{
    public int compare(E a, E b){
        if(a < b)
            return -1;
        if(a > b)
            return 1;
        return 0;
    }
}

Implementations

Unsorted List-Based Implementation

A simple implementation of a priority queue can use an unsorted list

insert() just appends the Entry(key,value) to the list
- $O (1)$ time
removeMin() and min() linear search the list to find the smallest key (one with highest priority) to return
- Linear search takes $O (n)$ time

Sorted List-Based Implementation

To improve the speed of removing items, a sorted list can instead be used. These two implementations have a tradeoff between which operations are faster, so the best one for the application is usually chosen.

insert() finds the correct place to insert the Entry(key,value) in the list to maintain the ordering
- Has to find place to insert, takes $O (n)$ time
As the list is maintained in order, the entry with the lowest key is always at the front, meaning removeMin() and min() just pop from the front
- Takes $O (1)$ time

Sorting Using a Priority Queue

The idea of using a priority queue for sorting is that all the elements are inserted into the queue, then removed one at a time such that they are in order

Selection sort uses an unsorted queue
- Inserting $n$ items in each $O (1)$ time takes $O (n)$ time
- Removing the elements in order
  - $O (n) + O (n - 1) + O (n - 2) + ... + O (1)$
- Overall $O (n^{2})$ time
Insertion sort uses a sorted queue
- Runtimes are the opposite to unsorted
- Adding $n$ elements takes $O (1) + O (2) + O (3) + ... + O (n)$ time
- Removing $n$ elements in each $O (1)$ time takes $O (n)$ time
- Overall runtime of $O (n^{2})$ again

Heaps

A heap is a tree-based data structure where the tree is a complete binary tree
Two kinds of heaps, min-heaps and max-heaps
For a min-heap, the heap order specifies that for every internal node $v$ other than the root, $v \geq p a re n t (v)$
- In other words, the root of the tree/subtree must be the smallest node
- This property is inverted for max heaps
Complete binary tree means that every level of the tree, except possibly the last, is filled, and all nodes are as far left as possible.
- More formally, for a heap of height $h$ , for $i = 0, 1, ..., h - 1$ there are $2^{i}$ nodes of depth $i$
- At depth $h - 1$ , the internal nodes are to the left of the external nodes
- The last node of a heap is the rightmost node of maximum depth
Unlike binary search trees, heaps can contain duplicates
Heaps are also unordered data structures
Heaps can be used to implement priority queues
- An Entry(Key,Value) is stored at each node

Insertion

To insert a node z into a heap, you insert the node after the last node, making z the new last node
- The last node of a heap is the rightmost node of max depth
The heap property is then restored using the upheap algorithm
The just inserted node is filtered up the heap to restore the ordering
Moving up the branches starting from the z
- While parent(z) > (z)
  - Swap z and parent(z)
Since a heap has height $lo g n$ , this runs in $O (lo g n)$ time

Removal

To remove a node z from the heap, replace the root node with the last node w
Remove the last node w
Restore the heap order using downheap
Filter the replacement node back down the tree
- While w is greater than either of its children
  - Swap w with the smallest of its children
Also runs in $O (lo g n)$ time

Heap Sort

For a sequence S of n elements with a total order relation on them, they can be ordered using a heap.

Insert all the elements into the heap
Remove them all from the heap again, they should come out in order
$n$ calls of insert take $O (n lo g n)$ time
$n$ calls to remove take $O (n lo g n)$ time
Overall runtime is $O (n lo g n)$
Much faster than quadratic sorting algorithms such as insertion and selection sort

Array-based Implementation

For a heap with n elements, the element at position p is stored at cell f(p) such that

If p is the root, f(p) = 0
If p is the left child q, f(p) = 2*f(q)+1
If p is the right child q, f(p) = 2*f(q)+2

Insert corresponds to inserting at the first free cell, and remove corresponds to removing from cell 0

A heap with n keys has length $O (n)$

Skip Lists

When implementing sets, the idea is to be able to test for membership and update elements efficiently
A sorted array or list is easy to search, but difficult to maintain in order
Skip lists consists of multiple lists/sets
- The skip list $S = {S_{0}, S_{1}, S_{2}, ..., S_{h}}$
- $S_{0}$ contains all the elements, plus $\pm \infty$
- $S_{i}$ is a random subset of $S_{i - 1}$ , for $i = 1, 2, ..., h - 1$
  - Each element of $S_{i - 1}$ appears in $S_{i}$ with probability 0.5
- $S_{h}$ contains only $\pm \infty$

Search

To search for an element $x$ in the list:

Start in the first position of the top list
At the current position $p$ , compare $x$ with the next element in the current list $y$
- If $x = y$ , return $y$
- If $x > y$ , move to the next element in the list
  - "Scan forward"
- If $x < y$ , drop down to the element below
  - "Drop down"
If the end of the list ( $+ \infty$ ) is reached, the element does not exist

Insertion

To insert an element $k$ into the list:

Repeatedly toss a fair coin until tails comes up
- $i$ is the number of times the coin came up heads
If $i \geq h$ , add to the skip list new lists $S_{h + 1}, ..., S_{i + 1}$
- Each containing only the two end keys $\pm \infty$
Search for $k$ and find the positions $p_{0}, p_{1}, ..., P_{i}$ of the items with the largest element $> k$ in each list $S_{0}, S_{1}, ..., S_{i}$
- Same as the search algorithm
For $j = 0.. i$ , insert k into list $S_{j}$ after position $p_{j}$

Deletion

To remove an entry $x$ from a skip list:

Search for $x$ in the skip list and find the positions of the items $p_{0}, p_{1}, ..., p_{i}$ containing $x$
Remove those positions from the lists $S_{0}, S_{1}, ..., S_{i}$
Remove a list if neccessary

Implementation

A skip list can be implemented using quad-nodes, where each node stores

It's item/element
A pointer to the node above
A pointer to the node below
A pointer to the next node
A pointer to the previous node

Performance

The space used by a skip list depends on the random number on each invocation of the insertion algorithm
- On average, the expected space usage of a skip list with $n$ items is $O (n)$
The run time of the insertion is affected by the height $h$ of the skip list
- A skip list with $n$ items has average height $O (lo g n)$
The search time in a skip list is proportional to the number of steps taken
The drop-down steps are bounded by the height of the list
The scan-forward steps are bounded by the length of the list
- Both are $O (lo g n)$
Insertion and deletion are also both $O (lo g n)$

Graphs

A graph is a collection of edges and vertices, a pair $(V, E)$ , where

$V$ is a set of nodes, called vertices
$E$ is a collection of pairs of vertices, called edges
Vertices and edges are positions and store elements

Examples of graphs include routes between locations, users of a social network and their friendships, and the internet.

There are a number of different types of edge in a graph, depending upon what the edge represents:

Directed edge
- Ordered pair of vertices $(u, v)$
- First vertex $u$ is the origin
- Second vertex $v$ is the destination
- For example, a journey between two points
Undirected edge
- Unordered pair of vertices $(u, v)$
In a directed graph, all edges are directed
In an undirected graph, all edged are undirected

Graph Terminology

Adjacent vertices
- Two vertices $U$ and $V$ are adjacent (ie connected by an edge)
Edges incident on a vertex
- The edges connect to a vertex
- $a$ , $d$ , and $b$ are incident on $V$
End vertices or endpoints of an edge
- The vertices connected to an edge
- $U$ and $V$ are endpoints of $a$
The degree of a vertex
- The number of edges connected to it
- $X$ has degree 5
Parallel edges
- Edges that make the same connection
- $h$ and $i$ are parallel
Self-loop
- An edge that has the same vertex at both ends
- $j$ is a self-loop
Path
- A sequence of alternating vertices and edges
- Begins and ends with a vertex
- Each edge is preceded and followed by its endpoints
- $P_{1} = (V, b, X, h, Z)$ is a simple path
Cycle
- A circular sequence of alternating vertices and edges
  - A circular path
- A simple cycle is one where all edges and vertices are distinct
- A non-simple cycle contains an edge or vertex more than once
- A graph without cycles (acyclic) is a tree
Length
- The number of edges in a path
- The number of edges in a cycle

Graph Properties

Notation:

$n$ is the number of vertices
$m$ is the number of edges
$de g (v)$ is the degree of vertex $v$

The sum of the degrees of the vertices of a graph is always an even number. Each edge is counted twice, as it connects to two vertices, so $\sum_{v} de g (v) = 2 m$ . For example, the graph shown has $n = 4$ and $m = 6$ . $de g (v) = 3 \Rightarrow \sum_{v} de g (v) = 2 m = 12$

In an undirected graph with no self loops and no multiple edges, $m \leq n \frac{n - 1}{2}$ . Each vertex has degree at most $(n - 1)$ and $\sum_{v} de g (v) = 2 m$ . For the graph shown, $m = 6 \leq n \frac{n - 1}{2} = 6$

The Graph ADT

A graph is a collection of vertices and edges, which are modelled as a combination of 3 data types: Vertex, Edge and Graph.

A Vertex is just a box object storing an element provided by the user
An Edge also stores an associated value which can be retrieved

public interface Graph{
    int numVertices();

    Collection vertices(); //returns all the graph's vertices

    int numEdges();

    Collection<Edge> edges(); //returns all the graph's edges

    Edge getEdge(u,v); //returns the edge between u and v, if on exists
    // for an undirected graph getEdge(u,v) == getEdge(v,u)

    Pair<Vertex, Vertex> endVertices(e); //returns the endpoint vertices of edge e

    Vertex oppsite(v,e); //returns the vertex adjacent to v along edge e

    int outDegree(v); //returns the number of edges going out of v

    int inDegree(v); //returns the number of edges coming into v
    //for an undirected graph, inDegree(v) == outDegree(v)

    Collection<Vertex> outgoingEdges(v); //returns all edges that point out of vertex v

    Collection<Vertex> incomingEdges(v); //returns all edges that point into vertex v
    //for an undirected graph, incomingEdges(v) == outgoingEdges(v)

    Vertex insertVertex(x); //creates and returns a new vertex storing element x

    Edge insertEdge(u,v,x); //creates and returns a new edge from vertices u to v, storing element x in the edge

    void removeVertex(v); //removes vertex v and all incident edges from the graph

    void removeEdge(e); //removes edge e from the graph
}

Representations

There are many different ways to represent a graph in memory.

Edge List

An edge list is just a list of edges, where each edge knows which two vertices it points to.

The Edge object stores
- It's element
- It's origin Vertex
- It's destination Vertex
The edge list stores a sequence of Edge objects

Adjacency List

In an adjacency list, each vertex stores an array of the vertices adjacent to it.

The Vertex object stores
- It's element
- A collection/array of all it's incident edges
The adjacency list stores all Vertex Objects

Adjacency Matrix

An adjacency matrix is an $n \times n$ matrix, where $n$ is the number of vertices in the graph. It acts as a lookup table, where each cell corresponds to an edge between two vertices.

If there is an edge between two vertices $u$ and $v$ , the matrix cell $(u, v)$ will contain the edge.
Undirected graphs are symmetrical along the leading diagonal

Subgraphs

A subgraph $S$ of a graph $G$ is a graph such that:
- The vertices of $S$ are a subset of the vertices of $G$
- The edges of $S$ are a subset of the edges of $G$
A spanning subgraph of $G$ is a subgraph that contains all the vertices of $G$
A graph is connected if there is a path between every pair of vertices
A tree is an undirected graph $T$ such that
- $T$ is connected
- $T$ has no cycles
A forest is an undirected graph without cycles
The connected components of a forest are trees

A spanning tree of a connected graph is a spanning subgraph that has all vertices covered with a minimum possible number of edges
- A spanning tree is not unique unless the graph is a tree
  - Multiple spanning trees exist
- Spanning trees have applications in the design of communication networks
- A spanning forest of a graph is a spanning subgraph that is a forest

Depth First Search

DFS is a general technique for traversing a graph. A DFS traversal of a graph $G$ will:

Visit all vertices and edges of $G$
Determine whether $G$ is connected
Compute the spanning components of $G$
Compute the spanning forest of $G$

DFS on a graph with $n$ vertices and $m$ edges takes $O (n + m)$ time. The algorithm is:

For a graph $G$ and a vertex $u$ of $G$
Mark vertex $u$ as visited
For each of $u$ 's outgoing edges $e = (u, v)$
- If $v$ has not been visited then
  - Record $e$ as the discovery edge for vertex $v$
  - Recursively call DFS with on $v$

DFS(G,V) visits all vertices and edges in the connected component of v, and the discovery edges labelled by DFS(G,V) form a spanning tree of the connected component of v.

DFS can also be extended to path finding, to find a path between two given vertices $u$ and $v$ . A stack is used to keep track of the path, and the final state of the stack is the path between the two vertices. As soon as the destination vertex $v$ is encountered, the contents of the stack is returned.

DFS can be used for cycle detection too. A stack is used to keep track of the path between the start vertex and the current vertex. As soon as a back edge $(v, w)$ (an edge we have already been down in the opposite direction) is encountered, we return the cycle as the portion of the stack from the top to the vertex $w$ .

To perform DFS on every connected component of a graph, we can loop over every vertex, doing a new DFS from each unvisited one. This will detect all vertices in graphs with multiple connected components.

Breadth First Search

BFS is another algorithm for graph traversal, similar to DFS. It also requires $O (n + m)$ time. The difference between the two is that BFS uses a stack while DFS uses a queue. The algorithm is as follows:

Mark all vertices and edges as unexplored
Create a new queue
Add the starting vertex $s$ to the queue
Mark $s$ as visited
While the queue is not empty
- Pop a vertex $v$ from the queue
- For all neighbouts $w$ of $v$
  - If $w$ is not visited
    - Push $w$ into the queue
    - Mark $w$ as visited

For a connected component $G_{s}$ of graph $G$ containing $s$ :

BFS visits all vertices and edges of $G_{s}$
The discovery edges labelled by BFS(G,s) form a spanning tree of $G_{s}$
The path of the spanning tree formed by the BFS is the shortest path between the two vertices

BFS can be specialised to solve the following problems in $O (n + m)$ time:

Compute the connected components of a graph
Compute a spanning forest of a graph
Find a simple cycle in G
Find the shortest path between two vertices
- DFS cannot do this, this property is unique to BFS

Directed Graphs

A digraph (short for directed graph) is a graph whose edges are all directed.

Each edge goes in only one direction
Edge $(a, b)$ goes from a to b but not from b to a
If the graph is simple and has $n$ vertices and $m$ edges, $m \leq n \frac{n - 1}{2}$
DFS and BFS can be specialised to traversing directed edges
- A directed DFS starting at a vertex $s$ determines the vertices reachable from $s$
- One vertex is reachable from another if there is a directed path to it

Strong Connectivity

A digraph is said to be strongly connected if each vertex can reach all other vertices. This property can be identified in $O (n + m)$ time with the following algorithm:

Pick a vertex $v$ in the graph $G$
Perform a DFS starting from $v$
- If theres a vertex not visited, return false
Let $G^{'}$ be $G$ with all the edge directions reversed
Perform a DFS starting from $v$ in $G^{'}$
- If theres a vertex not visited, return false
- Else, return True

Transitive Closure

Given a digraph $G$ , the transitive closure of $G$ is the digraph $G *$ such that:

$G *$ has the same vertices as $G$
If $G$ has a directed path from $u$ to $v$ , then G* also has a directed *edge* from $u$ to $v$
In $G *$ , every pair of vertices with a path between them in $G$ is now adjacent
The transitive closure provides reachability information about a digraph

The transitive closure can be computed by doing a DFS starting at each vertex. However, this takes $O (n (n + m))$ time. Alternatively, there is the Floyd-Warshall algorithm:

For the graph $G$ , number the vertices $1, 2, ..., n$
Compute the graphs $G_{0}, ..., G_{n}$
- $G_{0} = G$
- $G_{k}$ has directed edge $(v_{i}, v_{j})$ if $G$ has a directed path from $v_{i}$ to $v_{j}$ with intermediate vertices $v_{1}, ..., v_{k}$
Digraph $G_{k}$ is computed from $G_{k - 1}$
$G_{n} = G *$
Add $(v_{i}, v_{j})$ if edges $(v_{i}, v_{k})$ and $(v_{k}, v_{j})$ appear in $G_{k - 1}$

In pseudocode:

for k=1 to n
    Gk = Gk_1
      for i=1 to n (i != k)
          for j=1 to n (j != i, j!=k)
              if G_(k-1).areAdjacent(vi,vk) && G_(k-1).areAdjacent(vk,vj)
                  if !G_(k-1).areAdjacent(vi,vj)
                      G_k.insertDirectedEdge(vi,vj,k)
  return G_n

This algorithm takes $O (n^{3})$ time. Basically, at each iteration a new vertex is introduced, and each vertex is checked to see if a path exists through the newly added vertex. If it does, a directed edge is inserted to transitively close the graph.

Topological Ordering

A Directed Acyclic Graph (DAG) is digraph that has no directed cycles
A topological ordering of a digraph is a numbering $v_{1}, v_{2}, ..., v_{n}$ of the vertices such that for every edge $(v_{i}, v_{j})$ , $i < j$
- The vertex it points to is always greater than it
A digraph can have a topological ordering if and only if it is a DAG

A topological ordering can be calculated using a DFS:

public static void topDFS(Graph G, Vertex v){
    v.visited = true
    for(Edge e: v.edges){
        w = opposite(v,e)
        if(w.visited = false)
            topDFS(G,w)
        else{
            v.label = n
            n = n-1
        }
    }
}

The first node encountered in the DFS is assigned $n$ , the one after that $n - 1$ , and so on until all nodes are labelled.

CS132

Note that specifics details of architectures such as the 68k, its specific instruction sets, or the PATP are not examinable. They are included just to serve as examples.

The 68008 datasheet can be found here, as a useful resource.

Digital Logic

Digital logic is about reasoning with systems with two states: on and off (0 and 1 (binary)).

Basic Logic Functions

Some basic logic functions, along with their truth tables.

NOT

$f = \overset{ˉ}{A}$

A	f
0	1
1	0

AND

$f = A \cdot B$

A	B	f
0	0	0
0	1	0
1	0	0
1	1	1

OR

$f = A + B$

A	B	f
0	0	0
0	1	1
1	0	1
1	1	1

XOR

$f = A \oplus B$

A	B	f
0	0	0
0	1	1
1	0	1
1	1	0

NAND

$f = \overline{A \cdot B}$

A	B	f
0	0	1
0	1	1
1	0	1
1	1	0

NOR

$f = \overline{A + B}$

A	B	f
0	0	1
0	1	0
1	0	0
1	1	0

X-NOR

$f = \overline{A \oplus B}$

A	B	f
0	0	1
0	1	0
1	0	0
1	1	1

Logic Gates

Logic gates represent logic functions in a circuit. Each logic gate below represents one of the functions shown above.

Logic Circuits

Logic circuits can be built from logic gates, where outputs are logical functions of their inputs. Simple functions can be used to build up more complex ones. For example, the circuit below implements the XOR function.

$f = \overset{ˉ}{A} \cdot B + A \cdot \overset{ˉ}{B}$

Another example, using only NAND gates to build XOR. NAND (or NOR) gates can be used to construct any logic function.

Truth tables can be constructed for logic circuits by considering intermediate signals. The circuit below has 3 inputs and considers 3 intermediate signals to construct a truth table.

$P = A \cdot B Q = A \cdot B R = A \cdot C$

$f = P + Q + R = A \cdot B + A \cdot B + A \cdot C$

A	B	C	P	Q	R	f
0	0	0	0	0	0	0
0	0	1	0	0	0	0
0	1	0	0	0	0	0
0	1	1	0	1	0	1
1	0	0	0	0	0	0
1	0	1	0	0	1	1
1	1	0	1	0	0	1
1	1	1	1	1	1	1

Truth tables of circuits are important as they enumerate all possible outputs, and help to reason about logic circuits and functions.

Boolean Algebra

Logic expressions, like normal algebraic ones, can be simplified to reduce complexity
- This reduces the number of gates required for their implementation
- The less gates, the more efficient the circuit is
  - More gates is also more expensive
Sometimes, only specific gates are available too and equivalent expressions must be found that use only the available gates
Two main ways to simplify expressions
- Boolean algebra
- Karnaugh maps
The truth table for the expression before and after simplifying must be identical, or you've made a mistake

Expressions from Truth Tables

A sum of products form of a function can be obtained from it's truth table directly.

A	B	C	f
0	0	0	1
0	0	1	1
0	1	0	0
0	1	1	0
1	0	0	1
1	0	1	0
1	1	0	1
1	1	1	1

Taking only the rows that have an output of 1:

The first row of the table: $\overset{ˉ}{A} \cdot \overset{ˉ}{B} \cdot \overset{ˉ}{C}$
The second row: $\overset{ˉ}{A} \cdot \overset{ˉ}{B} \cdot C$
Fifth: $A \cdot \overset{ˉ}{B} \cdot \overset{ˉ}{C}$
Seventh: $A \cdot B \cdot \overset{ˉ}{C}$
Eight: $A \cdot B \cdot C$

Summing the products yields:

$f = (\overset{ˉ}{A} \cdot \overset{ˉ}{B} \cdot \overset{ˉ}{C}) + (\overset{ˉ}{A} \cdot \overset{ˉ}{B} \cdot C) + (A \cdot \overset{ˉ}{B} \cdot \overset{ˉ}{C}) + (A \cdot B \cdot \overset{ˉ}{C}) + (A \cdot B \cdot C)$

Boolean Algebra Laws

There are several laws of boolean algebra which can be used to simplify logic expressions:

Name	AND form	OR form
Identity Law	$1 A = A$	$0 + A = A$
Null Law	$0 A = 0$	$1 + A = 1$
Idempotent Law	$AA = A$	$A + A = A$
Inverse Law	$A \overset{ˉ}{A} = 0$	$A + \overset{ˉ}{A} = 1$
Commutative Law	$A B = B A$	$A + B = B + A$
Associative Law	$(A B) C = A (BC) = A BC$	$(A + B) + C = A + (B + C) = A + B + C$
Distributive Law	$A + BC = (A + B) (A + C)$	$A (B + C) = A B + A C$
Absorption Law	$A (A + B) = A$	$A + A B = A$
De Morgan's Law	$\overline{A \cdot B} = \overset{ˉ}{A} + \overset{ˉ}{B}$	$\overline{A + B} = \overset{ˉ}{A} \cdot \overset{ˉ}{B}$

Can go from AND to OR form (and vice versa) by swapping AND for OR, and 0 for 1

Most are fairly intuitive, but some less so. The important ones to remember are:

$A + BC = (A + B) (A + C)$
$A (B + C) = A B + A C$
$A (A + B) = A$
$A + A B = A$

De Morgan's Laws

De Morgan's Laws are very important and useful ones, as they allow to easily go from AND to OR. In simple terms:

Break the negation bar
Swap the operator

Example 1

When doing questions, all working steps should be annotated.

$f = (\overline{X + Y}) \cdot (\overline{\overset{ˉ}{X} + Y})$ $f = (\overset{ˉ}{X} \cdot \overset{ˉ}{Y}) \cdot (\overline{\overset{ˉ}{X} + Y}) De Morgan OR form$ $f = (\overset{ˉ}{X} \cdot \overset{ˉ}{Y}) \cdot (X \cdot \overset{ˉ}{Y}) De Morgan AND form$ $f = \overset{ˉ}{X} \cdot \overset{ˉ}{Y} \cdot X \cdot \overset{ˉ}{Y} Remove brackets (associative law)$ $f = \overset{ˉ}{X} \cdot X \cdot \overset{ˉ}{Y} \cdot \overset{ˉ}{Y} Re-order (commutative law)$ $f = 0 \cdot \overset{ˉ}{Y} Inverse and idempotent laws$ $f = 0 Null law$

Example 2

$f = X + \overset{ˉ}{Y} + \overset{ˉ}{X} \cdot Y + (X + \overset{ˉ}{Y}) \cdot \overset{ˉ}{X} \cdot Y$ $f = X + \overset{ˉ}{Y} + \overset{ˉ}{X} \cdot Y + X \cdot \cdot \overset{ˉ}{X} \cdot Y + \overset{ˉ}{Y} \cdot \overset{ˉ}{X} \cdot Y Distributive law$ $f = X + \overset{ˉ}{Y} + \overset{ˉ}{X} \cdot Y + 0 + 0 Inverse AND law$ $f = X + (\overset{ˉ}{Y} + \overset{ˉ}{X}) (\overset{ˉ}{Y} + Y) Distributive law$ $f = X + (\overset{ˉ}{Y} + \overset{ˉ}{X}) \cdot 1 Inverse law$ $f = X + \overset{ˉ}{Y} + \overset{ˉ}{X} Removing 1 and brackets (identity and associative laws)$ $f = \overset{ˉ}{Y} + 1 Inverse OR law$ $f = 1 Null law$

Karnaugh Maps

Karnaugh Maps (k-maps) are sort of like a 2D- truth table
Expressions can be seen from the location of 1s in the map

A	B	f
0	0	a
0	1	b
1	0	d
1	1	c

Functions of 3 variables can used a 4x2 or 2x4 map (4 variables use a 4x4 map)

Adjacent squares in a k-map differ by exactly 1 variable
- This makes the map gray coded
Adjacency also wraps around

The function $f = A B \overset{ˉ}{C} D + A \overset{ˉ}{B} \overset{ˉ}{C} D + \overset{ˉ}{A} \overset{ˉ}{B} C D + \overset{ˉ}{A} BC D$ is shown in the map below.

Grouping

Karnaugh maps contain groups, which are rectangular clusters of 1s -
To simplify a logic expression from a k-map, identify groups from it, making them as large and as few as possible
The number of elements in the group must be a power of 2
Each group can be described by a singular expression
The variables in the group are the ones that are constant within the group (ie, define that group)

Sometimes, groups overlap which allow for more than one expression

The function for the map is therefore either $f = \overset{ˉ}{A} \overset{ˉ}{B} \overset{ˉ}{C} + \overset{ˉ}{A} B D + BC D$ or $f = \overset{ˉ}{A} \overset{ˉ}{B} \overset{ˉ}{C} + \overset{ˉ}{A} \overset{ˉ}{C} D + BC D$ (both are equivalent)

Sometimes it is not possible to minimise an expression. the map below shows an XOR function $f = (A \oplus B) \oplus (C \oplus D)$

Don't Care Conditions

Sometimes, a certain combination of inputs can't happen, or we dont care about the output if it does. An X is used to denote these conditions, which can be assumed as either 1 or 0, whichever is more convenient.

Combinatorial Logic Circuits

Some useful circuits can be constructed using logic gates, examples of which are shown below. Combinatorial logic circuits operate as fast as the gates operate, which is theoretically zero time (realistically, there is a nanosecond-level tiny propagation delay).

1-Bit Half Adder

Performs the addition of 2 bits, outputting the result and a carry bit.

A	B	Sum	Carry
0	0	0	0
0	1	1	0
1	0	1	0
1	1	0	1

1-Bit Full Adder

Adds 2 bits plus carry bit, outputting the result and a carry bit.

Carry in	A	B	Sum	Carry out
0	0	0	0	0
0	0	1	0	1
0	1	0	0	1
0	1	1	1	0
1	0	0	0	1
1	0	1	1	0
1	1	0	1	0
1	1	1	1	1

N-Bit Full Adder

Combination of a number of full adders
The carry out from the previous adder feeds into the carry in of the next

N-Bit Adder/Subtractor

To convert an adder to an adder/subtractor, we need a control input $Z$ such that:
- $Z = 0 \Rightarrow S = A + B$
- $Z = 1 \Rightarrow S = A - B$
$- B$ is calculated using two's complement
- Invert the N bit binary number B by doing $Z \oplus B$
- Add 1 (make the starting carry in a 1)

Encoders & Decoders

A decoder has binary input pins, and one output pin per possible input state
eg 2 inputs has 4 unique states so has 4 outputs
- 3 inputs has 8 outputs
Often used for addressing memory
The decoder shown below is active low
- Active low means that 0 = active, and 1 = inactive
  - Converse to what would usually be expected
- Active low pins sometimes labelled with a bar, ie $\overline{enable}$
It is important to be aware of this, as ins and outs must comform to the same standard

$X_{0}$	$X_{1}$	$Y_{0}$	$Y_{1}$	$Y_{2}$	$Y_{3}$
0	0	0	1	1	1
0	1	1	0	1	1
1	0	1	1	0	1
1	1	1	1	1	0

Encoders are the opposite of decoders, encoding a set of inputs into outputs
Multiple input pins, only one should be active at a time
Active low encoder shown below

$Y_{0}$	$Y_{1}$	$Y_{2}$	$Y_{3}$	$X_{0}$	$X_{1}$
0	1	1	1	0	0
1	0	1	1	0	1
1	1	0	1	1	0
1	1	1	0	1	1

Multiplexers & De-Multiplexers

Multiplexers have multiple inputs, and then selector inputs which choose which of the inputs to put on the output.

$S_{0}$	$S_{1}$	Y
0	0	$X_{0}$
0	1	$X_{1}$
1	0	$X_{2}$
1	1	$X_{3}$

$Y = X_{0} \overset{ˉ}{S}_{0} \overset{ˉ}{S}_{1} + X_{1} \overset{ˉ}{S}_{0} S_{1} + X_{2} S_{0} \overset{ˉ}{S}_{1} + X_{3} S_{0} S_{1}$

De-Multiplexers are the reverse of multiplexers, taking one input and selector inputs choosing which output it appears on. The one shown below is active low

$S_{0}$	$S_{1}$	$Y_{0}$	$Y_{1}$	$Y_{2}$	$Y_{3}$
0	0	A	1	1	1
0	1	1	A	1	1
1	0	1	1	A	1
1	1	1	1	1	A

$Y_{0} = A + \overset{ˉ}{S}_{0} S_{1} + S_{0} \overset{ˉ}{S}_{1} + S_{0} S_{1} = A + S_{1} + S_{0}$

Multiplexers and De-Multiplexers are useful in many applications:

Source selection control
Share one communication line between multiple senders/receivers
Parallel to serial conversion
- Parallel input on X, clock signal on S, serial output on Y

Sequential Logic Circuits

A logic circuit whose outputs are logical functions of its inputs and it's current state

Flip-Flops

Flip-flops are the basic elements of sequential logic circuits. They consist of two nand gates whose outputs are fed back to the inputs to create a bi-stable circuit, meaning it's output is only stable in two states.

$\overset{ˉ}{S}$ and $\overset{ˉ}{R}$ are active low set and reset inputs
$Q$ is set high when $\overset{ˉ}{S} = 0$ and $\overset{ˉ}{R} = 1$
$Q$ is reset (to zero) when $\overset{ˉ}{R} = 0$ and $\overset{ˉ}{S} = 1$
If $\overset{ˉ}{S} = \overset{ˉ}{R} = 1$ then $Q$ does not change
If both $\overset{ˉ}{S}$ and $\overset{ˉ}{R}$ are zero, this is a hazard condition and the output is invalid

$\overset{ˉ}{S}$	$\overset{ˉ}{R}$	Q	P
0	0	X	X
0	1	1	0
1	0	0	1
1	1	X	X

The timing diagram shows the operation of the flip flop

D-Type Latch

A D-type latch is a modified flip-flop circuit that is essentially a 1-bit memory cell.

Output can only change when the enable line is high
$D = Q$ when enabled, otherwise $Q$ does not change ( $Q = Q$ )
When enabled, data on $D$ goes to $Q$

Enable	$D$	$Q$	$\overset{ˉ}{Q}$
0	0	$Q$	$\overset{ˉ}{Q}$
0	1	$Q$	$\overset{ˉ}{Q}$
1	0	0	1
1	1	1	0

Clocked Flip-Flop

There are other types of clocked flip-flop whose output only changes on the rising edge of the clock input.

$↑$ means rising edge responding

N-bit Register

A multi-bit memory circuit built up from d-type latches
The number on $A_{N - 1} A_{N - 2} ... A_{1} A_{0}$ is stored in the registers when the clock rises
The stored number appears on the outputs $Q$
$Q$ cannot change unless the circuit is clocked
Parallel input, parallel output

N-bit Shift Register

A register that stores and shifts bits taking one bit input at a time
Serial input, parallel output
When a clock transition occurs, each bit in the register will be shifted one place
Useful for serial to parallel conversion

N-bit Counter

The circles on the clock inputs are inverted on all but the first
Each flip-flop is triggerd on a high -> low transition of the previous flip-flop
Creates a counter circuit

Output is 0000, 1000, 0100, 1100, 0010, etc...

The first bit swaps every clock
2nd bit swaps every other clock
3rd bit swaps every fourth clock
etc...

Three State Logic

Three state logic introduces a third state to logic - unconnected
A three-state buffer has an enable pin, which when set high, disconnects the output from the input
Used to prevent connecting outputs to outputs, as this can cause issues (short circuits)

This can be used to allow different sources of data onto a common bus. Consider a 4-bit bus, where 2 4-bit inputs are connected using 3-state buffers. Only one of the buffers should be enabled at any one time.

When $\overline{E 1} = 0$ , A will be placed on the bus
When $\overline{E 2} = 0$ , B will be placed on the bus

Physical Implementations

Logic gates are physical things with physical properties, and these have to be considered when designing with them. Typical voltage values for TTL (Transistor-Transistor Logic):

5v - max voltage
2.8v - minimum voltage for a logical 1
2.8-0.8v - "forbidden region", ie voltages in this region are undefined
0.8-0v - voltage range for a logical 0

Propagation Delay

Logic gates have a propagation delay, the amount of time it takes for the output to reflect the input
- Typically a few nanoseconds or less
This limits the speed at which logic circuits can operate
Delay can be reduced by increasing density of gates on an IC

Integrated Circuits

Elementary logic gates can be obtained in small ICs
Programmable deviced allow large circuits to be created inside a single chip
- PAL - Programmable Array Logic
  - One-time programmamble
- PLA - Programmable Logic Array
  - Contains an array of AND and OR gates to implement any logic functions
- FPGA - Field Programmable Gate Array
  - Contains millions of configurable gates
  - More modern

PLA example

A PLA allows for the implementation of any sum-of-products function, as it has an array of AND gates, then OR gates, with fuses that can be broken to implement a specific function.

Assembly

Microprocessor Fundamentals

The CPU

The CPU controls and performs the execution of instructions
Does this by continuously doing fetch-decode-execute cycle
Very complex, but two key components
- Control Unit (CU)
  - Decodes the instructions and handles logistics
- Arithmetic Logic Unit (ALU)
  - Does maths

Fetch-Decode-Execute

Three steps to every cycle
- Fetch instructions from memory
- Decode into operations to be performed
- Execute to change state of CPU
Takes place over several clock cycles

The components of the CPU that are involved in the cycle:

ALU
CU
Program Counter (PC)
- Tracks the memory address of the next instruction to be executed
Instruction Register (IR)
- Contains the most recent instruction fetched
Memory Address Register (MAR)
- Contains address of the memory location to be read/written
Memory Data/Buffer Register (MDR/MBR)
- Contains data fetched from memory or to be written to memory

The steps of the cycle:

Fetch
- Instruction fetched from memory location held by PC
- Fetched instruction stored in IR
- PC incremented to point to next instruction
Decode
- Retrieved instruction decoded
- Establish opcode type
Execute
- CU signals the necessary CPU components
- May result in changes to data registers, ALU, I/O, etc

The 68008

The 68008 is an example of a CPU. The "programmer's model" is an abstraction that represents the internals of the architecture. The internal registers as shown below are part of the programmer's model.

Internal registers are 32 bits wide
Internal data buses are 16 bit wide
8 bit external data bus
20 bit external address bus
D0-D7 are 32 bit registers used to store frequently used values
- Can be long (32 bits), word (16 bits), or byte (8 bits)
Status register (CCR) consists of 2 8-bit registers
- Various status bits are set or reset depending upon conditions arising from execution
A0-A6 are pointer registers
A7 is system stack pointer to hold subroutine return addresses
Operations on addresses do not alter status register/ CCR
- Only ALU can incur changes in status
The stack pointer is a pointer to the next free location in the system stack
- Provides temporary storage of state, return address, registers, etc during subroutine calls and interrupts

The diagram shows the internal architecture of the CPU, and how the internal registers are connected via the buses. Note how and which direction data moves in, as indicated by the arrows on the busses.

Register Transfer Language

The fetch-decode-execute cycle is best described using Register Transfer Language (RLT), a notation used to show how data moves around the internals of a processor and between registers.

For example [MAR] <- [PC] denotes the transfer of the contents of the program counter to the memory address register
Computer's main memory is called Main Store (MS), and the contents of memory location N is denoted [MS(N)]
RLT does not account for the pipelining of instructions
Fetching an instruction in RTL:

RLT	Meaning
`[MAR] <- [PC]`	Move contents of PC to MAR
`[PC] <- [PC] + 1`	Increment PC
`[MBR] <- [MS([MAR])]`	Read address from MAR into MBR.
`[IR] <- [MBR]` -	Load instruction into I
`CU <- [IR(opcode)]`	Decode the instruction

Assembly Language

Assembly is the lowest possible form of code
High level code (for example C) is compiled to assembly code
Assembly is then assembled into machine code (binary)
Assembly instructions map 1:1 to processor operations
Uses mnemonics for instructions, ie MOV or ADD
Languages vary, but format tends to be similar: LABEL: OPCODE OPERAND(S) | COMMENT

An example program is shown below

    ORG  $4B0      | this program starts at hex 4B0
    move.b #5, D0  | load D0 with number 5
    add.b  #$A, D0 | add 10 (0x0A) to D0
    move.b D0, ANS | move contents of D0 to ANS
ANS: DS.B 1        | leave 1 byte of memory empty and name it ANS

# indicates a literal
$ means hexadecimal
% means binary
A number without a prefix is a memory address
ANS is a symbolic name
ORG (Origin) indicates where to load the program in memory
DS (Define Storage) tells the assembler where to put data

The 68008 Instruction Set

Instructions are commands that tell the processor what to do
5 main kinds of instructions
- Logical
  - Bitwise operations
  - AND, LSL (Logical Shift Left)
- Branch
  - Cause the processor to jump execution to a labelled address
  - Condition is specified by testing state of CCR set by previous instruction
  - BRA - branch unconditionally
  - BEQ - branch if equal
- System Control
Instructions are also specified with their data type, .b for byte, .w for word, .l for long
- move.w moves 2 bytes

Data Movement

Similar to RTL

move.b D0,   D1 | [D1(0:7)] <- [D0(0:7)]
move.w D0,   D1 | [D1(0:15)] <- [D0(0:15)]
swap   D2       | swap lower and upper words
move.l $F20, D3  | [D3(24:31)] ← [MS($F20)]
                | [D3(16:23)] ← [MS($F21)]
                | [D3( 8:15)] ← [MS($F22)]
                | [D3( 0:7)] ← [MS($F23)]
                | copied 8 bytes at a time in big endian order

Arithmetic

Maths performed on the ALU
The 68008, like many older processors, has no FPU, so only integer operations are supported

add.l   Di, Dj  | [Dj] ← [Di] + [Dj]
addx.w  Di, Dj  | also add in x bit from CCR
sub.b   Di, Dj  | [Dj] ← [Dj] - [Di]
subx.b  Di, Dj  | also subtract x bit from CCR
mulu.w  Di, Dj  | [Dj(0:31)] ← [Di(0:15)] * [Dj(0:15)]
                | unsigned multiplication
muls.w  Di, Dj  | signed multiplication

Logical

Perform bitwise operations on data
Also done by ALU
AND, OR, etc but also shifts and rotates
Logical shift (LSL/LSR) adds a 0 when shifting
- Bit shifted out goes into C and X
Arithmetic shift preserves sign bit (ASL/ASR)
Normal rotate (ROL/ROR) moves the top of the bit to the bottom bit and also puts the top bit into C and X
Rotate through X (ROXL/ROXR) rotates the value through the X register

AND.B #$7F, D0 | [D0] <- [D0] . [0x7F]
OR.B  D1,  D0 | [D0] <- [D0] + [D1]
LSL D0,    2  | [D0] <- [D0] << [2]

Branch

Cause the processor to move execution to a new pointer (jump/GOTO)
Instruction tests the state of the CCR bits against certain condition
Bits set by previous instructions

BRA | branch unconditionally
BCC | branch on carry clear
BEQ | branch on equal

System Control

Certain instructions used to issue other commands to the microprocessor

Subroutines and Stacks

Subroutines are useful for frequently used sections of code for obvious reasons
Can jump and return from subroutines in assembly
- JSR <label> - Jump to Subroutine
- RTS - Return from Subroutine
When returning, need to know where to return to
The stack is used as a LIFO data structure to store return addresses
JSR pushes the contents of the PC on the stack
RTS pops the return address from the stack to the PC
Can nest subroutine calls and stack will keep track

Addressing Modes

Addressing modes are how we tell the computer where to find the data it needs
5 Kinds in the 68006, and many other processors have equivalents
- Direct
- Immediate
- Absolute
- Address Register Indirect
  - 5 variations
- Relative

Direct Addressing

Probably the simplest
The address of an operand is specified by either a data or address register

move D3, D2 | [D2] <- [D3]
move D3, A2 | [A2] <- [D3]

Immediate Addressing

The operand forms part of the instruction (is a literal) and remains a constant
Note the prefix # specifying a literal and the prefix specifying the base of the number

move.b #$42, D5 | [D5] <- $42

Absolute Addressing

Operand specifies the location in memory
Does not allow for position-independent code: will always access the exact address given

move.l D2, $7FFF0 | [MS(7FFF0)] <- [D2]

Address Register Indirect Addressing

Uses offsets/increments/indexing to address memory based upon the address registers
Bad, rarely used
Not examinable

Relative Addressing

Specifies an offset relative to the program counter
Can be used to write position independent code

move 16(PC), D3 | [D3] <- [MS(PC + 16)]

Memory Systems

The Memory Hierarchy

Memory systems must facilitate the reading and writing of data
Many factors influence the choice of memory technology
- Frequency of access
- Access time
- Capacity
- Cost
Memory wants to be low cost, high capacity, and also fast
As a tradeoff, we organise memory into a hierarchy
- Allows for some high speed, some high capacity

Data has to be dragged up the hierarchy
Memory access is somewhat predictable
Temporal locality - when a location accessed, likely the same location will be accessed again in the near future
Spatial locality - when a location accessed, likely that nearby locations will be referenced in the near future
- 90% of memory access is within 2Kb of program counter

Semiconductor Memory Types

Memory Type	Category	Erasure	Write Mechanism	Volatility
Random Access Memory (RAM)	Read-Write	Electronically, at byte-level	Electronically written	Volatile
Read Only Memory (ROM)	Read only	Not possible	Mask Written	Non-volatile
Programmable ROM (PROM)	Read only	Not possible	Electronically written	Non-volatile
Erasable PROM (EPROM)	Read (mostly)	UV light at chip level	Electronically written	Non-volatile
Electrically Erasable PROM (EEPROM)	Read (mostly)	Electronically, at byte-level	Electronically written	Non-volatile
Flash Memory	Read (mostly)	Electronically, at byte-level	Electronically written	Non-volatile

Particularly interested in random access
RAM is most common - implements main store
- nb that all types shown here allow random access, name is slightly misleading
RAM is also volatile, meaning it is erased when de powered

Cache

If 90% of memory access is within 2Kb, store those 2Kb somewhere fast
Cache is small, fast memory right next to CPU
10-200 times faster
If data requested is found in cache, this is a "cache hit" and provides a big speed improvement
We want things to be in cache
Cache speed/size is often a bigger bottleneck to performance than clock speed

Moore's Law

As said by the co-founder of intel, Gordon Moore, the number of transistors on a chip will double roughly every 18 months
- Less true in recent years
Cost of computer logic and circuitry has fallen dramatically in the last 30 years
ICs become more densely paced
CPU clock speed is also increasing at a similar rate
Memory access speed is improving much more slowly however

Cache Concepts

Caching read-only data is relatively straightforward
- Don't need to consider the possibility data will change
- Copies everywhere in the memory hierarchy remain consistent
When caching mutable data, copies can become different between cache/memory
Two strategies for maintaining parity
- Write through - updates cache and then writes through to update lower levels of hierarchy
- Write back - only update cache, then when memory is replaced copy blocks back from cache

Cache Performance

Cache performance is generally measured by its hit rate. If the processor requests some block of memory and it is already in cache, this is a hit. The hit rate is calculated as

$h = \frac{total number of cache hits}{total number of memory accesses}$

Cache misses can be categorised:

Compulsory - misses that would occur regardless of cache size, eg the first time a block is accessed, it will not be in cache
Capacity - misses that occur because cache is not large enough to contain all blocks needed during program execution
Conflict - misses that occur as a result of the placement strategy for blocks not being fully associative, meaning a block may have to be discarded and retrieved
Coherency - misses that occur due to cache flushes in multiprocessor systems

Measuring performance solely based upon cache misses is not accurate as it does not take into factor the cost of a cache miss. Average memory access time is measured as hit time + (miss rate $\times$ miss penalty).

Cache Levels

Cache has multiple levels to provide a tradeoff between speed and size.

Level 1 cache is the fastest as it is the closest to the cpu, but is typically smallest
- Sometimes has separate instructions/data cache
Level 2 cache is further but larger
Level 3 cache is slowest (but still very fast) but much larger (a few megabytes)
Some CPUs even have a level 4 cache

Different levels of cache exist as part of the memory hierarchy.

Semiconductors

RAM memory used to implement main store
Static RAM (SRAM) uses a flip-flop as the storage element for each bit
- Uses a configuration of flip-flops and logic gates
- Hold data as long as power is supplied
- Provide faster read/write than DRAM
- Typically used for cache
- More expensive
Dynamic RAM (DRAM) uses a capacitor, and the presence to denote a bit
- Typically simpler design
- Can be packed much tighter
- Cheaper to produce
- Capacitor charge decays so needs refreshing by periodically supplying charge
The interface to main memory is a critical performance bottleneck

Memory Organisation

The basic element of memory is a one-bit cell with two states, capable of being read and written. Cells are built up into larger banks with combinatorial logic circuits to select which cell to read/write. The diagram shows an example of a 16x8 memory IC (16 words of 8 bytes).

For a 16x8 memory cell:

4 address inputs
- $lo g_{2} 16$
8 data lines
- word size

Consider alternatively a 1Kbit device with 1024 cells

Organised as a 128x8 array
- 7 address pins
- 8 data pins
Or, could organise as 1024x1 array
- 10 address pins
- 1 data pins
Less pins but very poorly organised
Best to keep memory cells square to make efficient use of space

Error Correction

Errors often occur within computer systems in the transmission of data dude to noise and interference. This is bad. Digital logic already gives a high degree of immunity to noise, but when noise is at a high enough level, this collapses.

Two common ways in which errors can occur:

Isolated errors
- Occur at random due to noise
- Usually singular incidences
Burst errors
- Errors usually occur in bursts
- A short period of time over which multiple errors occur
- For example, a 1ms dropout of a connection can error many bits

Majority Voting

A simple solution to correcting errors
Just send every bit multiple times (usually 3)
- The one that occurs the most is taken to be the true value
Slow & expensive

Parity

Parity adds an extra parity bit to each byte
Two types of parity system
- Even parity
  - The value of the extra bit is chosen to make the total number of 1s an even number
- Odd parity
  - The value of the extra bit is chosen to make the total number of 1s an odd number
7 bit ascii for A is 0100 0001
- With even parity - 0100 0001
- Odd parity - 1100 0001
Can be easily computed in software
Can also be computed in hardware using a combination of XOR gates
- Usually faster than in software
Allows for easy error detection without the need to significantly change the model for communication
Parity bit is computed and added before data is sent, parity is checked when data is received
Note that if there is more than one error, the parity bit will be correct still and the error won't be detected
- Inadequate for detecting bursts of error

Error Correcting Codes

ECCs or checksums are values computed from the entire data
If any of the data changes, the checksum will also change
The checksum is calculated and broadcast with the data so it can be checked on reception
Can use row/column parity to compute an checksum
- Calculate parity of each row and of each column
- Diagram shows how parity bits detect an error in the word "Message"

I/O

Memory Mapped I/O

With memory mapped I/O, the address bus is used to address both memory and I/O devices
Memory on I/O devices is mapped to values in the main address space
When a CPU accesses a memory address, the address may be in physical memory (RAM), or the memory of some I/O device
Advantages
- Very simple
- CPU requires less internal logic
- Can use general purpose memory instructions for I/O
Disadvantages
- Have to give up some memory
  - Less of a concern on 64-bit processors
  - Still relevant in smaller 16 bit CPUs

Polled I/O

Polling is a technique for synchronising communication between devices.
Most I/O devices are much slower than the CPU
Busy-wait polling involves constantly checking the state of the device
- Usually the device replies with nothing
- Can interleave polls with something else useful

Advantages
- Still relatively simple
Disadvantages
- Wastes CPU time and power
- Interleaving can lead to delayed responses from CPU

Synchronisation methods also need some way to transfer the data, so are sometimes used in conjunction with memory-mapped I/O. Methods for synchronising devices and methods for reading/writing data are not directly comparable.

Handshaking

Another form of synchronisation

Computer responds to the printer being ready by placing data on the data bus and signalling DATA_VALID
- Can do this either in hardware or in software
Timing diagram shows data exchange
During periods where both signals are at a logical 0, data is exchanged

Handshaking Hardware

Handshaking is usually done using an external chip, such as the 6522 VIA (Versatile Interface Adapter)

Setting bit values in the PCR (Peripheral Control Register) on the VIA allows to control the function.

Use PORT B as output
CB1 control line as PRINTER_READY
CB2 control line as DATA_VALID
For CB1 and CB2 control, 8 bit register is set to 1000xxxx
- Last 4 bits not used, don't care

Interrupts

Asynchronous I/O
Two kinds of interrupts (in 6502 processor)
- Interrupt Request (IRQ)
  - Code can disable response
  - Sent with a priority
  - If priority lower than that of current task, will be ignored
  - Can become non-maskable if ignored for long enough
- Non-Maskable Interrupt (NMI)
  - Cannot be disabled, must be serviced
An interrupt forces the CPU to jump to an Interrupt Service Routine (ISR)
- Switches context, uses stack to store state of registers
ISRs can be nested
Interrupts usually generated by some external device
- Hard drive can generate an interrupt when data is ready
- A timer can generate an interrupt repeatedly at a fixed interval
- A printer can generate an interrupt when ready to receive data
Advantages
- Fast response
- No wasted CPU time
Disadvantages
- All data transfer still CPU controlled
- More complex hardware/software

Direct Memory Access (DMA)

The CPU is a bottleneck for I/O
All techniques shown so far are limited by this bottleneck
DMA is used where large amounts of data must be transferred quickly
Control of system busses surrendered from CPU to a DMA Controller (DMAC)
- DMAC is a dedicated device optimised for data transfer
Can be up to 10x faster than CPU-driven I/O

DMA Operation

DMA transfer is requested by I/O
DMAC passes request to CPU
CPU initialises DMAC
- Input or Output?
- Start address is put into DMAC address register
- Number of words is put into DMAC count register
- CPU enables DMAC
DMAC requests use of system busses
CPU responds with DMAC ack when ready to surrender busses
DMAC can operate in different modes
- Cycle stealing
  - Uses system busses when they're not being used by CPU
- Burst mode
  - Requires busses for extended period of time, locks the CPU out for a fixed time, until transfer complete, or until CPU receives interrupt from device of higher priority

DMA Organisation

There are multiple ways a DMA can be incorporated into a system:

Single bus, detached DMA
- All modules (DMA, I/O devices, memory, CPU) share system bus
- DMA uses programmed I/O to exchanged data between memory and I/O device
- Straightforward, as DMA can just mimic processor
- Inefficient
Separate I/O bus
- Only one interface to DMA module
- The bus the DMA shares with processor and memory is only used to transfer data to and from memory

Summary

**Memory-mapped **deviced are accessed in the same way as RAM, at fixed address locations
Polled I/O is for scheduling input and output, where the CPU repeatedly checks for data
I/O devices are slow, so handshaking techniques coordinate CPU and device for transfer of data
Interrupts avoid polled I/O by diverting the CPU to a special I/O routine when necessary
A DMA controller can be used instead of the CPU to transfer data into and out of memory, faster than the CPU but at additional hardware cost

Microprocessor Architecture

Computer architecture concerns the structure and properties of a computer system, from the perspective of a software engineer
Computer organisation concerns the structure and properties of a computer system, from the perspective of a hardware engineer

The PATP

The Pedagogically Advanced Teaching Processor is a very simple microprocessor. The specifics of it are not examinable, but it is used to build an understanding of microprocessor architecture.

Programmer's model

The PATP has 8 instructions. Each instruction is 1 8-bit word, with the first 3 bits as the opcode and last 5 as the operand, if applicable.

Opcode	Mnemonic	Macro Operation	Description
000	`CLEAR`	`[D0] <- 0`	Set D0 to 0 (and set `Z`)
001	`INC`	`[D0] <- [D0] + 1`	Increment the value in D0 (and set `Z` if result is 0)
010	`ADD #v`	`[D0] <- [D0] + v`	Add the literal v to D0 (and set `Z` if result is 0)
011	`DEC`	`[D0] <- [D0] - 1`	Decrement the value in D0 (and set `Z` if result is 0)
100	`JMP loc`	`[PC] <- loc`	Jump unconditionally to address location `loc`
101	`BNZ loc`	If `Z` is not 0 then `[PC] <- loc`	Jump to address location `loc` if `Z` is not set
110	`LOAD loc`	`[DO] <- [MS(loc)]`	Load the 8 bit value from address location `loc` to D0
111	`STORE loc`	`[MS(loc)] <- [D0]`	Write the 8 bit value from D0 to address location `loc`

This is not many instructions, but it is technically Turing-complete. The other specs of the PATP are:

An address space of 32 bytes (the maximum address is 11111)
A single 8-bit data register/accumulator D0
A CCR with only 1 bit (Z, set when an arithmetic operation has a result of zero)
A 5-bit program counter (only 5 bits needed to address whole memory)

Internal Organisation

There are several building blocks that make up the internals of the PATP:

The data register D0
- An 8 bit register constructed from D-type flip-flops
- Has parallel input and output
- Clocked

The ALU
- Built around an 8-bit adder/subtractor
- Has two 8-bit inputs P and Q
- Capable of
  - Increment (+1)
  - Decrement (-1)
  - Addition (+n)
- Two function select inputs F1 and F2 which choose the operation to perform
  - 00: Zero output
  - 01: Q + 1
  - 10: Q + P
  - 11: Q - 1
- An output F(P, Q) which outputs the result of the operation
- A Z output for the CCR

The main system bus
- Uses 3-state buffers to enable communication

The control unit
- Controls:
  - The busses (enables)
  - When registers are clocked
  - ALU operation
  - Memory acccess
- Responsible for decoding instructions and issuing micro-instructions
- Inputs
  - Opcode
  - Clock
  - Z register
- Outputs
  - Enables
    - Main store
    - Instruction register IR
    - Program counter
    - Data register D0
    - ALU register
  - Clocks
    - Memory address register MAR
    - Instruction register IR
    - Program counter
    - Data register D0
    - ALU register
  - F1 and F2 on the ALU
  - R/W to control bit for main store

All the components come together like so:

Micro and Macro Instructions

There are several steps internally that are required to execute a single instruction. For example, to execute an INC operation:

D0 need to be put on the system bus
- CU enables the three-state buffer for D0
- [ALU(Q)] <- D0
The correct ALU function must be selected
- F1 = 0, F2 = 1
- Signals asserted by CU
- [ALU(F)] <- 01
The output from the ALU must be read into the ALU register
- ALUreg clocked by CU
- [ALUreg] <- [ALU]
D0 reads in the ALU output from the ALU register
- CU enables the three-state buffer for ALUreg
- D0 is clocked by CU

Macro instructions are the assembly instructions issued to the processor (to the CU, specifically), but micro instructions provide a low level overview of how data is moved around between internals of the CPU and what signals are asserted internally. The PATP can execute all instructions in 2 cycles. The table below gives an overview of the micro operations required for each macro instruction, along with the macro operations for fetching from main store.

Control Signals

The control unit asserts control signals at each step of execution, and the assertion of these control signals determine how data moves internally. For the PATP:

Enable signals are level-triggered
Clock signals are falling edge-triggered
An output can be enabled onto the main bus and then clocked elsewhere in a single time step
ALU timings assume that, if values are enabled at P and Q at the start of a cycle, then the ALU register can be clocked on the falling edge of that cycle
MS timings assume that if MAR is loaded during one cycle, then R, W and EMS can be used in the next cycle

The diagram below shows the timing for a fetch taking 4 cycles, and which components are signalled when. Notice which things happen in the same cycle, and which must happen sequentially.

cycle	Micro-Op	Control Signals
1	`[MAR] <- [PC]`	Enable PC, Clock MAR
2	`[IR] <- [MS(MAR)]`	Set read for MAR, Enable MS, Clock IR
3	`[ALU(Q)] <- [PC]`	Enable PC
3	`[ALU(F) <- 01]`	F1 = 0, F2 = 1
3	`[ALUreg] <- [ALU]`	Clock ALUreg
4	`[PC] <- [ALUreg]`	Enable ALUreg, Clock PC

Control Unit Design

The task of the control unit is to coordinate the actions of the CPU, namely the Fetch-Decode-Execute cycle. It generates the fetch control sequence, takes opcode input, and generates the right control sequence based on this. It can be designed to do this in one of two ways:

Hardwired design (sometimes called "random logic")
- The CU is a combinatorial logic circuit, transforming input directly to output
Microprogrammed
- Each opcode is turned into a sequence of microinstructions, which form a microprogram
- Microprograms stored in ROM called microprogram memory

Hardwired

A sequencer is used to sequence the clock cycles
- Has clock input and n outputs T1 ... Tn
- First clock pulse is output from T1
- Second is output from T2
- Clock pulse n output from Tn
- Pulse n+1 output from T1
This aligns the operation of the circuit with the control steps
Advantages
- Fast
Disadvantages
- Complex, difficult to design and test
- Inflexible, cant change design to add new instructions
- Takes a long time to design
This technique is most commonly used in RISC processors and has been since the 80s

The control signal generator maps each instruction to outputs
The sequencer sequences the outputs appropriately
The flip-flop is used to regulate control rounds

Microprogrammed

The microprogram memory stores the required control actions for each opcode
The CU basically acts as a mini CPU within the CPU
- Microaddress is a location within microprogram memory
- MicroPC is the CU's internal program counter
- MicroIR is the CU's internal microinstruction register
The microPC can be used in different ways depending upon implementation
- Holds the next microaddress
- Holds the microaddress of microroutine for next opcode
When powered initially holds microaddress 0
- The fetch microprogram
Each microinstruction sets the CU outputs to the values dictated the instruction
- As the microprogram executes, the CU generates control signals
After each microinstruction, the microPC is typically incremented, so microinstructions are stepped through in sequence
After a fetch, the microPC is not incremented, but is set to the output from the opcode decoding circuit (labelled OTOA in the diagram)
After a normal opcode microprogram, the microPC is set back to 0 (fetch)
When executing the microprogram for a conditional branch instruction, the microPC value is generated based upon whether the CU's Z input is set

Advantages
- Easy to design and implement
- Flexible design
- Simple hardware compared to alternative
- Can be reprogrammed for new instructions
Disadvantages
- Slower than hardwired
Most commonly used for CISC processors

RISC and CISC

In the late 70s-early 80s, it was shown that certain instructions are used far more than others:

45% data movement (move, store, load)
29% control flow (branch, call, return)
11% arithmetic (add, sub)

The overhead from using a microprogram memory also became more significant as the rest of the processor became faster. This caused a shift towards RISC computing. Right now, ARM is the largest RISC computing platform. Intel serve more for backwards compatibility with a CISC instruction set. In an modern intel processor, simplest instructions are executed by a RISC core, more complex ones are microprogrammed.

RISC has simple, standard instructions whereas CISC has lots of more complex instructions
- x86 is often criticised as bloated
RISC allows for simpler, faster, more streamlined design
RISC instructions aim to be executed in a single cycle
CISC puts the focus on the hardware doing as much as possible, whereas RISC makes the software do the work

Multicore Systems

The performance of a processor can be considered as the rate at which it executes instructions: clock speed x IPC (instructions per clock).
To increase performance, increase clock speed and/or IPC
An alternative way of increasing performance is parallel execution
Multithreading separates the instruction stream into threads that can execute in parallel
A process is an instance of a program running on a computer
- A process has ownership of resources: the program's virtual address space, i/o devices, other data that defines the process
- The process is scheduled by the OS to divide the execution time of the processor between threads
- The processor switches between processes using the stack

CS141

#notacult

Types & Typeclasses

Haskell is a strongly, statically typed programming language, which helps prevent us from writing bad programs.

Java, C, Rust - strongly typed
Python, Ruby - dynamically typed

Types have many benefits:

Describe the value of an expression
Prevent us from doing silly things
- not 7 gives Type Error
Good for documentation
Type errors occur at compile time

GHC checks types and infers the type of expressions for us. Types are discarded after type checking, and are not available at runtime.

Type notation

We say an expression has a type by writing expression :: type, read as "expression has type".

If we can assign a type to an expression, it is "well typed"
A type approximates and describes the value of an expression.

42 :: Int
True :: Bool
'c' :: Char
"Cake" :: String
0.5 :: Double
4 + 8 :: Int
2 * 9 + 3 :: Int
True && False :: Bool
"AB" ++ "CD" :: String
even 9 :: Bool

Before writing a definition, it is good practice to write its type.

daysPerWeek :: Int
daysperWeek = 7

Function Types

The types of functions are denoted using arrows ->. The not function is defined as not :: Bool -> Bool, read "not has type bool to bool". It means if you give me a Bool, I will give you back another Bool.

The definition of the not function is shown below.

not :: Bool -> Bool
not True = False
not False = True
not True :: Bool

The last line shows how function application eliminates function types, as by applying a function to a value, one of the types from the function definition is removed as it has already been applied.

The xor function takes two boolean arguments and is defined:

xor :: Bool -> Bool -> Bool
xor False True = True
xor False False = False
xor True True = False
xor True False = True

Applying one argument to a function that takes two is called partial function application, as it partially applies arguments to a function to return another function. This is because all functions in haskell are curried, meaning all functions actually only take one argument, and functions taking more than one argument are constructed from applying multiple functions with one argument.

xor :: Bool -> Bool -> Bool
xor True :: Bool -> Bool -- partially applied function
xor True False :: Bool

Polymorphic Types

What is the type of \x -> x ? Could be:

f :: Int -> Int
f :: Bool -> Bool
f :: Char -> Char

These are all permissible types. To save redifining a function, we can use type variables. Anything with a single lowercase character is a type variable (a in this case).

\x -> x :: a -> a

\x -> x is the identity function, as it returns its argument unchanged. We can also have functions with more than one type variable, to specify that arguments have different types:

const :: a -> b -> a
const x y = x

Tuples

Tuples are a useful data structure

(4, 7) :: (Int, Int)
(4, 7.0) :: (Int, Double)
('a', 9, "Hello") :: (Char, Int, String)

--can nest tuples
((4, 'g'), False) :: ((Int, Char), Bool)

--can also contain functions
(\x -> x, 8.15) :: (a->a, Double)

Functions on pairs. These are all in the standard library

fst :: (a,b) -> a
snd :: (a,b) -> b
swap :: (a,b) -> (b,a)

-- these functions can also be defined by pattern matching
fst (x,y) = x
snd (x,y) = y
swap (x,y) = (y,x)

Type Classes

Type classes are used for restricting polymorphism and overloading functions.

The (+) operator probably has type (+) :: Int -> Int -> Int,
- This is correct, as this typing is permissible
What about 1.2 + 3.4?
- Will raise an error with this definition of (+)
Can polymorphism help?
(+) :: a -> a -> a
- This is stupid
- Allows any types
- Won't work
A type class constraint is needed
The actual type is (+) :: Num a => a -> a -> a
- The Num a => part is the constraint part
- Tells the compiler that a has to belong to the typeclass Num
Type class constraints are used to constrain type variables to only types which support the functions or operators specified by the type class
Type class names start with an uppercase character
Num is a type class that represents all types which support arithmetic operations

Defining Type Classes

A type class is defined as follows:

class Num a where
    (+) :: a -> a -> a
    (-) :: a -> a -> a
    abs :: a -> a

Num is the name of the type class
a is the type variable representing it in the method typings
The type class contains method signatures for all functions that members of the type class must implement

The type class contains type definitions, but no implementations for the functions. To implement them, we need to tell the compiler which types implement the type class and how they implement the functions in the type class. The Show typeclass tells the compiler that a type can be converted to a string.

-- typeclass definition
class Show a where
    show :: a -> String

-- instance of typeclass for bool type
instance Show Bool where
    show True = "True"
    show False = "False"

The instance definition tells the compiler that Bool is a member of Show, and how it implements the functions that Show defines.

Prelude Type Classes

Num for numbers
Eq for equality operators == /=
Ord for inequality/comparison operators > <= etc
Show for converting things to string
Many More

The REPL makes extensive use of Show to print things. There are no show instances for function types, so you get an error if you try to Show functions. Typing :i in the REPL gets info on a type class. :i Num gives:

class Num a where
  (+) :: a -> a -> a
  (-) :: a -> a -> a
  (*) :: a -> a -> a
  negate :: a -> a
  abs :: a -> a
  signum :: a -> a
  fromInteger :: Integer -> a
  {-# MINIMAL (+), (*), abs, signum, fromInteger, (negate | (-)) #-}
        -- Defined in ‘GHC.Num’
instance Num Word -- Defined in ‘GHC.Num’
instance Num Integer -- Defined in ‘GHC.Num’
instance Num Int -- Defined in ‘GHC.Num’
instance Num Float -- Defined in ‘GHC.Float’
instance Num Double -- Defined in ‘GHC.Float’

Types of Polymorphism

In Java, there are two kinds of polymorphism:

Parametric polymorphism
- (Generics/Templates)
- A class is generic over certain types
- Can put whatever type you like in there to make a concrete class of that type
Subtype polymorphism
- Can do class Duck extends Bird
- Can put Ducks wherever Birds are expected

Haskell has two kinds of polymorphism also:

Parametric polymorphism
- Type variables
- id :: a -> a
- Can accept any type where a is
Ad-hoc polymorphism
- Uses type classes
- double :: Num a => a -> a
- double x = x * 2

Further Uses of Constraints

An example Show instance for pairs:

instance (Show a, Show b) => Show (a,b) Show where
    show (x,y) = "(" ++ show x ++ ", " ++ show y ++ ")"

The (Show a, Show b) => defines a constraint on a and b that they must both be instances of show for them to be used with this instance. The instance is actually defined on the type (a,b).

Can also define that a typeclass has a superclass, meaning that for a type to be an instance of a typeclass, it must be an instance of some other typeclass first. The Ord typeclass has a superclass constraint of the Eq typeclass, meaning something cant be Ord without it first being Eq. This makes sense, as you can't have an ordering without first some notion of equality.

class Eq a => Ord a where
    (<) :: a -> a -> Bool
    (<=) :: a -> a -> Bool

Default Implementations

Type classes can provide default method implementations. For example, (<=) can be defined using the definition of (<), so a default one can be provided using (==)

class Eq a => Ord a where
    (<) :: a -> a -> Bool
    (<=) :: a -> a -> Bool
    (<=) x y = x < y || x == y
    -- or defined infix
    x <= y = x < y || x == y

Derivable Type Classes

Writing type class instances can be tedious. Can use the deriving keyword to automatically generate them, which does the same as manually defining type class instances.

data Bool = False | True
    deriving Eq
data Module = CS141 | CS118 | CS126
    deriving (Eq, Ord, Show)

Certain other typeclasses can be dervied too, by enabling language extensions within GHC. The extension XDeriveFunctor allows for types to include a deriving Functor statement.

Data Types

How do we make our own data types in haskell? Algebraic data types.

Bool is a type
There are two values of type Bool
- True
- False

data Bool = True | False

A type definition consists of the type name Bool and it's data constructors, or values True | False. A type definition introduces data constructors into scope, which are just functions.

True :: Bool
False :: Bool

We can pattern match on data constructors, and also use them as values. This is true for all types.

not :: Bool -> Bool
not True = False
not False = True

More examples:

data Module = CS141 | CS256 | CS263

data Language = PHP | Java | Haskell | CPP

--for this one, the type name and constructor name are separate names in the namespace
data Unit = Unit

-- this one has no values
data Void

Parametrised Data Constructors

Parameters can be added to a data constructor by adding their types after the constructor's name. The example below defines a type to represent shapes. Remember that data constructors are just functions, and can be partially applied just like other functions.

data Shape = Rect Double Double | Circle Double
Rect :: Double -> Double -> Shape
Circle :: Double -> Shape

-- functions utilising the Shape type

-- constructs a square
square x :: Double -> Shape
square x = Rect x x

-- calculates area of a shape using pattern matching on constructors
area :: Shape -> Double
area (Rect w h) = w * h
area (Circle r) = pi * r * r

isLine :: Shape -> Bool#
isLine (Rect 1 h) = True
isLine (Rect w 1) = True
isLine _ = False

-- examples
area (square 4.0)
=> area (Rect 4.0 4.0)
=> 4.0 * 4.0
=> 16.0

area (Circle 5.0)
=> pi * 5.0 * 5.0
=> pi * 25.0
=> 78.53981...

Parametrised Data Types

The Maybe type is an example of a data type parametrised over some type variable a. It exists within the standard library, defined as data Maybe a = Nothing | Just a. This type is used to show that either there is no result, or some type a.

A function using the Maybe type to perform devision safely, returning Nothing if the divisor is 0, and the result wrapped in a Just if the division can be done.

data Maybe a = Nothing | Just a

safediv :: Int -> Int -> Maybe Int
safediv x 0 = Nothing
safediv x y = Just (x `div y)
-- safediv 8 0 => Nothing
-- safediv 8 4 = Just (8 `div` 4) = Just 2

-- this is included in stdlib for extracting the value using pattern matching
fromMaybe :: a -> Maybe a -> a
fromMaybe x Nothing = x
fromMaybe _ (Just x) = x

Null references were invented in the 1960s ... the guy who invented them called them his "billion dollar mistake". The Maybe type is a good alternative, which makes it clear that a value may be absent. Similar concepts exist in other procedural languages (Swift, Rust)

Recursive Data Types

In Haskell, data types can be defined in terms of themselves. An example definition of the natural numbers is shown below, where a number is either zero, or one plus another number.

data Nat = Zero | Succ Nat

Zero :: Nat
Succ :: Nat -> Nat

one = Succ Zero
two = Succ one
three = Succ two

add :: Nat -> Nat -> Nat
add Zero     m = m
add (Succ n) m = Succ (add n m)

mul :: Nat -> Nat -> Nat
mul Zero     m = Zero
mul (Succ n) m = add m (mul n m)

Another example defining binary trees in terms of themselves. A binary tree consists of subtrees (smaller binary trees). This type is parametrised over some type variable a also.

Data BinTree a = Leaf a | Node (BinTree a) (BinTree a)

--converts a binary tree to a list
flatten :: BinTree a -> [a]
flatten (Leaf x)   = [x]
flatten (Node l r) = flatten l ++ flatten r

-- computes the max depth of the tree
depth :: BinTree a -> Int
depth (Leaf _)   = 1
depth (Node l r) = 1 + max (depth l) (depth r)

Type Aliases

Types can be aliased. For example, String has been an alias of [Char] all along.

type String = [Char]

Another example, defining a Predicate type

type Predicate a = a -> Bool

isEven :: Predicate Int
isEven n = n `mod` 2 == 0

isEven' :: (Eq a, Integral a) => Predicate a
isEven' n = n `mod` 2 == 0

Recursion

Recursion is a way of expressing loops with no mutable state, by defining a function in terms of itself. The classic example, the factorial function. Defined mathematically:

$n! = {1 n \times (n - 1)! if n = 0 otherwise$

In haskell:

factorial :: Int -> Int
factorial 0 = 1
factorial n = n * factorial (n-1)

It can be seen how this function reduced when applied to a value:

factorial 2
=> 2 * factorial (2-1)
=> 2 * factorial 1
=> 2 * 1 * factorial (1-1)
=> 2 * 1 * factorial 0
=> 2 * 1 * 1
=> 2

Another classic example, the fibonacci function:

fib :: Int -> Int
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-1)

In imperative languages, functions push frames onto the call stack every time a function is called. With no mutable state, this is not required so recursion is efficient and can be infinite.

Haskell automatically optimises recursive functions to make execution more efficient:

fac' :: Int -> Int -> Int
fac' 0 m = m
fac' n m = fac' (n-1) (n*m)

This version of the function prevents haskell from building up large expressions:

fac 500
=> fac' 500 1
=> fac' (500-1) (500*1)
=> fac' 499 500
=> fac (499-1) (499 * 500)
=> fac' 498 249500

Notice the pattern for all recursive functions, where there is a recursive case, defining the function in terms of itself, and a base case. Without a base case, the function would recurse infinitely. The cases are usually defined as pattern matches.

Recursion on Lists

Recursion is the natural way to operate on lists in haskell. Defining the product function, which returns the product of all the items in the list:

product :: [Int] -> Int
product [] = 1
product (n:ns) = n * product ns

Here, the base case is the empty list [] and pattern match is used to "de-cons" the head off the list and operate on it (n:ns). The function reduces as follows:

product [1,2,3,4]
=> 1 * product [2,3,4]
=> 1 * 2 * product [3,4]
=> 1 * 2 * 3 * product [4]
=> 1 * 2 * 3 * 4 * product []
=> 1 * 2 * 3 * 4 * 1
=> 24

`let` and `where`

let and where clauses can be used to introduct local bindings within a function, which are useful in defining recursive functions. the splitAt function, which splits a list into two at a certain index.

splitAt :: Int -> [a] -> ([a],[a])
splitAt 0 xs = ([],xs)
splitAt n [] = ([],[])
splitAt n (x:xs) = (x:ys, zs)
    where (ys,zs) = splitAt (n-1) xs
-- alternatively
splitAt n xs =
  let
    ys = take n xs
    zs = drop n xs
  in (ys,zs)

let and where can also define functions locally, as everything in haskell is a function.

Higher Order Functions

Higher order functions are functions which operate on functions.

Associativity of functions

Function expressions associate to the right (one argument is applied at a time)

xor a b = (a || b ) && not (a && b)
-- equivalent to
xor = \a -> \b -> (a || b) && not (a && b)
-- equivalent to
xor = \a -> (\b -> (a || b) && not (a && b))

All functions in haskell are technically nameless, single-parameter functions
Currying allows for functions which return other functions
Functions are expressions
- The body of a function is an expression
When a function is applied to an argument it reduces to it's body.

Function application associates to the left:

xor True True
=> (xor True) True
=> ((\a -> (\b -> (a || b) && not (a && b))) True) True
=> (\b -> (True || b) && not (True && b)) True
=> (True || True) && not (True && True)

Function types, however, associate to the right:

xor :: Bool -> Bool -> Bool
xor = \a -> \b -> (a || b) && not (a && b)
--equivalent to
xor :: Bool -> (Bool -> Bool)
xor = xor = \a -> (\b -> (a || b) && not (a && b))

The table below shows how functions application and types associate:

Without Parentheses	With Parentheses
`f x y`	`(f x) y`
`\x -> \y -> ...`	`\x -> (\y -> ...)`
`Int -> Int -> Int`	`Int -> (Int -> Int)`

Functions as Arguments (`map`)

Haskell functions can be taken as arguments to other functions. Functions that take/return functions are called higher order functions. An example, increasing every element of a list by one:

incByOne :: [Int] -> [Int]
incByOne xs = [x+1 | x <- xs]
-- or using recursion
incByOne [] = []
incByOne (x:xs) = x+1 : incByOne xs

All this function does is applies the function (+ 1) to every element. This pattern can be generalised using the map function: a function that applies a function given as an argument to every element of a list:

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs

Note the type signature of the map function is map :: (a -> b) -> [a] -> [b], meaning the first argument is a function of type (a -> b). Using this to implement incByOne:

incByOne = map (+1)
-- tracing it's evaluation:
incByOne [1,2,3]
=> map (+1) [1,2,3]
=> (1+1) : map (+1) [2,3]
=> (1+1) : (1+2) : map (+1) [3]
=> (1+1) : (1+2) : (1+3) : map (+1) []
=> (1+1) : (1+2) : (1+3) : []
=> [2,3,4]

Effectively, map f [x, y, z] evaluates to [f x, f y, f z]

Sections

Sections are partially applied operators. Operators are functions like any other, and as such can be partially applied, passed as arguments, etc. The addition operator is shown as an example, but the same applies to any binary operator.

(+) :: Num a => a -> a -> a
(+ 4) :: Num a => a -> a
(4 +) :: Num a => a -> a
(+) 4 8 = 4 + 8
(+ 4) 8 = 8 + 4
(4 +) 8 = 4 + 8

`Filter`

Filter is an example of another higher order function, which given a list, returns a new list which contains only the elements satisfying a given predicate.

filter :: (a -> Bool) -> [a] -> [a]
filter p [] = []
filter p (x:xs)
    | p x       = x : filter p xs
    | otherwise =     filter p xs

Some examples:

-- remove all numbers less than or equal to 42
greaterThan42 :: (Int -> Bool) -> [Int] -> [Int]
greaterThan42 xs = filter (>42) xs
-- only keep uppercase letters
uppers :: (Char -> Bool) -> String -> String
uppers xs = filter isUpper xs

Curried vs Uncurried

Tuples can be used to define uncurried functions. A function that takes two arguments can be converted to a function that takes an a tuple of two arguments, and returns a single argument/

uncurriedAdd :: (Int, Int) -> Int
uncurriedAdd (x, y) = x + y

There are higher-order functions, curry and uncurry, which will do this for us:

curry :: ((a,b) -> c) -> a -> b -> c
curry f x y = f (x,y)

uncurry :: (a -> b -> c) -> (a,b) -> c
uncurry f (x,y) = f x y

-- examples
uncurriedAdd :: (Int, Int) -> Int
uncurriedAdd = uncurry (+)

curriedAdd :: Int -> Int -> Int
curriedAdd = curry uncurriedAdd

addPairs :: [Int]
addPairs = map (uncurry (+)) [(1, 2), (3, 4)]

Folds

foldr and foldl "collapse" a list by applying a function f to each element in the list in turn, where the first argument is an accumulated value, and the second is the starting value passed. There are several functions which follow this pattern, all reducing a list to a single value using recursion:

-- and together all bools in the list
and :: [Bool] -> Bool
and [] = True
and (b:bs) = ((&&) b) (and bs)

-- product of everything in the list
product :: Num a => [a] -> a
product [] = 1
product (n:ns) = ((*) n) (product ns)

-- length of list
length :: [a] -> Int
length [] = 0
length (x:xs) = ((+) 1) (length xs)

All of these functions have a similar structure, and can be redefined using foldr:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z []     = z
foldr f z (x:xs) = f x (foldr f z xs)

-- examples
and :: [Bool] -> Bool
and = foldr (&&) True

product :: Num a => [a] -> a
product = foldr (*) 1

length :: [a] -> Int
length = foldr (\x n -> n + 1) 0

In essence, foldr f z [1, 2, 3] is equal to f 1 (f 2 (f 3 z)). foldr folds from right (r) to left, starting by applying the function to the last element of the list first. foldl, however, works in the opposite direction:

foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs

foldl f z [1, 2, 3] is equal to f (f (f z 1) 2) 3. For some functions (commutative ones), there is no difference, but often the choice of which to use is important.

Function Composition

In haskell, functions are composed with the (.) operator, a higher order function defined as:

(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g x = f (g x)

Function composition is used to chain functions, so instead of f (g (h x)), you can write f.g.h x. An example, defining a function count to count the number of occurrences of an element in a list:

count :: Eq a => a => [a] -> Int
count _ [] = 0
count y (x:xs)
    | y == x    = 1 + count y xs
    | otherwise =     count y xs

--alternatively, using a fold
count y = foldr (\x l -> if y==x then 1+l else l) 0

-- the stdlib can do this
count y x = length (filter (==y) xs)
count y = length . filter (==y) -- using composition

Lazy Evaluation

Evaluation Strategies

How are programs evaluated? There are a number of strategies for evaluating a program. For example, the expression (4+8) * (15 + 16) can be evaluated in different ways:

(4+8) * (15 + 16)
=> 12 * (15+16)
=> 12 * 31
=> 372

-- or

(4+8) * (15 + 16)
=> (4 + 8) * 31
=> 12 * 31
=> 372

The final value when reducing an expression (it cannot be reduced further) is the normal form, 372 in this case. No matter how the expression is reduced, the normal form is the same. Haskell's type system prevents us from writing anything that cannot reduce to normal form.

A sub-expression (anything partially reduced that can still be reduced further) is called a redex, short for reducible expression. Evaluation strategies only matter when there are multiple redexes, otherwise there is only one route we can take to evaluate an expression.

Strict Evaluation

A programming language is strict if the arguments of the function are evaluated before the function is called.

Evaluating fac 500 using a strict method:

fac :: Int -> Int
fac n = fac' n 1

fac' :: Int -> Int -> Int
fac n m = case n of
  0 -> m
  _ -> fac' (n-1) (n*m)

fac 500      -- a redex, function application
=> fac' 500 1   -- another redex
=> fac' (500-1) (500*1)     -- 3 redexes, two multiplications and function application
=> fac' 499 (500*1)     -- two redexes now as 500-1=499 is now in normal form
=> fac' 499 500         -- now only one redex
=> fac' (499-1) (499*500) -- back to 3 redexes
... -- this goes on for a while

Call-by-value means that all function arguments are reduced to their normal forms (values), and then passed as such to the function. The call-by-value strategy is an example of strict evaluation. This is the evaluation strategy used by most programming languages: Java, JS, PHP, C/C++, OCaml, F#, Python, Scala, Swift. Note that some of these are also functional languages.

Haskell, on the other hand, is far superior. It is non-strict: aka lazy.

Call-by-name

A non-strict evaluation strategy by which expressions given to functions as arguments are not reduced before the function call is made.
Expressions are only reduced when their value is needed. Same example as before:

fac 2
=> fac' 2 1  -- still a redex here
=> case 2 of
     0 -> 1
     _ -> fac' (2-1) (2*1)   -- the function call is expanded to its expression
=> fac' (2-1) (2*1) -- left with 3 redexes now
=> case 2-1 of
     0 -> 2*1
     _ -> fac' ((2-1)-1) ((2-1) * (2*1)) -- a lot of redexes, but we don't need to know the value of any except the one in the case expression. this one is evaluated but not the others
=> case 1 of
     0 -> 2*1
     _ -> fac' ((2-1)-1) ((2-1) * (2*1)) -- something actually got evaluated, as we needed it's value. we still have a lot of redexes though

Note how that the same argument ((2-1)) is there 3 times, but it is only evaluated when it is needed. This means that it is evaluated possibly more than once, as it may be needed more than once at different points. With call-by-value (strict), an expression is only reduced once but will only ever be reduced once, but with call-by-name (lazy), expressions may end up being evaluated more than once.

Sharing avoids duplicate evaluation. Arguments to functions are turned into local definitions, so that when an expression is evaluated, any expressions that are identical are also evaluated. The same example again, using both call-by-name and sharing:

fac' :: Int -> Int -> Int
fac' n m = case n of
  0 -> m
  _ -> let x = n-1
           y = n*m
       in fac' x y

-- the compiler has replaced the expression arguments with let-bound definitions

fac 2
=> fac' 2 1
=> case 2 of
     0 -> 1
     _ -> let x0 = 2-1
              y0 = 2*1
          in fac' x0 y0 --expressions bound to variables

=> let x0 = 2-1
       y0 = 2*1 -- two redexes
   in fac' x0 y0
=> let x0 = 2-1
       y0 = 2*1
   in case x0 of
        0 -> y0
        _ -> let x1 = x0-1
                 y1 = x0 * y0
            in fac' x1 y1 -- even more redexes and bindings
    -- x0 can be replaced by 1, which evaluates the expresion in all places where x0 is used

Can think of let or where bindings as storing expressions in memory in such a way that we can refer to them from elsewhere using their names.

The combination of call-by-name and sharing is known as lazy evaluation, which is the strategy haskell uses. Nothing is evaluated until it is needed, and work is only ever done once. (Strict evaluation is done sometimes if the compiler decides to, so it is technically non-strict instead of lazy.)

Evaluation in Haskell

An example, using haskell's lazy evaluation strategy:

length (take 2 (map even [1,2,3,4]))
=> length (take 2 (even 1 : map even [2,3,4])) -- check argument is non-empty list
=> length (even 1 : take (2-1) (map even [2,3,4])) -- even 1 cons'd to take 1 of map
=> 1 + length (take (2-1) (map even [2,3,4])) --know length is at least 1, take out
=> 1 + length(take 1 (map even [2,3,4]))
=> 1 + length (take 1 (even 2 : map even [3,4])) --another map call
=> 1 + (1 + length (take (1-1) (map even [3,4])) -- length again
=> 1 + (1 + length []) --take 0 so empty list
=> 1 + 1 + 0 -- return 0
=> 2 -- done

Note how half the map wasn't evaluated, because haskell knew we only cared about the first 2 elements. However this trace doesn't show any of the internal bindings haskell makes for sharing expressions. The compiler does this by transforming the expression:

length (take 2 (map even [1,2,3,4]))
-- becomes
let
  xs = take 2 (map even [1,2,3,4])
in length xs
-- becomes
let
  ys = map even [1,2,3,4]
  xs = take 2 ys
in length xs
-- becomes
let
  ys = map even (1:(2:(3:(4:[]))))
  xs = take 2 ys
in length xs
-- finally
let
  zs4 = 4:[]
  zs3 = 3:zs4
  zs2 = 2:zs3
  zs  = 1:zs2
  ys  = map even zs
  xs  = take 2 ys
in length xs

In this representation, everything is let bound it it's own definition, and nothing is applied except to some literal or to another let bound variable. The representation in memory looks something like this:

These things in memory are called closures. A closure is an object in memory that contains:

A pointer to some code that implements the function it represents (not shown)
A pointer to all the free variables that are in scope for that definition
- A free variable is any variable in scope that is not a parameter

The closures form a graph, where the closures all point to each other.

Another example, using map:

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs

-- removing all syntactic sugar, done by compiler

map = \f -> \arg ->
  case arg of
    []      -> []
    (x: xs) -> let
                 y  = f x
                 ys = map f xs
                in (y:ys)

Using this definition of map to evaluate the expression from before (length (take 2 (map even [1,2,3,4]))):

let
  zs4 = 4:[]
  zs3 = 3:zs4
  zs2 = 2:zs3
  zs  = 1:zs2
  xs  = map even zs
  ys  = take 2 xs
in length ys
-- new closures allocated by map, using 2nd case of map function
let
  zs4 = 4:[]
  zs3 = 3:zs4
  zs2 = 2:zs3
  zs  = 1:zs2
  y0 = even 1
  ys0 = map even zs2 -- new closures
  xs  = y0 : ys -- updated to be a cons cell
  ys  = take 2 xs
in length ys

The graph of closures representing this:

Strictness in Haskell

Things can be evaluated strictly in haskell, if you want. This is prefereable in some cases for performance reasons. The \$! operator forces strict function application. The version of the function below forces the recursive call to be evaluated first.

fac' :: Int -> Int -> Int
fac' 0 m = m
fac' n m = (fac' \$! (n-1)) (n*m)

Infinite Data Structures

Laziness means data structures can be infinite in haskell. This is also facilitated by the lack of call stack, as there is no "max recursion depth" like in strict languages.

from :: Int -> [Int]
from n = n : from (n+1)

This function builds an infinite list of a sequence of Ints, starting with the Int passed. An example usage, showing how lazy evaluation works with it:

take 3 (from 4)
=> take 3 (4 : from 5)
=> 4 : take 2 (from 5)
=> 4 : take 2 (5 : from 6)
=> 4 : 5 : take 1 (from 6)
=> 4 : 5 : take 1 (6 : from 7)
=> 4 : 5 : 6 : take 0 (from 7)
=> 4 : 5 : 6 : []
=> [4,5,6]

The infinite evaluation is short-circuited, as the compiler knows it only needs the first 3 elements.

Reasoning About Programs

Haskell can use normal software testing methods to verify correctness, but because haskell is a pure language, we can do better and formally prove properties of our functions and types.

Natural Numbers

Natural numbers can be defined as data Nat = Z | S Nat in haskell. Alternatively, using mathematical notation, this can be written $0 \in N \forall n \in N, n + 1 \in N$ . Addition can then be defined recursively:

add :: Nat -> Nat -> Nat
add Z     m = m
add (S n) m = S (add n m)

Addition has certain properties which must hold true:

Left identity: ∀m :: Nat, add Z m == m
- $0 + m = m$
Right identity: ∀m :: Nat, add m Z == m
- $m + 0 = m$
Associativity: ∀x y z :: Nat, add x (add y z) == add (add x y) z
- $x + (y + z) = (x + y) + z$

These can be proven using equational reasoning, which proves that an equality holds in all cases. Generally, either a property can be proved by applying and un-applying either side of an equation, and/or by induction.

To prove the left identity is easy, as it is an exact match of one of our equations for add:

add Z m
-- applying add
= m

The right identity is a little harder, as we can't just directly apply one of our equations. We can instead induct on m. First, the base case:

add Z Z
-- applying add
= Z

Using the induction hypothesis add m Z = m, we need to show the inductive step holds for S m (m+1):

add (S m) Z
-- applying add
= S (add m Z)
-- applying induction hypothesis
= S m

This proves the right identity. To prove associativity we will again use induction, this time on x. The base case is add Z (add y z):

add Z (add y z)
-- applying add
= add y z
-- un-applying add
= add (add Z y) z

The proof holds for x = Z. Here, the proof was approached from either end to meet in the middle, but written as a single list of operations for clarity. Sometimes it is easier to do this and work from either direction, especially when un-applying functions as it is more natural.

The induction hypothesis is add x (add y z) == add (add x y) z, and can be assumed. We need to prove the inductive step add (S x) (add y z) == add (add (S x) y) z:

add (S x) (add y z)
-- applying add
= S (add x (add y z))
-- applying induction hypothesis
= S (add (add x y ) z)
-- un-applying add
= add (S (add x y)) z
-- un-applying add
= add (add (S x) y) z

This proves associativity.

Induction on Lists

We can induct on any recursive type, including lists: data List a = Empty | Cons a (List a). Using this definition, we can prove map fusion. Map fusion states that we can turn multiple consecutive map operations into a single one with composed functions:

map f (map g xs) = map (f.g) xs
- ∀f :: b -> c
- ∀g :: a -> b
- ∀xs :: [a]

The definitions of map and . may be useful:

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs

(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g x = f (g x)

Map fusion can be proved by induction on xs. The base case is map f (map g []) = map (f.g) []:

map f (map g [])
-- applying map
= map f []
-- applying map
= []
-- un-applying map
= map (f.g) []

Using the induction hypothesis map f (map g xs) = map (f.g) xs, we can prove the inductive case map f (map g (x : xs)) = map (f.g) (x : xs):

map f (map g (x : xs))
-- applying map
= map f (g x : map g xs)
-- applying map
= f (g x) : map f (map g xs)
-- induction hypothesis
= f (g x) : map (f.g) xs
-- un-applying (.)
= (f.g) x : map (f.g) xs
-- un-applying map
= map (f.g) (x : xs)

Proving a Compiler

Given a simple expression language:

data Expr = Val Int | Plus Expr Expr

And a simple instruction set:

data Instr = Push Int | Add
type Program = [Instr]
type Stack = [Int]

We can write an exec function as an interpreter for our instruction set:

exec :: Program -> Stack -> Stack
exec []                    s  = s
exec (Push n : p)          s  = exec p (n : s)
exec (Add    : p) (y : x : s) = exec p (x + y : s)

An eval function to evaluate our expressions:

eval :: Expr -> Int
eval (Val n)    = n
eval (Plus l r) = eval l + eval r

And a comp function as a compiler for our Expr language to our Instr instruction set:

comp :: Expr -> Program
comp (Val n) = [PUSH n]
comp (Plus l r) = comp l ++ comp r ++ [ADD]

Our compiler will be considered correct if for any expression, evaluating it yields the same result as compiling and then executing it:

∀ e :: Expr, s :: Stack . eval e : s == exec (comp e) s

This can be proved by induction on e. The base case for Expr is for Vals, and we want to show that eval (Val n) s == exec (comp (Val n)) s. This time, we start with the RHS:

exec (comp (Val n)) s
-- applying comp
= exec [Push n] s
-- applying exec
= exec [] (n : s)
-- applying exec
= (n : s)
-- unappplying eval
= eval (Val n) s

Our inductive case to be proved is eval (Plus l r) s == exec (comp (Plus l r)) s. Since the Plus constructor has two values of type Expr, there are two induction hypotheses:

for l: eval l : s == exec (comp l) s
for r: eval r : s == exec (comp r) s

exec (comp (Plus l r)) s
-- applying comp
= exec (comp l ++ comp r ++ [Add]) s
-- distributivity of (++)
= exec (comp l ++ (comp r ++ [Add])) s
-- distributivity lemma
= exec (comp r ++ [Add]) (exec (comp l) s)
-- distributivity lemma
= exec [Add] (exec (comp r) (exec (comp l) s))
-- induction hypothesis
= exec [Add] (exec (comp r) (eval l : s))
-- induction hypothesis
= exec [Add] (eval r : (eval l : s))
-- applying exec
= exec [] ((eval l + eval r) : s)
-- applying exec
= (eval l + eval r) : s
-- un-applying exec
= eval (Plus l r) s

The proof holds, but relies on a lemma proving the distributivity of the exec function, which states that executing a program where a list of instructions xs is followed by a list of instructions ys is the same as first executing xs and then executing ys with the stack that results from executing xs: ∀ xs ys::Program, s::Stack . exec (xs++ys) s == exec ys (exec xs s).

This can be proved by induction on xs. The base case is the empty list []: exec ([] ++ ys) s == exec ys (exec [] s):

exec ys (exec [] s)
-- applying exec
= exec ys s
-- un-applying (++)
= exec ([] ++ ys) s

The induction hypothesis is exec (xs++ys) s == exec ys (exec xs s). The inductive step is exec ((x : xs) ++ ys) s == exec ys (exec (x : xs) s). As x could be either Push x or Add, we perform case analysis on x, first with the case where x = Push n:

exec ys (exec (Push n : xs) s)
-- applying exec
= exec ys (exec xs (n : ns))
-- induction hypothesis
= exec (xs ++ ys) (n : s)
-- un-applying exec
= exec (Push n : (xs ++ ys)) s
-- un-applying (++)
= exec ((Push n : xs) ++ ys) s

The inductive step holds for the Push n case. The Add case:

exec ys (exec (Add : xs) s)
-- assuming stack has at least 2 elements
exec ys (exec (Add : xs) (b : a : s'))
-- applying exec
exec ys (exec xs (a + b : s'))
-- induction hypothesis
exec (xs ++ ys) (a + b : s')
-- un-applying exec
exec (Add : (xs ++ ys)) (b : a : s')
-- un-applying (++)
exec ((Add : xs) ++ ys) (b : a : s')
-- assumption
exec ((Add : xs) ++ ys) s

This proves the inductive case for the Add instruction, and therefore the proof for the distributivity of exec lemma, which supported our initial proof of the correctness of our compiler.

Functors & Foldables

The `\$` Operator

The \$ operator is an operator for function application. It has signature:

(\$) :: (a -> b) -> a -> b
f \$ x = f x

At first it doesn't look like it does much, but it is actually defined as infixr 0 meaning it is:

An infix operator with right associativity
Has the lowest precedence possible.

In contrast, normal function application is left associative and has the highest precedence possible. Practically, this means it can be used where you would otherwise have to use parentheses, to make code a lot cleaner. Some examples:

-- elem finds if an item x is contained in the list xs
elem :: Eq a => a -> [a] -> Bool
elem x xs = not (null (filter (==x) xs))
-- rewritten, without parentheses
elem x xs = not \$ null \$ filter (==x) xs
-- or using function composition (.)
elem x = not . null . filter (==x)

Another example, shown along with a trace of it's reduction:

map (\$ 4) [even, odd]
=> (even $ 4) : map (\$ 4) [odd]
=> (even \$ 4) : (odd \$ 4) : []
=> True : (odd \$ 4) : []
=> True : False : []
=> [True, False]

Foldables

It has already been shown how many examples of recursive functions can be rewritten with a fold. folding is a an example of a useful design pattern in functional programming.

A Trip to Michael's Tree Nursery

Binary trees are recursive data structures, that can be recursively operated on (much like lists). The example below shows a simple definition of a binary tree along with some functions to operate on it.

-- our binary tree type
data BinTree a = Leaf | Node (BinTree a) a (BinTree a)
 deriving Show

-- simple recursive functions
-- how big is the tree?
size :: BinTree a -> Int
size Leaf = 0
size Node (l _ r) = 1 + size l + size r

-- is x contained within the tree?
member:: Eq a => a -> BinTree a -> Bool
member _ Leaf = False
member x (Node l y r) = x == y || member x l || member x r

-- what is the sum of all the Nums in the tree
tsum :: Num a => BinTree a -> a
tsum Leaf =0
tsum (Node l n r) = n + tsum l + tsum r

These are all recursive functions operating on a tree, and can be generalised by defining our own version of a fold for trees, dubbed toldr. Note the similarities between foldr and toldr.

toldr :: (a -> b -> b) -> b -> BinTree a -> b
toldr f z Leaf = z
toldr f z (Node l x r) = f x (toldr f (toldr f z r) l)

tsum :: Num a => BinTree a -> a
tsum = toldr (+) 0

member :: Eq a => a -> BinTree a -> Bool
member x = toldr (\y r -> x==y || r) False

size :: BinTree a -> Int
size = toldr(\_ r -> 1 + r) 0

The `Foldable` Typeclass

This abstraction does actually exist in the standard libary, as a typeclass. A type can be an instance of Foldable (like lists), which then allows foldr to be used on it.

class Foldable t where
  foldr :: (a -> b -> b) -> b -> t a -> b

-- for lists
-- exists in prelude
instance Foldable [] where
  foldr f z [] = z
  foldr f z (x:xs) = f x (foldr f z xs)

-- for our bintree
instance Foldable BinTree where
  foldr _ z Leaf         = z
  foldr f z (Node l x r) = f x (foldr f (foldr f z r) l)

This instance of Foldable for BinTree can now be used to generalise our functions that operate on it:

sum :: (Foldable t, Num a) => t a -> t
sum = foldr (+) 0

elem :: (Foldable t, Eq a) => a -> t a -> Bool
elem x = foldr (\y r -> x==y || r) False

length :: Foldable t => t a -> Int
length = foldr (\_ r -> 1 + r) 0

These methods are actually part of the Foldable typeclass, so when defining an instance of Foldable on some type, you get them for free, and they are polymorphic over all foldable types.

Foldable is also a derivable typeclass using the language extension -XDeriveFoldable, so all of this can be derived automatically.

Functors

Bringing back our safediv function from previously:

data Maybe a = Nothing | Just a

safediv :: Int -> Int -> Maybe Int
safediv _ 0 = Nothing
safediv x y = Just (x `div` y)

divAndAdd :: Int -> Int -> Maybe Int
divAndAdd x y = 5 + safediv x y -- doesn't work, type error

-- using a case statement
divAndAdd x y = case safediv x y of
  Nothing -> Nothing
  Just r -> Just (5+r)
-- bit messy

The pattern of applying a function a value within a Maybe can be generalise. Defining a function pam to do this for us:

pam :: (a -> b) -> Maybe a -> Maybe b
pam _ Nothing = Nothing
pam f (Just x) = Just (f x)

-- much nicer!
divAndAdd :: Int -> Int -> Maybe Int
divAndAdd x y = pam (5+) (safediv x y)

It would be nice if there was some way to generalise the pattern of applying a function to element(s) in a container. The Functor typeclass does this for us. A type is a functor if we can apply a function to it. Lists are functors, as that is what the map function does. Maybe and BinTrees are also functors.

class Functor f where
  fmap :: (a -> b) -> f a -> f b

instance Functor [] where
  fmap = map

instance Functor Maybe where
  fmap f Nothing = Nothing
  fmap f (Just x) = Just (f x)

instance Functor BinTree where
  fmap f (Leaf x) = Leaf (f x)
  fmap f (Node lr ) = Node (fmap f l) (fmap f r)

Functors can be thought of as "boxes", and when given a function, will apply it to the value in the box, and return the result in the same box. Some examples of definitions using functors:

-- increases all Ints in the "box" by 5
incByFive :: Functor f => f Int -> f Int
incByFive = fmap (+5)

-- applies the odd function to all Ints in the box
odds :: Functor f => f Int -> f Bool
odds = fmap odd

-- redefining using fmap
divAndAdd :: Functor f => Int -> Int -> Maybe Int
divAndAdd x y = fmap (5+) (safediv x y)

Functor is also another typeclass that can be derived by GHC, using the -XDeriveFunctor extension.

The `<\$>` Operator

An operator that is essentially just an infix version of the fmap function.

infixl 4 <\$>
(<\$>) :: Functor f => (a -> b) -> f a -> f b
(<\$>) = fmap

fmap (replicate 6) (safediv 8 4)
== replicate 6 <\$> safediv 8 4
=> Just [2,2,2,2,2,2]


-- redefining using <\$>
divAndAdd :: Functor f => Int -> Int -> Maybe Int
divAndAdd x y = (5+) <\$> (safediv x y)

Functor Laws

There are certain laws that functors must obey for their properties to hold. A type f is a functor if there exists a function fmap :: (a-> b) -> f a -> f b , and the following laws hold for it:

fmap id = id
- If the values in the functor are mapped to themselves, the result will be an unmodified functor
fmap (f.g) = (fmap f) . (fmap g)
- The fusion law
- If two fmaps are applied one after the other, the result must be the same as a single fmap which applies the two functions in turn
These laws imply that a data structure's "shape" does not change when fmapped

Applicative Functors

Kinds

For the compiler to accept a program, it must be well typed
Kinds are the "types of types"
Types are denoted with expression :: type
- eg True :: Bool
Kinds are denoted the same: type :: kind
- Bool :: *
The compiler infers kinds of types the same way it infers types of expressions
* is the kind of types
Bool :: * because Bool has no type parameters
- data Bool = True | False
Maybe is parametrised over some type a, so the kind signature Maybe :: * -> * means that if given a type as an argument to the type constructor Just, it will give back some other type of kind *
[] :: * -> *
- [] is the type constructor for lists

Kinds are important when defining typeclasses. Take Functor, for example:

class Functor f where
  fmap :: (a -> b) -> f a-> f b

This definition shows that the type f is applied to one argument (f a), so f :: * -> *

-- Maybe :: * -> *
instance Functor Maybe where
  fmap f Nothing = Nothing
  fmap f (Just x) = Just (f x)

-- invalid
-- Maybe a :: *
-- As the type is already applied to a
instance Functor (Maybe a) where
  fmap f Nothing = Nothing
  fmap f (Just x) = Just (f x)

The `Either` Type

Either is usually used to represent the result of a computation when it could give one of two results. Right is used to represent success, and a is the wanted value. Left is used to represent error, with e as some error code/message.

data Either e a = Left e | Right a
Left :: e -> Either e a
Right :: a -> Either e a

Either has kind * -> * -> *, as it must be applied to two types e and a before we get some other type.

Only types of kind * -> * can be functors, so we need to apply Either to one argument first. The functor instance for Either applies the function to the Right value.

instance Functor (Either e) where
  fmap :: (a -> b) -> Either e a -> Either e b
  fmap f (Left x)  = Left x
  fmap f (Right y) = Right (f y)

The Unit Type `()`

() is called the unit type
() :: ()
- (), the unit value, has type ()
- () is the only value of type ()
Can be thought of as defined data () = ()
Or an empty tuple

Semigroups and Monoids

A type is a semigroup if it has some associative binary operation defined on it. This operator (<>) is the "combine" operator.

class Semigroup a where
  (<>) :: a -> a -> a

instance Semigroup [a] where
  -- (<>) :: [a] -> [a] -> [a]
  (<>) = (++)

instance Semigroup Int where
  -- (<>) :: Int -> Int -> Int
  (<>) = (+)

A type is a monoid if it is a semigroup that also has some identity value, called mempty:

class Semigroup a => Monoid a where
  mempty ::a

instance Monoid [a] where
  -- mempty :: [a]
  mempty = []

instance Monoid Int where
  -- mempty :: Int
  mempty = 0

Applicatives

Applicative Functors are similar to normal functors, except with a slightly different type definition:

class Functor f => Applicative f where
  pure :: a -> f a
  <*>  :: f (a -> b) -> f a -> f b

The typeclass defines two functions:

pure just lifts the value a into the "box"
<*> (the apply operator) takes some function (a -> b) in a box f, and applies it to a value a in a box, returning the result in the same box.
- "box" is a rather loose analogy. It is more accurate to say "computational context".

Different contexts for function application:

-- vanilla function application
(\$) :: (a -> b) -> a -> b
-- Functor's fmap
(<\$>) :: Functor f => (a -> b) -> f a -> f b
-- Applicative's apply
(<*>) :: Applicative f => f (a -> b) -> f a -> f b

Maybe and Either e are both applicative functors:

instance Applicative Maybe where
  pure x = Just x
  Nothing <*> _ = Nothing
  (Just f) <*> x = f <\$> x

instance Applicative (Either e) where
  pure = Right
  Left err <*> _ = Left err
  Right f  <*> x = f <\$> x

The "context" of both of these types is that they represent error. All data flow in haskell has to be explicit due to its purity, so these types allow for the propagation of error.

Another example of an applicative functor is a list:

instance Applicative [] where
  pure x = [x]
  fs <*> xs = [f x | f <- fs, x <- xs]

Every function in the left list is applied to every function in the right:

[f, g] <*> [x, y, z]
=> [f x, f y, f z, g x, g y, g z]

g <\$> [x,y] <*> [a,b,c]
=> [g x, g y] <*> [a,b,c]
=> [g x a, g x b, g x c, g y a, g y b, g y c]

The context represented by lists is nondeterminism, ie a function f given one of the arguments [x, y, z] could have result [f x, f y, f z].

Applicative Laws

Applicative functors, like normal functors, also have to obey certain laws:

pure id <*> x = x
- The identity law
- applying pure id does nothing
pure f <*> pure x = pure (f x)
- Homomorphism
- pure preserves function application
u <*> pure y = pure (\$ y) <*> u
- Interchange
- Applying something to a pure value is the same as applying pure ($ y) to that thing
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
- Composition
- Function composition with (.) works within a pure context.

Left and Right Apply

<* and *> are two more operators, both defined automatically when <*> is defined.

const :: a -> b -> a
const x y = x

flip :: (a -> b -> c) -> b -> a -> c
flip f x y = f y x

(<*) :: Applicative f => f a -> f b -> f a
a0 <* a1 = const <\$> a0 <*> a1

(*>) :: Applicative f => f a -> f b -> f b
a0 *> a1 = flip const <\$> a0 <*> a1

In simple terms *> is used for sequencing actions, discarding the result of the first argument. <* is the same, except discarding the result of the second.

Just 4 <* Just 8
=> const <\$> Just 4 <*> Just 8
=> Just (const 4) <*> Just 8
=> Just (const 4 8)
=> Just 4

Just 4 <* Nothing
=> const <\$> Just 4 <*> Nothing
=> Just (const 4) <*> Nothing
=> Nothing

Just 4 *> Just 8
=> flip const <\$> Just 4 <*> Just 8
=> Just (flip const 4) <*> Just 8
=> Just (flip const 4 8)
=> Just (const 8 4)
=> Just 8

Nothing *> Just 8
=> Nothing

These operators are perhaps easier to understand in terms of monadic actions:

as *> bs = do as
              bs
as *> bs = as >> bs

as <* bs = do a <- as
              bs
              pure a

Example: Logging

A good example to illustrate the uses of applicative functors is logging the output of a compiler. If we have a function comp that takes some Expr type, representing compiler input, and returns some Program type, representing output :

comp :: Expr -> Program
comp (Val n) = [PUSH n]
comp (Plus l r) = comp l ++ comp r ++ [ADD]
-- extending to return a String for a log
comp :: Expr -> (Program, [String])
comp (val n) = ([PUSH n],["compiling a value"])
comp (Plus l r) = (pl ++ pr ++ [ADD], "compiling a plus" : (ml ++ mr))
  where (pl, ml) = comp l
        (pr, mr) = comp r

This is messy and not very clear what is going on. There is a much nicer way to do this, using the Writer type:

-- w is the "log"
-- a is the containing type (the type in the "box")
data Writer w a = MkWriter (a,w)
--type of MkWriter
MkWriter :: (a,w) -> Writer w a
-- kind of Writer type
Writer :: * -> * -> *

instance Functor (Writer w) where
  -- fmap :: (a -> b) -> Writer w a -> Writer w b
  fmap f (MkWriter (x,o)) = MkWriter (f x, o) -- applies the function to the x value

-- a function to write a log
-- generates a new writer with a msg and unit type in it's box
writeLog :: String -> Writer [w] ()
writeLog msg = MkWriter((), [msg])

Using this to redefine comp:

comp :: Expr -> Writer [String] Program
comp (Val n) = MkWriter ([PUSH n], m)
  where (MkWriter (_, m)) = writeLog "compiling a value"
comp (Plus l r) = MkWriter (pl ++ pr ++ [ADD], m ++ ml ++ mr)
  where (MkWriter (pl, ml)) = comp l
        (MkWriter (pr, mr)) = comp r
        (MkWriter (_, m))   = writeLog

This definition of comp combines the output using Writer, but is messy as it uses pattern matching to deconstruct the results of the recursive calls and then rebuild them into the result. It would be nice if there was some way to implicitly keep track of the log messages.

We can define an instance of the Applicative typeclass for Writer to do this. There is the additional constraint that w must be an instance of Monoid, because we need some way to combine the output of the log.

instance Monoid w => Applicative (Writer w) where
  --pure :: a -> Writer w a
  pure x = MkWriter (x, mempty)
  -- <*> Monoid w => Writer w (a -> b) -> Writer w a -> Writer w b
  MkWriter (f,o1) <*> MkWriter (x,o2) = MkWriter (f x, o1 <> o2)
  -- f is applied to x, and o1 and o2 are combined using their monoid instance

Using this definition, the comp function can be tidied up nicely using <*>

comp :: Expr -> Writer [String] Program
comp (Val n) = writeLog "compiling a value" *> pure [PUSH n]
comp (Plus l r) = writeLog "compiling a plus" *>
    ((\p p' -> p ++ p' ++ [ADD]) <\$> comp l <*> comp r)

The first pattern uses *>. Recall that *> does not care about the left result, which in this case is the unit type, so only the result of the right Writer is used, which is the [PUSH n] put into a Writer by pure, with a mempty, or [] as the logged value.

The second pattern applies the anonymous function (\p p' -> p ++ p' ++ [ADD]) to the result of the recursive calls. The lambda defines how the results of the recursive calls are combined together, and the log messages are automatically combined by the definition of <*>. *> is used again to add a log message to the program.

Monads

ṱ̴̹͙̗̣̙ͮ͆͑̊̅h̸̢͔͍̘̭͍̞̹̀ͣ̅͢e̖̠ͫ̒ͦ̅̉̓̓́͟͞ ͑ͥ̌̀̉̐̂͏͚̤͜f͚͔͖̠̣͚ͤ͆ͦ͂͆̄ͥ͌o̶̡̡̝͎͎̥͖̰̭̠̊r̗̯͈̀̚b̢͙̺͚̅͝i̸̡̱̯͔̠̲̿dͧ̈ͭ̑҉͎̮d̆̓̂̏̉̏͌͆̚͝͏̺͓̜̪͓e̎ͯͨ͢҉͙̠͕͍͉n͇̼̞̙͕̮̣͈͓ͨ͐͛̽ͣ̏͆́̓ ̵ͧ̏ͤ͋̌̒͘҉̞̞̱̲͓k͔̂ͪͦ́̀͗͘n͇̰͖̓ͦ͂̇̂͌̐ȯ̸̥͔̩͒̋͂̿͌w̞̟͔̙͇̾͋̅̅̔ͅlͧ͏͎̣̲̖̥ẻ̴̢̢͎̻̹̑͂̆̽ͮ̓͋d̴̪͉̜͓̗̈ͭ̓ͥͥ͞g͊̾̋̊͊̓͑҉͏̭͇̝̰̲̤̫̥e͈̝̖̖̾ͬ̍͢͞

Monads are another level of abstraction on top of applicatives, and allow for much more flexible and expressive computation. Functors => Applicatives => Monads form a hierarchy of abstractions.

The `Monad` typeclass

class Applicative m => Monad m where
  (>>=) :: m a -> (a -> m b) -> m b

  return :: a -> m a
  return = pure

The >>= operator is called bind, and applies a function that returns a wrapped value, to another wrapped value.

The left operand is some monad containing a value a
the right operand is a function of type a -> m b, ie it takes some a and returns a monad containing something of type b
The result is a monad of type b

The operator can essentially be thought of as feeding the wrapped value into the function, to get a new wrapped value. x >>= f unwraps the value in x from it, and applies the function to f to it. Understanding bind is key to understanding monads.

return is just the same as pure for applicatives, lifting the value a into some monadic context.

Some example monad instances:

instance Monad Maybe where
  Nothing >>= _ = Nothing
  Just x  >>= f = f x

instance Monad (Either e) where
  Left l >>= _ = Left l
  Right r >>= f = f r

  pure = Right

instance Monad [] where
  xs >>= f = concat (map f xs)

Monads give effects: composing computations sequentially using >>= has an effect. With the State Monad this effect is "mutation". With Maybe and Either the effect is that we may raise a failure at any step. Effects only happen when we want them, implemented by pure functions.

Monad Laws

For a type to be a monad, it must satisfy the following laws:

return a >>= h = h a
- Left identity
m >>= return = m
- Right identity
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
- Associativity

Example: Evaluating an Expression

A type Expr is shown below that represents a mathematical expression, and an eval function to evaluate it. Note that it is actually unsafe and could crash at runtime due to a div by 0 error. The safediv function does this using Maybe.

data Expr = Val Int | Add Expr Expr | Div Expr Expr

eval :: Expr -> Int
eval (Val n)   = n
eval (Add l r) = eval l + eval r
eval (Div l r) = eval l `div` eval r

safediv :: Int -> Int -> Maybe Int
safediv x 0 = Nothing
safediv x y = Just (x `div` y)

If we want to use safediv with eval, we need to change it's type signature. The updated eval is shown below using applicatives to write the function cleanly and propagate any errors:

eval :: Expr -> Maybe Int
eval (Val n) = Just n
eval (Add l r) = (+) <\$> eval l <*> eval r
eval (Div l r) = safediv <\$> eval l <*> eval r

If any recursive calls return a Nothing, the entire expression will evaluate to Nothing. Otherwise, the <\$> and <*> will evaluate the expression within the Maybe context. However, this is still wrong as the last expression now has type of Maybe (Maybe Int). This can be fixed using >>=. Note the use of lambdas.

eval (Div l r) = eval l >>= \x ->
                 eval r >>= \y ->
                 x `safediv` y

The Expr type can be extended to include a conditional expression, where If Condition True False`.

data Expr = Val Int
          | Add Expr Expr
          | Div Expr Expr
          | If Expr Expr Expr

eval :: Expr -> Maybe Int
eval (Val n)    = Just n
eval (Add l r)  = eval l >>= \x ->
                  eval r >>= \y ->
                  Just (x+y)
eval (Div l r)  = eval l >>= \x ->
                  eval r >>= \y ->
                  x `safediv` y
eval (If c t f) = ifA <\$> eval c <*> eval t <*> eval f
  where ifA b x y = if b /= 0 then x else y

With this definition using applicatives, both branches of the conditional branch are evaluated. If there is an error in the false branch, the whole expression will fail. Here, using bind, the semantics are correct.

eval' (If c t f) = eval' c >>= \b ->
    if b /= 0 then eval t else eval f

`<*>` vs `>>=`

Bind is a much more powerful abstraction than apply:

<*>  :: m (a -> b) -> m a -> m b
(>>=) :: m a -> (a -> m b) -> m b

Apply operates on functions already inside a context
- This function can't determine anything to do with the context
- With a Maybe, it can't determine if the overall expression returns Nothing or not
Bind takes a function that returns a context, and can therefore can determine more about the result of the overall expression
- It knows if it's going to return Nothing

`do` Notation

Notice the pattern of >>= being used with lambdas a fair amount. This can be tidied up with some nice syntactic sugar, called do notation. Rewriting the earlier example:

eval :: Expr -> Maybe Int
eval (Val n)   = return n
eval (Add l r) = do
    x <- eval l
    y <- eval r
    return (x+y)
eval (Div l r) = do
    x <- eval l
    y <- eval r
    x `safediv` y

This looks like imperative code, but is actually using monads behind the scenes. The arrows bind the results of the evaluation to some local definition, which can then be referred to further down the block.

A block must always end with a function call that returns a monad -
- usually return, but safediv is used too
If any of the calls within the do block shown returns Nothing, the entire block will short-circuit to a Nothing.

Example: The `Writer` Monad

The example of Writer as an applicative instance can be extended to make it a Monad instance.

data Writer w a = MkWriter (a,w)

instance Functor (Writer w) where
  -- fmap :: (a -> b) -> Writer w a -> Writer w b
  fmap f (MkWriter (x,o)) = MkWriter(f x, o)

instance Monoid w => Applicative (Writer w) where
  -- pure :: Monoid w => a -> Writer w a
  pure x = MkWriter (x, mempty)
  -- <*> :: Monoid w => Writer w (a -> b) -> Writer w a -> Writer w b
  MkWriter (f,o1) <*> MkWriter (x,o2) = MkWriter (f x, o1 <> o2)

instance Monoid w => Monad (Writer w) where
  -- return :: Monoid w => a -> Writer w a
  return = MkWriter (x, mempty) --pure
  (Writer (x, o1)) >>= f = MkWriter (y, o2 <> o1)
                          where (MkWriter (y,o2)) = f x

Bind for Writer applies the function to the x value in the writer, then combines the two attached written values, and return the new value from the result of f x along with the combined values.

Now we have a monad instance for the Writer monad, we can rewrite our comp function with do notation:

comp' :: Expr -> Writer [String] Program
comp' (Val n)    = do
                   writeLog "compiling a value"
                   pure [PUSH n]
comp' (Plus l r) = do writeLog "compiling a plus"
                   pl <- comp l
                   pr <- comp r
                   pure (pl ++ pr ++ [ADD])

Type Level Programming

Type level programming is about encoding more information in our types, so make them more descriptive. The more descriptive types are, the easier it is to avoid runtime errors, as the type checker can do more at compile time.

The GHC language extensions used here are:

-XDataKinds
-XGATDs
-XKindSignatures
-XScopedTypeVariables
-XTypeFamilies

Type Promotion

As we already know, types have kinds:

Bool :: *
Maybe :: * -> *
[] :: * -> *
State :: * -> * -> *

Also recall that we have to partially apply type constructors with kinds greater than * -> * to use them as monads:

-- Maybe :: * -> *
instance Monad Maybe where
    ...

-- State :: * -> * -> *
instance Monad (State s) where
    ...

-- Either :: * -> * -> *
instance Monad Either where
    ... -- type error

instance Monad (Either e) where
    ... -- works

Type Promotion is used to define our own kinds. The DataKinds extension allows for this. Without DataKinds, data Bool = True | False gives us two constructors, True and False. At the three levels in haskell:

At the kind-level: *
At the type-level Bool
At the value-level: True or False

With DataKinds, we also get the following two new types, both of kind Bool:

'True :: Bool
'False :: Bool

The value constructors True and False have been promoted to the type level as 'True and 'False. A new kind is introduced too, Bool instead of just *. We now have booleans at the type level.

DataKinds promotes all value constructors to type constructors, and all type constructors to kinds.

Another example, recursively defined natural numbers. Zero is 0, and Succ Nat is Nat + 1.

data Nat = Zero | Succ Nat

-- values :: types
Zero :: Nat
Succ :: Nat -> Nat

-- types :: kinds
'Zero :: Nat
'Succ :: Nat -> Nat

Generalised Algebraic Data Types

GADTs allow for more expressive type definitions. Normal ADT syntax:

data Bool = True | False
-- gives two values
True :: Bool
False :: Bool

Usually, we define the type and its values, which yields two value constructors. With a GADT, we explicitly specify the type of each data constructor:

data Bool where
  True :: Bool
  False :: Bool

data Nat where
  Zero :: Nat
  Succ :: Nat -> Nat

The example below defines a recursively defined Vector type.

-- Normally
data Vector a = Nil | Cons a (Vector a)

-- GADT
data Vector a where
  Nil  :: Vector a
  Cons :: a -> Vector a -> Vector a

Example: A Safe Vector

The vector definition above can use another feature, called KindSignatures, to put more detail into the type of the GADT definition:

data Vector (n :: Nat) a where
  Nil :: Vector n a
  Cons :: a -> Vector n a -> Vector n a

This definition includes an n to encode the size of the vector in the type. n is a type of kind Nat, as defined above. The values and types were promoted using DataKinds. The type variable n can also be replaced with concrete types:

data Vector (n :: Nat) a where
  Nil :: Vector `Zero a
  Cons :: a -> Vector n a -> Vector (`Succ n) a

-- example
cakemix :: Vector ('Succ ('Succ Zero)) String
cakemix = Cons "Fish-Shaped rhubarb" (Cons "4 large eggs" Nil)

This further constrains the types to make the types more expressive. Now we have the length of the list expressed at type level, we can define a safer version of the head function that rejects zero-length lists at compile time.

vhead :: Vector ('Succ n) a -> a
-- this case will throw an error at compile time as it doesn't make sense
vhead Nil = undefined
vhead (Cons x xs) = x

Can also define a zip function for the vector type that forces inputs to be of the same length. The type variable n tells the compiler in the type signature that both vectors should have the same length.

vzip :: Vector n a -> Vector n b -> Vector n (a,b)
vzip Nil Nil = Nil
vzip (Cons x xs) (Cons y ys) = Cons (x,y) (vzip xs ys)

Singleton types

Singletons are types with a 1:1 correspondence between types and values. Every type has only a single value constructor. The following GADT is a singleton type for natural numbers. The (n :: Nat) in the type definition annotates the type with it's corresponding value at type level. The type is parametrised over n, where n is the value of the type, at type level.

data SNat (n :: Nat) where
    SZero :: SNat 'Zero
    SSucc :: Snat n -> SNat ('Succ n)

-- there is only one value of type SNat 'Zero
szero :: SNat 'Zero
szero = SZero

-- singleton value for one and it's type
sone :: SNat ('Succ 'Zero)
sone = SSucc SZero

stwo :: SNat ('Succ ('Succ Zero))
sone = SSucc sone

There is only one value of each type. The data is stored at both the value and type level.

This can be used to define a replicate function for the vector:

vreplicate :: SNat n -> a -> Vector n a
vreplicate SZero x = Nil
vreplicate (SSucc n) x = Cons x (vreplicate n x)

The length of the vector we want is SNat n at type level, which is a singleton type. This allows us to be sure that the vector we are outputting is the same size as what we told it, making sure this type checks.

Proxy Types & Reification

We are storing data at the type level, which allows us to access the data at compile time and statically check it. If we want to access that data at runtime, for example to find the length of a vector, we need a proxy type. Proxy types allow for turning type level data to values, ie turning a type level natural number (Nat) into an Int. Haskell has no types at runtime (due to type erasure), so proxies are a hack around this.

-- a type NatProxy parametrised over some type a of kind Nat
data NatProxy (a :: Nat) = MkProxy
-- NatProxy :: Nat -> *
-- MkProxy :: NatProxy a

This proxy type is parametrised over some value of type a with kind Nat, but there is never actually any values of type a involved, the info is at the type level. a is a phantom type.

zeroProxy :: NatProxy 'Zero
zeroProxy = MkProxy

oneProxy :: NatProxy ('Succ 'Zero)
oneProxy = MkProxy

These two proxies have the same value, but different types. The Nat type is in the phantom type a at type level.

We can then define a type class, called FromNat, that is parametrised over some type n of kind Nat:

class FromNat (n :: Nat) where
  fromNat :: NatProxy n -> Int

The function fromNat takes a NatProxy, our proxy type, and converts it to an int. Instances can be defined for the two types of Nat to allow us to covert the type level Nats to Ints.

-- instance for 'Zero
instance FromNat 'Zero where
  -- fromNat :: NatProxy 'Zero -> int
  fromNat _ = 0

instance FromNat n => FromNat ('Succ n) where
    fromNat _ = 1 + fromNat (MkProxy :: NatProxy n)

The arguments to these functions are irrelevant, as the info is in the types. The variable n refers to the same type variable as in the instance head, using scoped type variables. This hack allows for passing types to functions using proxies, and the converting them to values using reification.

Type Families

Type families allow for performing computation at the type level. A type family can be defined to allow addition of two type-level natural numbers:

type family Add (n :: Nat) (m :: Nat) :: Nat where
  Add 'Zero m = m
  Add ('Succ n) m = 'Succ (Add n m)

-- alternatively
type family (n :: Nat) + (m :: Nat) :: Nat where
  'Zero   + m = m
  'Succ n + m = 'Succ (n + m)

The type family for (+) is whats known as a closed type family: once it's defined it cannot be redfined or added to. This type family can be used to define an append function for our vector:

vappend :: Vector n a -> Vector m a -> Vector (n+m) a
vappend Nil         ys = ys
vappend (Cons x xs) ys = Cons x (vappend xs ys)

Importing GHC.TypeLits allows for the use of integer literals at type level instead of writing out long recursive type definitions for Nat. This means we can now do:

data Vector (n :: Nat) a where
  Nil :: Vector 0 a
  Cons :: a -> Vector n a -> Vector (n+1) a

vappend Nil          Nil          :: Vector 0 a
vappend (Cons 4 Nil) Nil          :: Vector 1 Int
vappend (Cons 4 Nil) (Cons 8 Nil) :: Vector 2 Int

Associated (Open) Type Families

The definition below defines a typeclass for a general collection of items:

class Collection c where
  empty :: c a
  insert :: a -> c a -> c a
  member :: a -> c a -> Bool

instance Collection [] where
  empty = []
  insert x xs = x : xs
  member x xs = x `elem` xs

However, the list instance will throw an error, as elem has an Eq constraint on it, while the member type from the typeclass doesn't. Another example, defining the red-black tree as an instance of Collection (the tree is defined in one of the lab sheets):

instance Collection Tree where
  empty = empty
  insert x t = insert t x
  member x t = member x t

This will raise two type errors, as both insert and member for the tree need Ord constraints, which Collection doesn't have.

To fix this, we can attach an associated type family to a type class.

class Collection c where
  type family Elem c :: *

  empty :: c
  insert :: a -> c -> c
  member :: a -> c -> Bool

For an instance of Collection for some type c, we must also define a case for c for a type level function Elem, this establishing a relation between c and some type of kind *.

We can now define instance for list and tree, where Eq and Ord constraints are placed in instance definition.

instance Eq a => Collection [a] where
    type Elem [a] = a

    empty = []
    insert x xs = x : xs
    member x xs = x `elem` xs


instance Ord a => Collection (L.Tree a) where
    type Elem (L.Tree a) = a

    empty      = L.Leaf
    insert x t = L.insert t x
    member x t = L.member x t

ES191

A (yet incomplete) collection of notes for ES191 Electrical and Electronic Circuits.
This one aims to be fairly comprehensive, so let me know if you think anything is missing. If you're looking for notes on digital logic, see CS132

Other Useful Resources

Circuit Symbols and Conventions

Circuits model electrical systems

Voltage is work done per unit charge
Potential difference- difference in electrical potential between two points in an electric field
A force used to move charge between two points in space

$V = \frac{W}{q} = \frac{d W}{d q}$

Moving charges produce an electric current
Moving charges can do electrical work the same way moving objects do mechanical work

$I = \frac{d q}{d t}$

Electrical energy is the capacity to do electrical work
Electrical power is the rate at which work is done

$P = I V = \frac{d q}{d t} \cdot \frac{d W}{d q} = \frac{d W}{d t}$

Resistance

Resistance is the opposition to the flow of current
Ohm's Law:

$R = \frac{V}{I}$

Resistance is also proportional to the Resistivity of the material
- $l$ and $A$ are the length and area of the conductor, respectively.

$R = \frac{ρ \cdot l}{A}$

Sources and Nodes

Everything in a circuit can be modelled as either a source, or a node.

Voltage Sources

DC and AC voltage sources
DC source has positive and negative terminals
Ideal voltage source has 0 internal resistance (infinite conductance)
Supplies constant voltage regardless of load
- This is an assumption, is not the case in reality

Current Sources

Ideal current source has infinite resistance (0 conductance)
Supplies constant current regardless of load
- Also an assumption
- In reality, will have some internal resistance and therefore a maximum power limit

Dependant sources

Diamond-shaped
Sources depend on values in other parts of the circuit
Model real sources more accurately

Nodes

All passive elements: generate no electrical power.

Resistors provide resistance/impedance in Ohms ( $Ω$ )
Inductors provide inductance in Henries ( $L$ )
Capacitors provide capacitance in Farads ( $F$ )

The voltage rise across an impedance conducting current is in opposition to the flow of current in the impedance.

Basic Conventions

Electrical current always flows from high to low potential.

If the direction of the current in a circuit is such that it leaves the positive terminal of a voltage source and enters the negative terminal, then the voltage is designated as negative
If the direction of the current is such that it leaves the negative and enters the positive, then the voltage is positive
- The sign of the loop current is the terminal that it flows into

The power absorbed/produced by a source is $P = I V$ .

A voltage source is absorbing power if it is supplying a negative current
A voltage source is producing power if it is supplying a positive current

The power dissapated in a resistor is $P = I^{2} R$ .

Resistors in series and parallel

Resistors in series: $R_{t} = R_{1} + R_{2}$

Resistors in parallel: $\frac{1}{R _{t}} = \frac{1}{R _{1}} + \frac{1}{R _{2}}$

Resistors dissipate electrical power, so there is a drop in voltage accross them, in the direction of current flow. Therefore, the voltage rise is in opposition to the direction of current

Voltage dividers

Using two resistors to divide a voltage

In the general case:

$V_{o u t} = V_{in} \times \frac{Z _{2}}{Z _{1} + Z _{2}}$

Current Dividers

Similar deal to voltage divider

$I_{R 1} = I_{T} \times \frac{R _{2}}{R _{1} + R _{2}}$ $I_{R 2} = I_{T} \times \frac{R _{1}}{R _{1} + R _{2}}$

Nodal Analysis

Kirchhoff's Current Law

The sum of currents entering a node is equal to the sum of currents leaving a node.

$- I_{1} - I_{2} + I_{3} + I_{4} + I_{5} = 0$ $I_{1} + I_{2} = I_{3} + I_{4} + I_{5}$

Currents flowing into a node are denoted as negative
Currents flowing out of a node are denoted positive
The sum of currents around a node must always be 0

Nodal Analysis

A technique used to analyse circuits to calculate unknown quantities. Allows the voltage at each circuit node to be calculated, using KCL.

An important point to remember is that the bottom of any circuit diagram is ground (0V), by convention.

Steps

Choose 1 node as the reference node
Label any remaining voltage nodes $V_{1}, V_{2}, e t c ...$
Substitute any known voltages
Apply KCL at each unknown node to form a set of simultaneous equations
Solve simultaneous equations for unknowns
Calculate any required values (usually currents)

Generally speaking, there will be a nodal equation for each node, formed using KCL, and then these equations will solve simultaneously.

Example

Calculate the voltages at nodes $V_{1}$ and $V_{2}$ .

There are 4 currents at $V_{1}$

Flowing from 15V source to $V_{1}$ accross 2 $Ω$ resistor
Flowing from $V_{1}$ to ground accross 16 $Ω$ resistor
Flowing between $V_{1}$ and $V_{2}$ accross 7 $Ω$ resistor
5A, from current source

Each current is calculated using ohm's law, which gives the following nodal equation:

$\frac{V _{1} - 15}{2} + \frac{V _{1}}{16} + 5 + \frac{V _{1} - V _{2}}{7} = 0$

When the direction of each current is not known it is all assumed to be positive, and the voltage at the node is labelled as postive, with any other voltages being labelled as negative. Similar can be done for node $V_{2}$ :

$\frac{V _{2} - V _{1}}{7} - 8 + \frac{V _{2} + 30}{9} = 0$

We now have two equations with two unknowns, which can easily be solved.

$11.29 V_{1} - 2.29 V_{2} = 40$ $- 9 V_{1} + 16 V_{2} = 294$ $V_{1} = 8.2 V, V_{2} = 23.0 V$

Admittance Matrices

The system of equations above can also be represented in matrix form

$(11.29 - 9 - 2.29 16) (V_{1} V_{2}) = (40294)$

This matrix equation always takes the form $Y \times V = I$ .

$Y$ is known as the Admittance Matrix.

Calculating Power Dissapated

Sometimes, it is required that the power dissapated by voltage/current sources is calculated. For example, calculate the power supplied by the current sources in the following:

KCL at node $V_{1}$ : $2 + \frac{V _{1} - V _{2}}{1} + \frac{V _{1} - V _{3}}{2} = 0$

KCL at node $V_{2}$ : $- 3 + \frac{V _{2}}{4} + \frac{V _{2} - V _{1}}{1} = 0$

KCL at node $V_{3}$ : $3 + \frac{V _{3}}{3} + \frac{V _{3} - V _{1}}{2} = 0$

$343 - 2 - 5 0 - 1 0 - 5 V_{1} V_{2} V_{3} = - 4 - 12 18 ⟹ V_{1} V_{2} V_{3} = - 3.5 - 0.4 - 5.7$

From the node voltages, the power dissapated in the sources can be calculated. In the 2A source:

$P = I V = 2 \times (0 - V_{1}) = 2 \times (0 - 3.5) = 7 W$

And in the 3A source:

$P = 3 \times (V_{2} - V_{3}) = 3 \times (- 0.4 + 5.7) = 15.9 W$

Note that the voltage accross the current source is always calculated as the node the current is flowing to, minus the node the current is flowing from, ie (to - from). This makes the sign correct so it is known whether the source is delivering or absorbing power. If the direction of the current source oppose the direction of the voltage rise, it will be absorbing power..

If correct, the total power delivered to the circuit will equal the total dissapated. This calculation can be done to check, if you're bothered.

Dependant Sources

Some circuits contain current/voltage sources which are dependant upon other values in the circuit. In the example below, a current $I$ is assumed between the two nodes where the dependant voltage source is.

Calculate the power dissipated by the 50 $Ω$ resistor, and the power delivered by the current source.

At Node $V_{1}$ : $\frac{V _{1} - 50}{5} + \frac{V _{1}}{50} + I = 0$

At Node $V_{2}$ : $- I + \frac{V _{2}}{100} - 4 = 0$

We have two equations in 3 unknowns, so another equation is needed. Using $I_{a}$ :

$I_{a} = \frac{V _{1} - 50}{5}, 10 I_{a} = V_{2} - V_{1}$ These can be equated about $I_{a}$ to give $V_{2} - V_{1} = 2 V_{2} - 100$

This system of equations solves to give $V_{1} = 60 V$ , and $V_{2} = 80 V$ .

Therefore,

The power delivered by the current source $P = I V = 4 \times 80 = 320 W$
The power dissapated by the 50 $Ω$ resistor is $P = \frac{V ^{2}}{R} = \frac{6 0 ^{2}}{50} = 72 W$

Mesh Analysis

Achieves a similar thing to nodal analysis, using Kirchhoff's voltage law, and meshes instead of nodes.

Kirchhoff's Voltage Law

The sum of voltages around a closed loop always equals zero

$- V_{1} + V_{2} - V_{3} - V_{4} = 0$

Sign convention

If voltage rise and current in a voltage source are in the same direction, the voltage is denoted as negative
If voltage rise and current are in opposite direction, voltage is positive
In a resistor, current opposes voltage rise

Steps

Identify meshes (loops) (always clockwise) and assign currents $I_{1}, I_{2},$ etc to those loops
Apply KVL to each mesh to generate system of equations
Solve equations

Where there are elements that are part of multiple meshes, subtract the currents of the other meshes from the mesh currently being considered to consider the total current through that circuit element.

Example

There are three meshes in this circuit, labelled $I_{1}$ , $I_{2}$ , $I_{3}$ .

For $I_{1}$ : $- 50 + 70 I_{1} + 20 (I_{1} - I_{2}) + 30 (I_{1} - I_{3}) + 40 I_{1} = 0$

For $I_{2}$ : $20 (I_{2} - I_{1}) + 100 I_{2} + 80 I_{2} + 10 (I_{2} - I_{3}) = 0$

For $I_{3}$ : $30 (I_{3} - I_{1}) + 10 (I_{3} - I_{2}) + 60 I_{3} + 90 I_{3} = 0$

This forms a system of equations:

$160 I_{1} - 20 I_{2} - 30 I_{3} = 50 - 20 I_{1} + 210 I_{2} - 10 I_{3} = 0 - 30 I_{1} - 10 I_{2} + 190 I_{3} = 0$

Solving yields $I_{1} = 325 m A$ , $I_{2} = 34 m A$ , and $I_{3} = 53 m A$ .

Impedance Matrices

Similar to how systems of equations from nodal analysis form admittance matrices, mesh analysis forms impedance matrices which describe the circuit being analysed. The matrix equation takes the form $Z \cdot I = V$ . As an example, the matrix equation for the system above is:

$160 - 20 - 30 - 20 210 - 10 - 30 - 10 190 I_{1} I_{2} I_{3} = 5000$

Therefore, the impedance matrix for the system is:

$Z = 160 - 20 - 30 - 20 210 - 10 - 30 - 10 190$

Another Example

Determine the currents in the circuit shown below:

Loop 1: $- 10 + 10 I_{1} + 5 (I_{1} - I_{2}) = 0$

Loop 2: $5 (I_{2} - I_{1}) + 20 (I_{2} - I_{3}) + V + 15 I_{2} = 0$

Where there is a current source, a voltage $V$ is assumed accross it.

Loop 3: $2 I_{3} - 20 + 20 (I_{3} - I_{2}) = 0$

There are now 3 equations with 4 unknowns. However, it can be seen from the diagram that $I_{2} = - 4 A$ (the direction of the current source opposes our clockwise current), so the system can be solved as follows:

$I_{2} = - 4 A$ $I_{1} = \frac{10 + 5 I _{2}}{15} = - 0.67 A$ $I_{3} = \frac{20 + 20 I _{2}}{22} = - 2.73 A$

Example with dependant sources

Calculate the power dissapated in the 4 $Ω$ resistor and the power delivered/absorbed by the current dependant voltage source.

KVL round $I_{1}$ :

$I_{1} + 4 (I_{1} - I_{3}) + 5 (I_{1} - I_{2}) = 0$

KVL round $I_{2}$ :

$5 (I_{2} - I_{1}) + 20 (I_{2} - I_{3}) - 50 = 0$

KVL round $I_{3}$ :

$15 I_{a} + 20 (I_{3} - I_{2}) + 4 (I_{3} - I_{1}) = 0$

$I_{a} = I_{2} - I_{3}$ , so this can be substituted into equation 3 to obtain a fourth equation:

$15 (I_{2} - I_{3}) + 20 (I_{3} - I_{2}) + 4 (I_{3} - I_{1}) = 0$

The system of equations then solves:

$10 - 5 - 4 - 5 25 - 5 - 4 - 20 9 I_{1} I_{2} I_{3} = 0500 ⟹ I_{1} I_{2} I_{3} = 26 29.6 28$

The power dissapated in the 4 $Ω$ resistor: $P = I^{2} R = 2^{2} \times 4 = 16 W$

The power delivered/absorbed by the dependant voltage source: $P = I V = 15 I_{a} \times I_{2} = 15 (29.6 - 28) \times 28 = 672 W (ab sor bin g)$ The source is absorbing power as the current $I_{2}$ opposes the direction of voltage rise in the source.

Thevenin and Norton Equivalent Circuits

Thevenin's Theorem states that as far as its appearance from outside is concerned, any two terminal network of resistors and energy sources can be replaced by a series combination of an ideal voltage source V and a resistor R, where V is the open-circuit voltage of the network and R is the resistance that would be measured between the output terminals if the energy sources were removed and replaced by their internal resistance.

In practice, this can be used for reducing complex circuits to a more simple model: taking networks of resistors/impedances and reducing them to a simple circuit of one source and one resistance.

Thevenin circuits contain a single voltage source and resistor in series
Norton circuits contain a single current source and a resistor in parallel

Calculating Equivalent Circuits

Any linear network viewed through 2 terminals is replaced with an equivalent single voltage & resistor.

The equivalent voltage is equal to the open circuit voltage between the two terminals ( $V_{oc}$ / $V_{t h}$ )
The equivalent resistance ( $R_{t h}$ ) is found by replacing all sources with their internal impedances and then calculating the impedance of the network, as seen by the two terminals.
- This can be done alternatively by calculating the short circuit current ( $I_{sc}$ / $I_{N}$ ) between the two terminals, and then using ohms law: $R_{t h} = \frac{V _{t h}}{I _{sc}}$ .
The value of the voltage source in a Thevenin circuit is $V_{t h}$
The value of the current source in a Norton circuit is $I_{N}$
The value of the resistor in either circuit is $R_{t h}$

Often, nodal/mesh analysis is needed to determine the open circuit voltage and/or short circuit current.

Maximum Power Transfer

For the maximum power transfer between a source and a load resistance in a Thevenin circuit, the load resistance must be _equal to the thevenin resistance $R_{t h}$ _. This can be trivially proved, and is left as an exercise to the reader.

Example 1

Determine the Thevenin equivalent of the following:

The open circuit voltage accross the two terminals can be calculated using the voltage divider rule, as the two resistors $R_{1}$ and $R_{2}$ split the voltage.

$V_{oc} = V_{t h} = 30 \times \frac{10 k}{10 k + 10 k} = 15 V$

The short circuit current can be calculated by nodal analysis. When calculating the short circuit current, it is assumed that the two terminals are connected (shorted), so current can flow between them.

KCL at the node labelled V:

$\frac{V - 30}{10 k} + \frac{V}{10 k} + \frac{V}{10 k} = 0$ $V = 10 V$

The voltage when the terminals are shorted is 10 V, so the short circuit current can be calculated using ohm's law:

$I_{sc} = \frac{V}{R _{2}} = 1 m A$

Which gives

$R_{t h} = \frac{V _{t h}}{I _{sc}} = \frac{15 V}{1 m A} = 15 k Ω$

The resistance can alternatively be calculated by replacing the voltage source with it's internal resistance (0), and then determining the overall resistance of the network:

$R_t h = R_{2} + (R_{1} ∣∣ R_{3}) = R_{2} + \frac{R _{1} \cdot R _{3}}{R _{1} + R _{3}} = 15 k Ω$

The resulting Thevenin circuit is therefore:

Example 2

Find the Thevenin equivalent circuit of the the network as seen by the two terminals A & B, and therefore the power dissapated/absorbed by the 12V source.

Open Circuit

Doing nodal analysis to determine voltages:

$V_{1}$ : $- 4.8 + \frac{V _{1}}{7.5} + \frac{V _{1} - V _{2}}{2.5} = 0$ $4 V_{1} - 3 V_{2} = 36$ $V_{2}$ : $\frac{V _{2} - V _{1}}{2.5} + \frac{V _{2}}{10} + I = 0$ $V_{3}$ : $- I + \frac{V _{3}}{2.5} = 0$

Combining 2 & 3 by cancelling the assumed current $I$ :

$\frac{V _{2} - V _{1}}{2.5} + \frac{V _{2}}{10} + \frac{V _{3}}{2.5} = 0$ $- 4 V_{1} + 5 V_{2} + 4 V_{3} = 0$

Using $I_{x}$ to generate another equation:

$I_{x} = \frac{V _{1}}{7.5}$ $V_{3} - V_{2} = I_{x}$ $V_{1} + 7.5 V_{2} - 7.5 V_{3} = 0$

This gives a system of 3 equations in 3 unknowns which can be solved to determine the node voltages:

$4 - 4 1 - 3 5 7.5 04 - 7.5 V_{1} V_{2} V_{3} = 3600$

$V_{1} V_{2} V_{3} = 12.64 4.86 6.55$

$V_{3}$ is equal to $V_{oc}$ , so $V_{t h} = 6.55 V$

Short Circuit

The same nodal analysis is needed, except this time the terminals are shorted. The steps are pretty much identical.

$V_{1}$ is the exact same, $4 V_{1} - 3 V_{3} = 36$

$V_{2}$ : $\frac{V _{2} - V _{1}}{2.5} + \frac{V _{2}}{10} + I = 0$ $V_{3}$ : $- I + \frac{V _{3}}{2.5} + \frac{V _{3}}{1} = 0$

2 & 3 are combined in the same way, except yielding a slightly different equation, as this time current can flow to ground from $V_{3}$ through the 1 \Omega Resistor.

$- 4 V_{1} + 5 V_{2} + 14 V_{3} = 0$

The third equation generated using $I_{x}$ is also the same, $V_{1} + 7.5 V_{2} - 7.5 V_{3} = 0$

The solution to this system is very similar to above:

$4 - 4 1 - 3 5 7.5 014 - 7.5 V_{1} V_{2} V_{3} = 3600$

$V_{1} V_{2} V_{3} = 9.82 1.1 2.42$

The short circuit current is then calculated as:

$I_{sc} = \frac{V _{3}}{1} = 2.42 A$

Solution

The Thevenin resistance is calculated as:

$R_{t h} = \frac{V _{oc}}{I _{sc}} = \frac{6.55}{2.42}$

The power delivered to the 12V source is therefore:

$P = I V = \frac{12 - 6.55}{2.7} \times 12 = 24 W$

First Order RC Circuits

RC circuits are those containing resistors and capacitors. First order means they can be modelled by first order differential equations

Capacitors

Capacitors are reactive elements in circuits that store charge. They work by creating an electric field between two parallel plates seperated by a dielectric insulator.

When charging, the electrons between the plates separate. At full charge, all electrons will be on opposite plates.
When discharging, the plates discharge and the charges recombine, forming a current

Equations

Capacitance of a specific capacitor, where

$A$ = the area of the two plates
$d$ = the separation of the two plates
$ϵ_{r}$ = the relative electric permittivity of the insulator
$ϵ_{0}$ = the permittivity of free space

$C = \frac{A ϵ _{r} ϵ _{0}}{d}$

The charge on a capacitor is equal to the product of the capacitance and the voltage accross it: $Q = C \cdot V$

This can be used to derive the i-v equation for a capacitor:

$C = \frac{Q}{V} = \frac{\int I d t}{V}$ $\int I d t = C V$ $I = C \frac{d V}{d t}$

This equation is important as it shows how current leads voltage in a capacitor by a phase of $\frac{π}{2}$ rads.

Energy

The energy stored in a capacitor:

$W = \frac{1}{2} C V^{2} = \frac{1}{2} Q V = \frac{Q ^{2}}{2 C}$

Series and Parallel Combinations

Capacitance combines in series and parallel in the opposite way to resistors.

For capacitors in series: $\frac{1}{C _{t}} = \frac{1}{C _{1}} + \frac{1}{C _{2}}$

In parallel: $C_{t} = C_{1} + C_{2}$

Charging and Discharging

When a voltage is applied to a capacitor, an electric field is formed between the two plates, and the dielectric becomes polarised.
As the capacitor charges, the charges in the dielectric separate which forms a displacement current. At time $t = 0$ , the capacitor behaves as a short circuit
Capacitors charge exponentially, so the time at which one is fully charged is describes as time $t = \infty$ . At this time, the capacitor can take no more charge, so it behaves as an open circuit
When discharging, the displaced charges flow round the circuit back to the other side of the capacitor.
The charge decays exponentially over time.

Step Response

Capacitors charge and discharge at exponential rates, and there are equations which describe this response to a step input.

The step response of a charging capacitor at time $t$ , assuming the switch is closed at time $t = 0$ :

$V_{c} (t) = V_{in} + (V_{0} - V_{in}) e^{- \frac{t}{RC}}$

Equations for current can be derived from this by differentiation:

$I_{c} (t) = C \frac{d}{d t} V_{c} (t)$

Assuming $V_{0} = 0$ , the equations for current and voltage when charging at time $t$ are:

$I_{c} (t) = I_{in} e^{- \frac{t}{τ}}$ $V_{c} (t) = V_{in} (1 - e^{- \frac{t}{τ}})$

Where $I_{in}$ and $V_{in}$ are the input current and voltage, respectively. Similar equations exist for discharging. Voltage at time $t$ when discharging:

$V (t) = V_{0} e^{- \frac{t}{τ}}$

Time constant

$τ = RC$ is the time constant of the circuit, which describes the rate at which it charges/discharges. 1 time constant is the time in seconds for which it takes the charge of a capacitor to rise by a factor of $1 - e^{- 1}$ (approx 63%). As charging and discharging are exponential, a capacitor will only be fully charged when $t = \infty$ . However, in practical terms, a capacitor can be considered charged at $t = 5 τ$ .

Example

In the circuit below, determine equations for the response of the capacitor when the switch is moved to position 2.

$V_{0}$ is equal to the voltage accross the capacitor at time $t = 0$ , which is the same as the voltage accross the 5 $k Ω$ resistor. When capacitors are fully charged, they are open circuit, so it is not conducing current, making the two voltages equal.

$V_{0} = 24 \times \frac{5}{3 + 5} = 15 V$

$V_{in}$ is equal to the voltage of the charging circuit as seen by the capacitor. This can be calculated as the thevenin equivalent of the circuit when the switch is in the right position.

$V_{t h} = - 75 \times \frac{160 k}{160 k + 40 k} = - 60 V$ $R_{t h} = 8 k + \frac{160 k \times 40 k}{160 k + 40 k} = 40 k Ω$

The time constant of the circuit:

$τ = R_{t h} \times C = 40 k \times 0.5 m = 0.05$

Therefore:

$V_{c} (t) = V_{t h} + (V_{0} - V_{t h}) e^{- \frac{t}{τ}} = - 60 + 75 e^{- 0.05 t} V$

The current can be calculated using $I = C \frac{d}{d t} V$ : $I_{c} (t) = C \frac{d}{d t} V_{c} (t) = 0.5 m \times \frac{d}{d t} (- 60 + 75 e^{- 0.05 t}) = - 1.87 e^{- 0.05 t} m A$

Another Example

For the circuit shown below:

Determine Thevenin circuit as seen by capacitor in position 1
Calculate the time constant of the circuit for time $t > 0$
Derive an equation for $V_{c} (t)$ for $t > 0$
Calculate the time taken for the capacitor voltage to fall to zero
Derive an equation for $I_{c} (t)$ for $t > 0$

t < 0

The Thevenin voltage of the left hand bit of the circuit can be calculated by KCL:

$\frac{V _{t h} - 40}{20 k} + \frac{V _{t h}}{60 k} - 8 = 0$ $V_{t h} = 150 V$

Calculating Thevenin resistance by summing resistances:

$R_{t h} = 40 k + \frac{20 k \times 60 k}{20 k + 60 k} = 55 k Ω$

t > 0

The Thevenin voltage of the right hand side as seen by the capacitor, using the voltage divider rule:

$V_{t h} = - 100 \times \frac{150 k}{150 k + 50 k} = - 75 V$

Thevenin Resistance:

$R_{t h} = 12.5 k + \frac{150 k \times 50 k}{150 k + 50 k} = 50 k Ω$

This gives the time constant $τ = 50 k \times 0.25 μ = 12.5 m s$

Deriving transient equations:

$V_{c} (t) = V_{t h} + (V_{0} - V_{t h}) e^{- \frac{t}{τ}} = - 75 + (150 - - 75) e^{- \frac{t}{12.5}}$ $V_{c} (t) = - 75 + 225 e^{- 80 t}$ $I_{c} (t) = C \frac{d}{d t} V_{c} (t) = 0.25 μ \times (- 75 + 225 e^{- 80 t}) = 4.5 e^{- 80 t} m A$

For $V_{c}$ to fall to zero: $- 75 + 225 e^{- 80 t} = 0$ $e^{- 80 t} = \frac{1}{3}$ $t = - \frac{1}{80} l n (0.33333) = 13.7 m s$

First Order RL Circuits

Basically the same as RC circuits, but with inductors instead.

Inductors

Inductors are reactive components, similar to capacitors. The difference is that while capacitors store energy in electric fields, inductors store it in magnetic fields. They do this with coils of wire wrapped around ferromangetic cores. Inductance is measured in Henries H and has symbol $L$ .

Inductance can be calculated as $L = \frac{μ _{0} μ _{r} A N ^{2}}{l}$ where

$N$ is the number of turns in the coil
$l$ is the circumference of the core
$A$ is the cross-sectional area of the core
$μ_{0}$ is the permeability of free space
$μ_{r}$ is the relative permeability of the core

Inductance

Current passing through a conductor (the coil of wire) causes a change in magnetic flux which magnetises the coil.
This change in flux induces an EMF (Electro-Motive Force) in any conductor within it.
Faraday's Law states that the magnitude of the EMF induced in a circuit is proportional to the rate of change of flux linking the circuit
Lenz's Law states that the direction of the EMF is such that it tends to produce a current that opposes the change of flux responsible for inducing the EMF in the first place
Therefore, as we attempt to magnetise an inductor with a current, it induced a back EMF while it's field charges
One the inductor is fully charged, the back EMF dissapears and the inductor becomes a short circuit (it is just a coil of wire, after all).
When a circuit forms a single coil, the EMF induced is given by the rate of change of the flux
When a circuit contains many coils of wire, the resulting EMF is the sum of those produced by each loop
If a coil contains N loops, the induced voltage $V$ is given by the following equation, where $Φ$ is the flux of the circuit. $V = - N \frac{d Φ}{d t}$
This property, where an EMF is induced by a changing flux, is known as inductance.

Self - Inductance

A changing current causes a changing field
which then induced an EMF in any conductors in that field
When any current in a coil changes, it induced an EMF in the coil

$V = L \frac{d I}{d t}$

This equation describes the I-V relationship for an inductor. It can be derived from the equations for faraday's law and inductance.

Energy Stored

The energy stored in an inductor is given by $W = \frac{1}{2} L I^{2}$

Series & Parallel Combinations

Inductors sum exactly the same way as resistors do. In series:

$L_{t} = L_{1} + L_{2}$

And in parallel:

$\frac{1}{L _{t}} = \frac{1}{L _{1}} + \frac{1}{L _{2}}$

DC Conditions

The final constant values of a circuit, where current and voltage are both in a "steady-state" is known as DC conditions. Under DC conditions:

Capacitor acts as open circuit
Inductor acts as short circuit

Response of RL Circuits

Inductors exhibit the same exponential behaviour as capacitors. In a simple first order RL circuit:

Inductor is initially uncharged with a current at 0
When the circuit is switched on at time t=0, $I$ is initially 0 as the inductor is open circuit.
- $V_{R}$ is initially 0
- $V_{L}$ is initially V
As the inductor energises, $I$ increases, $V_{R}$ increases, so $V_{L}$ decreases
- This is where the exponential behaviour comes from

Equations for Step Response

Consider the circuit above, where thw switch is closed at time t=0. KVL can be used to derive an equation for the current in the circuit over time, which is shown below:

$I (t) = \frac{V _{in}}{R} + (I_{0} - \frac{V _{in}}{R}) e^{- \frac{t}{τ}}$

Where the time constant $τ = \frac{L}{R}$ . The inductor voltage at time $t$ is equal to: $V_{L} (t) = (V_{in} - I_{0} R) e^{- \frac{t}{τ}}$

When discharging, the current at time $t$ is equal to: $I (t) = I_{0} e^{- \frac{t}{τ}}$

Note that $\frac{V _{in}}{R}$ is equal to current $I_{in}$ / $I_{\infty}$ , by ohm's law.

RC vs RL Circuits

RC circuits and RL circuits are similar in some respects, but different in others.

RC Equations

$I = C \frac{d V}{d t}$ $V_{in} = I R + V_{C} = RC \frac{d V}{d t} + V_{C}$ $V_{C} (t) = V_{in} + (V_{0} - V_{in}) e^{- f r a c t τ}$ $τ = RC$

RL Equations

$V = L \frac{d I}{d t}$ $V_{in} = I R + V_{L} = I R + L \frac{d}{d t} I_{L}$ $I_{L} (t) = \frac{V _{in}}{R} + (I_{0} - \frac{V _{in}}{R}) e^{- \frac{t}{τ}}$ $τ = \frac{L}{R}$

Examples

In the circuit below, the switch is opened at time $t = 0$ . Find:

$I (t)$ for $t > 0$
$I_{0} (t)$ for $t > 0$
$V_{0} (t)$ for $t > 0$

$I (t)$

Looking for something of the form $I_{L} (t) = \frac{V _{in}}{R} + (I_{0} - \frac{V _{in}}{R}) e^{- \frac{t}{τ}}$

In steady state, before the switch is opened, all of the current flows through the inductor as it is short circuit, meaning $I_{0} = 20 A$ .

When the switch is opened there is no energy supplied to the circuit, so the inductor discharges through the right hand half of the circuit. The inductor can see a resistance of $R_{e q} = 2 + 10∣∣40$ :

$R = 2 + \frac{1}{\frac{1}{10} + \frac{1}{40}} = 10 Ω$

There is no input voltage, so: $I_{L} (t) = 0 + (I_{0} - 0) e^{- \frac{t}{τ}}$ $τ = \frac{2}{10} = 0.25$ $I (t) = 20 e^{- 5 t}$

$I_{0} (t)$

This can simply be calculated using the current divider rule:

$I_{0} (t) = - 20 e^{- 5 t} \times \frac{10}{10 + 40} = - 4 e^{- 5 t}$

$V_{0} (t)$

Using ohm's law:

$V_{0} (t) = I_{0} (t) R = 40 \times - 4 e^{- 5 t} = - 160 e^{- 5 t}$

AC Circuits

AC current is the dominant form of electricity
Current changes direction at a fixed frequency (usually 50~60Hz)
AC voltage is generated by a rotating electromagnetic field
- The angular velocity of this rotation determines the frequency of the current

An instantaneous voltage $V$ in a sine wave is described by:

$V = V_{p} sin (ω t + ϕ)$

Where:

$V_{p}$ is the peak voltage
$ω$ is the angular frequency (rad/s)
$ϕ$ is the phase shift (radians)
The period of the wave is given by $T = \frac{2 π}{f}$

$V_{p}$ , $ω$ and $ϕ$ define a waveform

As current and voltage are proportional, AC current is defined in a similar way:

$I = I_{p} sin (ω t + ϕ)$

Euler's Identity and Phasors

A phasor is a vector that describes a point in a waveform. A vector has a magnitude and a direction, which describe the amplitude $V_{p}$ and the phase $ϕ$ of the signal, respectively. The rate at which the phasor "rotates" is the frequency of the signal.

An AC phasor can be represented as a complex number.

$A sin (ω t + ϕ) = A cos ϕ + j A sin ϕ = e^{j ϕ}$

This formula can be used to go from anywhere on a waveform to a phasor, for example:

$V = 5 sin (ω t + 30) = 5 e^{30 j} = 5 ∠3 0^{\circ}$

Reactance and Impedance

The ratio of voltage to current is a measure of how a component opposes the flow of electricity
In a resistor, this is resistance
In inductors and capacitors, this property is reactance, $X$ , measure in ohms $Ω$
Can still be used in a similar way to resistance
Ohm's law still applies, $V = I X$
Capacitative reactance $X_{C} = \frac{1}{ω C}$
Inductive reactance $X_{L} = ω L$
- $ω$ is the angular frequency of the AC current
Both reactance and resistance are impedances
Impedance $Z$ is also measured in ohms
The impedance of a component is how hard it is for current to flow through it
- Impedance represents not only the magnitude of the current, but the phase

Inductance

The voltage accross an inductor is: $V_{L} = L \frac{d}{d t} I_{L}$

In an AC circuit: $I_{L} = I_{p} sin (ω t + ϕ) = I_{p} ∠ ϕ = I_{p} e^{j ϕ}$ $V_{L} = L \frac{d}{d t} I_{L} = L ω I_{p} cos (ω t) = ω L I_{p} sin (ω t + ϕ + 9 0^{\circ}) = ω L I_{p} ∠ (ϕ + 9 0^{\circ})$

When an AC current flows through an inductor, an impedance applies

$Z_{L} = \frac{V _{L}}{I _{L}} = \frac{ω L I _{p} ∠ ( ϕ + 9 0 ^{\circ} )}{I _{p} ∠ ϕ} = ω L ∠9 0^{\circ} = jω L$

The impedance of an inductor is $j$ times its reactance: $Z_{L} = j X_{L} = jω L$

Capacitance

Capacitors have a similar property: $I_{C} = C \frac{d}{d t} V_{c}$ $V_{C} = V_{p} sin (ω t + ϕ) = V_{p} e^{j ϕ}$ $I_{C} = C \frac{d}{d t} V_{p} sin (ω t + ϕ) = ω C V_{p} sin (ω t + ϕ + 9 0^{\circ})$ $Z_{C} = \frac{V _{L}}{I _{L}} = \frac{V _{p} e ^{j ϕ}}{ω C V _{p} e ^{j (ϕ + 90)}} = \frac{1}{ω C j}$

Capacitive Impedance:

$Z_{C} = - j X_{c} = \frac{1}{jω C}$

Complex Impedance

Impedance not only changes the magnitude of an AC current, it also changes its phase.

In a capacitor, voltage leads current by a phase of 90 degrees
In an inductor, current leads voltage by a phase of 90 degrees
- CIVIL: Capacitor I leads V, V leads I in inductor

The diagram below shows the effect of reactance on phase shift.

Consider the circuit below, containing an inductor and resistor in series. The phasor diagram shows the effect of the impedances on the voltage. The inductor introduces a phase shift of 90 degrees into the voltage.

The magnitude of the voltage accross both components is: $V = (V_{R})^{2} + (V_{L})^{2} = (I R)^{2} + (I X_{L})^{2} = I R^{2} + (X_{L})^{2} = I Z$ where Z is the magnitude of the impedance, $Z = ∣ Z ∣$

From the phasor diagram, the phase shift of the impedance is: $ϕ = tan^{- 1} \frac{V _{L}}{R _{L}} = tan^{- 1} \frac{I X _{L}}{I R} = tan^{- 1} \frac{X _{L}}{R}$

Complex impedances sum in series and parallel in the exact same way as normal resistance.

Example 1

Determine the complex impedance of the following combination at 50 Hz

$Z_{T} = Z_{C} + Z_{R} + Z_{L} = R + j X_{L} - j X_{C} = R + j (ω L - \frac{1}{ω C})$ At 50Hz, the angular frequency $ω = 2 π f = 314$ rad/s $= 200 + j (314 \times 400 m - \frac{1}{314 \times 50 μ}) = 200 + 62 j Ω$

Example 2

Determine the complex impedance and therefore the current in the following combination

Since $V = 100 sin (250 t)$ , $ω = 250$

$Z_{T} = R - j X_{C} = 100 - \frac{j}{ω C} = 100 - j \frac{1}{250 \times 1 0 ^{- 4}} = 100 - 40 j$

The current can be calculated from the impedance using ohm's law: $I = \frac{V}{Z} = \frac{100}{100 - 40 j} = 0.86 + 0.34 j = 0.93 ∠21. 8^{\circ}$

Diodes

Diodes are semiconductor devices that allow current to flow only in one direction. Diodes look like this:

The diagram is labelled with an anode and a cathode. The voltage drop accross the diode is from anode -> cathode, and the current is conducted in the direction pointed by the really big black arrow.

The type's of diode's we're concerned with are silicon diodes, which have a forward voltage of about 0.7V. This is only an approximation, but is the value to use in calculations.

IV characteristics

Diodes are non-linear components:

When current is flowing from anode to cathode, the diode is forward-biased, and will conduct current
When the current is flowing backwards (the wrong way), the diode is reverse-biased.
At a large negative voltage, the diode will break down, and start to conduct current again
- Don't let the voltage get this high, you wont like what happens.

Forward Voltage

For the diode to conduct, it must have a minimum voltage accross it, known as the forward voltage. This is also always the total voltage drop accross the diode. For a silicon diode, this is 0.7V, which is why the I-V graph does not go up from zero. The diode can be said to "open" or "switch on" at about this voltage.

If there is a voltage of 0.2V accross a diode, no current will flow
If there is a voltage of 0.6V accross a diode, a tiny amount of current may flow
At >0.7V, the full current will flow with no resistance.

Example 1

Find the current and the voltages accross each component in the circuit below.

By Ohm's law, the current is:

$I = \frac{V _{in} - V _{D}}{R _{t}} = \frac{10 - 0.7}{100 + 300} = 23.25 m A$

Thefore, the voltages are $V_{300 R} = 300 \times 23.25 = 6.98 V$ $V_{100 R} = 100 \times 23.25 = 2.32 V$ $V_{D} = 0.7 V$

Example 2

Find the current through each resistor in the circuit below.

Doing KCL around node $V_{x}$ :

$\frac{V _{x} - 9}{500} + \frac{V _{x}}{100} + \frac{V _{x} - 0.7}{50} = 0$ $V_{x} = 1 V$

The three currents are then:

$I_{500 R} = \frac{9 - 1}{500} = 16 m A$ $I_{100 R} = \frac{1}{100} = 10 m A$ $I_{50 R} = \frac{1 - 0.7}{50} = 6 m A$

Transistors

Transistors are semiconductor devices based on P-N junctions. They have three terminals, the arrangement of which depends on the kind of transistor:

Base
Emitter
Collector

KCL applies, meaning the currents in the transistor sum to zero: $I_{E} = I_{C} + I_{B}$

Transistors, like diodes are also semiconductors, meaning there is a voltage drop of 0.7 volts between the base and the emitter. When there is no collector current, transistors behave like a diode.

Transistors also have a current gain, meaning the current flowing into the collector is related to the current flowing into the base: $I_{C} = β I_{B}$

NPN Transistors

The base-emitter junction behaves like a diode
A base current $I_{B}$ only flows when the voltage $V_{B} E$ is sufficiently positive, ie $\geq 0.7 V$ .
The small base current controls the larger collector current, flowing from collector to emitter
$I_{C} = β I_{B}$ - the current gain, showing how base current controlls collector current

Functionally, transistors are switches that emit a current from collector to emitter dependant upon the base current.

Example

For the circuit below, find the base and collector currents using a gain of $β = 200$ .

The base current can be calculated using ohm's law, taking into account the 0.7V drop between base and emitter: $I_{B} = \frac{10 - 0.7}{185 k} = 50.2 μ A$

As there is sufficient voltage for the transistor to be on, the collector current is therefore: $I_{C} = β I_{B} = 200 \times 50.2 μ A = 10 m A$

PNP Transistors

The diagram at the top of the page shows the circuit symbols for both kinds of transistor. The difference between the two is the way the emitter points, which is the direction of current flow in the transistor, and also the direction of voltage drop. An NPN transistor has a forward-biased junction, whereas PNP is reverse biased. Functionally, the difference between the two is that for a PNP transistor to be "on", the emitter should be at $0.7 V$ higher than the base.

Example

Note that this circuit uses a PNP transistor, so the base is at a lower voltage than the emitter. Also note that one of the resistors is not labelled. This is because the value of it is irrelevant, as the collector current is dependant upon the bias of the transistor.

$I_{B} = \frac{20 - 0.7}{185 k} = 104 μ A$ $I_{C} = 200 \times I_{B} = 20.8 m A$

Emitter Current

Notice that in the two examples, the collector current is much larger than the base, due to the large gain on the transistor. When there is a large gain $β$ : $I_{E} = I_{C} + I_{B}$ $I_{E} = β I_{B} + I_{B} \approx β I_{B}$ $I_{E} \approx β I_{B} = I_{C}$

From the example above: $I_{E} = I_{B} + I_{C} = 104 μ A + 20.8 m A = 20.9 m A \approx I_{C}$

Op Amps

Operational Amplifiers (Op-Amps) are high-gain electronic voltage amplifiers. They have two inputs, an output, and two power supply inputs. Op amps require external power, but this is implicit so is often emitted in circuit diagrams.

Op amps are differential amplifiers, meaning they output an amplified signal that is proportional to the difference of the two inputs. They have a very high gain, in the range of $1 0^{4}$ to $1 0^{6}$ , but this is assumed to be infinite in ideal amplifiers. The output voltage is calculated by:

$V_{0} = A (V_{2} - V_{1})$

Ideal Model

An ideal model of an op amp is shown below

Open loop gain is infinite
- The gain of the op amp when there is no positive or negative feedback
Input impedance ( $Z_{in}$ ) is infinite
- Ideally, no current flows into the amplifier
Output impedance ( $Z_{o u t}$ ) is zero
- The output is assumed to act like a perfect voltage source to supply as much current as possible
Bandwith is infinite
- An ideal op amp can amplify any input frequency signal
Offset Voltage is zero
- The output will be zero when the input and output voltage are the same

Ideal Circuits

Op amps can be used to design inverting and non-inverting circuits.

Inverting

Negative feedback is used to create an amplifier that is stable, ie doesn't produce a massive voltage output.
This creates closed loop gain, which controls the output of the amplifier
The non-inverting input is grounded
The negative feedback reverses the polarity of the output voltage
As the output of the op amp is only a few volts, and the gain of the op amp is very high, it can be assumed that the voltage at both inputs is equal to zero volts
- This creates a "virtual earth" at the node shown on the diagram

Using KCL at this node, it can be shown that:

$\frac{V _{o u t}}{V _{in}} = \frac{- R _{F}}{R _{in}}$

The gain of the amplifier is set by the ratio of the two resistors.

Non-Inverting

Non-inverting amplifiers don't invert the voltage output, and use input at the non-inverting terminal of the op amp instead.

The output of the amplifier is calculated by:

$\frac{V _{o u t}}{V _{in}} = 1 + \frac{R _{F}}{R _{2}}$

Op Amps as Filters

Filters take AC signals as input, and amplify/attenuate them based upon their frequency.

Low Pass Filter

Take a simple inverting amplifier circuit, and add a capacitor in parallel.

The gain is now a function of the input frequency, which makes the circuit a filter. The reactance of the capacitor $X_{C} = \frac{1}{ω C}$ . The impedance of the capacitor and resistor in parallel:

$Z = \frac{R _{2} j X _{C}}{R _{2} + j X _{C}} = \frac{R _{2}}{1 + jω C R _{2}}$

The gain as a function of $jω$ is therefore:

$A (jω) = \frac{V _{o u t} ( jω )}{V _{in} ( jω )} = \frac{- Z}{R _{1}} = - \frac{R _{2}}{1 + jω C R _{2}} \times \frac{1}{R _{1}}$

Gain is measured in decibels
As the input frequency increases, gain decreases
At very low frequencies, the gain is constant (0dB)
- The capacitor has high reactance at low frequencies, and is open circuit at very low frequencies
At very high frequencies, the gain tends towards $- \infty$ dB
- The capacitor has a very low reactance at high frequencies (short circuit)

Cutoff Frequency

The cutoff frequency of a filter is the point at which the gain is equal to -3 dB, which corresponds to a fall in output by a factor of $\frac{1}{2}$ . For the filter shown above, this is:

$f_{c} = \frac{1}{2 π R _{2} C}$

High Pass Filter

A high pass filter is designed in a similar way

This time, the impedance of the capacitor-resistor combination is:

$Z = R_{1} + \frac{1}{jω C} = \frac{1 + jω C R _{1}}{jω C}$

Which makes the gain:

$A (jω) = \frac{V _{o u t} ( jω )}{V _{in} ( jω )} = \frac{- R _{2}}{Z} = - R_{2} \times \frac{jω C}{1 + jω C R _{1}}$

The cutoff frequency for this filter is:

$f_{c} = \frac{1}{2 π R _{1} C}$

Which is similar to the other one, just with the other resistor.

Voltage Transfer Characteristics

The voltage transfer characteristic of an amplifier shows the output voltage as a function of the input voltage
The output range is equal to the range of the power supplies
Where the slope = 0, the amplifier is saturated
Where the slope > 0, the gain is positive
Where the slope < 0, the gain is negative
When the amplifier is saturated the signal becomes distorted

Passive Filters

Op amps are active filters because they require power. Passive filters use passive components (Resistors, Inductors, Capacitors) to achieve a similar effect. They are constructed using a potential divider with reactive components. The diagram below shows a potential divider with two impedances, $Z_{1}$ and $Z_{2}$ :

$\frac{V _{o u t}}{V _{in}} = \frac{Z _{2}}{Z _{1} + Z _{2}}$

Transfer Functions

The transfer function is the ratio of input to output (see ES197 - Transfer Functions for more details.). For a passive filter, this is the ratio of output voltage to input voltage, as shown above. For a filter, this will be a function of the input waveform, $H (jω)$ . When $Z_{1}$ and $Z_{2}$ are both identical resistors $R$ :

$H (jω) = \frac{R}{R + R} = \frac{1}{2}$

However, if $Z_{2}$ was a capacitor $C$ , $Z_{2} = \frac{1}{jω C}$ :

$H (jω) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{1}{1 + jω RC}$

The gain and phase of the output are then the magnitude and argument of the transfer function, respectively: $∣ H (jω) ∣ = \frac{1}{1 + ( ω RC ) ^{2}}$ $∠ H (jω) = \frac{∠ 0 ^{\circ}}{tan ^{- 1} ( ω RC )} = - tan^{- 1} (ω RC)$

Cutoff Frequency

Similar to active filters, passive filters also have a cutoff frequency $f_{c}$ . This is the point at which the power output of the circuit falls by $\frac{1}{2}$ , or the output gain falls by -3dB, a factor of $\frac{1}{2}$ . Using the above example again (a low pass RC filter):

$∣ H (jω) ∣ = \frac{1}{1 + ( ω RC ) ^{2}} = \frac{1}{2}$ $2 = 1 + (ω RC)^{2}$ $ω^{2} = \frac{1}{R ^{2} C ^{2}}$ $ω = \frac{1}{RC}$ $f_{c} = \frac{1}{2 π RC}$

This is also the point at which $H (jω) = 1 + j$

The filter bandwith is the range of frequencies that get through the filter. This bandwith is 0 to $f_{c}$ for low pass filters, or $f_{c}$ and upwards for high pass.

RC High Pass

$H (jω) = \frac{jω RC}{1 + jω RC}$ $∣ H (jω) ∣ = \frac{ω RC}{1 + ( ω RC ) ^{2}}$ $∠ H (jω) = \frac{∠9 0 ^{\circ}}{tan ^{- 1} ( ω RC )} = 90 - tan^{- 1} (ω RC)$ $f_{c} = \frac{1}{2 π RC}$

RC Low Pass

$H (jω) = \frac{1}{1 + jω RC}$ $∣ H (jω) ∣ = \frac{1}{1 + ( ω RC ) ^{2}}$ $∠ H (jω) = \frac{∠ 0 ^{\circ}}{tan ^{- 1} ( ω RC )} = - tan^{- 1} (ω RC)$ $f_{c} = \frac{1}{2 π RC}$

RL High Pass

$H (jω) = \frac{jω L}{jω L + R}$ $∣ H (jω) ∣ = \frac{ω L}{R ^{2} + ( ω L ) ^{2}}$ $∠ H (jω) = 90 - tan^{- 1} (\frac{ω L}{R})$ $f_{c} = \frac{R}{2 π L}$

RL Low Pass

$H (jω) = \frac{R}{jω L + R}$ $∣ H (jω) ∣ = \frac{R}{R ^{2} + ( ω L ) ^{2}}$ $∠ H (jω) = - tan^{- 1} (\frac{ω L}{R})$ $f_{c} = \frac{R}{2 π L}$

2nd Order Circuits

For circuits more complex than those above, to find the transfer function, either:

Find a thevenin equivalent circuit, as seen from the element
Combine multiple elements into single impedances

Note that any of the above techniques only work for simple first order circuits.

Example

Using $H (jω) = \frac{Z _{2}}{Z _{1} + Z _{2}}$ , where $Z_{1} = R_{1}$ , and $Z_{2} = R_{2} ∣∣ j X_{C}$ :

$Z_{2} = \frac{\frac{R _{2}}{jω C}}{R _{2} + \frac{1}{jω C}} = \frac{R _{2}}{1 + jω R _{2} C}$ $H (jω) = \frac{Z _{2}}{Z _{2} + Z _{1}} = \frac{\frac{R _{2}}{1 + jω R _{2} C}}{R _{1} + \frac{R _{2}}{1 + jω R _{2} C}} = \frac{R _{2}}{R _{1} + jω R _{1} R _{2} C + R _{2}}$ $∣ H (jω) ∣ = \frac{R _{2}}{( R _{1} + R _{2} ) ^{2} + ( ω R _{1} R _{2} C ) ^{2}}$ $∠ H (jω) = - tan^{- 1} \frac{ω R _{1} R _{2} C}{R _{1} + R _{2}}$

Equations

Below are some of the main equations that I have found useful to have on hand.

Capacitors
Energy Stored	$E = \frac{1}{2} C V^{2} = \frac{1}{2} Q V = \frac{Q ^{2}}{2 C}$
Capacitor Equation	$C = \frac{Q}{V}$
Capacitance equation	$C = \frac{A ϵ _{r} ϵ _{0}}{d}$
Series Capacitors	$\frac{1}{C _{T}} = \frac{1}{C _{1}} + \frac{1}{C _{2}}$
Parallel Capacitors	$C_{T} = C_{1} + C_{2}$
Current-Voltage	$C = \frac{Q}{V} = \frac{\int I , d t}{V}$
Step Response	$V_{c} (t) = V_{in} + (V_{0} - V_{in}) e^{- \frac{t}{RC}}$
Electric Field Strength	$E = \frac{F}{Q} = \frac{1}{4 π ϵ _{0}} \frac{Q}{r ^{2}} = \frac{V}{r}$
Capacitor Reactance	$X_{c} = \frac{1}{2 π f C} = \frac{1}{j wC}$
Flux Density	$D = \frac{f l ux}{a re a} = \frac{c ha r g e}{a re a} (?)$
Magnetic Field Strength of Straight Current Carrying Wire	$B = \frac{μ _{0} I}{2 π d}$

Resistors
Resistors in Series	$R_{t} = R_{1} + R_{2}$
Resistors in Parallel	$\frac{1}{R _{t}} = \frac{1}{R _{1}} + \frac{1}{R _{2}}$
Voltage Divider	$V_{o u t} = V_{in} \times \frac{Z _{1}}{Z _{1} + Z _{2}}$
Current Divider	$I_{R 1} = I_{T} \times \frac{R _{2}}{R _{1} + R _{2}}$

Inductors
Inductors in Series	$L_{t} = L_{1} + L_{2}$
Inductors in Parallel	$\frac{1}{L _{t}} = \frac{1}{L _{1}} + \frac{1}{L _{2}}$
Induced Voltage	$V = - N \frac{d Φ}{d t}$
Self Inductance	$V = L \frac{d I}{d t}$
Energy Stored	$W = \frac{1}{2} L I^{2}$
Step Response of RL Circuit (Current)	$I_{L} (t) = \frac{V _{in}}{R} + (I_{0} - \frac{V _{in}}{R}) e^{- \frac{t}{τ}}$
Step Response of RL Circuit (Voltage)	$V_{L} (t) = (V_{in} - I_{0} R) e^{- \frac{t}{τ}}$

Thevenin and Norton Equivalent Circuits
Equivalent Resistance	$R_{t h} = \frac{V _{t h}}{I _{sc}}$
Thevenin - Norton Conversion	$I_{N} = \frac{V _{t h}}{R _{t h}}$

AC Circuits
Instantaneous Voltage	$V = V_{p}, sin (ω t + ϕ)$
Instantaneous Current	$I = I_{p}, sin (ω t + ϕ)$
AC Phasor - As complex number	$A sin (ω t + ϕ) = A cos ϕ + j A sin ϕ = A e^{j ϕ}$

Operational Amplifiers
Output of Inverting Amplifier	$\frac{V _{o u t}}{V _{in}} = \frac{R _{in}}{R _{F}}$
Output of Non-Inverting Amplifier	$\frac{V _{o u t}}{V _{in}} = 1 + \frac{- R _{in}}{R _{F}}$

Filters
Cutoff Frequncy	$f_{c} = \frac{1}{2 π R _{2} C}$
Gain (dB)	$, 20 lo g \frac{V o u t}{Vin}$

Capacitors

Energy Stored

The energy stored by a capacitor of capacitance, C with a voltage, v $E = \frac{1}{2} C V^{2} = \frac{1}{2} Q V = \frac{Q ^{2}}{2 C}$

$C$ = Capacitance, Farads, F
$V$ = Voltage, Volts, V
$Q$ = Charge, Coulombs, C

Capacitor Equation

The ratio of charge to voltage. $C = \frac{Q}{V}$

$C$ = Capacitance, Farads, F
$V$ = Voltage, Volts, V
$Q$ = Charge, Coulombs, C

Capacitance equation

$C = \frac{A ϵ _{r} ϵ _{0}}{d}$

$A$ = the area of the two plates
$d$ = the separation of the two plates
$ϵ_{r}$ = the relative electric permittivity of the insulator
$ϵ_{0}$ = the permittivity of free space

Series Capacitors

$\frac{1}{C _{T}} = \frac{1}{C _{1}} + \frac{1}{C _{2}}$ $C_{T} = \frac{1}{\frac{1}{C _{1}} + \frac{1}{C _{2}}}$

Parallel Capacitors

$C_{T} = C_{1} + C_{2}$

Current-Voltage

$C = \frac{Q}{V} = \frac{\int I d t}{V}$

Step Response

$V_{c} (t) = V_{in} + (V_{0} - V_{in}) e^{- \frac{t}{RC}}$

$V_{c} (t)$ = Voltage of the capacitor at time t, Volts
$V_{in} (t)$ = Voltage in, Volts
$V_{0}$ = Starting Voltage, Volts
$C$ = Capacitance, Farads, F Derived from: $I_{c} (t) = C \frac{d}{d t} V_{c} (t)$

Electric Field Strength

$E = \frac{F}{Q} = \frac{1}{4 π ϵ _{0}} \frac{Q}{r ^{2}} = \frac{V}{r}$

$F$ = Force
$Q$ = Charge
$ϵ_{0}$ = Permittivity of free space = $8.85 \times 1 0^{- 12} F m^{- 1}$
$\frac{1}{4 π ϵ _{0}}$ = Constant
$V$ = Voltage Potential, Volts
$r$ = Separation

Capacitor Reactance

As the capacitor charges or discharges, a current flows through it which is restricted by the internal impedance of the capacitor. This internal impedance is commonly known as Capacitive Reactance $X_{c} = \frac{1}{2 π f C} = \frac{1}{j wC}$

$X_{c}$ = Reactance of the Capacitor, Ohmns
$j$ = $i$ = $- 1$
$w$ = frequency, rads per second

Flux Density

The amount of flux passing through a defined area that is perpendicular to the direction of the flux. $D = \frac{f l ux}{a re a} = \frac{c ha r g e}{a re a} (?)$

Magnetic Field Strength of Straight Current Carrying Wire

Amperes Law: For any closed loop path, the sum of the products of the length elements and the magnetic field in the direction of the length elements is proportional to the electric current enclosed in the loop. $B = \frac{μ _{0} I}{2 π d}$

$B$ = Magnetic field strength at distance d
$I$ = Current
$μ_{0}$ = Permeability of free space = $4 π \times 1 0^{- 7} T m / A$
$d$ = distance from the wire.

Resistors

Resistors in Series

$R_{t} = R_{1} + R_{2}$

Resistors in Parallel

$\frac{1}{R _{t}} = \frac{1}{R _{1}} + \frac{1}{R _{2}}$

Voltage Divider

$V_{o u t} = V_{in} \times \frac{Z _{1}}{Z _{1} + Z _{2}}$

Current Divider

$I_{R 1} = I_{T} \times \frac{R _{2}}{R _{1} + R _{2}}$

Inductors

Inductors in Series

Inductors act in the same way as resistors in terms of their behaviour in series and parallel. $L_{t} = L_{1} + L_{2}$

Inductors in Parallel

$\frac{1}{L _{t}} = \frac{1}{L _{1}} + \frac{1}{L _{2}}$

Induced Voltage

If a coil contains N loops, the induced voltage V is given by the following equation, where Φ is the flux of the circuit. $V = - N \frac{d Φ}{d t}$

Self Inductance

A changing current causes a changing field, which then induced an EMF in any conductors in that field, When any current in a coil changes, it induced an EMF in the coil $V = L \frac{d I}{d t}$

Energy Stored

The energy stored by an inductor is given by: $W = \frac{1}{2} L I^{2}$

Step Response of RL Circuit (Current)

$I_{L} (t) = \frac{V _{in}}{R} + (I_{0} - \frac{V _{in}}{R}) e^{- \frac{t}{τ}}$

$V_{in}$ - Voltage source
$R$ - Resistance of the resistor
$I_{0}$ - The initial current. (If is already charged, then will be short circuit current)
$τ = \frac{L}{R}$

Step Response of RL Circuit (Voltage)

Inductor voltage at time t, $V_{L} (t) = (V_{in} - I_{0} R) e^{- \frac{t}{τ}}$

$V_{L} (t)$ - Voltage across inductor at time t
$V_{in}$ - Voltage source
$R$ - Resistance of the resistor
$I_{0}$ - The initial current
$τ = \frac{L}{R}$

Thevenin and Norton Equivalent Circuits

Thevenin circuits contain a single voltage source and resistor in series. Norton circuits contain a single current source and a resistor in parallel

Equivalent Resistance

$R_{t h} = \frac{V _{t h}}{I _{sc}}$ Any linear network viewed through 2 terminals is replaced with an equivalent single voltage & resistor.

The equivalent voltage is equal to the open circuit voltage between the two terminals ( $V_{oc}$ / $V_{t h}$ )
The equivalent resistance ( $R_{t h}$ ) is found by replacing all sources with their internal impedances and then calculating the impedance of the network, as seen by the two terminals.
- This can be done alternatively by calculating the short circuit current ( $I_{sc}$ / $I_{N}$ ) between the two terminals, and then using ohms law: $R_{t h} = \frac{V _{t h}}{I _{sc}}$ .
The value of the voltage source in a Thevenin circuit is $V_{t h}$
The value of the current source in a Norton circuit is $I_{N}$
The value of the resistor in either circuit is $R_{t h}$

Thevenin - Norton Conversion

Thevenin and Norton are essentially the same, but in a different form. The $R_{t} h$ is the same for both. $I_{N} = \frac{V _{t h}}{R _{t h}}$

$I_{N}$ - Norton Current
$V_{t h}$ - Thevevin Voltage
$R_{t h}$ - Thevenin Resistance

AC Circuits

AC current is the dominant form of electricity,
Current changes direction at a fixed frequency (usually 50~60Hz)
AC voltage is generated by a rotating electromagnetic field
- The angular velocity of this rotation determines the frequency of the current

Instantaneous Voltage

An instantaneous voltage V in a sine wave is described by $V = V_{p} sin (ω t + ϕ)$

Where:

$V_{p}$ is the peak voltage
$ω$ is the angular frequency (rad/s)
$ϕ$ is the phase shift (radians)
The period of the wave is given by $T = \frac{1}{f} = \frac{2 π}{ω}$

Instantaneous Current

As current and voltage are proportional, AC current is defined in a similar way: $I = I_{p} sin (ω t + ϕ)$

AC Phasor - As complex number

An AC phasor can be represented as a complex number. $A sin (ω t + ϕ) = A cos ϕ + j A sin ϕ = A e^{j ϕ}$

Operational Amplifiers

Output of Inverting Amplifier

The gain of the amplifier is set by the ratio of the two resistors. The negative feedback reverses the polarity of the output voltage (Hence Negative). $\frac{V _{o u t}}{V _{in}} = \frac{- R _{in}}{R _{F}}$

Output of Non-Inverting Amplifier

Non-inverting amplifiers don't invert the voltage output, and use input at the non-inverting terminal of the op amp instead. $\frac{V _{o u t}}{V _{in}} = 1 + \frac{R _{in}}{R _{F}}$

Filters

Cutoff Frequncy

The cutoff frequency of a filter is the point at which the gain is equal to -3 dB, which corresponds to a fall in output by a factor of $\frac{1}{2}$ . For the filter shown above, this is: $f_{c} = \frac{1}{2 π R _{2} C}$

Gain (dB)

Gain is measured in decibels $20 lo g \frac{V o u t}{Vin}$

At very low frequencies, the gain is constant (0dB) The capacitor has high reactance at low frequencies, and is open circuit at very low frequencies At very high frequencies, the gain tends towards − $\infty$ dB The capacitor has a very low reactance at high frequencies (short circuit)

ES193

Functions, Conics & Asymptotes

Domain & Range

The domain of a function is the set of all valid/possible input values
- The x axis
The range of a function is the set of all possible output values
- The y axis

Odd & Even Functions

$f (x) = f (- x) \Rightarrow f is even$ $f (- x) = - f (x) \Rightarrow f is odd$ $f (x) = - f (- x) \Rightarrow f is odd$

Conics

Equation of a circle with radius $r$ and centre $(x_{0}, y_{0})$ $(x - x_{0})^{2} + (y - y_{0})^{2} = r^{2}$

Equation of an ellipse with centre $(x_{0}, y_{0})$ , major axis length $2 a$ and minor axis length $2 b$ : $\frac{( x - x _{0} ) ^{2}}{a ^{2}} + \frac{( y - y _{0} ) ^{2}}{b ^{2}} = 1$

Equation of a Hyperbola with vertex $(x_{0}, y_{0})$ :

$\pm \frac{( x - x _{0} ) ^{2}}{a ^{2}} \mp \frac{( y - y _{0} ) ^{2}}{b ^{2}} = 1$ The asymptotes of this hyperbola are at: $(y - y_{0}) = \pm \frac{b}{a} (x - x_{0})$

Asymptotes

There are 3 kinds of asymptotes:

Vertical
Horizontal
Oblique (have slope)

For a function $y = \frac{P ( x )}{Q ( x )}$ :

Vertical asymptotes lie where $Q (x) = 0$ and $P (X) \neq = 0$
Horizontal asymptotes
- If the degree of the denominator is bigger than the degree of the numerator, the horizontal asymptote is the x-axis
- If the degree of the numerator is bigger than the degree of the denominator, there is no horizontal asymptote.
- If the degrees of the numerator and denominator are the same, the horizontal asymptote equals the leading coefficient of the numerator divided by the leading coefficient of the denominator
Oblique asymptotes
- A rational function will approach an oblique asymptote if the degree of the numerator is one order higher than the order of the denominator
- To find
  - Divide $P (x)$ by $Q (x)$
  - Take the limit as $x \to \infty$

Example: find the asymptotes of $y = \frac{- 3 x ^{2} + 2}{x - 1}$ :

Vertical asymptotes:
- Where the denominator is 0

$x - 1 = 0 \Rightarrow x = 1$

Horizontal asymptotes:
- There are none, as degree of the numerator is bigger than the degree of the denominator
Oblique asymptotes:
- Divide the top by the bottom using polynomial long division
- Find the limit

$y = \frac{- 3 x ^{2}}{x - 1} = - 3 x - 3 + \frac{- 1}{x - 1}$

As $x \to \infty$ , $y \to - 3 x - 3$ , giving $y = - 3 x - 3$ as an asymptote.

Complex Numbers

De Moivre's Theorem

$(r (cos θ + i sin θ))^{n} = r^{n} (cos n θ + i sin n θ)$

Complex Roots

For a complex number

$z = (r (cos θ + i sin θ))$

The $n^{t h}$ roots can be found using the formula

$z^{\frac{1}{n}} = r^{\frac{1}{n}} (cos \frac{θ + 2 kπ}{n} + i sin \frac{θ + 2 kπ}{n}), k = 0, 1, 2, ..., n - 1$

Finding Trig Identities

Trig identities can be found by equating complex numbers and using de moivre's theorem. The examples below are shown for n=2 but the process is the same for any n.

Identities for $f (n θ)$

Using de moivre's theorem to equate $cos 2 θ + i sin 2 θ = (cos θ + i sin θ)^{2}$

Expanding $(cos θ + i sin θ)^{2} = cos^{2} θ + 2 i sin θ cos θ - sin^{2} θ$

Equating real and imaginary parts $cos 2 θ = cos^{2} θ - sin^{2} θ$ $sin 2 θ = 2 sin θ cos θ$

Identities for $f^{n} (θ)$

$z = cos θ + i sin θ$
$z^{n} + z^{- n} = 2 cos n θ$
$z^{n} - z^{- n} = 2 i sin n θ$

To find the identity for $cos^{2} θ$ , start with $z + z^{- 1}$ , and raise to the power of 2

$z + z^{- 1} = 2 cos θ$ $(z + z^{- 1})^{2} = (2 cos θ)^{2}$ $z^{2} + 2 + z^{- 2} = 4 cos^{2} θ$

Substituting in for the pairs of $z^{n} + z^{- n}$

$(z^{2} + z^{- 2}) + 2 = 2 cos 2 θ + 2 = 4 cos^{2} θ$ $cos^{2} θ = \frac{1}{2} cos 2 θ + \frac{1}{2}$

Vectors

Vector Equation of a Straight Line

The vector $r$ is the vector of any point along the line.

$r = a + λ b$

$a$ is any point on the line, and \bm{b} is the direction of the line. $λ$ is a parameter that represents the position of $r$ relative to $a$ along the line. The carteian form of this can be derived: $x = a_{1} + λ b_{1}$ $y = a_{2} + λ b_{2}$ $z = a_{3} + λ b_{3}$

Equating about lambda: $\frac{x - a _{1}}{b _{1}} = \frac{y - a _{2}}{b _{2}} = \frac{z - a _{3}}{b _{3}}$

Scalar/Dot Product

The dot product of two vectors: $a \cdot b = ∣ a ∣∣ b ∣ cos θ = \sum a_{n} b_{n}$

If $a \cdot b = 0$ , then $θ = 90$ and $cos θ = 0$
- The two vectors are perpendicular
$a \cdot a = ∣ a ∣^{2}$

The angle between two vectors can be calculated using the dot product $cos θ = \frac{a \cdot b}{∣ b ∣∣ a ∣}$

Projections

The projection of vector $a$ in the direction of $b$ is given by the scalar product:

$\frac{b}{∣ b ∣} \cdot a = \hat{b} \cdot a$

This gives a vector in the direction of $b$ with the magnitude of $a$ .

Equation of a Plane

The vector equation of a plane is given by $r \cdot n = a \cdot n$

Where $n$ is the normal to the plane, and $a$ is any point in the plane. This expands to the cartesian form:

$n_{1} x + n_{2} y + n_{3} z = a \cdot n$

Angle Between Planes

The angle between two planes is given by the angle between their normals. $cos θ = \frac{n _{1} \cdot n _{2}}{∣ n _{1} ∣∣ n _{2} ∣}$

Intersection of 2 Planes

Two planes will only intersect if their normal vectors intersect.

First, check the two normals are non parallel
- $n_{1} \cdot n_{2} \neq = 0$
Equate all 3 variables about either a parameter $λ$ or one of $x$ , $y$ , or $z$ to get an equation for the line along which the planes intersect in cartesian form

Example

Find the intersection of the planes $3 x + y - 4 z = 4$ (1) and $- x + y = 2$ (2).

(1) - (2): $4 x - 6 z = 2 \Rightarrow z = \frac{2 x - 1}{3}$

(1) + 3(2): $4 y + 2 z = 10 \Rightarrow z = \frac{2 y - 5}{- 1}$

Equating the two with z:

$\frac{2 x - 1}{3} = \frac{2 y - 5}{- 1} = z$

Using Cross Product

For two normals to planes $n_{1}$ and $n_{2}$ , the vector $b = n_{1} \times n_{2}$ will lie in both planes. The line

$r = a + λ (n_{1} \times n_{2})$

lies in both planes.

Distance from Point to Plane

The shortest distance from the point $(x_{0}, y_{0}, z_{0})$ to the plane $A x + B y + C z + D = 0$ is given by:

$\frac{∣ A x _{0} + B y _{0} + C z _{0} + D ∣}{A ^{2} + B ^{2} + C ^{2}}$

Vector/Cross Product

The cross product of two vectors produces another vector, and is defined as follows

$a \times b = ∣ a ∣∣ b ∣ sin θ \hat{n} = i a_{x} b_{x} j a_{y} b_{y} k a_{z} b_{z}$

$θ$ is the angle between the two vectors, and $\hat{n}$ is a unit vector perpendicular to both $a$ and $b$ . The right-hand rule convention dictates that $\hat{n}$ should always point up (ie, if $a$ and $b$ are your fingers, then $\hat{n}$ is your thumb). The cross product is not commutative, as $a \times b$ = $- (b \times a)$ .

The magnitude of the cross product $∣ a \times b ∣$ is equal to the area of the parallelogram formed by the two vectors.
Can be used to find a normal given 2 vectors/2 points in a plane

Angular Velocity

A spheroid rotates with angular velocity $ω$ . A point $A$ on the spheroid has velocity $v = ω \times A$

Matrices

Determinant & Inverse of a 2x2 Matrix

The determinant of a 2x2 matrix:

$a c b d = a d - b c$

The inverse:

$(a c b d)^{- 1} = \frac{1}{a d - b c} (d - c - b a)$

The inverse of a matrix $M$ only exists where $det M \neq = 0$

Minors & Cofactors

There is a matrix minor corresponding to each element of a matrix
The minor is calculated by
- ignoring the values on the current row and column
- calculate the determinant of the remaining 2x2 matrix

Example:

$M = 320001 2 - 2 1$

The minor of the top left corner is:

$01 - 2 1 = 2$

The cofactor is the minor multiplied by it's correct sign. The signs form a checkerboard pattern:

$+ - + - + - + - +$

The matrix of cofactors is denoted $C$ .

Determinant of a 3x3 Matrix

The determinant of a 3x3 matrix is calculated by multiplying each element in one row/column by it's cofactor, then summing them. For the matrix:

$M = a d h b e h c f i$

$det M = a \cdot e h f i - b \cdot d g f i + c \cdot d g e h$

This shows the expansion of the top row, but any column or row will produce the same result.

Inverse of a 3x3 Matrix

Calculate matrix of minors
Calculate matrix of cofactors $C$
Transpose $C^{T}$
Multiply by 1 over determinant

$M^{- 1} = \frac{1}{det M} C^{T}$

Example

$M = 320001 2 - 2 1$

$M_{11} = + 01 - 2 1 = 2 M_{12} = - 20 - 2 1 = - 2 M_{13} = + 2001 = 2$

$M_{21} = - 0121 = 2 M_{22} = + 3021 = 3 M_{23} = - 3001 = - 3$

$M_{31} = + 00 2 - 2 = 0 M_{32} = - 32 2 - 2 = 10 M_{33} = + 3200 = 0$

The transposed matrix of cofactors $C^{T}$ is therefore:

$C^{T} = 2 - 2 2 23 - 3 0100$

Explanding by the bottom row to calculate the determinant (it has 2 zeros so easy calculation): $det M = 0 \times 0 + 1 \times 10 + 0 \times 0 = 10$

Calculating inverse:

$M^{- 1} = \frac{1}{det M} C^{T} = \frac{1}{10} 2 - 2 2 23 - 3 0100 = 0.2 - 0.2 0.2 0.2 0.3 - 0.3 010$

Simultaneous Linear Equations

Several methods for solving systems of simultaneous linear equations. All the examples shown are for 3 variables, but can easily be expanded 2 $n$ variables.

Cramer's Rule

For a system of 3 equations:

Calculate the determinant $Δ$ of the matrix of coefficients
Calculate determinants $Δ_{1}, Δ_{2}, ..., Δ_{n}$ by replacing 1 column of the matrix with the solutions
Use determinants to calculate unknowns

$a_{1} x + b_{1} y + c_{1} z = d_{1} a_{2} x + b_{2} y + c_{2} z = d_{2} a_{3} x + b_{3} y + c_{3} z = d_{3}$

$a_{1} a_{2} a_{3} b_{1} b_{2} b_{3} c_{1} c_{2} c_{3} \cdot x y z = d_{1} d_{2} d_{3}$

$Δ = a_{1} a_{2} a_{3} b_{1} b_{2} b_{3} c_{1} c_{2} c_{3} Δ_{1} = d_{1} d_{2} d_{3} b_{1} b_{2} b_{3} c_{1} c_{2} c_{3}$

$Δ_{2} = a_{1} a_{2} a_{3} d_{1} d_{2} d_{3} c_{1} c_{2} c_{3} Δ_{3} = a_{1} a_{2} a_{3} b_{1} b_{2} b_{3} d_{1} d_{2} d_{3}$

$x = \frac{Δ _{1}}{Δ} y = \frac{Δ _{2}}{Δ} z = \frac{Δ _{3}}{Δ}$

Matrix Inversion

For a system of equations in matrix form $M x = a$ The solutions $x$ is given by

$x = M^{- 1} a$

The system has no solutions if $det M = 0$

Gaussian Elimination

Eliminating variables from equations one at a time to give a solution. Generally speaking, for a system of 3 equations

$a_{1} x + b_{1} y + c_{1} z = d_{1} (1) a_{2} x + b_{2} y + c_{2} z = d_{2} (2) a_{3} x + b_{3} y + c_{3} z = d_{3} (3)$

First, eliminate x from $(2)$ and $(3)$ $a_{1} (2) - a_{2} (1) = (2) (n e w)$ $a_{1} (3) - a_{3} (1) = (3) (n e w)$

This gives

$a_{1} x + b_{1} y + c_{1} z = d_{1} (1) b_{2} y + c_{2} z = d_{2} (2) b_{3} y + c_{3} z = d_{3} (3)$

Then, eliminate y from $(3)$ $b_{2} (3) - b_{3} (2) = (3) (n e w)$

Giving

$a_{1} x + b_{1} y + c_{1} z = d_{1} (1) b_{2} y + c_{2} z = d_{2} (2) c_{3} z = \tilde{d_{3}} (3)$

This gives a solution for $z$ , which can then be back-substituted to find the solutions for $x$ and $y$ .

$z = \frac{d _{3}}{c _{3}}$ $y = \frac{d _{2}}{b _{2}} - \frac{c _{2}}{b _{2}} z$ $x = \frac{d _{1}}{a _{1}} - \frac{b _{1}}{a _{1}} y - f r a c c_{1} a_{1} z$

The advantages of this method are:

No need for matrices (yay)
Works for homogenous and inhomogeneous systems
The matrix need not be square
Works for any size of system if a solution exists

Sometimes, the solution can end up being in a parametric form, for example:

$4 x + y + z = 9 x + y + z = 6 x + 2 y + 2 z = 11 (1) (2) (3)$

$4 x + y + z = 9 3 y + 3 z = 15 7 y + 7 z = 35 (1) 4 (2) - 1 (1) 4 (3) - 1 (1)$

$4 x + y + z = 9 3 y + 3 z = 15 0 z = 0 (1) 4 (2) - 1 (1) 3 (3) - 7 (2)$

This doesn't make sense, as the final equation is satisfied for any value of $z$ . Substituting a parameter $λ$ for $z$ gives:

$z = λ y = 5 - λ x = 1$

Gauss-Seidel Iteration

Iterative methods involve starting with a guess, then making closer and closer approximations to the solution. If iterations tend towards a limit, then the system converges and the limit will be a solution. If the system diverges, there is no solution for this iteration. For the gauss-seidel scheme:

$a_{1} x + b_{1} y + c_{1} z = d_{1} (1) a_{2} x + b_{2} y + c_{2} z = d_{2} (2) a_{3} x + b_{3} y + c_{3} z = d_{3} (3)$

Rearrange to get iterative formulae:

$x^{r + 1} = (d_{1} - b_{1} y^{r} - c_{1} z^{r}) / a_{1} (1) y^{r + 1} = (d_{2} - a_{2} x^{r + 1} - c_{2} z^{r}) / b_{2} (2) z^{r + 1} = (d_{3} - a_{3} x^{r + 1} - b_{3} y^{r + 1}) / c_{3} (3)$

Using these formulae, make a guess at a starting value and then continue to iterate. For example:

$4 x + 1 y + 1 z = 9 (1) 1 x + 5 y + 1 z = 14 (2) 1 x + 1 y + 3 z = 12 (3)$

Rearranging:

$x^{r + 1} = (9 - y^{r} - z^{r}) /4 (1) y^{r + 1} = (14 - x^{r + 1} - z^{r}) /5 (2) z^{r + 1} = (12 - x^{r + 1} - y^{r + 1}) /3 (3)$

The solutions are $x = 1$ , $y = 2$ , $z = 3$ , as can be seen from the table below containing the iterations:

r	x	y	z
0	0	0	0
1	2.25	2.35	2.467
2	1.046	2.098	2.952
3	0.988	2.012	3.000
4	0.997	2.001	3.001

Note that this will only work if the system is diagonally dominant. For a system to be diagonally dominant, the divisor of the iterative equation must be greater than the sum of the other coefficients.

$4 x + 1 y + 1 z = 9 1 x + 5 y + 1 z = 14 1 x + 1 y + 3 z = 12$

Systems can be rearranged to have this property:

$2 x + 7 y + 1 z = 5 - 1 x + 3 y + 3 z = 2 - 6 x + 2 y + 2 z = - 3$

Rearranges to:

$- 6 x + 2 y + 1 z = - 3 2 x + 7 y + 1 z = 5 - 1 x + 3 y + 8 z = 2$

Differentiation

Implicit Differentiation

When differentiating a function of one variable with respect to another (ie $\frac{d y}{d x} f (y)$ ), simply differentiate with respect to $y$ , then multiply by $\frac{d y}{d x}$ .

For example, find $\frac{d y}{d x}$ where $x^{2} y^{3} - x^{2} + 3 y - 3 = 0$ . First, using the product rule to differentiate the first term: $u = x^{2} u^{'} = 2 x$ $v = y^{3} v^{'} = 3 y^{2} \frac{d y}{d x}$

The equation with all terms differentiated: $2 x y^{3} + 3 x^{2} y^{2} \frac{d y}{d x} - 2 x + 3 y \frac{d y}{d x} = 0$

Rearranging to get in terms of $\frac{d y}{d x}$ : $3 x^{2} y^{2} \frac{d y}{d x} + 3 y \frac{d y}{d x} = 2 x - 2 x y^{3}$ $\frac{d y}{d x} = \frac{2 x - 2 x y ^{3}}{3 x ^{2} y ^{2} + 3}$

Inverse Trig Functions

All the derivatives of the inverse trig functions are given in the data book. They can be derived as follows ( $sin$ is used as an example).

$y = sin^{- 1} x$ $sin y = x$

Differentiating both sides with respect to x

$\frac{d y}{d x} cos y = 1$ $\frac{d y}{d x} = \frac{1}{cos y}$

Using pythagorean identity $cos y = 1 - sin^{2} y$

$\frac{d y}{d x} = \frac{1}{1 - sin ^{2} y} = \frac{1}{1 - x ^{2}}$

Differentials

Differentials describe small changes to values/functions $y = f (x)$ $\frac{d y}{d x} = f^{'} (x)$ $d y = f^{'} (x) d x$

Recall that $\frac{d y}{d x} \approx \frac{δy}{δ x}$ . This means this can be rewritten: $δy \approx f^{'} (x) δ x$

Dividing both sides by $y = f (x)$ : $\frac{δy}{y} = \frac{x f ^{'} ( x )}{f ( x )} \frac{δ x}{x}$

$\frac{δy}{y}$ represents a relative change in y, and $\frac{δ x}{x}$ represents a relative change in x. This can be used to give approximations of how one quantity changes based upon another.

For example, given the mass of a sphere $M = \frac{4}{3} ρ π r^{3}$ , where $ρ$ is the material density, estimate the change in mass when the radius is increased by 2%. $M = \frac{4}{3} ρ π r^{3}$ $\frac{d M}{d r} = 4 ρ π r^{2}$ $δ m = 4 ρ π r^{2} δr$

Dividing both sides by the original formula: $\frac{δ M}{M} = \frac{4 ρ π r ^{2} δr}{\frac{4}{3} ρ π r ^{3}} = 3 \frac{δr}{r}$

$\frac{δr}{r}$ represents a relative change in radius, so when $r$ increases by 2%, $\frac{δr}{r} = 0.02$ $\frac{δ M}{M} = 3 \times 0.02 = 0.06$

Meaning the mass increases by 6%.

Hyperbolic Functions

Hyperbolic functions have similar identities to circular trig functions. They're the same, except anywhere there is a product of two $sinh$ s, the term should be negated. Hyperbolic functions can also be defined in terms of exponential functions, making them easy to differentiate.

$y = cosh x = \frac{1}{2} (e^{x} + e^{- x})$ $\frac{d y}{d x} = \frac{1}{2} (e^{x} - e^{- x}) = sinh x$

All the derivatives of hyperbolic functions are given in the formula book.

Parametric Differentiation

For a function given in parametric form $y = f (t)$ , $x = f (t)$ : $\frac{d y}{d x} = \frac{d y}{d t} \times \frac{d t}{d x}$ $\frac{d}{d x} = \frac{d t}{d x} \times \frac{d}{d t}$

$\frac{d ^{2} y}{d x ^{2}} = \frac{d}{d x} (\frac{d y}{d x}) = \frac{d t}{d x} \times \frac{d}{d t} (\frac{d y}{d x})$

Partial Differentiation

For a function of two variables $z = f (x, y)$ there are two gradients at the point $z$ , one in $x$ and one in $y$ . To find the gradient in the x direction, differentiate $f (x, y)$ treating y as a constant. To find the gradient in the y direction, differentiate $f (x, y)$ treating x as a constant. These are the two partial derivatives of the function, $\frac{\partial z}{\partial x}$ and $\frac{\partial z}{\partial y}$ .

For example, for a function $z = 4 x^{3} + 5 y^{7} x$ : $\frac{\partial z}{\partial x} = 12 x^{2} + 5 y^{7}$ $\frac{\partial z}{\partial y} = 35 y^{6} x$

Implicit Partial Differentiation

When a function of several variables is given and a partial derivative is required, differentiate the numerator of the partial derivative implicitly with respect to the denominator, and treat the third variable as constant. For example, find $\frac{\partial z}{\partial y}$ given $z^{2} = x^{2} + y^{2}$ :

$\frac{\partial}{\partial y} z^{2} = \frac{\partial}{\partial y} (x^{2} + y^{2})$

$2 z \frac{\partial z}{\partial y} = 2 y$ $\frac{\partial z}{\partial y} = \frac{y}{z}$

Another example, find $\frac{\partial z}{\partial x}$ given $z cos z = x^{2} y^{3} + z$ $\frac{\partial}{\partial x} (z cos z) = \frac{\partial}{\partial x} (x^{2} y^{3} + z)$ $\frac{\partial z}{\partial x} cos z - z sin z \frac{\partial z}{\partial x} = 2 x y^{3} + \frac{\partial z}{\partial x}$ $\frac{\partial z}{\partial x} (cos z - z sin z - 1) = 2 x y^{3}$ $\frac{\partial z}{\partial x} = \frac{2 x y ^{3}}{cos z - z sin z - 1}$

Higher Order Partial Derivatives

Three 2nd order derivatives for functions of 2 variables. For $z = f (x, y)$ : $\frac{\partial ^{2} z}{\partial x ^{2}} = \frac{\partial}{\partial x} (\frac{\partial z}{\partial x})$ $\frac{\partial ^{2} z}{\partial y ^{2}} = \frac{\partial}{\partial y} (\frac{\partial z}{\partial y})$ $\frac{\partial ^{2} z}{\partial x \partial y} = \frac{\partial}{\partial x} (\frac{\partial z}{\partial y}) = \frac{\partial}{\partial y} (\frac{\partial z}{\partial x}) = \frac{\partial ^{2} z}{\partial y \partial x}$

Note how for the last one, the order is interchangable as it yields the same result.

Chain Rule

The chain rule for a function $w (x, y)$ , where x and y are functions of a parameter $t$ : $\frac{d w}{d t} = \frac{\partial w}{\partial x} \frac{d x}{d t} + \frac{\partial w}{\partial y} \frac{d y}{d t}$

Total Differential

The total differential represents the total height gain or lost when moving along the function described by $z = f (x, y)$ $d z = \frac{\partial f}{\partial x} d x + \frac{\partial f}{\partial y} d y$

Contour Plots

Along a line of a contour plot, the total differential is zero: the height doesn't change. This allows $\frac{d y}{d x}$ to be found $d h = \frac{\partial h}{\partial x} d x + \frac{\partial h}{\partial y} d y = 0$ $\frac{d y}{d x} = \frac{\partial h / \partial x}{\partial h / \partial y}$

Integration

Integration by Parts

When an integral is a product of two functions (ie $\int e^{x} sin x$ ), it can be integrated by parts:

$\int u \frac{d v}{d x} d x = uv - \int v \frac{d u}{d x} d x$

(see also the DI method)

Improper Integrals

An integral is improper if either

One of its limits is infinity
The function is not defined for any point within the interval (bounds inclusive)

To evaluate these integrals, replace the dodgy boundwith a variable $t$ , evaluate the integral in terms of the variable, and then take the limit as the variable tends towards the bound.

$\int_{a}^{\infty} f (x) d x = t \to \infty lim \int_{a}^{t} f (x) d x$

Where functions are not continuous over the interval, may need to split the function into two integrals. For example, if $f (x)$ is not continuous at $x = c$ where $a < c < b$ , then: $\int_{a}^{b} f (x) d x = \int_{a}^{c} f (x) d x + \int_{c}^{b} f (x) d x$

Reduction Formulae

Reduction formulae involve rewriting an integral in terms of itself to get a recurrence relation. They usually involve some variable $n$ as well as other variables in the integral ( $x$ ). For example, integrating $I_{n} = \int_{0}^{\infty} x^{n} e^{- x} d x$ :

By parts:

$u = x^{n} \frac{d v}{d x} = e^{- x}$ $\frac{d u}{d x} = n x^{n - 1} v = - e^{- x}$

$\int u \frac{d v}{d x} d x = uv - \int v \frac{d u}{d x} d x$ $\int_{0}^{\infty} x^{n} e^{- x} d x = - x^{n} e^{- x} - \int_{0}^{\infty} e^{- x} x^{n - 1}$ $I_{n} = n \cdot I_{n - 1}$

Note how the integral is now in terms of itself, but with $n - 1$ . This creates a recursive definition that can be expanded to evaluate $I_{5}$

$I_{5} = 5 I_{4} = 5 \times 4 I_{3} = 5 \times 4 \times 3 I_{3} = 5 \times 4 \times 3 \times 2 \times 1 \times I_{0} = 120 I_{0}$ $I_{0} = \int_{0}^{\infty} x^{0} e^{- x} d x = \int_{0}^{\infty} e^{- x} d x = (- e^{- \infty} + e^{0}) = 1$ $I_{5} = 5! = 120$

Integration by Substitution

Substitution is often useful in solving integrals.

Choose a new function $u (x)$
Find $\frac{d u}{d x}$
Substitute $u$ in
Swap $d x$ for $d u$
Put limits in terms of $u$ (if appropriate)
Solve with respect to u

Choosing a function $u$ to substitute depends on the integral, and there are certain patterns to spot which make it easier.

Example

$\int_{0}^{1} \frac{y ^{2}}{1 + y ^{6}} d y$

Substituting $u = y^{3}$ : $u = y^{3} \frac{d u}{3} = y^{2} d y$

$\int_{y = 0}^{y = 1} \frac{d u}{1 + u ^{2}}$ Substituting the limits: $u = 1^{3} = 1 u = 0^{3} = 0$ The integral becomes: $\frac{1}{3} \int_{0}^{1} \frac{d u}{1 + u ^{2}} = \frac{1}{3} [arctan u]_{0}^{1} = \frac{1}{3} [arctan 1 - arctan 0] = \frac{π}{12}$

$tan$ Substitutions

There are two standard $tan$ substitutions that can be really useful when integrating trig functions.

$t = tan \frac{x}{2}$ Subs

The first one: $t = tan \frac{x}{2} d x = \frac{2 d t}{1 + t ^{2}}$ $cos x = \frac{1 - t ^{2}}{1 + t ^{2}} sin x = \frac{2 t}{1 + t ^{2}} tan x = \frac{2 t}{1 - t ^{2}}$

For example:

$\int cosec x d x = \int \frac{1}{sin x} d x$

Letting $t = tan \frac{x}{2}$ : $d x = \frac{2 d t}{1 + t ^{2}} sin x = \frac{2 t}{1 + t ^{2}}$ $\int \frac{1 + t ^{2}}{2 t} \cdot \frac{2 d t}{1 + t ^{2}} = \int \frac{d t}{t} = ln ∣ t ∣ + c$

$T = tan x$ Subs

$T = tan x d x = \frac{d T}{1 + T ^{2}}$ $sin x = \frac{T}{1 + T ^{2}} cos x = \frac{1}{1 + T ^{2}}$

For example: $\int \frac{d x}{4 cos ^{2} x - sin ^{2} x}$

Letting $T = tan x$ : $sin^{2} x = \frac{T ^{2}}{1 + T ^{2}} cos^{2} x \frac{1}{1 + T ^{2}} d x = \frac{d T}{1 + T ^{2}}$

$\int \frac{d x}{4 cos ^{2} x - sin ^{2} x} = \int \frac{\frac{d T}{1 + T ^{2}}}{\frac{4}{1 + T ^{2}} - \frac{T ^{2}}{1 + T ^{2}}} = \int \frac{\frac{d T}{1 + T ^{2}}}{\frac{4 - T ^{2}}{1 + T ^{2}}} = \int \frac{d T}{4 - T ^{2}}$ $= \frac{1}{2} tanh^{- 1} \frac{T}{2} + C = \frac{1}{2} tanh^{- 1} \frac{tan x}{2} + C = \frac{1}{4} ln \frac{2 + tan x}{2 - tan x} + C$

Standard Forms

Integrals will sometimes be (or can be put into) standard forms which then evaluate directly to inverse trig functions. The full list is given in the data book but:

$\int \frac{d u}{a ^{2} - u ^{2}} = arcsin \frac{u}{a} + c$ $\int \frac{d u}{a ^{2} + u ^{2}} = \frac{1}{a} arctan \frac{u}{a} + c$

Example

$\int_{2}^{5} \frac{d x}{x ^{2} + 2 x - 8} = \int_{2}^{5} \frac{d x}{( x + 1 ) ^{2} - 9}$

Substituting $u = x + 1$

$\int_{3}^{6} \frac{d u}{u ^{2} - 9} = [cosh^{- 1} \frac{u}{3}]_{3}^{6} = [cosh^{- 1} 2 - cosh^{- 1} 1] = cosh^{- 1} 2$

Trigonometric Identities

Trig identities are often useful in evaluating integrals, for example:

$\int sin 4 x cos 3 x d x$

Using $2 sin A cos B = (sin (A + B) + s in (A - B))$ :

$\int sin 4 x cos 3 x d x = \frac{1}{2} \int sin 7 x + sin x d x = \frac{1}{2} (- \frac{1}{7} cos 7 x - cos x) + c$

$\int sin 4 x cos 3 x d x = - \frac{1}{1} 4 cos 7 x - \frac{1}{2} cos x + c$

Integration as a Limit

The area under a curve $f (x)$ from $a \leq x \leq b$ is given by: $\int_{a}^{b} f (x) d x$ This can be approximated by dividing the area under the curve into a number of rectangles:

For $n$ rectangles over the width $b - a$ , the width of each rectangle $δ x = \frac{b - a}{n}$ . The area of the rectangle is therefore given by $y (x_{k}) \cdot δ x$ . The sum of all the rectangles, and therefore total area is: $k = 1 \sum n y (x_{k}) δ x$

As $n \to \infty$ , $δ x \to 0$ , so: $\int_{a}^{b} y (x) d x = δ x \to 0 lim x = a \sum x = b y (x) δ x$

Volumes of Revolution

For a function $y (x)$ rotated 360 degrees about the x axis, consider a disc of width $δ x$ and radius y. The volume is given by $π y^{2} δ x$ . The volume of all slices as $n \to \infty$ is $x = a \sum x = b π y^{2} δ x$

Therefore the volume of revolution for a function $y (x)$ about the x axis is $V = π \int_{a}^{b} y^{2} d x$

Volume of revolution about y axis:

$V = π \int_{a}^{b} x^{2} d y$

Centres of Mass for Planar Objects

The centre of mass is the point through which gravity acts. In 1 dimension:

The sum of the moments about 0 is $m_{1} x_{1} + m_{2} x_{2}$ . The moment of the total mass is $\overset{x}{ˉ} (m_{1} + m_{2})$ . Equating these: $\overset{x}{ˉ} = \frac{m _{1} x _{1} + m _{2} x _{2}}{m _{1} + m _{2}} = \frac{\sum m _{i} x _{i}}{\sum m _{i}} = \frac{sum of moments}{total mass}$

This can be expanded into 2 dimensions: $\overset{x}{ˉ} = \frac{\sum m _{i} x _{i}}{\sum m _{i}} \overset{y}{ˉ} = \frac{\sum m _{i} y _{i}}{\sum m _{i}}$

For the centre of mass of an infinitely thin sheet with uniformly distributed mass, for x-axis consider thin slices of width $δ x$ .

Area of slice = $y \cdot δ x$
Mass of slice = $m \cdot y \cdot δ x$
Moment of slice about y-axis = $x \cdot m \cdot y \cdot δ x$
Sum of all moments as $δ x \to 0$ = $\int_{0}^{a} (m \cdot x \cdot y) d x$

$\overset{x}{ˉ} = \frac{\int _{0}^{a} ( m x y ) d x}{m}$

For the sum of the moments about y axis, take a horizontal slice with width $δy$ with length $(a - x)$

Area of slice = $(a - x) \cdot δy$
Mass of slice = $m \cdot (a - x) \cdot δy$
Moments of slice about x-axis = $y \cdot m \cdot (a - x) \cdot δy$
Sum of all moments as $y \to 0$ = $\int_{0}^{a} (y m (a - x)) d y$

$\overset{y}{ˉ} = \frac{\int _{0}^{a} m y ( a - x ) d y}{m}$

Note that usually, mass $m$ is mass per unit area.

Example

Find centre of mass of plane lamina shown

By symmetry, clearly $\overset{x}{ˉ} = 0$ . For $\overset{y}{ˉ}$ , let $m$ be the mass per unit area, and consider a horizontal strip of width $y$ .

Area of strip is $2 x δy$
Mass of strip is $2 m x δy$
Moment of one strip about x axis is $2 m x yδy$

Total moment as $δy \to 0$ :

$\int_{0}^{4 a^{2}} 2 m x y d y = 2 m \int_{0}^{4 a^{2}} y 4 a^{2} - y d y = \frac{256 m a ^{5}}{15}$

For the total mass $M$ , total area of the shape: $\int_{2 a}^{2 a} y d y = \int_{2 a}^{2 a} (4 a^{2} - x^{2}) d y = \frac{32 a ^{3}}{3}$

So total mass M = $m \times M = \frac{32 m a ^{3}}{3}$

$\overset{y}{ˉ} = \frac{M x}{M} = \frac{256 m a ^{5}}{15} \div \frac{32 m a ^{3}}{3} = \frac{8 a ^{2}}{5}$

Moments of Inertia for Laminae

The moment of inertia $I$ is a measure of how difficult it is to rotate an object. Suppose a lamina is divided into a large number of small elments, each with mass $δ m$ at distance $r$ from the origin $O$ . The moment of inertia of one element is defined to be $r^{2} δ m$ . Taking the sum of all moments as $δ m \to 0$

$I = \int r^{2} d m$

The bounds of the integral should be chosen appropriately such as to include the entire lamina.

For a lamina lying in the x-y plane, the moment of inertia about z-axis is the sum of the moments about x and y axes.
- $I_{oz} = I_{o x} + I_{oy}$
For an axis $L^{'}$ parallel to $L$ at a distance $d$ and both lying in the same plane as the lamina with mass $M$ , where $L$ passes through the centere of the lamina:
- $I_{L^{'}} = I_{L} + M d^{2}$

Example

Find the moment of inertia of a thin rectangular plate of mass $M$ , length $2 a$ and width $2 b$ about an axis through its centre of gravity which is normal to its plane.

Assuming the plate lies in the x-y axis, the question is asking for the moment about the z-axis. To find this, the moments about both x and y axes are required as $I_{oz} = I_{o x} + I_{oy}$ . To find $I_{oy}$ :

Let the mass per unit area $m = \frac{M}{4 ab}$
A strip of width $δ x$ at distance $x$ from $O y$ has mass $m \cdot 2 b \cdot δ x$
The moment of inertia of the strip is $x^{2} \cdot m \cdot 2 b \cdot δ x$

Taking the limit of the sum of all the strips: $\int_{- a}^{a} 2 bm x^{2} d x = \frac{4 b a ^{3}}{3}$

As $m = \frac{M}{4 ab}$ , $I_{oy} = \frac{4 b a ^{3}}{3} \cdot \frac{M}{4 ab} = \frac{M a ^{2}}{3}$

$I_{o x}$ is identically dervied and equals $\frac{M b ^{2}}{3}$ . Summing the two moments gives:

$I_{oz} = \frac{M a ^{2}}{3} + \frac{M b ^{2}}{3} = \frac{M}{3} (a^{2} + b^{2})$

Lengths of Curves

The length of the arc of a curve $y (x)$ between $x = a$ and $x = b$ is given by $\int_{a}^{b} 1 + (\frac{d y}{d x})^{2} d x$

Alternatively, for parametrised curves: $\int_{t = t_{1}}^{t = t_{2}} (\frac{d x}{d t})^{2} + (\frac{d y}{d t})^{2} d t$

Surface Areas of Revolution

Similar to volumes of revolution, the surface area of a function when rotated about the x axis is given by:

$A = 2 π \int_{a}^{b} y 1 + (\frac{d y}{d x})^{2} d x$

Example

The surface are of the parabola $y^{2} = 8 x$ between $x = 0$ and $x = 2$ , when rotated about x axis: $y = 2 2 x \frac{d y}{d x} = \frac{2}{x}$

$A = 2 π \int_{0}^{2} 2 2 x 1 + \frac{2}{x} d x = 42 π \int_{0}^{2} x + 2 d x$ $A = \frac{64 2 π - 32 π}{3}$

Mean Values of a Function

For a function $f (x)$ over the interval $[a, b]$

Mean value: $\frac{1}{b - a} \int_{a}^{b} f (x) d x$

Root mean square value: $\frac{1}{b - a} \int_{a}^{b} (f (x))^{2} d x$

Differential Equations

First Order

A first order differential equation has $\frac{d y}{d x}$ as it's highest derivative. For the two methods below, it is important the equation is in the correct form specified.

Seperating Variables

For an equation of the form $\frac{d y}{d x} = f (x) g (y)$

The solution is $\int \frac{d y}{g ( y )} = \int f (x) d x$

Integrating Factors

For an equation of the form $\frac{d y}{d x} + P (x) y = Q (x)$

An integrating factor $μ$ can be found such that: $μ (x) = e^{\int P (x) d x}$

Multiplying through by $μ$ gives

$μ (x) \frac{d y}{d x} + μ (x) P (x) y = μ (x) Q (x)$

Then, applying he product rule backwards gives a solution:

$\frac{d}{d x} (μ (x) y) = μ (x) Q (x)$ $μ (x) y = \int μ (x) Q (x) d x$

Second Order

A second order ODE has the form: $a \frac{d ^{2} y}{d x ^{2}} + b \frac{d y}{d x} + cy = f (x)$

The equation is homogeneous if $f (x) = 0$ .

The auxillary equation is $a k^{2} + bk + c = 0$

This gives two roots $k_{1}$ and $k_{2}$ , which determine the complementary function:

Roots	Complementary Function
$k_{1}$ and $k_{2}$ both real	$y = A e^{k_{1} x} + B e^{k_{2} x}$
$k_{1} = k_{2}$ , both real	$y = (A + B x) e^{k x}$
$k_{1} = α + β i$ and $k_{2} = α - β i$	$y = e^{αx} (A cos β x + B sin β x)$

The complementary function is the solution. Sometimes, initial conditions will be given which allow the constants $A$ and $B$ to be found.

Non-Homogeneous Systems

If the system is non-homogenous, ie $f (x) \neq = 0$ , then a particular integral is needed too, and the solution will have form $y = c . f . + p . i .$ . The particular integral is found using a trial solution, then substituting it into the equation to find the coefficients. Note that if the particular integral takes the same form as the complementary function, an extra $x$ will need to be added to the particular integral for it to work, it $a e^{k x}$ would become $a x e^{k x}$

$f (x)$	Trial Solution
const $k$	const $α$
polynomial $a x^{r} + ... + b x + c$	$α x^{r} + ... + β x + γ$
$a cos k x$ or $a sin k x$	$α cos k x + β sin k x$
$a e^{k x}$	$α e^{k x}$

Example

$\frac{d ^{2} y}{d x ^{2}} + 4 y = sin x y (0) = 1, \frac{d y}{d x} (0) = 1$

Auxillary equation: $k^{2} + 4 = 0$ $k = \pm 2 i$

Complementary function is therefore: $y = A cos 2 x + B sin 2 x$

System is non-homogeneous, so have to find a particular integral. For this equation $f (x) = sin x$ , so the p.i. is $y = a cos x + b sin x$ . $y = a cos x + b sin x$ $\frac{d y}{d x} = - a sin x + b cos x$ $\frac{d ^{2} y}{d x ^{2}} = - a cos x - b sin x$

Substituting this into the original equation: $\frac{d ^{2} y}{d x ^{2}} + 4 y = - a cos x - b sin x + 4 a cos x + 4 b sin x = sin x$

Comparing coefficients: $cos x : 3 a = 0 \Rightarrow a = 0$ $sin x : 3 b = 1 \Rightarrow b = 1/3$

The general solution is therefore: $y = c . f . + p . i . = A cos 2 x + B sin 2 x + \frac{1}{3} sin x$

Using initial conditions to find constants, for $y (0) = 1$ $1 = A cos 0 + B sin 0 + \frac{1}{3} sin 0 \Rightarrow A = 1$

For $\frac{d y}{d x} (0) = 1$ $\frac{d y}{d x} = - 2 A sin 2 x + 2 B cos 2 x + \frac{1}{3} cos x$ $1 = - 2 A sin 0 + 2 B cos 0 + \frac{1}{3} cos 0$ $1 = 2 B + \frac{1}{3} \Rightarrow B = \frac{1}{3}$

Particular solution for given initial conditions is therefore: $y = cos 2 x + \frac{1}{3} sin 2 x + \frac{1}{3} s in x$

Laplace Transforms

The laplace transform transforms a function from the time domain to the laplace domain. For a continuous function $f (t)$ . with $t \geq 0$ , the laplace transform is defined as

$L {f (t)} (s) = \int_{0}^{\infty} e^{- s t} f (t) d t$

The notation used is

$L {f (t)} = F (s)$

Where $F (s)$ is the function in the laplace domain. Tables of laplace transforms for common functions are given in the formula book, so there is no need to work out most transforms manually.

Transforms are linearly independent in the same way integrals are:

$L {4 x + 2 y} = 4 L {x} + 2 L {y}$

For example, find the laplace transform of $f (t) = \frac{1}{2} e^{- 2 t} + 3 sin 4 t$ :

$L {f (t)} = \frac{1}{2} L {e^{- 2 t}} + 3 L {sin 4 t}$

$= \frac{1}{2} \frac{1}{s + 2} + 3 \times \frac{4}{s ^{2} + 16} = \frac{1}{2 s + 4} + \frac{12}{s ^{2} + 16}$

Inverse Transforms

Transforms also have an inverse:

$f (t) = L^{- 1} {F (s)}$

For example, find $f (t)$ from $F (s) = \frac{3}{s ^{2} + 25 + 5}$

$f (t) = L^{- 1} {\frac{3}{s ^{2} + 25 + 5}} = L^{- 1} {\frac{3}{( s + 1 ) ^{2} + 4}}$

$= \frac{3}{2} L^{- 1} {\frac{2}{( s + 1 ) ^{2} + 4}} = \frac{3}{2} e^{- t} sin 2 t$

Sometimes, partial fractions and/or completing the square is required to get the equation into a form recognisable from the table.

First Shift Theorem

$L^{- 1} {F (s + a)} = e^{- a t} f (t)$

Differential Equations

Laplace transforms exist of derivatives:

$L {\frac{d x}{d t}} = s X (s) - x (0)$

$L {\frac{d ^{2} x}{d t ^{2}}} = s^{2} X (s) - s x (0) - \frac{d x}{d t} (0)$

This can be used to solved differential equations, by laplace transforming the differential equation to make an algebraic one, then inverse laplace transforming the result back.

Example

Solve:

$\overset{y}{¨} + 3 \overset{y}{˙} + 2 y = e^{t} y (0) = 1 \overset{y}{˙} (0) = 1$

$L {\overset{y}{¨} + 3 \overset{y}{˙} + 2 y} = L {e^{t}}$

$s^{2} Y (s) - sy (0) - \overset{y}{˙} (0) + 3 (s Y (s) - y (0)) + 2 Y (s) = \frac{1}{s - 1}$

$s^{2} Y (s) - s - 1 + 3 s Y (s) - 3 + 2 Y (s) = \frac{1}{s - 1}$

$(s^{2} + 3 s + 2) (Y (s)) - s - 4 = \frac{1}{s - 1}$

$Y (s) = \frac{1}{( s - 1 ) ( s ^{2} + 3 s + 2 )} + \frac{s + 4}{( s ^{2} + 3 s + 2 )} = \frac{s ^{2} + 3 s - 3}{( s - 1 ) ( s + 1 ) ( s + 2 )}$

Need to use partial fractions to inverse transform

$\frac{s ^{2} + 3 s - 3}{( s - 1 ) ( s + 1 ) ( s + 2 )} = \frac{A}{s - 1} + \frac{B}{s + 1} + \frac{C}{s + 2}$

$s^{2} + 3 s - 3 = A (s + 1) (s + 2) + B (s - 1) (s + 2) + C (s + 1) (s - 1)$

$s = 1 \Rightarrow A = \frac{1}{6}$

$s = - 1 \Rightarrow B = \frac{5}{2}$

$s = - 2 \Rightarrow C = \frac{- 5}{3}$

$Y (s) = \frac{1}{6} \cdot \frac{1}{( s - 1 )} + \frac{5}{2} \cdot \frac{1}{( s + 1 )} - \frac{5}{3} \cdot \frac{1}{( s + 2 )}$

Taking inverse laplace transforms using table:

$y (t) = \frac{1}{6} e^{t} + \frac{5}{2} e^{- t} - \frac{5}{3} e^{- 2 t}$

Probability & Statistics

Probability

Set Theory

A set is a collection of elements
- Elements are members of a set
$s \in S$ means "the element $s$ is a member of the set $S$
The empty set $\emptyset$ contains no elements
- It is empty
$S = {1, 3, 5, 7, 9}$
- $S$ is a set consisting of those integers
$S = {n : n is a prime number and n \leq 12}$
- $S = {1, 2, 3, 5, 7, 11}$
$S = {x : x^{2} = 4 and x is odd}$
- $S = \emptyset$
$A \subset S$
- $A$ is a subset of $S$
- $a \in A$ implies $a \in S$
$\emptyset \in S$ for all sets $S$
$A = B$ if and only if $A \subset B$ and $B \subset A$
$A \cup B$ is the union of $A$ and $B$
- Set of elements belonging to $A$ or $B$
$A \cap B$ is the intersection of $A$ and $B$
- Set of elements belonging to $A$ and $B$
Disjoint sets have no common elements
- $A \cap B = \emptyset$
$A ∖ B$ is the different of $A$ and $B$
- Set of elements belonging to $A$ but not $B$
$A^{c}$ is the complement of $A$
- Set of elements not belonging to $A$

Random Processes & Probability

The probability of event $A$ occurring is denoted $P (A)$ . This is the relative frequency of event $A \in S$ occurring in a random process within sample space S.

$S$
- Certain or sure event, guaranteed 100% to happen
$\emptyset$
- Impossible event, won't happen
$a \in S$
- Elementary event, the only event that can happen, the only possible outcome
$A \cup B$
- Event that occurs if $A$ or $B$ occurs
$A \cap B$
- Event that occurs if $A$ and $B$ occur
$A^{c} = S ∖ A$
- Event that occurs if $A$ does not occur
$A \cup B = \emptyset$
- Events $A$ and $B$ are mutually exclusive

Example

Toss a coin 3 times and observe the sequence of heads and tails.

Sample space $S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}$
Event that $\geq 2$ heads occur in succession $A = {HHH,HHT,THH}$
Event that 3 heads or 3 tails occur $B = {HHH,TTT}$
$A \cup B = {HHH,HHT,THH,TTT}$
$A \cap B = {HHH}$
$A^{c} = {HTH,HTT,THT,TTH,TTT}$
$A^{c} \cup B = {TTT}$

Another Example

Sample space $S = {17, 18, 19, 20, 21, 22}$ . Each number is an individual event.

Events	Frequency	Relative Frequency
17	3	3/35
18	4	4/35
19	9	9/35
20	11	11/35
21	6	6/35
22	2	2/35

Axioms & Laws of Probability

$0 \leq P (A) \leq 1$ for all $A \subset S$
- Probabilities are always between 0 and 1 inclusive
$P (S) = 1$
- Probability of the certain event is 1
If $A \cap B = \emptyset$ then $P (A \cup B) = P (A) + P (B)$
- If two events are disjoint, then the probability of either occurring is equal to the sum of their two probabilities
$P (\emptyset) = 0$
- The probability of the impossible event is zero
$P (A^{c}) = 1 - P (A)$
- The probability of all the elements not in A occurring is the opposite of the probability of all the elements in A occurring
If $A \subset B$ , then $P (A) \leq P (B)$
- The probability of A will always be less than or equal to the probability of B when A is a subset of B
$P (A ∖ B) = P (A) - P (A \cup B)$
- The probability of A minus B is equal to the probability of A minus the probability of A and B
$P (A \cup B) = P (A) + P (B) - P (A \cap B)$
- Probability of A or B is equal to probability of A plus the probability of B minus the probability of A and B
- This is important

Example

In a batch of 50 ball bearings:

15 have surface damage ( $A$ )
- $P (A) = 0.3$
12 have dents ( $B$ )
- $P (B) = 0.24$
6 both have defects ( $A \cap B$ )
- $P (A \cap B) = 0.12$

The probability a single ball bearing has surface damage or dents: $P (A \cup B) = P (A) + P (B) - P (A \cap B) = 0.3 + 0.24 - 0.12 = 0.42$

The probability a single ball bearing has surface damage but no dents: $P (A \cap B^{c}) = P (A ∖ B) = P (A) - P (A \cap B) = 0.3 - 0.12 = 0.18$

Conditional Probability & Bayes' Theorem

A conditional probability $P (A ∣ B)$ is the probability of event $A$ occurring, given that the event $B$ has occurred.

$P (A ∣ B) = \frac{P ( A \cap B )}{P ( B )}$

Bayes' theorem:

$P (A ∣ B) = \frac{P ( B ∣ A ) P ( A )}{P ( B )}$

Axioms of conditional probability:

$P (B) = P (B ∣ A) P (A) + P (B ∣ A^{c}) P (A^{c})$
$P (A \cup B ∣ C) = P (A ∣ C) + P (B ∣ C) - P (A \cap B ∣ C)$

Example

In a semiconductor manufacturing process:

$A$ is the event that chips are contaminated
- $P (A) = 0.2$
$F$ is the event that the product containing the chip fails
- $P (F ∣ A) = 0.1$ and $P (F ∣ A^{c}) = 0.005$

Determining the rate of failure: $P (F) = P (F ∣ A) P (A) + P (F ∣ A^{c}) P (A^{c}) = 0.1 \times 0.2 + 0.005 \times 0.8 = 0.024$

Independent Events

Two events are independent when the probability of one occurring does not dependend on the occurrence of the other. An event $A$ is independent if and only if $P (A \cap B) = P (A) P (B)$

Example

Using the coin flip example again with a sample space $S$ and 3 events $A, B, C$

$S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}$
- $P (S) = 1$
$A = {HHH, HHT, HTH, HTT}$
- $P (A) = 0.5$
$B = {HHH, HHT, THH, THT}$
- $P (B) = 0.5$
$C = {HHT, THH}$
- $P (C) = 0.25$

A and C are independent events:

$A \cap C = {HHT}$
$P (A \cap C) = 0.25 = 0.5 \times 0.25 = P (A) P (C)$

B and C are not independent events:

$B \cap C = {HHT,THH}$
$P (B \cap C) = 0.25 \neq = 0.25 \times 0.5 = P (B) P (C)$

Discrete Random Variables

For a random process with a discrete sample space $S$ , a discrete random variable $X$ is a function that assigns a real number to each outcome $s \in S$ .

$X$ is a measure related to the random distribution.
Denoted $P (X = a)$

Consider a weighted coin where $P (H) = 0.75$ and $P (T) = 0.25$ . Tossing the coin twice gives a sample space $S = {TT, TH, HT, HH}$ , which makes the number of heads a random variable $X (s) = {0, 1, 2}$ . Since successive coin tosses are independent events:

$P (TT) = 0.0625$
$P (T H) = 0.1875$
$P (H T) = 0.1875$
$P (HH) = 0.5625$

Events are also mutually exclusive, so:

$f (0) = P (TT) = 0.0625$
$f (1) = P (T H) + P (H T) = 0.375$
$f (2) = P (HH) = 0.5625$

This gives a probability distribution function $f (x) = P (X = x)$ of:

$x$	$f (x)$
$0$	$0.0625$
$1$	$0.375$
$2$	$0.5625$

Cumulative Distribution Functions

The cumulative probability function gives a "running probability" $F_{X} (x_{i}) = P (X \leq x_{i}) = j = 1 \sum i f (x_{j})$

if $x_{i} \leq x_{j}$ then $F_{X} (x_{i}) \leq F_{X} (x_{j})$
$F_{X} (x_{1}) = f (x_{1})$
$F_{X} (x_{n}) = 1$

Using coin example again:

$x$	$F_{X} (x)$
$0$	$0.0625$
$1$	$0.4375$
$2$	$1$

Expectation & Variance

Expectation is the average value, ie the value most likely to come up
- The mean of $X$

$E (X) = i = 1 \sum n x_{i} f (x_{i}) = μ_{X}$

Variance is a measure of the spread of the data

$Va r (X) = i = 1 \sum n (x_{i} - μ_{x})^{2} f (x_{i}) = E (X^{2}) - (E (X))^{2} = σ_{X}^{2}$

Standard deviation $σ_{X} = Va r (X)$

Using the weighted coin example once more:

$E (X) = 0 \times 0.0625 + 1 \times 0.375 + 2 \times 0.5625 = 1.5$ $E (X^{2}) = 0^{2} \times 0.0625 + 1^{2} \times 0.375 + 2^{2} \times 0.5625 = 2.625$ $Va r (X) = E (X^{2}) - (E (X))^{2} = 1.5 - 2.62 5^{2} = 0.375$

Standardised Random Variable

The standardised random variable is a normalised version of the discrete random variable, obtained by the following transformation: $X^{*} = \frac{X - μ _{X}}{σ _{X}}$

$E (X^{*}) = 0$
$Va r (X^{*}) = 1$

Binomial Distribution

The binomial distribution models random processes consisting of repeated independent events
Each event has only 2 outcomes, success or failure
- $P (s u ccess) = p$
- $P (f ai l u re) = q = 1 - p$

The probability of $k$ successes in $n$ events:

$b (k; n; p) = (k n) p^{k} q^{n - k}, k = 0, 1, 2, ..., n$

Probability of no success $= q^{n}$
Probability of $\geq 1$ successes is $1 - q^{n}$

Expectation & Variance

$μ = n p$ $σ^{2} = n pq$

Example

A fair coin is tossed 6 times. $p = q = 0.5$

Probability of exactly 2 heads out of 6 $b (2; 6; 0.5) = (2 6) \times 0. 5^{2} \times 0. 5^{4} = \frac{15}{64}$

Probability of $\geq 1$ heads $1 - q^{6} = 1 - 0. 5^{6} = \frac{63}{64}$

Probability of $\geq 4$ heads

$b (4; 6; 0.5) + b (5; 6; 0.5) + b (6; 6; 0.5) = (2 6) (\frac{1}{2})^{4} (\frac{1}{2})^{2} + (2 6) (\frac{1}{2})^{5} (\frac{1}{2})^{1} + (\frac{1}{2})^{6} = \frac{11}{32}$

Expected value $E (X)$ $μ = n p = 6 \times 0.5 = 3$

Variance $σ^{2} = n pq = 6 \times 0.5 \times 0.5 = 1.5$

Poisson Distribution

Models a random process consisting of repeated occurrence of a single event within a fixed interval. The probability of $k$ occurrences is given by $p (k; λ) = \frac{λ ^{k}}{k !} e^{- λ}, k = 0, 1, 2, ...$

The poisson distribution can be used to approximate the binomial distribution with $λ = n p$ . This is only valid for large $n$ and small $p$

Expectation & Variance

$μ = σ^{2} = λ$

Example

The occurrence of typos on a page is modelled by a poisson distribution with $λ = 0.5$ .

The probability of 2 errors: $p (2; 0.5) = \frac{0. 5 ^{2}}{2 !} e^{- 0.5} = 0.076$

Continuous Random Variables

Continuous random variables map events from a sample space to an interval. Probabilities are written $P (a \leq X \leq b)$ , where $X$ is the random variable. $X$ is defined with a continuous function, the probability density function.

The function must be positive
- $f (x) \geq 0$
The total area under the curve of the function must be 1
- $\int_{- \infty}^{\infty} f (x) d x = 1$
$P (a \leq X \leq b) = \int_{a}^{b} f (x) d x$

Example

$f (x) = {a (x - x^{2}) 0 0 \leq x \leq 1 o t h er w i se$

Require that $\int_{- \infty}^{\infty} f (x) d x = 1$ , so have to find $a$ : $\int_{- \infty}^{\infty} f (x) d x = \int_{0}^{1} a (x - x^{2}) d x = a [\frac{x ^{2}}{2} - \frac{x ^{3}}{3}]_{0}^{1} = \frac{a}{6} \Rightarrow a = 6$

Calculating some probabilities: $P (0 \leq X \leq 0.5) = \int_{0}^{0.5} f (x) d x = \int_{0}^{0.5} 6 (x - x^{2}) d x = 6 [\frac{x ^{2}}{2} - \frac{x ^{3}}{3}]_{0}^{0.5} = 0.5$ $P (0.25 \leq X \leq 0.75) = \int_{0.25}^{0.75} f (x) d x = \int_{0.25}^{0.75} 6 (x - x^{2}) d x = 6 [\frac{x ^{2}}{2} - \frac{x ^{3}}{3}]_{0.25}^{0.75} = \frac{11}{16}$

Cumulative Distribution Function

The cumulative distribution function $F_{X}$ up to the point $a$ is given as $F_{X} (a) = \int_{- \infty}^{a} f (x) d x$

if $a \leq b$ , then $F_{X} (a) \leq F_{X} (b)$
$lim_{x \to - \infty} F_{X} (x) = 0$
$lim_{x \to \infty} F_{X} (x) = 1$
$\frac{d}{d x} F_{X} (x) = f (x)$
- Derivative of cumulative distribution function is the probability distribution function

Using previous example, let $F_{X} (x) = \int_{- \infty}^{x} f (t) d t$ . For $x < 0$ $F_{X} (x) = 0$

For $0 \leq x \leq 1$ $F_{X} (x) = \int_{0}^{x} 6 (t - t^{2}) d t = 6 [\frac{t ^{2}}{2} - \frac{t ^{3}}{3}]_{0}^{x} = 3 x^{2} - 2 x^{3}$

For $x > 1$ $F_{X} (x) = \int_{0}^{1} 6 (t - t^{2}) d t = 6 [\frac{t ^{2}}{2} - \frac{t ^{3}}{3}]_{0}^{1} = 1$

Expectation & Variance

Where $X$ is a continuous random variable:

$E (X) = \int_{- \infty}^{\infty} x f (x) d x = μ$ $Va r (X) = \int_{- \infty}^{\infty} (x - μ)^{2} f (x) d x = σ_{X}^{2} = E (X^{2}) - μ^{2}$

Uniform Distribution

A continuous distribution with p.d.f:

$f (x) = {\frac{1}{b - a} 0 a \leq x \leq b o t h er w i se$

Expectation and variance:

$μ = \frac{a + b}{2}$ $σ^{2} = \frac{( b - a ) ^{2}}{12}$

Cumulative distribution function:

$F_{X} (x) = ⎩ ⎨ ⎧ 0 \frac{x - a}{b - a} 0 - \infty < x < a a \leq x \leq b b < x < \infty >>$

Exponential Distribution

A continuous distribution with p.d.f:

$f (x) = {0 v e^{- vx} - \infty < x < 0 0 \leq x < \infty$

Expectation and variance:

$μ = \frac{1}{v}$ $σ^{2} = \frac{1}{v ^{2}}$

Cumulative distribution function:

$F_{X} (x) = {0 1 - e^{- vx} - \infty < x < 0 0 \leq x < \infty$

Recall that a discrete random process $X$ where a single event occurs $i$ times in a fixed interval is modelled by a Possion distribution $p (k; λ)$
- $E (X) = λ$
Consider a situation where the event occurs at a constant mean rate $v$ per unit time
Let $λ = v t$ , then $P (0) = e^{- v t}$ and probability of $\geq 1$ events occurring is $1 - e^{- v t}$
Suppose the continuous random variable $Y$ is the time between occurrences of successive events
If there is a period of time $t$ with no events, then $Y > t$ and $P (Y > t) = e^{- v t}$
If $\geq 1$ events occur then $Y \leq t$ and $P (Y \leq t) = 1 - e^{- v t}$

If the number of events per interval of time is Possion distributed, then the length of time between events is exponentially distributed

Example

Calls arrive randomly at the telephone exchange at a mean rate of 2 calls per minute. The number of calls per minute $X$ is a d.r.v. which can be modelled by a Poisson distribution with $λ = 2$ . The probability of 1 call in any given minute is:

$P (X = 1) = \frac{λ e ^{(} - λ )}{1 !} = 2 e^{- 2} = 0.27$

The time between consecutive calls $Y$ is a c.r.v. modelled by an exponential distribution with $v = \frac{λ}{t} = \frac{2}{1} = 2$ . The probability of at least 1 ( $\geq 1$ ) minute between calls is: $p (1 \leq Y \leq \infty) = \int_{1}^{\infty} v e^{- v t} d t = \int_{1}^{\infty} 2 e^{- 2 t} d t = [- e^{- 2 t}]_{1}^{\infty} = 0.135$

Normal Distribution

A distribution with probability density function:

$f (x) = \frac{1}{σ 2 π} e^{- \frac{( x - μ ) ^{2}}{2 σ ^{2}}}$

Expectation $E (X) = μ$ and variance $Va r (X) = σ^{2}$ . Normal distribution is denoted $N (μ, σ^{2})$ and is defined by its mean and variance.

Standardised Normal Distribution

$X$ is a random variable with distribution $N (μ, σ^{2})$ . The standardised random variable $U$ is distributed $N (0, 1)$ and can be obtained with the transform: $U = \frac{X - μ}{σ}$ and has p.d.f. $f (u) = \frac{1}{2 π} e^{- \frac{u ^{2}}{2}}$

$P (X \leq b) = P (U \leq β)$ where $β = \frac{b - μ}{σ}$ . Values for the standard normal distribution are tabulated in the data book.

Example

The length of bolts $x$ from a production process are distributed normally with $μ = 2.5$ and $σ^{2} = 0.01$ .

$u = \frac{x - μ}{σ} = \frac{x - 2.5}{0.1}$ The probability the length of a bolt is between 2.6 and 2.7 cm (values obtained from table lookups): $P (2.6 \leq X \leq 2.7) = P (\frac{2.6 - 2.5}{0.1} \leq U \leq \frac{2.7 - 2.5}{0.1}) = P (1 \leq U \leq 2)$ $= P (0 \leq U \leq 2) - P (0 \leq U \leq 1) = 0.4772 - 0.3413 = 0.1359$

Confidence Intervals

A confidence interval is the interval in which we would expect to find an estimate of a parameter, at a specified probability level. For example, the interval covering 95% of the population of $N (μ, σ^{2})$ is $μ \pm 1.96 σ$ .

For a random variable $X$ with distribution $N (67.5, 2. 5^{2})$ , the standard variate $u = \frac{x - 67.5}{2.5}$ . For confidence interval at 95% probability:

$Q (u) = \frac{0.95}{2} = 0.475$

Using table lookups, $u = \pm 1.96$ , and: $x = μ \pm 1.96 σ = 67.5 \pm 1.96 \times 2.5 = 67.5 \pm 4.9$

For confidence interval at 99.9% probability:

$Q (u) = \frac{0.999}{2} = 0.4995$

Table lookups again, $u = \pm 3.3$ , and: $x = μ \pm 3.3 σ = 67.5 \pm 3.3 \times 2.5 = 67.5 \pm 8.25$

Normal Approximation to Binomial Distribution

The normal distribution gives a close approximation to the binomial distribution, provided:

$n$ is large
neither $p$ nor $q$ are close to zero
$μ = n p$ and $σ^{2} = n pq$

For example, take a random process consitsting of 64 spins of a fair coin $n = 64$ and $p = q = 0.5$ . The probability of 40 heads is: $P (40) = (40 60) \times 0. 5^{64} = 0.01359$ $μ = n p = 32, σ = n pq = 4$

For a normal approximation, must use the interval around 40 (normal is continuous, binomial is discrete) $[39.5, 40.5]$ :

$P (39.5 \leq X \leq 40.5) = P (\frac{39.5 - 32}{4} \leq X \leq \frac{39.5 - 32}{4}) = 0.4832 - 0.4696 = 0.0136$

Normal Approximation to Poisson Distribution

The normal distribution gives a close approximation to the binomial distribution, provided:

$λ$ is large
$μ = σ^{2} = n p$

For example, say a radioactive decay emits a mean of 69 particles per seconds. A standard normal approximation to this is:

$u = \frac{x - μ}{σ} = \frac{x - 69}{69}$

The probability of emitting $\leq 60$ particles in a second is therefore: $P (0 \leq X \leq 60) = P (\frac{0 - 69}{69} \leq X \leq \frac{60.5 - 69}{69}) = 0.5 - 0.3473 = 0.1527$

Equations

Below are some of the main equations that I have found useful to have on hand.

Integration and Differentiation

Cheatsheet

ES197

This section, similar to ES191, also aims to be fairly comprehensive as a reference. I probably won't cover much of the matlab/simulink stuff.

Translational Mechanical Systems

Translational systems involve movement in 1 dimension
For example, a the suspension in a car going over bumps going up and down
System diagrams can be used to represent systems

Diagrams include:
- Masses
- Springs
- Dampers

Elements

There are element laws to model each of the three elements involved in mechanical systems. They are modelled using two key variables:

Force $F (t)$ in newtons ( $N$ )
Displacement $x (t)$ in meters ( $m$ )
- Also sometimes velocity $v (t) = \overset{x}{˙}$ in meters per second ( $m s^{- 1}$ )

When modelling systems, some assumptions are made:

Masses are all perfectly rigid
Springs and dampers have zero mass
All behaviour is assumed to be linear

Mass

Stores kinetic/potential energy
Energy storage is reversible
- Can put energy in OR take it out

Elemental equation (Newton's second law):

$m \frac{d ^{2}}{d t ^{2}} x = m \overset{x}{¨} = ma = f (t)$

Kinetic energy stored:

$W = \frac{1}{2} m v^{2}$

Spring

Stores potential energy
Also reversible energy store
- Can be stretched/compressed

Elemental equation (Hooke's law):

$f (t) = k (x_{1} (t) - x_{2} (t))$

The spring constant k has units $N m^{- 1}$ . Energy Stored:

$W = \frac{1}{2} k (x_{1} - x_{2})$

In reality, springs are not perfectly linear as per hooke's law, so approximations are made. Any mechanical element that undergoes a change in shape can be described as a stiffness element, and therefore modelled as a spring.

Damper

Dampers are used to reduce oscillation and introduce friction into a system.

Dissapates energy as heat
Non reversible energy transfer
Takes energy out of the system

Elemental equation:

$f (t) = B (\overset{x_{1}}{˙} - \overset{x_{2}}{˙})$

B is the damper constant and has units $N s m^{- 1}$

Interconnection Laws

Compatibility Law

Elemental velocities are identical at points of connection

Equilibrium Law

Sum of external forces acting on a body equals mass x acceleration
All forces acting on a body in equilibrium equals zero

Fictitious/D'alembert Forces

D'alembert principle is an alternative form of Newtons' second law, stating that the force on a body is equal to mass times acceleration: $F - ma = 0$ . $- ma$ is the inertial, or fictitious force. When modelling systems, the inertial force always opposes the direction of motion.

Example:

Form a differential equation describing the system shown below.

4 forces acting on the mass:

Spring: $F = k x$
Damper: $F = B \overset{x}{˙}$
Inertial/Fictitious force: $F = m \overset{x}{¨}$
The force being applied, $f (t)$

The forces all sum to zero:

$f (t) - k x - B \overset{x}{˙} - m \overset{x}{¨} = 0$ $f (t) = m \overset{x}{¨} + B \overset{x}{˙} + k x$

Rotational Mechanical Systems

Dynamic Systems

A system is a set of interconnected elements which transfer energy between them
In a dynamic system, energy between elements varies with time
Systems interact with their environments through:
- Input
  - System depends on
  - Do no affect environment
- Output
  - System does not depend on
  - Affects Environment
Mathematical models of dynamic systems are used to describe and predict behaviour
Models are all, always approximations

Lumped vs Distributed Systems

In a lumped system, properties are concentrated at 1 or 2 points in an element
- For example
  - Inelastic mass, force acts at centre of gravity
  - Massless spring, forces act at either end
- Modelled as an ODE
- Time is only independent variable
In a distributed system, properties vary throughout an element
- For example, non-uniform mass
- Time and position are both independent variables
- Can be broken down into multiple lumped systems

Linear vs Non-Linear Systems

For non-linear systems, model is a non-linear differential equation
For linear systems, equation is linear
In a linear system, the resultant response of the system caused by two or more input signals is the sum of the responses which would have been caused by each input individually
- This is not true in non-linear systems

Discrete vs Continuous Models

In discrete time systems, model is a difference equation
- output happens at discrete time steps
In continuous systems, model is a differential equation
- output is a continuous function of the input

Rotational Systems

Rotational systems are modelled using two basic variables:

Torque $τ$ measured in $N m$
- A twisting force
- Analogous to force in Newtons
Angular displacement $θ$ measured in radians
- Angular velocity $ω = \dot{θ}$
- Analogous to displacement in meters

Element Laws

Moment of Inertia

Rotational mass about an axis
Stores kinetic energy in a reversible form
Shown as rotating disc with inertia $J$ , units $K g m^{2}$

Elemental equation: $τ (t) = J \frac{d ^{2}}{d t ^{2}} θ (t) = J \ddot{θ} (t)$

Energy Stored: $W = \frac{1}{2} J ω^{2}$

The force $J \ddot{θ}$ acts in the opposite direction to the direction the mass is spinning

Rotational Spring

Stores potential energy by twisting
Reversible energy store
Produced torque proportional to the angular displacement at either end of spring

Elemental Equation:

$τ (t) = k (θ_{1} (t) - θ_{2} (t))$

Stored Energy:

$W = \frac{1}{2} k (θ_{1} (t) - θ_{2} (t))^{2}$

Rotational Damper

Dissapates energy as heat
Non-reversible
Energy dissapated $\propto$ angular velocity

Elemental Equation:

$τ (t) = B (ω_{1} (t) - ω_{2} (t))$

Interconnection Laws

Compatibility Law

Connected elements have the same rotational displacement and velocity

Interconnection Law

D'alembert law for rotational systems:

$i \sum (τ_{e x t})_{i} - J \overset{ω}{˙} = 0$

$J \overset{ω}{˙}$ is considered an inertial/fictitious torque, so for a body in equilibrium, $\sum_{i} τ_{i} = 0$ .

Example

Form an equation to model the system shown below.

4 torques acting upon the disk:

Stiffness element, $τ = k θ$
Friction element, $τ = B \dot{θ}$
Input torque $τ (t)$
Inertial force $τ = J \ddot{θ}$

The forces sum to zero, so:

$τ (t) - k θ - B \dot{θ} - J \ddot{θ} = 0$

$τ (t) = J \ddot{θ} (t) + B \dot{θ} (t) + k θ (t)$

Electrical Systems

Similar to mechanical systems, models of electrical systems can be constructed. Similar deal to ES191.

Variables

Current $I (t)$ in amps (A)
Voltage $e (t)$ in volts (V) -- not v for voltage, e is used in systems
Power in watts $P = I (t) \cdot e (t)$

Elements

Capacitors

Store electrical energy in a reversible form
Capacitance $C$ measured in Farads (L)

Elemental equation:

$I (t) = C \frac{d}{d t} e_{12} (t)$

Energy stored: $W = \frac{1}{2} C e^{2}$

Inductors

Store magnetic energy in a reversible form
Inductance $L$ measured in Henries (H)

Elemental equation: $e_{12} (t) = L \frac{d}{d t} I (t)$

Energy Stored: $W = \frac{1}{2} L I^{2}$

Resistors

Dissapates energy
- Non-reversible
Resistance $R$ measured in Ohms ( $Ω$ )

Elemental Equation (Ohm's law): $e_{12} (t) = I (t) \cdot R$

Voltage Source

Provides an input of energy to the system.
Input voltage $e_{i} (t)$

Kirchhoff's Laws

Describe how elements interconnect and transfer energy between them
KVL - voltages around a closed loop sum to zero
KCL - currents about a node sum to zero

Example

Form a differential equation to model the following electrical system/circuit:

Elements:

Resistor: $e_{r} = I R$
Capacitor: $I = C \frac{d}{d t} e_{c}$
Inductor: $e_{L} = L \frac{d}{d t} I$

KVL - the voltages round the loop sum to zero:

$e_{i} - e_{r} - e_{l} - e_{o} = 0$ $e_{i} - I R - L \frac{d I}{d t} - e_{o} = 0$

Using the capacitor equation, and the fact that $e_{o} = e_{c}$ :

$e_{i} - RC \frac{d}{d t} e_{o} - L C \frac{d ^{2}}{d t ^{2}} e_{o} - e_{o} = 0$ $L C \frac{d ^{2}}{d t ^{2}} e_{o} (t) + RC \frac{d}{d t} e_{o} (t) + e_{o} (t) = e_{i} (t)$

Thermal Systems

User to model heat transfer
- For example in a house
- Or in electronic components
Determine efficiency of elements
Determine thermal operating ranges for components

Variables

Rate of heat flow $q (t)$ in watts ( $J s^{- 1}$ )
Temperature, $θ (t)$ in Kelvins (K)
Analogous to current and voltage in electrical systems

Elements

Thermal Capacitor

Stores heat energy in a reversible way

Elemental equation: $q_{c} (t) = C \frac{d θ}{d t}$

Where $q_{c} (t)$ is the net heat flowing in, ie $q_{in} (t) - q_{o u t} (t)$ .

Thermal Resistor

Dissapates heat
- Non-reversible

Any object that restricts heat flow when heat flows from on medium to another can be modelled as a resistor. Elemental equation: $q (t) = \frac{1}{R} θ_{12} (t)$

Where $q_{t}$ is the flow of heat from the temperature $θ_{1}$ on one side of the resistor to the temperature $θ_{2}$ on the other.

Interconnection Laws

Compatibility Law:

Temperatures are identical where elements touch,
$θ_{1} = θ_{2} = ... = θ_{n}$

Equilibrium Law:

Elemental heat flow rates sum to zero at connection points
$q_{1} + q_{2} + ... + q_{n} = 0$

Examples

Develop a thermal model for someone doing winter sports. Assume:

Ambient temperature $θ_{a}$
Body temperature $θ$
Thermal resistance between body and ambient (the person is wearing a coat) $R$
Heat generated by body $q_{in}$

The rate of heat flow out is the difference in ambient and body temperature accross the resistor: $q_{o u t} = \frac{θ - θ _{a}}{R}$

In the thermal capacitor, the net input heat is proportional to the rate of change of temperature: $q_{in} - q_{o u t} = C \frac{d θ}{d t}$

Combining the two equations gives: $RC \frac{d}{d t} θ + θ = R q_{in} + θ_{a}$

Data Driven Models

A system model can be developed from data describing the system
Computational techniques can be used to fit data to a model

Modelling Approaches

White Box

A white box model is a physical modelling approach, used where all the information about a system and its components is known.
For example: "What is the voltage accross a 10 $Ω$ resistor?"
- The value of the resistor is known, so a mathematical model can be developed using knowledge of physics (Ohm's law in this case)
- The model is then tested against data gathered from the system

Grey Box

A grey box model is similar to white box, except where some physical parameters are unknown
A model is developed using known physical properties, except some parameters are left unknown
Data is then collected from testing and used to find parameted
For example: "What is the force required to stretch this spring by $x$ mm, when the stiffness is unknown"
- Using knowledge, $F = K x$
- Test spring to collect data
- Find value of $K$ that best fits the data to create a model
- Final model is then tested
Physical modelling used to get the form of the model, testing used to find unknown parameters
This, and white box, is mostly what's been done so far

Black box

"Here is a new battery. We know nothing about it. How does it performance respond to changes in temperature?"

Used to build models of a system where the internal operation of it is completely unknown: a "black box"
Data is collected from testing the system
An appropriate mathematical model is selected to fit the data
The model is fit to the data to test how good it is
The model is tested on new data to see how closely it models system behaviour

Modelling in Matlab

Regression

Regression is predicting a continuous response from a set of predictor values
- eg, predict extension of a spring given force, temperature, age
Learn a function that maps a set of predictor variables to a set of response variables

For a linear model of some data $y = p_{1} x + p_{0}$ :

$y$ and $x$ are the predictor variables from the data set
$p_{1}$ and $p_{0}$ are the unknowns to be estimated from the data
Polynomial models can be used for more complex data

In Matlab

% data points
x = 0:0.1:1.0;
y = 2 * x + 3;
%introduce some noise into the data
y_noise = y + 0.1*randn(11,1)';

%see the data
figure;
plot(x,y_noise);
axis([0 1 0 5])

In matlab, the polyfit function (matlab docs) is used to fit a polynomial model of a given degree to the data.

Inputs: x data, y data, polynomial degree
Output: coefficients of model

P = polyfit(x,y_noise,1) % linear model
hold on;
plot(x,polyval(P,x),'r');

In the example shown, the model ended up as $y = 1.7456 x + 3.0976$ , which is close, but not exact due to noise introduced into the data.

Limitations

Too complex of a model can lead to overfitting, where the model contains unwanted noise
To overcome this:
- Use simpler model
- Collect more data

First Order Step Response

Modelling is about predicting the behaviour of a system. Often, need to know

What is the output for a given input?
Is the system stable?
If the input changes quickly, how will the output change?

First Order Systems

First order systems are those with only one energy store, and can be modelled by a first order differential equation.

Type	Equation
Electrical	$RC \frac{d e _{o}}{d t} + e_{o} = e_{i}$
Thermal	$RC \frac{d θ}{d t} + θ = θ_{a}$
Mechanical	$M \overset{v}{˙} + B v = f (t)$
General	$T \frac{d y}{d t} + y = x$

For the general form of the equation $T \frac{d y}{d t} + y = x$ , the solution for a step input $x = H$ at time $t = 0$ , with $y (0) = 0$ : $y = H (1 - e^{- \frac{t}{T}})$ T is the time constant of the system.

Free and Forced Response

Free response:
- The response of a system to its stored energy when there is no input
- Zero Input
- Non-zero initial Conditions
- Homogenous differential equation
Forced response:
- The response of a system to an input when there is no energy initially in the system
- Non-zero input
- Zero initial Conditions
- Non-homogeneous differential equation
Total system response is a linear combination of the two

System Inputs

Different inputs can be used to determine characteristics of the system.

Step Input

$u (t) = {0 H t < 0 t \geq 0$

A sudden increase of a constant amplitude input
Can see how quickly the system responds
Is there is any delay/oscillation?
Is it stable?

Sine Wave

Can vary frequency and amplitude
Shows frequency response of a system

Impulse

$u (t) = {0 \infty t \neq = 0 t = 0$

A spike of infinite magnitude at an infinitely small time step

Ramp

$u (t) = {0 k t t < 0 t \geq 0$

An input that starts increasing at a constant rate, starting at $t = 0$ .

Step Response

The step response of the system is the output when given a step input
- System must have zero initial conditions
Characteristics of a response:
- Final/resting value
- Rise time
- Delay
- Overshoot
- Oscillation (frequency & damping factor)
- Stability

For a system with time constant $T = 10$ , the response looks something like this:

The time constant $T$ of a system determines how long the system takes to respond to step input. After 1 time constant, the system is at about $1 - \frac{1}{e}$ (63) % of its final value.

Time (s)	% of final value
$0.5 T$	39.3%
$T$	63.2%
$2 T$	86.5%
$3 T$	95.0%
$4 T$	98.2%
$5 T$	99.3%

Second Order Step Response

How 2nd order systems (those with 2 energy storing elements) respond to step inputs.

Standard form

$\frac{1}{ω _{n}^{2}} \frac{d ^{2}}{d t ^{2}} y (t) + \frac{2 ζ}{ω _{n}} \frac{d}{d t} y (t) + y (t) = u (t)$

$ω_{n}$ is the undamped frequency of the system response
- Indicates the speed of the response
$ζ$ is the damping factor
- Indicates the shape of the response

Forced Response

Forces response is the response to a non-zero input, namely
- Step
- Sinusoidal
Initial conditions are zero, it $y (0) = 0$ , $\frac{d}{d t} y (0) = 0$
The response is the solution to a non-homogeneous second order differential equation

Damped Response

There are 4 different cases for system response:

Damping Factor	Response
$ζ = 0$	No Damping
$0 < ζ < 1$	Underdamped
$ζ = 1$	Critically Damped
$ζ > 1$	Overdamped

The response of a system to the same input with varying damping factors is shown in the graph below, from the data book. The equations are also given in the data book.

Undamped

The system is not damped at all and is just a normal sinusoidal wave.

$y (t) = H (1 - cos ω_{n} t)$

Underdamping

The amplitude of the sinusoidal output decreases slowly over time to a final "steady state" value.

$y (t) = H [1 - \frac{e ^{- ζ ω_{n} t}}{1 - ζ ^{2}} sin (1 - ζ^{2} ω_{n} t + ϕ)]$ $tan ϕ = \frac{1 - ζ ^{2}}{ζ}$

Critical Damping

This gives the fastest response, where the output rises to its final steady state value.

$y (t) = H (1 - (ω_{n} t) e^{- ω_{n} t})$

Overdamping

The output rises slowly to its steady state value $y (t) = H [1 - \frac{e ^{- ζ ω_{n} t}}{ζ ^{2} - 1} sinh (ζ^{2} - 1 ω_{n} t + ϕ)]$ $tan ϕ = \frac{ζ ^{2} - 1}{ζ}$

Transfer Functions

A transfer function is a representation of the system which maps from input to output
- Useful for system analysis
- Carried out in the Laplace Domain

The Laplace Domain

Problems can be easier to solve in the Laplace domain, so the equation is Laplace transformed to make it easier to work with
Given a problem such as "what is the output $y (t)$ given a differential equation in $y$ and the step input $u (t)$ ?"
- Express step input in Laplace domain $U (s)$
- Express differential equation in Laplace domain and find transfer function $G (s)$
- Find output $Y (s) = U (s) G (s)$ in Laplace domain
- Transfer back to time domain to get $y (t)$

Function	Time domain	Laplace domain
Input	$u (t)$	$U (s)$
Output	$y (t)$	$Y (s)$
Transfer	$g (t)$	$G (s)$

The laplace domain is particularly useful in this case, as a differential equation in the time domain becomes an algebraic one in the Laplace domain. $L (\frac{d y}{d x}) = s Y (s) - y (0)$

Transfer Function Definition

The transfer function is the ratio of output to input, given zero initial conditions. $G (s) = \frac{Y ( s )}{U ( s )}$

For a general first order system of the form $T \frac{d}{d t} y (t) + y (t) = u (t)$

The transfer function in the Laplace domain can be derived as: $T \cdot L (\frac{d}{d t} y (t)) + L (y (t)) = L (u (t))$ $T (s Y (s)) + Y (s) = U (s)$ $Y (s) (T s + 1) = U (s)$ $G (s) = \frac{Y ( s )}{U ( s )} = \frac{1}{T s + 1}$

Step Input in the Laplace Domain

Step input has a constant value $H$ for $t > 0$ $L (H) = \frac{H}{s} = U (s)$

For a first order system, the output will therefore be: $Y (s) = U (s) G (s) = \frac{H}{s} \frac{1}{T s + 1}$ $y (t) = L^{- 1} (\frac{H}{s} \cdot \frac{1}{T s + 1}) = H (1 - e^{\frac{t}{T}})$

Example

Find the transfer function for the system shown:

The system has input-output equation (in standard form): $\frac{J}{B} \overset{ω}{˙} (t) + ω (t) = \frac{1}{B} τ (t)$

Taking the Laplace transform of both sides: $\frac{J}{B} s Ω (s) + Ω (s) = \frac{1}{B} T (s)$

Rearranging to obtain the transfer function: $G (s) = \frac{Ω ( s )}{T ( s )} = \frac{1}{B} \cdot \frac{1}{\frac{J}{B} s + 1} = \frac{1}{J s + B}$

Using Matlab

In matlab the tf function (Matlab docs) can be used to generate a system model using it's transfer function. For example, those code below generates a transfer function $G (s) = \frac{1}{2 s + 3}$ , and then plots it's response to a step input of amplitude 1.

G = tf([1],[2 3]);
step(G);

Example

For the system shown below, where $M = 100$ , $B = 40$ , $K = 100$ , plot the step response and obtain the undamped natural frequency $ω_{n}$ and damping factor $ζ$ .

$G (s) = \frac{1}{s ^{2} M + s B + K} = \frac{1}{100 s ^{2} + 40 s + 100}$

system = tf([1],[100 40 100]);
step(system, 15); % plot 15 seconds of the response

%function to obtain system parameters
[wn,z] = damp(system)

The script will output wn=1, and z = 0.2. The plotted step response will look like:

First Order Frequency Response

Frequency response is is the response of a system to a sinusoidal/oscillating input.

Response to Sinusoidal input

For a standard first order system $T \frac{d}{d t} y (t) + y (t) = u (t)$ , with a sinusoidal input $u (t) = A s in (ω t)$ :

$U (s) = L (u (t)) = \frac{A ω}{s ^{2} + ω ^{2}}$ $Y (s) = U (s) G (s) = \frac{A ω}{s ^{2} + ω ^{2}} \frac{1}{T s + 1}$ $y (t) = L^{- 1} (Y (s)) = \frac{A}{1 + ω ^{2} T ^{2}} s in (ω t - tan^{- 1} ω T) + \frac{A ω T}{1 + ω ^{2} T ^{2}} e^{- \frac{t}{T}}$

The sinusoidal part of the equation is the steady-state that the response tends to, and the exponential part is the transient part that represents the rate of decay of the offset of the oscillation.

The frequency of input and output is always the same
- It is the amplitude and phase shift $ϕ$ that change
- These depend on the input frequency $ω$
  - This dependence is the frequency response

Example

The example below shows an input $s in (4 π t)$ , and its output with $G (s) = \frac{1}{s + 1}$

$y (t) = 0.0793 s in (4 π t + ϕ) + 0.079 e^{- t}, ϕ = - tan^{- 1} 4 π = - 8 5^{\circ}$

The steady state sinusoidal and transient exponential part of this response can be seen in the equation.

Matlab Example

The following code generates the following plot

system = tf(1,[1 1]);
t = 0:0.01:3; % time value vector
u = (t>=1).*sin(4 * pi * t) %input signal for t >= 1
y = lsim(sys,u,t); % simulate system with input u

figure;
subplot(2,1,1); plot(t,u); title("input");
subplot(2,1,2); plot(t,y,'r'); title("outputA");

Gain and Phase

Gain is the ratio of output to input amplitude, ie how much bigger or smaller the output is compared to input.

$G = \frac{E}{A} (ω)$

Phase difference $ϕ (ω)$ is how much the output signal is delayed compared to the input signal. Both are functions of input frequency $ω$ .

The frequency response can be obtained by substituting $jω$ for $s$ in the transfer function. This gives a complex function as shown

$G (s) = \frac{1}{T s + 1} ⟹ G (jω) = \frac{1}{G jω + 1}$

Magnitude $∣ G (jω) ∣$ gives the amplitude of the response, and the argument of the complex number $∠ G (jω)$ gives the phase shift $ϕ$ . The substitution $s = jω$ is used, is because in the Laplace domain, both signals and systems are represented by functions of $s$ .

The $s$ -plane is the complex plane on which Laplace transforms are graphed.
Generally, $s = σ + jω$
$σ$ is the Neper frequency, the rate at which the function decays
$ω$ is the radial frequency, the rate at which the function oscillates
Periodic sinusoidal inputs are non decaying, so $σ = 0$ , giving $s = jω$

To find the frequency response parameters:

$G (jω) = \frac{1}{1 + jω T} \times \frac{1 - jω T}{1 - jω T} = \frac{1 - jω T}{1 + ω ^{2} T ^{2}}$ $= \frac{1}{1 + ω ^{2} T ^{2}} - j \frac{ω T}{1 + ω ^{2} T ^{2}}$ $= R e (G) - j I m (G)$ $∣ G (jω) ∣ = (R e (G))^{2} + (I m (G))^{2} = \frac{1}{1 + ω ^{2} T ^{2}}$

$∠ G (jω) = tan^{- 1} \frac{I m ( G )}{R e ( G )} = - tan^{- 1} ω T$

The graphs below show the frequency response in terms of $T$ for varying frequency $ω$ :

Example

Given a transfer function $G = \frac{1}{s}$ , what is the magnitude and phase of frequency response? $G (jω) = \frac{1}{jω} = \frac{- j}{ω} = 0 - \frac{1}{ω} j$ $∣ G (jω) ∣ = \frac{1}{ω ^{2}} = \frac{1}{ω}$ $∠ G (jω) = tan^{- 1} \frac{\frac{- 1}{ω}}{0} = - \frac{π}{2}$

Bode Plots

Bode plots show frequency and amplitude of frequency response on a log $_{10}$ scale. Information is not spread linearly accross the frequency range, so it makes more sense to use a logarithmic scale. An important feature of bode plots is the corner frequency: the frequency at the point where the two asymptotes of the magnitude-frequency graph. This point is where $ω = \frac{1}{T}$ .

The plot above is for the function $G (s) = \frac{1}{s + 1}$ . The gain is measured in decibels $d B$ for the magnitude of the response.

Second Order Frequency Response

How second order systems respond to sinusoidal/oscillating input. Similar to first order.

Gain and Phase for Second Order Systems

For a 2nd order system in standard input-output form:

$\frac{1}{ω _{n}^{2}} \frac{d ^{2}}{d t ^{2}} y (t) + \frac{2 ζ}{ω _{n}} \frac{d}{d t} y (t) + y (t) = u (t), y (0) = 0, \frac{d}{d t} y (0) = 0$ $G (s) = \frac{ω _{n}^{2}}{s ^{2} + 2 ζ ω _{n} s + ω _{n}^{2}}$ $G (jω) = \frac{ω _{n}^{2}}{( ω _{n}^{2} - ω ^{2} ) + 2 j ζ ω _{n} ω}$

The gain and phase of the frequency response are therefore:

$∣ G (jω) ∣ = \frac{ω _{n}^{2}}{[( ω _{n}^{2} - ω ^{2} ) ^{2} + 4 ζ ^{2} ω _{n}^{2} ω ^{2} ] ^{1/2}}$

$ϕ (ω) = ∠ G (jω) = - tan^{- 1} \frac{2 ζ ω _{n} ω}{ω _{n}^{2} - ω ^{2}}$

Bode Plots, from Data Book

The plots show gain and phase shift for varying values of $ζ$

Example

For the electrical system shown below with the values $R = 1 k Ω$ , $C = 0.1 μ F$ , $L = 0.1 H$ find:

The undamped natural frequency $ω_{n}$
The damping factor $ζ$
Sketch the magnitude of the frequency response $∣ G (jω) ∣$
- At what frequency is this at it's maximum?
Sketch a bode plot using matlab

The system equation is: $L C \frac{d ^{2}}{d t ^{2}} e_{o} (t) + RC \frac{d}{d t} e_{o} (t) + e_{o} (t) = e_{i} (t)$

Undamped natural frequency: $\frac{1}{ω _{n}^{2}} = L C = 1 0^{- 8} ⟹ ω_{n} = 1 0^{4} r a d s^{- 1}$

Damping factor: $\frac{2 ζ}{ω _{n}} = RC = 1 0^{4} ⟹ ζ = 0.5$

Using the graph from the data book

The graph peaks at approx $∣ G (jω) ∣ = 1.15$ , so: $ω ≊ 0.71 ω_{n} = 0.71 \times 1 0^{4} r a d s^{- 1}$ $f = \frac{0.71 \times 1 0 ^{4}}{2 π} = 1129 Hz$

Matlab plot:

R = 1000
C = 10e-7
L = 0.1
sys = tf([1],[L*C R*C 1]); figure; step(sys);
bode(sys);

CS241

Operating Systems

Processes

Process is a program in execution
A process in memory has
- Text: process instructions
- Data: global variables
- Stack & Heap
  - Can shrink/grow at runtime
Process can be in several states
- New: being created
- Ready: waiting to be assigned to processor
- Waiting: waiting on something or other
- Running
- Terminated: finished execution
Process control blocks
- Stores:
  - State
  - Program counter
  - CPU registers
  - Scheduling info
  - Memory management info
  - Accouting information
    - CPU usage, time since start, etc
  - I/O Status
    - Open files and I/O devices
- Stored in kernel memory
- Used when saving processes for context switches
- Simpler the PCB, faster the context can switch
Process scheduling
- Scheduler selects among available processes for who's turn is next on the CPU
- Three queues:
  - Job queue for new processes (long term)
  - Ready queue for ready processes (short term)
  - Device queues for processes waiting for I/O access
- Short term scheduler selects next process from ready queue
  - Invoked frequently and must be fast
- Long term moves from new state to ready queue
  - Invoked when processes are created
  - Moves processes into memory
  - Not used in modern OS
Process creation
- Child processes can be created by parent processes
  - Forms a tree
  - Root process is init
- Options are specified when creating process
  - Resource sharing options
  - Execution options (concurrently or parent waits)
  - Address space options (duplicate or child loads new)
- fork() creates new process as duplicate
- exec() used after fork to replace address space with new program
Process termination
- Processes terminate after executing last statement
- Can be terminated with exit() syscall, returning status code
- wait() tells parent to wait for child to exit
- If a parent exits without waiting for child, children become orphans and are adopted by init
- When a process terminates but exit code has not yet been collected it is a zombie process
  - All resources released but entry in process table remains
  - Once parent gets exit status it is released
Inter-process communication
- Either shared memory or message passing
- Shared memory
  - Address space of one process and other one attaches to it
  - Special permission required for one process to access another's address space
  - mmap() syscall creates shared block of memory
- Message passing
  - Send and receive syscalls provided
  - One process typically act as producer and other consumer
  - Message buffer exists in kernel space
    - Circular queue can be used as a shared buffer
    - Can have zero-capacity, or bounded/unbounded
  - Can communicate directly by naming processes
  - Can also communicate indirectly using mailboxes
    - Mailboxes have unique IDs
    - Process can only communicate if they share a mailbox
  - Can do blocking sends/receives, or non-blocking
- Pipes
  - A mechanism for message passing in UNIX
  - Pipes in bash exist with |, connecting input of one process to output of another
  - Named pipes, or fifos appear in file system and can be manipulated using file operations
    - Much more powerful, persist beyond processes exiting

Threads

What are threads
- A unit of CPU execution
- Can multi-thread processes to achieve concurrency
- Threads lighter than processes and share more with parent
  - Share code, data, files
- Threads have own id, program counter, register set, stack (share heap)
Concurrency and parallelism
- Concurrency implies more than one task making progress
- Parallelism implies that a system can perform more than one task simultaneously
  - Data parallelism splits data up and performs same processing on each subset of data
  - Task parallelism splits threads doing different things up
- Can have concurrency without parallelism by interleaving tasks on one core
- Amdahl's law is a rough estimate of speedup
- Speedup $\leq \frac{1}{S + ( 1 - S ) / N}$
  - Numerator (1) is time taken before parallelising
  - $S$ is time taken to run serial part
  - $(1 - S) / N$ is the time taken to run parallelisable part on $N$ cores
Pthreads is common API for working with threads
- pthread_create() creates new thread to execute a function
- pthread_join() waits for thread to exit
- Provides mutexes and condvars
- Can set thread IDs and work with attributes, etc
Synchronising threads
- Sharing memory between threads requires synchronisation
- Race conditions occur when two threads try to write to a variable at the same time
  - High level code broken down into atomic steps which become interleaved and cause registers and intermediate operations to become mixed up, causing undefined behaviour
- Can use mutexes for synchronisation
User vs Kernel threads
- User level threads are implemented by user code in userspace
  - No kernel involvement
  - Cannot be scheduled in parallel but can run concurrently
- Kernel threads are implemented by the kernel and created by syscalls
  - Scheduling is handled by kernel so can be scheduled on different CPUs
  - Management has kernel overhead
- Many-to-one model maps many user level threads to a single kernel thread
  - Less overhead
  - User threads are all sharing kernel thread so no parallelism and one blocking causes all to block
- One-to-one gives each user thread a kernel thread
  - Used in windows and linux
  - More kernel threads = more overhead
  - Users can cause creation of kernel threads which slows system
- Many-to-many multiplexes user threads across a set of kernel threads
  - Number of kernel thread can be set and can run in parallel
  - More complex than one-to-one
Condition variables
- Used to synchronise threads
- Threads can wait() on condition variables
- Other threads signal the variable using signal() or broadcast()
Signals are used in UNIX systems to notify processes
- Synchronous signals generated internally by process
- Asynchronous signals generated external to process by other processes
  - ctrl+c sends SIGINT asynchronously
- Signals are delivered to process and handled by signal handlers
  - Only signal-safe functions can be called within signal handlers
- Signals can be delivered to all threads, just the main thread, or specific threads

Scheduling

Different schedule queues contain processes in different states
- Queues contain process control blocks
Scheduler wants to be as efficient as possible in scheduling jobs
- Maximise CPU utilisation and process throughput
- Minimise turnaround, waiting, response times
Four events that can trigger scheduler
- Process switches from running to waiting state
- Process terminates
- Process switches from running to ready
- Process switches from waiting to ready
- First two cases are non pre-emptive, where process give up CPU
- 2nd two are pre-emptive, where scheduler takes the task off the CPU and gives it to a new task
First-come first-serve scheduling is where processes are assigned to CPU in order of arrival
- Avg wait time varies massively on order of processes arriving
- Non pre-emptive
- Shorter jobs first improves performance
Shortest first scheduling
- Provably optimal in minimising average wait time
- Relies on knowing how long each job will take
- Can estimate job length by exponential moving average
  - $t_{n}$ is the length of the nth CPU burst
  - $τ_{n + 1}$ is the predicted length of the next CPU burst
  - $0 \leq α \leq 1$
  - $τ_{n + 1} = α t_{n} + (1 - α) τ_{n}$
- Can be either pre-emptive or non pre-emptive
  - When a new, shorter process arrives when one is already being executed, can either:
    - Switch to new process
    - Wait for current job to finish
  - Pre-emptive can cause race conditions where processes are switched mid-write
Priority scheduling assigns a priority to each process, and lowest priority is executed first
- Shortest job first is a special case of priority scheduling, where the priority is execution time
- Can cause starvation for processes with low priority
  - Can overcome with aging, where priority is increased over time
Round robin scheduling is where each process gets a small amount of CPU time (a quantum $q$ ), and after that time has elapsed the process is pre-empted and put back into the ready queue
- Scheduler visits process in arrival order
- No process waits more than $(N - 1) q$ for it's next turn
- If $q$ is large, becomes first come first served
- If $q$ is small, too many context switches
  - $q$ usually 10 to 100ms
- Higher wait time than shortest job first in most cases, but better response time

Synchronisation

Synchronisation is important to prevent race conditions
Needed for both process and threads as they both share memory
The part of code where processes update shared variables is the critical section
- No two processes can concurrently execute their critical section
  - Entry and exit must uphold mutual exclusion
Ideal solution to the critical section problem must satisfy:
- Mutual exclusion
- At least one process must be able to progress into the critical section if no other process is in it
- No process should have to wait indefinitely to enter critical section
Peterson's Algorithm is a solution to the problem
- int turn; shared variable to specify who's turn it is
- boolean flag[2]; flags store who wished to enter
- Process runs if both waiting and their turn, or if only one waiting and other not in critical section
- Can fail with modern architectures reordering stuff
Synchronisation primitives are based on the idea of locking
- Two processes cannot hold a lock simultaneously
- Locking and unlocking should be atomic operations
  - Modern hardware provides atomic instructions
  - Used to build sync primitives
Test and set is one type of atomic instruction
- Update a register and return it's original value
- Can be used to implement a lock using a shared boolean variable
  - Does not satisfy bounded waiting as the process can instantly reacquire the lock
- More complex implementations can satisfy criteria (allow the next waiting process to execute and only release lock if no other process waiting)
Mutex locks are lock variables that only one process can hold at a time
- If another process tries to acquire the lock then it blocks until the lock is available
Semaphores have integer values
- 0 means unavailable
- Positive value means available
- wait() on a semaphore makes the process wait until the value is positive
  - Decrements by 1 if/when positive
- signal() increments value by one
- Both commands must be atomic
- Controls the number of processes that can concurrently access resource - more powerful than mutex
Deadlocks may occur when both processes are waiting for an event that can only be caused by the other waiting process
Starvation occurs when a specifics process has to wait indefinitely while others make progress
Priority inversion is a scheduling problem when a lower-priority process holds a lock needed by a higher priority process
- Solved via priority inheritance, where the priority of the low priority task is set to highest to prevent it being pre-empted by some medium priority task.
There are a few classic synchronisation problems that can be used to test synchronisation schemes
- The bounded buffer problem has $n$ buffers where each can old one item. Produces produces items and write to buffers while the consumers consume from buffers
  - Producer should not write when all buffers full
  - Consumer should not consume when all buffers empty
  - Solved with three semaphores
    - mutex = 1; full = 0; empty = n
  - Producers wait on empty when filling a buffer, and signal on full to indicate a buffer has been filled
  - Consumers wait on full to indicate emptying a buffer, and wait on empty to indicate one has been emptied
  - buffer access protected by mutex
- Reader/writer problem has some data shared among processes, where multiple readers are allowed but only one writer
  - Readers are given preference over writers, and writers may starve
  - A shared integer keeps track of the number of readers, and two mutexes are used, one read/write mutex, and another to protect the shared reader count.
  - The writer must acquire the writer mutex
  - Readers increase the read count while reading and decrease when done, both operations synchronised using mutex
  - Read/write mutex also locked while read count is at least one reading to prevent writes while anyone is reading.
- Dining philosophers spend their lives either thinking or eating.
  - They sit in a circle, with a chopstick between each pair. When they wish to eat, they pick up a chopstick from either side of them, and put them back down when done.
    - Two neighbouring philosophers cannot eat at the same time
    - Five mutexes, one for each chopstick
  - If all five decide to eat at once and pick up the left chopstick, then deadlock occurs
  - There are multiple solutions:
    - Allow only $n - 1$ philosophers for $n$ chopsticks
    - Allow a philosopher to only pick up both chopsticks if both are available, which must be done atomically
    - Use an asymmetric solution, where odd-numbered philosophers pick up left first, and even numbers pick up right first

Deadlocks

A set of processes is said to be in deadlock when each process is waiting for an event that can only be caused by another process in the event
- All waiting on each other
- Usually acquisition/release of a lock or resource
- An abstract system model for discussing deadlocks
  - System has resources $R_{1}, R_{0}, ..., R_{m}$
    - Resources can have multiple instances
  - A set of processses $P_{0}, P_{1}, ..., P_{n}$
  - To utilise a resource a process must request it, use it, then release it
- Conditions for deadlock:
  - Mutual exclusion, only one process can use a resource
  - Hold and wait, a process must hold some resources and then be waiting to acquire more
  - No pre-emption, a resource can be released only voluntarily
  - Circular wait, there must be a subset of processes waiting for each other in a circular manner
The resource allocation graph is a directed graph where:
- Vertices are processes and resources
  - Resource nodes show the multiple instances of each resource
- Request edge is a directed edge $P_{i} \leftarrow R_{j}$
- Assignment edge is a directed edge $R_{j} \leftarrow P_{i}$
- Cycles in graph show circular wait
- No cycles means no deadlock
- Cycles may mean deadlock, but not sufficient alone to detect deadlock
Deadlock detection algorithms are needed to verify if a resource allocation graph contains deadlock
- Resource graph can be represented in a table showing allocated, available, and requested resources
- Flags show if each process has finished executing
- A process may execute and set it's flag if it can satisfy it's requested resources using the currently available resources, which then frees any allocated resources
- Can then try to execute other processes
- If ever a point where no progress can be made, then the processes are deadlocked
Deadlock prevention ensures that at least one of the necessary conditions for deadlock does not hold
- Impossible to design system without mutual exclusion
- Can prevent hold-and-wait by ensuring a process atomically gets either all or none of its required resources at once, so it is either waiting on one of them or all of them
- Can introduce pre-emption into the system to make a process release all it's resources if it is ever waiting on any
- Can prevent circular wait by numbering resources, and requiring that each process requests resources in order
  - Process holding resource $n$ cannot request any resources numbered less than $n$
- All of these methods can be restrictive
  - Harmless requests could be blocked
Deadlock avoidance is less restrictive than prevention
- Determines if a request should be granted based upon if the resulting allocation leaves the system in a safe state where no deadlock can ever occur in future
  - Need advanced information on resource requirements
- Each process declares the maximum number of instances of each resources it may need
- On receiving a resource request, the algorithm checks if granting the resource leaves the system in a safe state
- If it can't guarantee a safe state, the system waits until the system changes into a state where the request can be granted safely
- How do we determine if a state is safe?
  - Cycles alone do not guarantee deadlock
  - The banker's algorithm determines if a state is safe
The banker's algorithm:
- Take a system with 5 process and three resource types, A, B and C, with 10, 5, and 7 instances respectively.
- Table shows the current and maximum usage for each process
  - Available resources is (instances of resource) - (total current used by each process)
  - Future needed resources is (maximum usage) - (current usage)
- At each step, a process is found who's needs can be satisfied with currently available resources
  - Can then execute process and reclaim its resources
  - Keep applying steps to try to reclaim all resources
    - Gives a sequence that processes can be executed in
      - If sequence completes all processes then it's a safe sequence and starting state is safe
      - If some processes cannot be executed and there is no possible safe sequence the starting state is unsafe
Resource request algorithm checks if granting a request is safe
- Check that we can satisfy request
- Pretend request was executed
- Use bankers algorithm to see if resulting state would be safe
  - If not, then keep request pending until state changes into a safe state where we can grant it

Memory

Memory is a flat array of addressable bytes
- CPU fetches data and instructions from memory
Memory protection
- Addresses accessible by a process must be unique to that process such that processes cannot write to each others address spaces
- Base and limit registers define the range of legal addresses
  - OS loads these registers when a process is scheduled
  - Only OS can modify
  - CPU checks addresses are in legal range, OS takes action if not
  - Assumes contiguous memory allocations, but other methods exist
Address binding
- Addresses in source code are usually symbolic (variables)
  - Typically bound by compilers to relocatable addresses
  - Addresses in object code are all mapped relative to some base address, which is then mapped to a physical address when loading into memory
- Different address binding strategies exist, can be done at compile time, load time, or execution time
Addresses generated by a program during its runtime are either logical or physical
- Logical/virtual are generated by the CPU to fetch or read/write, may differ from physical address
  - Must be converted to physical address before being used to access memory
- Physical address is the one seen by the memory unit.
- Under compile and load time binding, logical and physical addresses are the same
- Under execution time binding, the physical addresses may change at runtime
The memory management unit is special hardware that translates logical to physical addresses
- MMU consits of a relocation register and a limit register under contiguous allocation
Three main techniques for memory allocation
Contiguous memory allocation
- Each process has one chunk of memory
- Used in older OSs
- MMU checks each logical address against limit register
  - Registers can only be loaded by OS when a process is scheduled
- Memory divided into fixed partitions which are allocated to processes
  - Fixed number of partitions => fixed number of processes
- OS keeps track of free chunks called holes
- Processes allocated memory based on their size
  - Put into a hole large enough to accommodate it
- Different strategies for hole allocation
  - First-fit, allocate first hole
  - Best-fit, allocate smallest hole possible
    - Must search entire address space
  - Worst-fit, allocate the largest hole
    - Must also search entire address space
    - Produces largest leftover hold
- Can result in fragmentation of the address space
  - External fragmentation, when there is enough memory space for a process but it is not contiguous
  - Internal fragmentation, where a process is using more memory than it needs
  - Can deal with it by compacting holes into one block
    - Require processes to be relocated during execution and have significant overhead
  - Can also allow non-contiguous allocation
Segmented memory allocation
- Program divided into segments, each in its own contiguous block
- Each logic address is a two-tuple of (segment number, offset)
  - Segment number mapped to base address and offset is
- MMU contains segment table
  - Table indexed by segment numbers
  - Each table entry has
    - Segment base, which is the physical address of the segment in memory
    - Segment limit, the size of the segment
- Still cannot avoid external fragmentation
Paging is the best technique for memory management
- Avoids external fragmentation
- Divide program into blocks called pages
- Divide physical memory into blocks called frames
- Page size = frame size = 4kB
- Pages are assigned to frames
- Mapping between pages and frames is stored in a page table, one for each process
- Logical addresses have a page number and a page offset
- Still suffers from internal fragmentation
  - Worst case scenario has one byte in a frame
  - Average wastage is half a frame
  - Smaller frames means less wastage but larger page tables
Page table implementations are complex
- There is a page table in memory for each process
- MMU consists of registers to hold page table entries
  - Loaded by OS when a process is scheduled
  - Can only store a limited number of entries
- Holding a page table in memory doubles the time it takes to access an address, because you need an access to translate logical -> physical address first
- Translation Lookaside Buffer (TLB) stores frequently used page table entries in a hardware cache
  - Extremely fast
  - On a cache miss, the entry is brought into the TLB
    - Cache miss requires an extra memory access to get page table entry on top of the usual fetch
    - Different cache replacement algorithms are used (LRU is common)
    - Different algorithms have different corresponding hit ratios
  - Effective memory access time depends on hit ratio
  - Stores page table entries of multiple process
    - Each entry requires an Address Space Identifier (ASID) to uniquely identify the process requesting the TLB entry
    - Cache only hits if ASID matches, which guarantees memory protection
- It can be beneficial to have smaller page tables to reduce memory overhead
  - 32 bit word length and 4kB page size gives $2^{20}$ possible entries
  - If each entry is a 4 byte address, this is 4MB of page table per page
  - Most processes only use a very small number of the logical entries
  - A valid-invalid bit is used for each page table entry to indicate if there is a physical memory frame corresponding to a page number
    - Bit is set high when there is no physical frame corresponding to a a page
- Hierarchical page tables divide the page table into pages and store each page in a frame in memory
  - The mapping is stored in an outer page table
  - The OS does not need to store the inner page tables that aren't in use
  - The flat page table requires 4MB of space, which requires 1000 frames of 4kB each
  - The outer page table will have 1000 entries (one for each inner page table), which fits in a single frame.
- Addressing under multi-level paging works by separating the address into chunks
  - Outer and inner page tables are 4kB, so hold 1000 4 byte addresses.
    - 10 upper bits address the outer page table
    - Next 10 bits address the inner page table
    - Lowest 12 bits used to address the 4kB address space of each page
    - This takes 3 memory accesses now, which is slow
      - Less memory used but higher penalty in case of TLB miss
- Hashed page tables use the page numbers as hash keys, which are hashed to the index of the page table
  - Each entry is a pointer to a linked list of page numbers with the same hash value
    - Each list node is the page number, frame number, and pointer to the next node
- Some architectures use inverted page tables, where each index in the table corresponds to a physical frame number
  - Each entry in the table is a PID and a page number
  - When a virtual address is generated, each entry is searched until the entry for the frame with the page number and PID is found
  - Decreases memory needed to store each page table, but increases search time

Networks

Intro

A network is a group of interconnected dervices that communicate by sending messages
- End hosts run applications and send/receive messages
  - Generate messages and break them down into packets
  - Add additional info such as IP address and port in packet header
  - Send bits physically
- Access points provide access to the internet
  - End hosts connect to APs
  - Most use ethernet/wifi but also 4G/5G mobile networks
- Intermediate devices such as switches and routers forward and route messages
  - Also known as network core
  - Run routing and forwarding algorithms
  - Info stored in routing tables
  - Move packets to correct output link
The store-and-forward principle states that an entire packet must arrive at a before it can re-send it
- It takes $\frac{L}{R}$ seconds to transmit a packet of length $L$ at $R$ bits per second
- The router has to receive and send, so total delay is $2 \frac{L}{R}$ , plus processing time
- Packets queue at the router if the rate of incoming packets is greater than the transmittion rate
- Packets either queue in buffer or may be dropped if buffer fills
- There are four main sources of packet delay:
  - Transmission delay
    - $\frac{L}{R}$ time to send packet
  - Queueing delay
    - Time waiting to be transmitted
  - Processing delay
    - Any processing at node
  - Propagation delay
    - Time to physically move bits in link cables
Throughput is the overall rate at which bits are transferred from a source to a destination in a time window
- Can be instantaneous throughput, the rate at a specific point in time
- Or average throughput, the mean rate over a longer period of time
- Transmission links are bottlenecked by their minimum speeds
Protocols are defined rules for communication between nodes
- Define packet format, order of messages, actions to take on send and receive
- Can be in software or hardware
- Routers run IP protocols and switches and network cards implement ethernet
The internet uses packet switching to allow different routes to share links between nodes
- If one flow of data is not using any shared links then another flow can use it
- Circuit switching was used in old telephone networks, where links were reserved for an entire call duration and flows did not get shared
  - Not ideal for internet traffic due to the bursty nature of packets
There are 5 layers in the network stack, each using the services of the layer below it and providing services to the layer above it
- Application layer generates data
  - HTTP, SMTP, DNS
- Transport layer packetises data, adds port number, sequencing and error correcting info
  - TCP, UDP
- Network layer adds source and destination IP addresses and routes packets
  - IP
- Link layer adds source/destination MAC addresses, passes ethernet frames to network interface hardware drivers
  - Ethernet, WiFi
- Physical layer sends the bits down the wire
  - Different protocol for cables, WiFi, fibre optics, etc

Application Layer

Processes such as web browsers, email, file sharing, communicate over networks
- Developer has to develop either both client and server so they know how to communicate
- Alternatively, processes can implement an application-layer protocol such as HTTP
Process send/receive via sockets, which are the API between application and network
- Creating, reading, writing to sockets is done by syscalls
- Messages need to be addressed to the correct process running on the correct end host
  - Host identified by IP address
  - Processes identified by port number
Application processes use transport layer services
- Transport layer is expected to deliver messages to the intended recipients
- All transport layer protocols provide basic services such as packetisation, addressing, sequencing, error correction
- Different protocols provide different services
- TCP is for reliable and ordered data transfer
  - Is correction-oriented
    - TCP handshake is required
  - Client must contact server, establish connection with IP and Port
  - Provides a
- UDP provides no guarantees on data transfer
  - Best-effort service
  - Faster as no handshake is required and headers are smaller
  - Maintains no connection, data may be lost or our of order
HTTP is how web browsers communicate with web servers
- Uses TCP port 80
- Client sends a HTTP request to request a resource
- Server response with a HTTP response with the requested resource
- Web pages consist of HTML file and references objects
- HTTPv1.0 is non-persistent and downloads each object over a separate TCP connection
  - New TCP handshake for each object
- HTTPv1.1 is persistent and uses the same connection for multiple objects
  - Server leaves connection open for any referenced objects, which are sent back-to-back as soon as they are encountered
- RTT is the round trip time for a request
  - Needs 1 RTT to establish TCP connection, then another to request and receive first few bytes of data
  - Non-persistent response time is 2RTT + file transmission time for each file
  - Persistent response time only requires 2RTT once, then total data transfer
- HTTP requests and responses are in ASCII, in a human-readable format
  - In request, top line is request line with request verb (GET/POST/PUT)
  - In response, top line is status line with status code and phrase
Web clients can be configured to access the web via a cache, which caches objects to reduce response time for client requests

Transport Layer

Transport services provide logical communication between application processes running on different hosts
- Break messages into segments, add header, pass to network layer on the send side
- Reassemble segments into messages and pass up to application layer on the receive side
UDP provides bare minimum services
- No effort to recover lost packets or re-order packets
- Connectionless
  - Each segment treated individually
- No congestion control
  - Sender can send as fast as they want, possibly overloading receiver or infrastructure
- Used when fast and low latency is needed
  - UDP header is smaller
  - Can send data as fast as wanted
    - Video games, internet streaming
- It is the programmers responsibility to make UDP reliable
TCP is connection-oriented and more reliable
- Provides flow and congestion control
- Manages packets out of order to make packets appear in order
- Enhances unreliable network layer services
  - Bits often flipped due to noise and packets re-ordered
- Checksums in headers detect bit errors
- Acknowledgments (ACKs) indicated packets are correctly received
- Sender times out if ACK not received within a timeout interval
- Automatic Repeat Requests (ARQs) are sent to retransmit lost or corrupt packets
- Packets include a sequence number to detect lost or duplicated packets
Stop and wait ARQ is a protocol for ARQs
- Sender sends a packet and waits until it receives an ACK
- If ACK arrives, send the next packet
- If ACK times out, retransmit the same packet
- Duplicate detection is possible because sequence numbers are used
  - Sufficient to use 1 bit sequence number since there can be at most one outstanding packet
    - Known as the alternating bit protocol
- Reliable, but slow as sender has to wait for ACK to send next packet
  - Suppose a 1Gbps ( $R = 1 0^{9}$ ) link with a packet length of $L = 8000$ bits
  - RTT is 30ms
  - Utilisation $U$ is the fraction of time the link is spent transmitting
    - $L / R$ time spent sending, $RTT$ time spent waiting
    - $U = \frac{L}{R \times RTT + L} = 0.00027$
- The sender should be allowed to send more packets without waiting for an ACK
  - There is $R \times RTT$ bits of additional data that could be sent during the RTT interval
  - $R \times RTT$ is the delay-bandwith product
    - Indicates the length of the pipeline
- Receiving buffer can also be a bottleneck
  - Typically has a finite buffer of $B$ bits
  - May not be reading from buffer all the time
  - Sender should not send more than $B$ bits at a time to prevent overflow
  - The maximum number of bits without waiting for an ACK is $min (B, L + R \times RTT)$
Pipelined protocols allow multiple unacknowledged packets in the pipeline
- ACKs are sent individually or cumulatively
- Range of sequence number must be increased from alternating bits
- Go-back-n is a common protocol
  - Sender maintains a window of $N$ packets that can be sent without waiting for ACK
    - Depends on delay-bandwith product, receive buffer size, other factors
  - Receiver maintains expected sequence number variable, keeps track of the next expected packet
  - If the receiver receives the packet with the expected sequence number, then it sends ACK( $n$ ), which acknowledges all packets up to $n$ , making the ACK cumulative
  - If the sequence number is not the expected one, then the receiver discard the incoming packet and sends ACK( $n - 1$ ), acknowledging all up to the last correctly received packet
    - Waits for packet $n$ to be correctly received before acknowledging any further packets
  - The sender moves the send window forward for every ACK received
  - Maintains a timer for the oldest unacknowledged packet, if an ACK times out then the packet is resent
- Selective repeat does not discard out of order packets as long as they fall inside a receive window
  - ACKs are individual and not cumulative
  - Sender selectively retransmits packets whose ACK did not arrive
    - Maintains a timer for each unacknowledged packet in the send window
  - Does not have to retransmit out-of-order packets
  - Packets arriving out of order are buffered, but receive window not moved forward
  - Window size should be less than or equal to half the max sequence number
    - Avoids packets being recognised incorrectly
  - Send window moved forward when ACK received
TCP uses a combination of GBN and SR protocols
- Uses cumulative ACKs
- Only retransmits the packet causing timeout
- Each byte of data is numbered in TCP
  - Sequence number of a packet is the byte number of the first byte of the segment
- TCP ACK number is the number of the next byte expected from the other side
  - Cumulative ACKs are used
- TCP is duplex, so ACKs are piggybacked onto data segments. A segment can carry data and serve as an ACK
- The timout period is often relatively long, so on 3 duplicate ACKs, the sender re-transmits that segment without waiting for timeout
  - Duplicate ACKs are good indicators of high packet loss
- TCP headers contain a few fields
  - Sequence number is the 32 bit number of the segment indicating the number of the first byte in the packet
  - Acknowledgement number is the number of the next byte expected to be transmitted
  - Receive window is used for flow control
TCP uses flow control to ensure that the data in the pipeline does not exceed the receive buffer size
- Receiver advertises free buffer spaces in the receive windows field - rwnd
- Sender limits amount of unacknowledged data to the receiver's rwnd value
- (last byte send - last byte ACK'd) $\leq$ rwnd
TCP provides congestion control to control the rate of transmission according to the level of perceived congestion in the network
- Congestion occurs when input rate > output rate
- Results in lost packets, buffer overflows, long delays due to queuing at routers
- As a transmission link approaches maximum capacity queues build up and delay approaches infinity
- There is no benefit in increasing transmission rate beyond network capacity
- In a circular network where the transmission rate via link $i$ is $λ^{'}$ and the capacity of the link is $C$
  - If $λ \leq C /2$ , then $λ^{''} = λ^{'} = λ$ : the links can all transmit at the same rate and there is no congestion
  - If $λ > C /2$ , then only a portion of the traffic can be carried by each link
    - $λ^{'} = (C λ) / (λ + λ^{'})$
    - $λ^{''} = (C λ^{'}) / (λ^{'} + λ)$
    - $λ^{''} = C - \frac{λ}{2} (1 + 4 C / λ - 1)$
  - The throughput increases linearly up to a max of $λ = C /2$ , then decreases exponentially towards 0 from there causing congestion collapse
- Throughput control aims to limit send rates such that congestion collapse does not occur, and flows get a fair share of network resources
- TCP detects network congestion through delays and losses
  - Congestion is assumed when timeout occurs or 3 duplicate ACKs are received
- TCP is a window-based pipelined protocol, where the rate of transmission is window size $W / RTT$
  - Controlling $W$ controls the transmission rate
- Maximum size of a TCP segment is the MSS, Maximum Segment Size, which is determined by the maximum frame size specified by the link layer
- Number of segments to transmit all data is $W / MSS$
- The sender maintains a congestion window size, denoted cwnd
  - W = LastByteSent - LastByteAcked <= min(cwnd, rwnd)
  - When rwnd is large, sender cwnd determines the transmission rate, which $\approx \frac{c w n d}{RTT}$ bps
- cwnd is a function of perceived network congestion
  - Varied using additive increases and multiplicative decreases (AIMD)
    - Increase cwnd by 1MSS every RTT until loss detected
    - Cut cwnd in half after loss
    - Achieves a fair allocation rate among competing flows
      - Additive increase gives a slope of 1 as throughput increases
      - Multiplicative decrease decreases throughput proportionally
      - $w (t + 1) = w (t) + 1$ if no loss, $0.5 w (t)$ if loss
      - Ideal operating point for two connections sharing $R$ bandwith is that both are sending at $R /2$ bps
- TCP starts slowly as AIMD convergence rate is slow
  - Window size increased exponentially until predefined threshold hit (ssthresh)
    - "slow start" phase, cwnd doubled each RTT
    - ssthresh remembers previous window size for which a loss occurred
  - Initial aggressive behaviour ensure sender reaches correct speed quickly
- Losses are detected through timeouts and 3 duplicate ACKs
  - Harsher on losses than duplicate ACKs
  - Timeout indicates a packet loss, so drastic action is taken
    - ssthresh= 0.5 * cwnd, cwnd = 1 MSS
    - Sender enters slow start phase again
  - Losses indicated by duplicate ACKs take less drastic action -
    - ssthresh= 0.5 * cwnd, cwnd = 0.5 * cwnd
    - Window grows again linearly (additively)

Network Layer

The main function of the network layer is to move packets from the source to destination node through intermediate nodes (routers)
Main protocol on this layer is IP
- At the source, the IP header is added with source and destination IP addresses
- Routers check destination IP addresses to decide the next hop
- At the destination, the IP header is stripped and the packet is delivered to the transport layer
Routers have two key functions
- Forwarding, moving packets from the input to the appropriate output
- Routing, constructing routing tables and running routing protocols
Routing tables map destination IP ranges to their output links
- Mapping all 4 billion IP addresses would be impractical
- If the IP ranges don't divide up so nicely, longest prefix matching is used
  - When looking for a table entry for a given destinaion address, use the longest address prefix that matches the destination address
IPv4 addresses are 32bits to uniquely identify network interfaces
- IP addresses belonging to the same subnet have the same prefix, the subnet mask
- Interfaces on the same subnet are connected by a link layer switch and communicate directly
- IP addresses have their subnet mask specified as the number of bits as a prefix
  - CIDR notation is xxx.xxx.xxx.xxx/xx
- A sender checks if the destination IP has the same subnet mask
  - If it does then obtain the MAC address of the destination and forward the packet to the link layer switch
- If source and destination belong to different subnets, then the source forwards the packet to it's default gateway
  - Gateway routers connect subnets
  - If A want so communicate with B on a different subnet, it forwards the packets to R, the default gateway
  - R will look up in it's routing table to forward A's packets to the correct outgoing interface
  - When the packet reaches the interface, it will be forwarded to B through the switch in B's subnet
Nodes have two options for acquiring IP addresses
- Network admins can manually configure the IP of each host on the network
- DHCP is an application layer protocol that dynamically assigns IP addresses from the server to clients
- Both subnet mask and default gateway must be provided for both
Networks are allocated subnets from the ISP's address space
- Global authority ICANN is responsible for allocating IP addresses to ISPs
Network Address Translation (NAT) is used so that each IP address on a subnet does not need a globally unique IP, as ICANN have run out of them (4 billion is not enough)
- Unique IP addresses are provided to public gateway routers
- Private IP addresses that are unique only on the subnet are allocated by the gateway router
- Devices in home or private networks need not be visible to the public internet, they can use private IP addresses to communicate with each other and communication with the internet is done via the gateway router
- Packets with private IP addresses cannot be carried by the public internet
- Private source IP addresses are converted to the public IP address of the router facing the internet
- Incoming packets for different hosts are distinguished by different ports on the router
- Address shortage is solved by IPv6 with 128-bit addresses, but it is not in wide use yet
At each router, a routing protocol such as RIP or OSPF constructs the routing table
- Each routing protocol implements a routing algorithm
- Networks are abstracted as graphs $G = (N, E)$
  - $N = u, v, w, x, y, z$ is the set of routers
  - $E = (u, v), (u, x), (v, x), (v, w), (x, w), (x, y), (w, y), (w, z), (y, z)$ is the set of links
  - Each edge $(x, y)$ has a cost associated with if $c (x, y)$
    - $c (x, y) = \infty$ if $x$ and $y$ are not direct neighbours
  - The cost of a path $(x_{1}, x_{2}, x_{3}, ..., x_{p}) = c (x_{1}, x_{2}) + c (x_{2}, x_{3}) + ... + c (x_{p - 1}, x_{p})$
  - The idea is that given a source $x$ and destination $y$ , what is the least cost path from $x$ to $y$
    - Need the shortest past from each node to every other node to populate the routing table
- Two type of routing algorithm are used
  - Global requires the knowledge of the complete topology at each router including costs
    - Link state algorithms
  - Local requires only knowledge of the network surrounding the router
Dijkstra’s algorithm is a link-state routing algorithm that computes leas cost path from one node (the source) to all other nodes
- Implemented in Open Shortest Path First (OSPF) protocol
- Each node requires the entire topology, which is obtained through broadcasting link states
- Maintains a set of visited nodes $N^{'}$ , initially only the source
- For all nodes $v$
  - If $v$ adjacent to $u$
    - D(v) = c(u, v) , store current estimates of shortest distance
    - p(v) = u, store predecessor node of $v$ along with current shortest path from $u$ to $v$
  - else, $D (v) = \infty$ , $p (v) = null$ , initialise all other nodes to be infinite distance away with no known predecessor yet
- While all nodes $w$ not in $N^{'}$ , not yet visited
  - Add node $w$ to $N^{'}$
  - For all $v$ adjacent to $w$ and not in $N^{'}$
    - If $D (v) > D (w) + c (w, v)$
      - $D (v) = D (w) + c (w, v)$ , update distance to the unvisited neighbour $v$ of $w$ if it is smaller
      - $p (v) = w$
The Distance Vector (DV) algorithm is used in the Routing Information Protocol (RIP)
- Uses local information from neighbouring nodes to compute shortest paths
- Based on the Bellman-Ford equation
  - $d_{x} (y)$ is the length of the shortest path from $x$ to $y$
  - BF equation relates $d_{x} (y)$ to $d_{v} (y)$ , where $v \in N (x)$ (the set of neighbouts of x)
  - $d_{x} (y) = min c (x, v) + d_{v} (y)$
    - If $v^{*}$ minimises the above sum, then it is the next-hop node in the shortest path
- $D_{x} (y)$ is the current estimate of the minimum distance from $x$ to $y$ (different to actual minimum distance $d_{x} (y)$ )
  - DV algorithm tries to converge estimates to their actual values
  - Each node maintains a distance vector $D_{x} = [D_{x} (y) : y \in N]$
  - Node $x$ performs the update $D_{x} (y) = mi n_{v} c (x, v) + D_{v} (y)$
    - Node $x$ needs toe cost of each neighbour, and the distance vector of each neighbour (obtained via message passind)
    - Whenever any of these is updated, the node recomputes it's distance vector and update all it's neighbours
  - Each node:
    - Wait for a change in local link cost or a message from neighbour
    - Recompute estimates using BF equation
    - If DV to any destination has changed, notify neighbours

Selected Topics

A network interfaces is how the computer connects to a network
- Node can have multiple interfaces
- Loopback address (localhost) is simulated interface
- Each interface has an IP address
- Each NIC has a MAC address
Internet protocols specify the structure of internet packets
- Packet headers are added at each layer of the network stack
- Ethernet header from link layer describes source and destination MAC addresses
  - Fixed length header
- IP header from network layer describes source and destination IP
  - Variable length header, length stored in IHL field
  - Stores protocol of transport layer too
- TCP/UDP header from transport layer has port numbers, control bits
  - Also has data offset has the header length is variable
  - Has sequence number, ACK number and checksum
- Application message is after the three headers
SYN attacks are when a malicious attacker sends a flood of TCP packets with the SYN bit set
- This causes the server to reply with SYN ACK for each packet received, creating a bunch of half open connections waiting for an ACK that never arrives
- The server is then too busy to respond to any other users
- Denial of service attack
MAC addresses are the addresses of the physical network interface hardware
- Address Resolution Protocol (ARP) determines the MAC addresses of hardware from the IP addresses
- The router broadcasts an ARP request packet to all interfaces on the link
- The ARP reply is sent by the node with the requested address
- MAC address is saved in ARP cache for future use
ARP allows unsolicited replies from anyone, so an attacker can send an unsolicited ARP reply pretending to be another address.
- This poisons the ARP cache with an incorrect entry, and the device will the send all messages intended for the spoofed address to the attacker

CS257

Memory Systems

Main Memory

We have a memory hierarchy to balance the tradeoff between cost and speed
Want to exploit temporal and spatial locality
Moore's law is long dead and never really applied to memory
The basic element of main memory is a memory cell capable of being written or read to
- Need to indicate read/write, data input, and also an enable line
When organising memory cells into a larger chip, it is important to maintain a structure approach and keep the circuit as compact as possible
- For example, a 16 word x 8 bit memory chip requires 128 cells and 4-bit addresses
- A 1024 bit device as a 128x8 array requires 7 address pins and 8 data pins
  - Alternatively, it is possible to organise it as a 1024x1 array, which would be really dumb as it would result in a massive decoder and inefficient space usage
- Dividing the address inputs into 2 parts, column and row address, minimise the decoder space and allows more space for memory
Can use the same principle to build smaller ICs into larger ICs, using decoders/multiplexers to split address spaces
Semiconductor memory is generally whats used for main store, Random Access Memory
Two main technologies:
- Static RAM (SRAM) uses a flip-flop as a storage element for each bit
- Dynamic RAM (DRAM) uses the presence or lack of charge in a capacitor for each bit
  - Charge leaks away over time so needs refreshing, but DRAM is generally cheaper if the overhead of the refresh circuitry is sufficiently amortised
- SRAM typically faster so is used for cache
- DRAM used for main memory
The interface to main memory is always a bottleneck so we can do some fancy DRAM organisations stuff
- Synchronous DRAM exchanges data with the processor according to an external clock memory
  - Clock runs at the speed of the bus to avoid waiting on memory
  - Processor can perform other tasks while waiting because clock period and wait times are known
- Rambus DRAM was used by Intel for Pentium and Itanium
  - Exchanges data over a 28-wire bus no more than 12cm long
  - Provides address and control information
  - Asynchronous and block-oriented
  - Fast because requests are issued by the processor over the RDRAM bus instead of using explicit R/W and enable signals
  - Bus propertties such as impedances must be known to processor
- DDR SDRAM extends SDRAM by sending data to the processor on both rising and falling edge
  - Actually used
- Cache DRAM (CDRAM) combines DRAM with a small SRAM cache
  - Performance very dependant upon domain and load
ROM typically used in microprogramming or systems stuff
- ROM is mask-written read only memory
- PROM is same as above, but electrically written
- EPROM is same as above, but is erasable via UV light at the chip level
- EEPROM is erasable electrically at the byte-level
Flash memory is a high speed semiconductor memory
- Used for persistent storage
- Limited to block-level erasure
- Uses typically 1 transistor per bit

Interleaved Memory

A collection of multiple DRAM chips grouped to form a memory bank
$n$ banks can service $n$ requests simultaneously, increading memory read/write rates by a factor of $n$
If consecutive words of memory are stored in different banks, the transfer of a block of memory is sped up
Distributing addresses among memory units/banks is called interleaving
- Interleaving addresses among $n$ memory units is known as $n$ -way interleaving
Most effective when the number of memory banks is equal to number of words in a cache line

Virtual Memory

Virtual memory is a hierarchical system accross caches, main memory and swap that is managed by the OS
Locality of reference principle: addresses generated by the CPU should be in the first level of memory as often as possible
- Use temporal, spatial, sequential locality to predict
- The working set of memory addresses usually changes slowly so should maintain it closest to CPU
Performance measured has hit ratio $H = \frac{N _{1}}{N _{1} + N _{2}}$ (assuming a two-level memory hierarchy with data in $M_{1}$ and $M_{2}$ )
The average access time $t_{A} = H t_{A 1} + (1 - H) t_{A 2}$
- When there is a miss, the block is swapped in from $M_{2}$ to $M_{1}$ then accessed
- $t_{B}$ is the time to transfer a block, so $t_{A 2} ≊ t_{B}$
- $r = t_{A 2} / t_{A 1}$ , the access time ratio of the two levels
- $e = t_{A 1} / t_{A}$ , the factor by which average access time differs from minimum, access efficiency
Memory capacity is limited by cost considerations, so wastins space is bad
- The efficiency which space is being used can be defined as the ratio of useful stuff in memory over total memory, $u = S_{u} / S$
- Wasted space can be empty due to fragmentation, or inactive data that is never used
- System also takes up some memory space
Virtual memory space is usually much greater than physical
- If a memory address is referenced that is not in main memory, then there is a page fault and the OS fetches the data
- When virtual address space is much greater than physical, most page table entries are empty
  - Fixed by inverted hashed page tables, where page numbers are hashed to smaller values that index a page table where each entry corresponds to physical frames
  - Hash collisions handled by extra chain field in the page table which indicates where colliding entry lives
  - Lookup process is:
    - Hash page number
    - Index the page table using hash. If the tag matches then page found
    - If not then check chain field and go to that index
      - If chain field is null then page fault
  - Average number of probes for an inverted page table with good hashing algorithm is 1.5
    - Practical to have a page frame table with twice the number of entries than frames of memory
Segmentation allows programmer to view memory as multiple address spaces - segments
- Each segment has its own access and usage rights
- Provides a number of advantages:
  - Simplifies dynamic data structures, as segments can grow/shrink
  - Programs can be altered and recompiled independently without relinking and reloading
  - Can be shared among processes
  - Access privileges give protection
- Programs divided into segments which are logical parts of variable length
- Segments make up pages, so segment table used to get offset of address within page table
  - Two levels of lookup tables, address split into 3
Translation Lookaside Buffer (TLB) holds most recently reference table entries as a cache
- When TLB misses, there is a significant overhead in searching main memory page tables
- Average address translation time $t_{t} = t_{tl b} + (1 - H_{tl b}) t_{tl b}$
- TLB miss ratio usually low, less than 0.01
Page size $S_{p}$ has an impact on memory space utilisation factor
- Too large, then excessive internal fragmentation
- Too small, then page tables become large and reduces space utilisation
- $S_{s}$ is the segment size in words, so when $S_{s} > S_{p}$ , the last page assigned to a segment will contain on average $S_{p} /2$ words
- Size of the page table associated with each segment is approx $S_{s} / S_{p}$ words, assuming each table entry is 1 word
- Memory overhead for each segment is $S = \frac{S _{p}}{2} + \frac{S _{s}}{S _{p}}$
- Space utilisation is therefore $u = \frac{S _{s}}{S _{s} + S} = \frac{2 S _{s} S _{p}}{S _{p}^{2} + 2 S _{s} ( 1 + S _{p} )}$
- Optimum page size = $2 S_{s}$
- Optimum utilisation = $\frac{1}{1 + 2/ S _{s}}$
- Hit ratio increases with page size up to a maximum, then begins to decrease again
  - Value of $S_{p}$ yielding max hit ratios can be greater than the optimum page size for utilisation
When a page fault occurs, the memory management software is called to swap in a page from secondary storage
- If memory is full, it is necessary to swap out a page
- Efficient page replacement algorithm required
  - Doing it randomly would be fucking stupid, might evict something being used
  - FIFO is simple and removes oldest page, but still might evict something being used
  - Clock replacement algorithm modifies fifo, which keeps track of unused pages through a use bit
    - Use bit is set if page hasn't been used since last page fault
  - LRU algorithm works well but complex to implement, requires an age counter per entry
    - Usually approximated through use bits set at intervals
  - Working set replacement algorithm keeps track of the set of pages referenced during a time interval
    - Replaces the page which has not been referenced during the preceding time interval
    - As time passes, a moving window captures a working set of pages
    - Implementation is complex
Thrashing occurs when there is too many processes in too little memory and OS
- Get a better page replacement algorithm
- Close some chrome tabs
- Download more RAM

Cache

Cache contains copies of sections of main memory and relies of locality of reference
Objective of cache is to have as high a hit ratio as possible
Three techniques used for cache mapping
- Direct, maps each block of memory to only one possible cache line
- Associative, permits each main memory block to be loaded into any line of cache
  - Cache control logic must examine each cache line for a match
- Set associative, each cache line can be in one of a set of cache lines
In direct mapping, address is divided into three fields: tag, line and word
- Cache is accessed with the same line and word as main memory
- Tag is stored with data in the cache
  - If tag matches that of the address, then that's a cache hit
  - If a miss occurs, the new data and tag is fetched to cache
- Simple and inexpensive
- Fixed cache location for each block means that if two needed blocks map to the same line than cache will thrash
- Victim cache was originally proposed as a solution
  - A fully associative cache of 4-16 lines sat between L1 and L2
Fully associative cache scheme divide the CPU address into tag and word
- Cache accessed by same word
- Tag stored with data, have to examine every tag to determine if theres a cache miss
  - Complex because of this
Set associative combines the two, where a given block maps to any line in a given set
- eg, a 4-way cache has 4 lines per set and a block can map to any one of these 4
- Performance increases diminish as set size increases
Performance can be improved with separate instruction and data caches, L1 usually split
Principle of inclusion states that L1 should always be subset of L2, L2 subset of L3, etc
- When L3 is fetched to, data is written to L2 and L1 also
Writing to cache can result in cache and main memory having inconsistent data
- It is necessary to be coherent if
  - I/O operates on main memory
  - Processors share main memory
- There are two common methods for maintaining consistency
  - With write through, every write operation to cache is repeated to main memory in parallel
    - Adds overhead to write to memory, but usually there are several reads between each write
    - Average access time $t_{a} = t_{c} + (1 - h) t_{b} + w (t_{m} - t_{c}) = (1 - w) t_{c} + (1 - h + w) t_{m}$
      - Assumes $t_{b} = t_{m}$ is time to transfer block to cache, and $w$ is the fraction of references that are writes
    - Main memory write operation must complete before any further cache operations
      - If size of block matches datapath width, then whole block can be transferred in one operation, $t_{b} = t_{m}$
        
        If not, then $b$ transfers are required and $t_{b} = b t_{m}$
    - Write through often enhanced by buffers for writes to main memory, freeing cache for subsequent accesses
    - In some systems, cache is not fetched to when a miss occurs on a write operation, meaning data is written to main memory but not cache
      - Reduces average access time as read misses incur less overhead
  - With write back, a write operation to main memory is performed only at block replacement time
    - Increases efficiency if variables are changed a number of times
    - Simple write back refers to always writing back a block when a swap is required, even if data is unaltered
    - Average access time becomes $t_{a} = t_{c} + 2 (1 - h) t_{b}$
      - x2 because you write the block back then fetch a new one
    - Tagged write back only writes back a block if the contents have altered
      - 1-bit tag stored with each block, and is set when block altered
      - Tags examined at replacement time
      - Access time $t_{1} = t_{c} + (1 - h) t_{b} + w_{b} (1 - h) t_{b}$
        
        $w_{b}$ is the probability a block has been altered
    - Write buffers can also be implemented
Most modern processors have at least two cache levels
- Normal memory hierarchy principles apply, though on an L2 miss data is written to L1 and L2
- With two levels, average access time becomes $t_{a} + t_{c 1} + (1 - h_{1}) t_{c 2} + (1 - h_{2}) t_{m}$
A replacement policy is required for evicting cache lines in associative and set-associative mappings
- Most effective policy is LRU, implemented totally in hardware
- Two possible implementations, counter and reference matrix
  - A counter associated with each line is incremented at regular intervals and reset when the line is referenced
    - Reset every time line is accessed
    - On a miss when the cache is full, the line with a counter set at the maximum value is replaced and counter reset, all other counters set to 0
- Reference matrix is based on a matrix of status bits
  - If $B$ lines to consider, then the upper triangular matrix of a $B \times B$ matrix is formed without the diagonal, with $(B \times (B - 1)) /2$
  - When the $i$ th line is referenced, all bits in the $i$ th row are set to one and $i$ th column is zeroed
  - The least recently used one is one that has all 0s in its row and all 1s in its column
There are three types of cache miss:
- Compulsory, where an access will always miss because it is the first access to the block
- Capacity, where a miss occurs because a cache is not large enough to contain all the blocks needed
- Conflict, misses occurring as a result of blocks not being fully associative
- Sometimes a fourth category, coherency, is used to describe misses occurring due to cache flushes in multiprocessor systems
Performance measures based solely on hit rate don't factor in the actual cost of a cache miss, which is the real performance issue
- Average memory access time = hit time + (miss rate x miss penalty)
- Measuring access time can be a more indicative measure
There are a number of measures that can be taken to optimise cache performance
- Have larger block sizes to exploit spatial locality
  - Likely to reduce number of compulsory misses
  - Will increase cache miss penalty
- Have a larger cache
  - Longer hit times and increased power consumption and more expensive
- Higher levels of associativity
  - Reduces number of conflict misses
  - Can cause longer hit times and increased power consumption
- Multilevel Caches
  - Idea is to reduce miss penalty
  - L1 cache keeps pace with CPU clock, further caches serve to reduce the number of main memory accesses
  - Can redefine average accesstime for multilevel caches: L1 hit time + (L1 miss rate x (L2 hit time + (L2 miss rate x L2 miss penalty)))
- Prioritising read misses over writes
  - Write buffers can hold updated value for a location needed on a read miss
  - If no conflicts, then sending the read before the write will reduce the miss penalty
  - Optimisation easily implemented in write buffer
  - Most modern processor do this as cost is low
- Avoid address translation during cache indexing
  - Caches must cope with the translation of virtual addresses to physical
  - Using the page offset to index cache means the TLB can be omitted
    - Imposes restrictions in structure and size of cache
- Controlling L1 cache size and complexity
  - Fast clock cycles encourage small and simple L1 caches
  - Lower levels of associativity can reduce hit times as they are less complex
- Way prediction
  - Reduce conflict misses
  - Keep extra bits in cache to preduct the block within the next set of the next cache access
  - Requires block predictor bits in each block
    - Determine which block to try on the next cache access
    - If prediction correct then latency is equal to direct mapped, otherwise at least an extra clock cycle required
    - Prediction accuracy commonly 90%+ for 2-way cache
- Pipelined access
  - Effective latency of an L1 cache hit can be multiple cycles
  - Pipelining allows to increase clock speeds and bandwith
  - Can incur slower hit times
- Non-blocking cache
  - Processors in many systems do not need to stall on a data cache miss
    - Instruction fetch could be performed while data fetched from main memory following a miss
  - Allows to issue more than one cache request at at time
    - Cache can continue to supply hits immediately following a miss
  - Performance hard to measure and model
    - Out-of-order processors can hide impact of L1 misses that hit L2
- Multi-bank caches
  - Increase cache bandwith by having multiple banks that support simultaneous access
  - Ideal if cache accesses spread themselves accross banks
    - Sequential interleaving spreads block addresses sequentially accross banks
- Critical word first
  - A processor often only needs one word of a block at a time
  - Request the missing word first and send it to the processor, then fill the remainder of the block
  - Most beneficial for large caches with large blocks
- Merging write buffer
  - Write buffers are used by write-through and write-back caches
  - If write buffer is empty then data and full address are written to buffer
  - If write buffer contains other modified blocks then address can be checked to see if new data and buff entry match, and the data is combined with the buffer entry
    - Known as write merging
  - Reduces miss penalty
- Hardware prefetching
  - Put the data in cache before it's requested
  - Instruction prefetches usually done in hardware
  - Processor fetches two blocks on a miss, the missed block and then prefetches the next one
  - Prefetched block put in instruction stream buffer
- Compiler driven prefetching
  - Reduces miss rate and penalty
  - Compiler inserts prefetching instructions based on what it can deduce about a program
- Compiler can make other optimisations such as loop interchange and blocking

Processor Architecture

CPU Organisation & Control

Processor continuously runs fetch-decode-execute cycle
- Each instruction cycle take several CPU clock cycles
- Requires interaction of lots of CPU components
  - ALU, CU, PC, IR, MAR, MDR
- Machine instructions may specify
  - Op code
  - Source operand reference
  - Result operand reference
  - Next instruction reference
- Some CPU registers are user-visible, such as data and address registers
- Control and status registers are used by CU and privileged OS processes only
- Executing an instruction may involve one or more operands, each requiring to be fetched
  - Can account for this in instruction cycle model known as indirect cycle
Instruction pipelining allows to use wasted time, as new inputs can be accepted before previously accepted instructions and been output
Control unit is responsbible for generating control signals to drive cycle
- Observe opcode input and choose right control signal - decode
- Assert control signals - execute
- Two approaches to CU design:
  - Hardwired
    - Uses a sequencer and a digital logic circuit that produces outputs
    - Fast but limited by complexity and inflexibility
  - Microprogrammed
    - Uses a microprogram memory
    - Has it's own fetch-execute cycle - mini computer in the CPU
      - Microaddress, MicroPC, MicroIR, microinstructions
    - Easy to design, implement, flexible, can be reprogrammed
    - Slower than hardwired
Instruction sequencing is important to be designed to utilise as many memory cycles as possible, possibly by overlapping fetches
- Proper sequence must be followed in sequencing control signals, to avoid conflicts
  - MAR <- PC must precede MBR <- Memory
Micro-ops are enabled by control signals to transfer data between registers/busses and perform arithmetic or logical operations
- Each step in the operation of a larger machine instruction is encoded into a micro-instruction
- Micro-instructions make up the micro-program
- Micro-program word length is based on 3 factors:
  - The max number of simultaneous micro-ops supported
  - How control info is represented/encoded
  - How the next micro-instruction address is specified
- Horizontal/direct control has very wide word length with few micro-instructions per machine instruction
  - Outputs buffered/gated with timing signals
  - Fewer instructions == faster
- Vertical control uses narrower instructions with $n$ control signals encoded into $lo g_{2} n$ bits
  - Limited ability to express parallelism
  - Requires external decoder to identify what control lines are being asserted

Performance

M J Flynn in 1966 defined a simple means of classifying machines, SISD is one such classification
- Uses fetch-decode-execute
- Fetch sub-cycle is fairly constant-ish speed
- Execute sub-cycle may vary in speed greatly
A simple measure of performance is MIPS, millions of instructions per second
- Not actually that useful as it measures how fast a processor can do nothing
Parallel performance is very difficult to measure due to system architecture and degree of parallelism varying
Instruction bandwith measures the instruction execution rate, similar to MIPS
Data bandwith measured in FLOPS measures the throughput
It is nigh-on impossible to get full theoretical throughput in any system, especially parallel
Speedup is a useful measure that factors in the degree of parallelism
- $S (n) =$ (Execution time on sequential machine, $T (1)$ ) / (Execution time on parallel machine, $T (n)$ )
- A closeley related measure is efficiency, $E_{n} = S (n) / n$
- Both measures depend on parallelism of algorithm
An algorithm may be characterised by it's degree of parallelism $n_{i}$ , which is the degree of parallelism that exists at time $i$
Assume all computations are of two types, vector operations of length $N$ and scalar operations where $N = 1$
- $f$ is the total proportion of scalar ops, so $1 - f$ is the measure of parallelism in the program
- $b_{v}$ is the throughput of vector ops in MFLOPS and $b_{s}$ is the scalar throughput
  - Average throughput $\frac{1}{b} = \frac{f}{b _{s}} + \frac{1 - f}{b _{v}}$

Pipelining

The problem with an instruction/execute pipeline is contention over memory access
- Overcome with interleaved memory
Two possible methods of controlling the transfer of information between pipeline stages
- Asynchronously using handshake signals
  - Most flexible, max speed determined by slowest stage
- Synchronously, where there are latches between each stage all synced to a clock
Example 5-stage I/E pipeline: fetch instruction, decode instruction, fetch operands, execute instruction, store results
Pipelining assumes the only interaction between stages is the passage of information, but there are 3 major things that can cause hazards and stall the pipeline
- Structural hazards, resource conflicts where two stages wish to use the same resource, ie a memory port
  - Interleave memory or prefetch data into cache
- Control hazards occur when there is a change in order of execution of instructions, eg when there is a branch or jump
  - Cause the pipeline to stall and have to refill it
  - Strategies exist to reduce pipeline failures due to conditional branches
    - Instruction pre-fetch buffers, which fetches both branches
      - Complex and rarely used
    - Pipeline freeze strategy, which freezes the pipeline when it receives a branch instruction
      - Simple, but poor performance
    - Static prediction leverages known facts about branches to guess which one is taken
      - 60% of all branches are taken, so may be better to predict this
      - However to not take wastest less pipeline cycles so average performance may be better
    - Dynamic prediction predicts on the fly for each instruction
      - Based on branch instruction characteristics, target address characteristics, and branch history
- Data hazards, where an instruction depends on the result of a previous instruction that has not yet completed
Pipeline clock period is determined by the slowest stage, usually execution
- Pipeline execution unit separately or have multiple execution units
Sometimes useful to add feedback between stages (recursion), where the output of one stage becomes the input to a previous one
- Used in accumulation
Alternative designs are always possible, which come with their own performance tradeoffs
Space-time diagrams show pipeline usage
- Efficiency $E_{n}$ = (busy area)/(total area)
  - Speedup $S (n) = n E_{n}$
- More generally, $S (n) = \frac{n N}{N + ( n - 1 )}$
  - $n$ is number of stages, $N$ is instructions executed
  - As $N \to \infty$ , $S(n) \to \n$
Complex pipelines with feedback and differently clocked stages can be difficult to design and optimise
- Reservation tables are space-time diagrams that show where data can be admitted to the pipeline
  - Xs in adjacent columns of the same row show that stages operate for more than one clock period
  - More than one Xs in a row not next to each other show feedback
  - Pipelines may not accept initiations at the start of every clock period, or collisions may occur
    - Potential collisions shown by the distance in time slots between Xs in each row
- Collision vector is derived from the distance between Xs
  - $C = C_{n - 1} C_{n - 2} ... C_{2} C_{1} C_{0}$
    - $C_{0} = 1$ , always
  - $C_{i} = 1$ if a collision would occur with an initiation $i$ cycles after a previous initiation
  - The initial collision vector is the state of the pipeline after the first initiation
    - Distances between all pairs of Xs in each row, if distance is $i$ then set bit
- Need a control mechanism to determine if new initiations can happen without a collision occurring
  - Latency is the number of clock periods between initiations
  - Average latency is the number of clock periods between initiations over some repeating cycle
  - Minimum average latency is the smallest possible considering all possible sequences of initiations
    - The goal for optimum design
  - A pipeline changes state as a result of initiations, so represent activity as a state diagram
    - A diagram of all pipeline states and changes starting with the initial collision vector
    - Shifting the collision vector to the right gives the next state
      - If shifted vector has $C_{0} = 1$ , cannot initiate
      - If $C_{0} = 0$ , then can do new initiation, new vector is bitwise OR of shifted vector and initial vector
    - State diagram can be reduced to show only changes where initiations are taken
      - Numbers on edges indicate number of clock periods to reach the next tate shown
      - Can identify cycles in graph
- Always taking initiations when $C_{0} = 0$ , to give minimum latency is the greedy strategy
  - Will not always give minimum average latency but is close
  - Often more than one greedy cycle
  - Average latency for a greedy cycle is less than or equal to the number of 1s in the initial collision vector
    - Gives an upper bound on latency
  - Minimum average latency is greater than or equal to the max number of Xs in any reservation table row
    - Gives a lower bound on latency
  - Max Xs in row $<=$ min avg latency $<=$ greedy cycles avg latency $<=$ number of 1s in the initial collision vector
- A given pipeline may not give the required latency, so insert delays into the pipeline to expand the number of time slots and reduce collisions
- Can identify where to place delays to give a latency of $n$ cycles: -
  - Start with the first X, enter an X in a revised table and mark as forbidden every $n$ cycles, to indicate the positions are reserved for initiations
  - Repeat for all Xs until X falls on a forbidden mark, then delay the X by one or more
  - Mark all delayed positions and delay all subsequent Xs by the same amount
- Delays can be added using a latch to delay by a cycle

Honestly just check the slides and examples for this one it makes zero sense lol

Superscalar Processors

A single, linear instruction pipeline provides at very best a steady-state Clocks per Instruction (CPI) of 1
Fetching/decoding more than one instruction per clock cycle can reduce the CPI below 1
An easy way to do this is to add duplicate the pipeline
For example:
- Two fetch/decode stages
- Execution staging window register
- Multiple execution pipelines for different instructions
- Non-uniform superscalar has pipeline is not duplicated
Number of replications before window is the degree of the superscalar processor
Some pipeline stages need less than half a clock cycle, so double internal clock speed can get two tasks done per half a clock cycle
- Known as superpipelining
A pipeline takes $s + N - 1$ clock cycles to execute $N$ instructions
- A superscalar pipeline takes $s + \frac{N - 1}{σ}$ to do the same
An example pipeline has 4 stages, fetch, decode, execute, write-back
- Each stage is duplicated
  - $σ = 2$ , the number of replications
  - $s = 4$ , the number of stages
- If instructions are aligned, the number of clocks required if $s + (N / σ) - 1$
- If instructions are unaligned, then $s + (N / σ)$
The CPI of a superscalar processor is $1/ σ + 1/ N (s - 1/ σ)$
For large values of $σ$ , the speedup is limited by delays set by $N$ and pipeline length $s$
As $σ$ increases, speedup increases linearly too until the point where instruction level parallelism limits further increases
- For many problems, ILP gives parallelism in the range 2-4x
No reason to have a huge number of duplicated pipelines, as most programs have a limited degree of inherent parallelism
- Can be maximised by compiler and hardware techniques
- Limited by dependencies
The program to be executed is a linear stream of instructions
- Instruction fetch stage includes branch prediction to form a dynamic stream which may include dependencies
- Processor dispatches instructions to be executed according to their dependencies
- Instructions are conceptually put back into sequential order and results recorded - known as committing or retiring the instruction
  - Needed as instructions are executed out of order
  - Instruction may also be executed speculatively and not need to be retired

Instruction Level Parallelism

Common instructions can be initiated simultaneously and executed independently
Superscalar processors rely on this ability to execute instructions in separate pipelines, possibly out-of-order
- Multiple functional units for multiple tasks
ILP refers to the degree to which instructions can be executed in parallel
Common techniques to exploit it include instruction pipelining and superscalar execution, but also:
- Out-of-order execution
- Register renaming
  - Values conflict for use of the registers, processor has to stall to resolve conflicts
  - Can treat the problem as a resource conflict, and dynamically rename registers in hardware to reduce dependencies
  - Use different registers to the ones that the instructions say
- Branch prediction
  - Prefetch both sides of the branch, reduces delay
  - Can be static or dynamic
  - Speculative execution aims to do the work before it is known if results will be needed
    - Relies on resource abundance to provide performance improvements
Fiver factors fundamentally constrain ILP:
- True data dependency
  - An instruction cannot execute because it requires data that will be produced by a preceding instruction
  - Usually causes pipeline delays
- Procedural dependency
  - Inherent to the sequential nature of execution
  - Instructions following a branch have a dependency on the result of the branch
  - Variable length instructions can prevent simultaneous fetching
- Resource conflicts
  - Two or more instructions require a system resource at the same time
  - Memories, caches, functional units, etc
A program may not always have enough inherent ILP to take advantage of the machine parallelism
- Limited machine parallelism will always inhibit performance
- Processor must be able to identify ILP
Instruction issue refers to the process of initiating execution in the processors functional units
- Instruction has been issued once it finishes decoding and hits first execute stage
- The instruction issue policy can have a large performance impact
- Three types of instruction order are significant:
  - Fetch order
  - Execute order
  - Order in which instructions update the contents of memory
- Issue policy can fuck with these orders to whatever extent it pleases provided the results are correct
Three general categories for instruction issue policies:
- In-order issue with in-order completion
  - Do the same as what would be done by a sequential processor
    - Issuing stalls when there is a conflict on a functional unit or takes more than one cycle
- In-order issue with out-of-order completion
  - A number of instructions may be being executed at any time
  - Limited by machine parallelism in functional unites
  - Still stalled by resource conflicts and dependencies
  - Introduces output dependencies
- Out-of-order issue with out-of-order completion
  - In-order issue will only decode up to a dependency or conflict
  - Further decouple decode and execute stages
  - A buffer - the instruction window - holds instructions after decode
  - Processor can continually fetch/decode as long as window not full and execution is separate
  - Increases instructions that are available to execution unit

Parallelism

Parallel Organisation

Flynn's Taxonomy:
- SISD
  - Standard uniprocessor stuff
- SIMD
  - Vector/Array Processors
  - Single machine instruction executes on a number of processing elements in lockstep
- MISD
  - Not really used
- MIMD
  - Distributed memory systems (cluster-based)
    - Communicate via message passing, very scalable
  - Shared memory systems
    - Communicate via memory and are easy to program but memory contention can happen
    - Symmetric multiprocessors
    - NUMA
Vector computers employ lots of arithmetic pipelines for SIMD processing
- Instructions operate on vectors of numbers (one or two dimensional)
- One operation specified for all elements of the vector
- 2 main types of architecture:
  - memory-to-memory
  - register-to-register (specific vector registers)
- Chaining often used - chain pipelines together for operations such as FMA
  - Connect inputs/outputs via crossbar switches
- SIMD array computers had good performance for specific applications, but they're old and no-one makes them anymore
  - Special set of instructions broadcast to processing elements for execution
- Array computer are dead but MMX, SSE, AVX are big in x86
- ARM has NEON coprocessor, a 10-stage SIMD pipeline
Interconnection structure are important in allowing data or memory to be shared
- In distributed memory systems, communication is in software via ethernet or infiniband
- More efficient interconnects are needed to share memory
  - A shared bus allows processor and memory to share a communication network
    - Need to resolve bus contention issues
    - Poor reliability
    - Only good for small systems
  - A cross-bar switch matrix uses a matrix of interconnects
    - Functional units require minimal logic
    - Switch is complex, large and costly
    - Potentially high bandwith, but still struggles with contention
  - Static links between each processor enable dedicated communication
    - More links -> better communication rate
    - Different patterns have different performance properties
    - Chosen architecture of links usually is a tradeoff between cost and performance
      - Hypercube is a good balance
      - Number of connections and links per node are a good indication of cost
      - Maximum inter-node distance is an indicator of worst-case communication delay
    - Can have a dedicated link for each pair but that's expensive and rarely necessary
- Multistage switching networks can be either cross-bar or cell-based
  - Requirement is to connector each processor to any other processor
    - Known as the full access property
  - Another useful property is that connections are non-blocking
  - CLOS networks (multi-stage cross-bar switches) showed that a network with 3 or more stages can be non-blocking
  - A CLOS network with 2x2 cross-bar elements is known as a Benes Network, classified as cell-based
    - Most cell-based networks are highly blocking but require few switches

Cache Coherence

Shared memory MIMD systems are easy to program, and can overcome memory contention via cache
Copies of the same data may now be in different places
- Cache coherence must be maintained
- A write-through policy is not sufficient as that only updates main memory
- It is necessary to update other caches too
Possible solutions include:
- Shared caches
  - Poor performance for more than a few processors
- Non-cacheable items
  - Can only write to main memory, causes problems
- Broadcast write
  - Every cache write request is broadcast to all other caches
  - Copies either updated or invalidated, preferably the latter as it is faster
  - Increases memory transactions and wastes bus bandwidth
- Snoop bus
  - Suitable for single-bus architectures
  - Cache write-through is used
  - A bus watcher (cache controller) is used and snoops on the system bus
    - Detects memory write operations, and invalidates local cached copies if main memory updated
- Directory methods
  - A directory is a list of entries identifying cached copies
    - Used when a processor writes to a cached location to invalidate or update other copies
  - Various methods exist
  - Suitably for shared memory systems with multistage or hierarchical interconnects where broadcast systems are hard to implement
  - Full directory has a directory in main memory
    - A set of pointers per cache and a dirty bit is used with each shared data item
    - Bit set high if cache has a copy
    - Each word/block/line in cache has two state bits:
      - Valid bit, set if cache data is valid
      - Private bit, set if processor is allowed to write to the block
  - Limited directories only stored pointer for the number of caches that have the data
    - Saves memory storing pointers for caches that don't have data
    - Only $n$ pointers required, but each pointer must uniquely identify one of the $N$ caches
      - $lo g_{2} N$ pointers required for each pointer instead of 1 bit
    - Requires $n lo g_{2} N$ bits instead of $N$ bits
    - Scales much better as entries grow less than linearly
  - Chained directories also attempt to reduce the size of the directory
    - Use a linked list to hold directory items
    - Shared memory directory entry points to one copy in a cache, from there a pointer points to next copy, so on..
    - $N$ copies may be maintained
    - Whenever a new copy called for, list broken and pointers altered
MESI is the good protocol
- Snoop bus arrangement used with a write-back policy
- Two status bits per cache line tag so it can be in one of four states
  - Modified: entry valid, main memory invalid, no copies exist
  - Exclusive: no other cache holds line, memory up to date
  - Shared: multiple caches hold line, memory is up to date
  - Invalid: cache entry is garbage
- When machine booted, all entries are invalid
- First time memory is read, block referenced is fetched by CPU 1 and marked exclusive
  - Subsequent reads by same processor use cache
- CPU 2 fetches same block
  - CPU 1 sees by snooping it is no longer alone and announces it has a copy
  - Both copies marked shared
- CPU 2 wants to write to the block
  - Puts invalidate signal on bus
  - Cached copy goes into modified state
  - If block was exclusive, no need to signal on bus
- CPU 3 wants to read block from memory
  - CPU 2 has the modified block, so tells 3 to wait while it writes it back
- CPU 1 wants to write a word in the block (cache)
  - Assuming fetch on write, block must be read before writing
  - CPU 1 generates a Read With Intend To Modify (RWITM) sequence
    - CPU 2 has a modified copy so interrupts the sequence and write to memory, invaliding it's own copy
    - CPU 1 reads block from memory, updates it and marks it modified
- All read hits do not alter block state
- All read misses cause a change to shared state
Intel and AMD took different approaches to extending MESI
- Intel uses MESIF
  - Forward state is a specialised shared state
  - Serving multiple caches in shared state is inefficient, so only the cache with the special forward state responds to requests
    - Allows cache-to-cache speeds
- AMD uses MOESI
  - Owned state is when a cache has exclusive write rights, but other caches may read from it
    - Changes to line are broadcast to other caches
  - Avoids writing dirty line back to main memory
    - Modified line provided from the owning cache

Data Level Parallelism

The utilisation of SIMD depends on applications having a degree of data-level parallelism
- Matrix oriented computation
- Image and sound processing
Sequential thinking but parallel processing makes it easy to reason about
Vector-specific architecures make SIMD easy but practicality is limited
- Reduced fetch/decode bandwith as fewer instructions
- Programmers view is:
  - Transfer data elements to register files
    - Essentially compiler-managed buffers for data
    - Fixed length buffer to store a single vector
      - Eg, each register holds 64 words
      - Needs enough ports to service all functional units
      - Ports connect to functional units over crossbar switch
  - Operate on register files
    - Functional units heavily pipleined
    - Integrated control units detect structural or data hazards
    - Also provide scalar units to compute addresses
      - Can be chained with vector units
  - Place results back in memory
- Loads and stores are pipleined
  - Program pays memory latency cost just once, instead of once per data element
- Three contributing performance factors are:
  - Length of vector ops
  - Structural hazards
  - Data dependencies
- Performance can be considered in terms of vector length or initiation rate
- Modern vector computers employ parallel pipelines known as lanes
  - Superscalar architecture
- Convoys are sets of vector instructions that can execute together
  - Performance of code sections can be estimated by counting number of convoys
  - Need to ensure no structural hazards exist
  - A chime refers to the unit of time to execute a single convoy
    - A vector sequence of $n$ convoys executes in $n$ chimes
    - Approximation ignores processor specific overhead and allows to readon about inherent data-level parallelism
- Chaining can be used to acheive performance, as it allows operations to be initiated as soon as individual elements of the vector source are available
  - Earliest implementations work in a similar way to forwarding in scalar pipelines
  - Flexible chaining allows a vector instruction to chain to almost any other active vector instruction
    - Have to take care not to introduce hazards
    - Supported by modern architectures
- A number of techniques can be applied to optimise vector architectures
  - Can have multiple lanes, a single vector instruction can be split up to execute accross the lanes
    - Doubling lanes but halving clock rate does not change speed
    - Increases size and energy consumption
  - Vector length registers vary the size of the vector operations
    - Value cannot be greater than the max vector length, the physical register size
    - Strip mining is a technique that generates code such that each vector operation is done for a size less than or equal to the max vector length
  - Vector mask registers allow for conditional execution of each element operation, when usually conditionals would be needed that hinder performance
  - Memory banking spreads memory accesses across multiple memory banks to improve the start up time for a vector load
MMX/SSE/AVX provide SIMD in x86
- Many media applications operate on a narrower range of data types than 32-bit processors are designed for
  - 8-bit colour components
  - 16-bit audio samples
- A 256-bit adder can operate on 32 8-bit values at once
- MMX was introduced by intel in 1996
  - Used 64-bit FP registers to provide 8 and 16-bit operations
- SSE was introduced as the successor, adding 128-but wide registers
- AVX introduced in 2010 adds 256 bit registers with a focus on double precision FP
  - AVX-512 introduced doubles register size again
- Focus of SIMD extensions is to accelerate carefully implemented code
  - Low cost to use
  - Require little extra state compared to vector architectures
  - No virtual memory problems
GPUs are powerful vector units that are similar to vector architectures
- Hardware designed for graphics but usually supplemented to improve the performance of a wider range of applications
- Heterogeneous execution model
  - CPU is host, GPU is device
- NVIDIA have CUDA for programming, OpenCL is vendor-independent
- GPUs provide high levels of every form of parallelism, but it is hard to achieve performance as must also manage
  - Scheduling of computation
  - Transfer of data to GPU memory
- CUDA threads are the lowest form of parallelism, one associated with each data element
  - Can group thousands of threads to yield other forms of parallelism
  - Threads organised into blocks, multithreaded SIMD processor executed a whole thread block
  - Blocks organised into grids, executed independently and in any order
  - GPU hardware handles thread management

Multicore Systems

Can consider the performance of a processor in terms of the rate at which it executes instructions
- MIPS = freq * IPC
- Leads to an focus on increasing clock frequency and processor efficiency
  - We've kinda hit a ceiling with this
Alternative approach is multithreading
- Divide instruction stream into smaller streams to execute threads in parallel
- Various designs and implementations
  - Threads may or may not be the same as software threads in multiprogrammed OS
A process is an instance of a running program
- Processes own resources in their virtual address space
- Processes are scheduled by the OS
- Process switch is an operation that switches the processor form one process to another
A thread is a unit of work within a process
- Thread switch switches processor control from one to another within the same process
- Far less costly than processes & process switches
Implicit multithreading is the concurrent execution of multiple threads from a single sequential program
- Statically defined by compiler or dynamically in hardware
- Rarely done as it hard
Most processors have adopted explicit multithreading, which concurrently execute instructions form different threads by either:
- Uses separate program counter for each thread
- Instruction fetching happens per thread
- Each thread treated and optimised separately
- Multiple approaches:
  - Interleaved, where processor deals with more than one at a time, switching at each clock cycle
    - Thread skipped when blocking
  - Blocking or coarse grained, where threads execute successively until an event occurs that may cause a delay
    - Delay prompts a switch to another thread
  - SMT, where instructions are issues from multiple threads to the execution units of a superscalar processor
    - Performance comes from superscalar capability combined with multiple thread contexts
  - Chip multiprocessing replicates entire processor on same chip
    - Multicore
- Interleaved and blocked do not provied true concurrency, whereas SMT and multicore are actual simultaneous execution
- Multicore systems combine multiple cores on a single die
  - Each core has its own components (ALU, registers, PC) and caches
  - Pollack's rule: performance increase is roughly proportional to square root of increase in complexity
    - If we double the logic, will deliver 40% perf boost
    - Multicore has potential for near-linear improvement but is hard to acheive
  - Main variables are number of cores, and levels and amount of shared cache
    - Can have dedicated L1/L2
    - Can share L2 or have dedicated L2 and share L3
    - Shared L2 cache has advantages over reliance on dedicated cache
      - Constructive interference can reduce miss rates
      - Data shared is not replicated in shared cache
      - Amount of shared cache for each core is dynamic
      - Interprocessor communication can happen through cache
      - Confines cache coherence problem to L1 cache
Clusters
- A group of interconnected whole computers working together as a unified computing resource, that creates the illusion of a single machine
- Alternative to multiprocessing for high performance and availability
- Attractive for servers
- Absolute and incremental scalability, high reliability, superior price/performance ratio
- High-speed interconnects needed
With uniform memory access, all processors have access to all the memory in uniform time
- NUMA, Non Uniform Memory Access, gives different access times to different processors for different regions of memory
  - All processors can still access all memory, just slower
  - Cache Coherent NUMA (CC-NUMA) extends NUMA with cache coherence between the processors
- Used because SMP approaches don't scale, and allows for transparent-system wide memory
- Could motivate clusters, but clusters are hard to program effectively

Thread Level Parallelism

Synchronisation primitives exist in hardware that allow high-level synchronisation constructs to be built
- Establish building blocks to build actual constructs used by programmers
Most important hardware provision is the atomic instruction
- Uninterruptible and capable of incurring value change
- May actually be an atomic instruction sequence
In high-contention sequence, synchronisation can become a performance bottleneck
Atomic exchange is a primitive that swaps a value in a register for a value in memory
- Can be used to build locks for synchronisation
  - Assume a value of 0 indicates the lock is free, 1 indicates it is unavailable
- Simplest possible situation where two processors both wish to perform an atomic exchange
  - One processor will enter the exchange first
  - This processor will ensure that a value of 1 is returned to any other processor that next attempts an exchange
  - The two simultaneous exchange operations will be ordered by write serialisation mechanisms
Older microprocessors feature a test-and-set atomic instruction in hardware
- Allowed to define a test against which a value can be tested
- Value modified if defined test succeeded
Some current gen microprocessors have fetch-and-increment atomic
- Return the value at a pointer and increment it
Atomic instructions usually consist of some read and write
Requiring an uninterruptible read-write fucks with a good number of things
- Cache coherence
- Instruction pipelining
- Cache performance
Possible to have a pair of atomic instructions where the second instruction returns a value that indicates if the pair executed atomically
- Pair includes a special load known as load linked, followed by a special write, store conditional
  - If they memory location specified by load linked is accessed prior to the store conditional then the store fails
  - Also fails if there is a context switch
- Can implement atomic exchange using this
  - If the store conditional returns a value indicating failure, then a branch jumps back and retries
- Can also implement fetch-and-increment
  - Maintain a record of the address specified by linked load in a link register
  - If an interrupt occurs or cache block containing address is invalidated, register is cleared
  - Conditional store checks register for address matching to determine success
  - To avoid deadlock, only register to register operations are permitted between linked-store instructions
Spin locks are locks that processors repeatedly attempt to required
- Effective when low latency required and lock held for short periods
- Processors with cache coherence provide a convenient mechanism for spin locks
  - Testing the status of a lock requires local cache access rather than main memory access
  - Temporal locality decreases lock acquisition times
- Linked-store can avoid needless bus access when multiple processors attempt to acquire a lock
Cache coherence ensures multiple processors have a consistent view of memory, so allows communication through shared memory
- Shared memory communications means we only need consider the rules enforced on reads and writes of different processors
  - Don't need to sync everything
Different models of memory consistency exist
- Simplest is sequential consistency
  - Requires the results of execution be the same if memory accesses of processors were kept in order and interleaved
  - Ensured all processors delay memory accesses until all cache invalidations are complete
  - Simple but slow
- Synchronised consistency orders all accesses to shared data using synchronisation operations
  - A data reference is ordered by a synchronisation operation if, in every possible execution, a write by one processor and an access by another are separated by a pair of synchronisation operations
  - Whenever a variable might be updated without ordering by synchronisation is a data rate
- There are relaxed consistency models that allow reads and writes to complete out-of-order but use synchronisation to enforce ordering
  - Three general models
  - A -> B denotes that A must complete before B
  - Total store ordering relaxes W -> R
    - Retrains ordering among writes
  - Partial order store model relaxes W -> W
    - Impractical for most programs
  - Relaxing R -> R and R -> W happens in a variety of models, including weak ordering and release consistency

High Performance Systems

Symmetric Multiprocessors (SMP) is an organisation of two or more processors sharing memory
- Processors connected by bus
- Uniform memory access
- All processors are the same and share I/O
- System controlled by integrated OS
- Performant for parallel problems
- All processors are the same so if one processor goes down another is still available
- Can scale incrementally
- Most PCs use a time-shared bus but can also use multi-port memory in more complex organisations
Clusters are an alternative to SMP
- A cluster computer is defined as a group of interconnected computers (nodes) working together as a unified resources
- High performance and availability
- Attractive for server applications
- Absolute and incremental scalability
- Superior price/performance
- High speed message links required to coordinate activity
- Machines in a cluster may or may not share disks
- Cluster middleware provides a unified system image to the user
  - Responsible for load balancing, fault tolerance, etc
  - Desireable to have:
  - A single entry and control point/workstation
  - Single file hierarchy
  - Single virtual networking
  - Single memory space
  - Single job-management system
  - Single UI
  - Single I/O space
  - Single Process space
  - Check pointing, to save the process state and intermediate results
  - Process migration, to enable load balancing
Both clusters and SMP provide multiple processors for high-demand applications
- SMP easier to manage and configure, take up less space and power
  - Bus architecture limits processors to around 16~64
- Clusters dominate high-performance server market
  - Scalable to 1000s of nodes
Uniform memory access used in SMP organisations
Memory access time varies in NUMA systems
- NUMA with no cache coherence is more or less a cluster system
CC-NUMA is NUMA with cache coherence
- Objective is to maintain a transparent system memory while permitting multiple nodes
- Nodes each have own SMP organisations and internal busses/interconnects
- Each processor sees a single addressable memory
- Cache coherence usually done via a directory method
- Can deliver effective performance at higher levels of parallelism than SMP
- Bus traffic on any individual node is limited by bus capacity
- If many memory accesses are to remote performance degrades
- Software changes required to go form SMP to CC-NUMA systems

I/O

I/O Mechanisms

Programmed I/O is a mapping between I/O-related instructions that the processor fetches from memory and commands that the processor issues to I/O modules
- Instruction forms depend on addressing policies for external devices
  - Devices given a unique address
- When a processor, main memory and I/O share a bus, two addressing modes are possible
  - Memory-mapped
    - Same addres bus used for both memory and I/O
    - Memory on I/O device mapped into the single address space
    - Simple, and can use general-purpose memory instructions
    - Portions of address space must be reserved
  - Isolated
    - Bus may have input and output command lines, as well as usual read/write
    - Command lines specify if address is a memory location or I/O device
    - Leaves full range of memory address space for processor
    - Requires extra hardware
Most I/O devices are much slower than CPU, so need some way to synchronise
Busy-wait polling is when CPU constantly polls I/O device for status
- Can interleave polling with other tasks
- Polling is simple but wastes CPU time and power
  - When interleaved can lead to delayed response
Interrupt-driven I/O is when devices send interrupts to CPU
- IRQs (interrupt requests) and NMIs (non-maskable interrupts)
- Interrupt forces CPU to jump to interrupt service routine
- Fast response, and does not waste CPU time/power
- Complex, and data transfer still controlled by CPU
DMA avoids CPU bottleneck by speeding up transfer of data to memory
- Used where large amounts of data needed at high speed
- Control of system busses surrendered to DMA controller
  - DMAC can use cycle stealing or force processor to suspend operation in burst mode
- DMA can be more than 10x faster than CPU-driven I/O
- Involves addition of dedicated hardware on the system bus
- Can have single Bus with a detached DMA, where all modules share the bus
- Can connect I/O devices directly to DMA, which reduces bus cycles by integrating I/O and DMA functions
- Can have separate I/O bus, DMA connected to system and I/O bus, devices connected to I/O bus
Thunderbolt is a general purpose I/O channel developed by Apple and Intel
- Combines data, audio, video, power into single high speed connection (up to 10Gbps)
- Based on thunderbolt controller, high speed crossbar switch]
Infiniband is an I/O spec aimed at high-end servers
- Intended to replace PCI in servers
- Provides remote storage, networking, connection
- Scalable and can add nodes as required
PCIe is a serial interconnect between two devices
- Expansion bus standard
- Based on a number of signal lanes
- Packet based with a high bandwith

RAID

RAID: Redundant Array of Independent Disks
As performance increased there was a need for larger and faster secondary storage, and one solution is to use disk arrays
Two general ways to utilise a disk array
- Data striping transparently distributes data over multiple disks to make the appear as a single large disks
  - Improves I/O performance by allowing multiple requests to be serviced in parallel
    - Multiple independent requests can be serviced in parallel by separate disks
    - Single, multi-block requests can be serviced by disks acting in coordination
  - More disks = more performance
- Redundancy duplicates data accross disks
  - Allows continuous operation without data loss in case of a disk failure in an array
RAID 0 - non-redundant striping
- Lowest cost as there is no redundancy
- Data is striped accross all disks
- Best write performance as no need to duplicate data
- Any 1 disk failure will result in data loss
- Used where performance is more important than reliability
RAID 1 - mirrored
- 2 copies of all info is kept, on separate disks
- Uses twice as many disks as a non-redundant array, hence is expensive
- On read, data can be retrieved from either disk, hence gives good read performance
- If a disk fails, another copy is used
- Data can also be striped as well as mirrored, which is RAID 10
RAID 2 - redundancy through Hamming codes
- Very small stripes are used, often single byte or word
- Employs fewer disks than mirroring by using Hamming codes, error correction codes that can correct single-but errors and detect double-bit errors
- Number of redundant disks is proportional to the log of the total number of data disks in the system
- On a single write, all data and parity disks must be accessed
- Read access not slowed as controller can detect and correct single-bit errors
- Overkill and not really used, only effective when lots of disk errors
RAID 3 - bit-interleaved parity
- Parallel access, with data in small strips
- Bit parity is computer for the set of bits in the same position on all data disks
- If drive fails, parity accessed and data reconstructed from remaining devices
- Only one redundant disk required
- Can acheive high data rates
- Simple to implement, but only one I/O request can be executed at a time
RAID 4 - block-interleaved parity
- Data striping used, with relatively large strips
- Bit-by-but parity calculated accross corresponding strips on each data disk, parity bits stored in the corresponding strip on parity disk
- Involves a write penalty for small I/O requests
  - Parity computed by noting differences between old and new data
  - Management software mut read old data and parity, then update new data and parity
- For large writes that touch all blocks on all disks, parity computed by XORing the data for each new disk
- Parity disk can become bottleneck
RAID 5 - block-interleaved distributed parity
- Eliminates parity disk bottleneck by distributing parity accross all disks
- One of the best small read, large read, and large write performances
- Small read requests are still inefficient compared to mirroring due to need to perform read-modify-write operations to update parity
- Best parity distribution is left-symmetric
  - When traversing striping units sequentially, you access each disk once before accessing any disk twice, which reduces disk conflicts when servicing a large request
- Commonly used in file servers, most versatile RAID level
RAID 6 - dual redundancy
- Multiple disk failures require a stronger code than parity
- When disk fails, requires
- One scheme, called P + Q redundancy, uses Reed-Soloman codes to protect against up to two disk failures using a bare minimum of two redundant disks
- Three disks need to fail for data loss
- Significant write penalty, but good for mission-critical applications
SSDs use NAND flash.
- Becoming more popular as cost drops and performance increases
- High performance I/O
- More durable than HDDs
- Longer lifespan, lower power consumption, quieter, cooler
- Lower access times and latency
- Still have some issues
  - Performance tends to slow over the device's lifetime
  - Flash becomes unusable after a certain number of writes
  - Techniques exist for prolonging life, such as front-ending drive with cache and being used in RAID arrays
Storage area networks are for sharing copies of data between many users on a network so anyone can access
- Must protect against:
  - Drive failures - use RAID
  - Power failures - have redundant power supplies (UPS)
  - Storage controller failures - have dual active controllers
  - System unit failures - controllers connect to multiple hosts
  - Interface failures - have redundant links
  - Site failures - keep backups offsite
- Flash copies produce an instantaneous copy while an application is running, eg for online backups
  - Use a copy-on-write algorithm
- Remote copies are maintained at secondary sites for disaster recovery
  - Can use synchronous copy, where data is copied before each command executed on host, keeping secondary copy always in sync
  - Asynchronous copy is done after host executes command, which means data lags but is much more scalable and does not impact host performance

Request Level Parallelism

Request level parallelism is an emphasis on independence of user requests for computational service
- Emphasis is on use of commodity hardware to provide parallelism at scale and capacity
Applicable when provisioning resources at large scale
- Internet services
- Corporate infrastructure
- The Cloud
Exploited in data centres and warehouse-scale computer systems
Internet services are sustained by such systems
- Cloud computing founded on this premise
- Presents system design challenges
  - Designing for scale and reliability
  - Implementation and operation at scale
  - Cost/performance balance
  - Power consumption
    - Environmental responsibility
- Common measure of data centre efficiency is power utilisation effectiveness
  - PUE = (total facility power usage) / (IT equipment power usage)
  - Must be at least 1
- Dependability is key - services typically are designed to run indefinitely
  - Typical to pursue 99.99% uptime, less than 1hour down per year
  - Can be realised through redundancy in temporal and spatial domains
  - Usually achieved through replication of affordable hardware
- Network I/O is key, servers and warehouse systems must provide consistent network interface
- Must be able to support interactive and varying/unpredictable work loads
- Support must be provided for batch processing (likely highly data-parallel)
- Magnitude of parallelism must be considered to ensure that parallelism provided by hardware is justified
  - Can support both data and request level parallelism
- Operational cost must be considered
  - High performance servers often designed with best performance in mind
  - Warehouses must be designed with longevity and efficiency in mind
- Exploiting economies of scale allows cloud providers to provide software and infrastructure as services
Infrastructure as a service is the most basic cloud service model
- Cloud provider rents out machine and other resources
Platform as a service makes a computing platform available to users
- Used by clients whose focus is software
- Underlying resources adapt to demand
Software as a service provides access to application software in the cloud
- Uses "dumb" clients will all the power in the cloud
- Load balancing done in software
- Office 365 is prominent example
Network as a service refers to cloud providers allowing infrastructure to be used as a network/transport layer
Batch provessing workloads for warehouse-scale systems typically involve things like video transcode or search engine indexing
- MapReduce is a prominent example of how warehouse systems can necessitate alternative programming models
  - Maps a function over each item of the input
  - Exploits data-level parallelism
  - Then collects outputs (reduces) using another function as an aggregation
  - Generalisation of SIMD followed by a reduction
Servers often fitted with local storage, and rely on ethernet-based exchange of data
- Potential latency penalties when crossing the local rack switch
- Alternative is network attached storage
  - Can employ high-speed interconnect

Embedded Systems & Security

Embedded Systems

Embedded software is software integrated with physical processes. The technical problem is managing time and concurrency in computational systems.
Embedded processing is in everything, and will be in more things as computing becomes more ubiquitous
Application areas include:
- Automotive
  - ABS brakes
  - ESP - electronic stability control
  - Airbags
  - Automatic gearboxes
  - Smart keys
- Avionics
  - Flight control
  - Anti-collision systems
  - Flap control
  - Entertainment systems
- Consumer electronics
  - TVs
  - Smart Home
Dependability is key
- Reliability $R (t)$ is the probability of a system working correctly, provided it was working at $t = 0$
- Maintainability $M (d)$ is the probability of a system working correctly $d$ time units after an error occured
- Availability $A (t)$ is the probability of a system working at time $t$
- Safety - no harm must be caused
- Security - data and communication must be confidential and authenticated
Embedded systems bust be efficient:
- Code-size efficient (especially for SoCs)
- Runtime efficient
- Weight and size efficient (small)
- Cost and energy efficient
  - Power is the most important constraint in embedded systems
General purpose processors are CPUs like we're used to
- Application specific have all the same components but are more optimised with custom hardware
- Single-purpose processors have very limited resources and are constrained to run a single program
Different types of hardware:
- ASICs - Application Specific Integrated Circuits
  - Custom designed circuits on chips
  - Necessary if ultimate speed or efficiency is the goal
  - Can- only be produced in volume
    - Masks to produce are hugely expensive
  - Suffers from lack of flexibility, long design times and high costs
  - Power consumption scales with voltage quadratically
  - Can do dynamic power management
  - Varying clock speed can save energy
- FPGAs - Field Programmable Gate Arrays
  - hahaha
- DSPs - Digital Signal Processors
- MPUs - Microprocessor Units
Minimising power consumption is important for
- Design of power supply
- Design of voltage regulators
- Dimensioning of interconnect
- Cooling - high cost and limited space
- Energy availability often restricted (battery powered)
- Lower temperatures lead to longer lifetimes
Efficiency also a concern in memory
- Speed, must have predictable timing
- Energy efficiency
- Size
- Cost
- Energy usage and access time increases with size
Scratch pad memory is a small separate memory mapped intro address space
- Selection done through a simple address decoder
- Used as it is far more energy efficient than a cache

Security

Hardware typically has ports, which can be a security risk
- USB killer is a thumb drive than charges and then discharges capacitors over the data pins
DMA provides access to memory over the system bus
- High speed expansion puts often connected to DMA
- System may be vulnerable if ports connect directly to physical address space
- Mitigated by signing drivers to verify the operation of a device
  - Use IOMMU to implement virtual addressing for I/O devices
  - Modify kernel to disable DMA
Intel has a history of security concerns
- 1995 paper warned against a timing channel relating to CPU cache and the TLB
- 2012 - Apple XNU kernel adopts Address Space Layout Randomisation (KASLR)
  - Linux adopted in 2014
  - Primary goal to mitigate address leaks
- 2016 conference demonstrated "Using Undocumented CPU Behaviour to See into Kernel Mode and Break KASLR"
  - Demonstrated techniques for locating kernel modules
  - Defeated the point in KASLR
  - KASLR was found to have lots of vulnerabilities, but has been updated and replaced with Kernel Page Table Isolation (KPTI)
- Work was done looking at side effects of instructions, leaking info form hardware
  - Measure memory access timings
    - Attacker primes cache
    - Victim evicts cache
    - Attacker probes data to see if it has been accessed
- Lots of CVEs in 2017 related to speculative execution
Meltdown is a CVE related to rogue data cache load
- Melts security boundaries normally enforced by hardware
- Speculative out-of-order execution may execute code that is never intended to be run
- Separate side-channel attack called flush and reload can highlight what was brought into cache by speculative execution
- 3 steps:
  - Attacker-chosen memory location is loaded into register
  - Transient instruction accesses cache line based on register contents
  - Attacker uses flush and reload to determine accessed cache line and hence the secret stored at memory location
- Accesses memory-mapped pages
  - Mitigation prevents probes from revealing anything useful
  - Performance impact can be very high in some workloads
- Every intel processor from 1995-2018 vulnerable
  - Some ARM and IMB PowerPC too
- AMD thought to be immune, by variant discovered in 2021 that exploits branch predictor

CS261

Requirements & Software Methodologies

Formal guidelines on how software should be engineered
Software process model is a sequence of activities that leads to the production of a software product
- Specification - what software should do
- Design and implementation - how should be organised and implemented
- Validation and testing - does it do what it should
- Software evolution - changing software over time
Plan driven
- All activities planned in advance
- Progress measured against plan
- Fixed, detailed spec before development commences
Agile
- Incremental planning
- More adaptable to change

Plan-Based Methodologies

Waterfall model has a strict linear ordering of processes
- Each stage must be completed before moving on
- If anything changes in the plan, go back to the start again
- Stages:
  - Requirements analysis
    - System's services, constraints, goals are established and defined
  - System design
    - Identification of software components and their relationships
  - Implementation and unit testing
    - Software programmed in unit, each unit tested against specification
  - Integration and system testing
    - Software components integrated and tested together as a complete system
  - Operation and maintenance
    - System installed, any errors that appear are fixed
    - System services enhanced as new requirements added
- Works if requirements are fixed an understood
- Fewer team constraints
- Each component can be tested against spec
- Easy to churn team because everything well-documented
- Customers can wait a long time for results
- Difficult to accommodate change
- Difficult to respond to changing requirements
- Can be a problem if project is long running
  - Longer time = more likelihood of things changing
Plan driven too rigid - introduce flexibility with incremental development
- Develop in staged with customer feedback incorporated between iterations
- Specification - development - validation is iterative
- New functionality can be added in each iteration
- Each stage planned in full and validated against plan
- Cost of accommodating change is reduced
- Software available to use quicker so feedback can be gathered easily
- Customers can see development in progress
- Easier to include user acceptance testing
- Difficult to estimate cost of development
- Difficult to maintain consistency
- As progress continues, becomes harder to include new features or make changes
- Not cost effective to document each version
- Increased cost of repeated deployment
Re-writing software from scratch is expensive
- Rely instead on off the shelf components (libraries, frameworks)
- Include component analysis in development flow, identify library/framework
- Requirements may have to accommodate available components

Agile Methodologies

Agile development is a principle that defines a set of methodologies
- Interleaves specification, design and implementation
- System developed as a series of versions
- Feedback provided at each stage
- Process driven approaches have become too cumbersome as businesses need to be able to evolve more rapidly
  - Increased focus on code over design
- Principles include
  - Customer involvement
  - Incremental delivery
  - People, not process
  - Embrace change
  - Maintain simplicity
- Focuses on development over documentation - can make it hard to pick up a system later on
- Works well as long as original team continues the evolution - problems can arise if team changes
Possible to use techniques from both plan-based and agile, depending on what is applicable
Extreme programming is an agile methodology involving incremental delivery with fast iteration
- Build several times a day
- Deliver to customers often
- Automate tests to verify builds
- Strong customer involvement
- Incremental planning with requirements on story cards, stories selected based on priority
- Small releases with initially minimal functionality, then building with more
- Simple design, only enough to meet current requirements
- Write tests before the software - test driven
- Developers expected to continually refactor code
- Pair programming provides support
- Collective code ownership allows for anyone to work on anything
- Continuous Integration integrates components as soon as they are ready
- Working at a sustainable pace is important for developers
- Having customer on-site is useful to incorporate frequent feedback
Scrum is a general agile method that focuses on managing iterative development
- 3 primary stages:
  - Outline planning phase to establish general goals
  - Sprint cycles, each cycle developing an increment of the system
  - Project closure, wrap up project, document, deliver
- Uses quick development cycles of typically 2-4 weeks
  - Daily team meetings to discuss current work
  - Each sprint completes item on backlog
  - Features selected with customer
  - Scrum master interface between team and customer
  - End of each sprint, work reviewed and presented

Requirements Analysis

Requirements are descriptions of what the system should and should not do, the service it provides, and the constraints on its operation
Enable developers to make software that wil correctly fulfil customers needs
Provides a basis for tests, validation and verification
Enable (semi-)cost accurate specification
Important to distinguish what is built from how it is built
Requirements act as a bridge between customers and developers
First stage in any process is software specification - requirements engineering
- Requires that we define the services required from the system
- Identify constraints on operation and development
- Produce a requirements document
  - End user facing and system developer facing - possibly two documents
- Feasibility study determines that task is feasible and cost effective
- Requirements elicitation and analysis derives the system requirements
  - Look at existing docs
  - Talk to customer
  - Discuss features
  - Possibly prototype
- Requirements specification translates information gathered in elicitation into formal documents
- Requirements validation ensures requirements are achievable and valid
- Need to ensure customer signs off requirements
- Notion of C- and D-requirements for customer and development facing
  - Technical requirements vs idiot speak
  - C-requirements describe operation and constraints from users's point of view
  - D-facing give detailed description of system functions, acting as basis for contract with developer
Good requirements are
- Prioritised: features have an implementation priority
- Consistent: requirements do not conflict with each other
- Modifiable: able to revise set of requirements when necessary and maintain history of changes
- Traceable: able to link each requirement to source, which could be higher-level requirement, use case, or customer statement
- Correct: accurately describes functionality to be delivered
- Feasible: must be possible to implement each requirement within the known capabilities and limitations of environment
- Necessary: should document something that customers actually need, or is required for conformance to external standard or interface
- Unambiguous: someone reading requirement should interpret it only one way
- Verifiable: can tests or other approaches be used to verify if requirement has been implemented properly
MoSCoW requirements group requirements into 4 groups:
- Must have
- Should have
- Could have
- Won't have
Requirements document will be read by:
- Customers
- Managers
- Engineers
- Testers
- Maintainers
Sections include:
- Preface - history and purpose of document
- Intro - justify and outline system
- Glossary
- User requirements design - describe services provided for users
- System architecture - high-level overview of system
- Requirements spec - describe functional and non-functional requirements
- System models - show relationships between system components
- System evolution - anticipated changes due to changing future needs
Functional requirements describe what system should do, state system services and how it should behave in different scenarios
Non-functional requirements are constraints on services or functions offered by system, describe qualities of the system such as availability, performance, etc
Requirements engineering processes is not a linear sequence, processes often interleaved and iterated upon
- Requirements discovery
  - Gather info from stakeholders
  - Domain research
  - Consider use cases
- Classification and organisation - group similar requirements and organise into categories
- Prioritisation and negotiation - assign priorities and sort conflicts between requirements from different stakeholders
- Specification - write the document and give to stakeholders, then iterate
Requirements validation is key once document has been written
- Validity - will system support customer's needs?
- Consistency - are there any conflicts?
- Realism - can system be produced with available resources and technology?
- Verifiability - can system be shown/proved to satisfy requirements?
- Review by both customers and engineers
- Prototyping and test-case generation
Requirements must be managed to see if they should be accepted
- Problem analysis - is new requirement valid and unambiguous?
- Change analysis - what are the effects on the rest of the system?
- Change implementation - do the change
Must take into account legal, social, ethical, professional issues
- Copyright
- Patents
- Developers given fair recognition of work
- Software not produced to do anything illegal or evil
- Work completed in best interest of customer

System Modelling

UML was developed in the 90s as a general purpose modelling language, providing a formal scheme for describing system models.
- Static/structural view of system (objects and their attributes)
- Dynamic/behavioural view - dynamic behaviour of system (collaboration betwen objects and changing state)
Different perspectives for system modelling include:
- External - the context of the system
- Interaction
  - Between system and it's environment
  - Between components of system
- Structural
  - Organisation of system
  - Structure of data being processed
- Behavioural - dynamic behaviour of the system

Structural UML

Creating a static view of a system requires identifying entities, which can be done in one of four ways
- Grammatical approach based on natural language of the system
  - Identify key items from the description of the problem
- Identification of tangible things in the application domain
- Behavioural approach to identify objects
- Scenario-based, where objects, attributes and methods in each scenario are identified
Class diagram shows system classes and their relationships
- Show structure of design and organisation of components
- UML formal notation move requirements closer to a mathematical description
- Forces us to think about the language used in D-requirements
- Class name is shown in diagram
- Attributes shown with types
- Methods shown with return and argument types
- Use the line with the crows foot for showing one-to-many or many-to-many
- Use ranges to indicate how many objects there are
Writing correct class diagrams:
- Class name should be at the top
  - Abstract classes go in italic
  - Interfaces are represented <<interface>>
- Attributes represent internal datatypes and are optional
- Methods that make up public interface should be included
  - Don't show inherited
  - Don't show getters/setters
- Symbols indicate access modifier
  - + - public
  - - - private
  - # - protected
  - ~ - package private
  - / - derived
  - Static attributes/methos should be underlined
- Comments can be associated with classes, use a folded note notation
- Class inheritance hierarchies, drawn top down with arrows pointing to parent
  - Solid line with black arrow for class
  - Solid line with white arrow for abstract class
  - Dashed line with white arrow for interface
- Multiplicity shown next to arrow/line ends
  - * is zero or more
  - 1 is exactly one
  - 2..4 is between two and four
  - 3..* is 3 or more
- Include name and navigability on arrows
- Association (no arrow) shows classes are associated in some way
- White diamond shows aggregation
- Black diamond shows composition
- Dotted line shows a temporary use/dependency
Context models illustrate the operational context of the system and other systems
- Show links between different systems

Behavioural UML

Activity diagrams are flowcharts to represent workflows of stepwise activities within the system
- Involves actions, decision boxes, bars to introduce parallel actions
Use case diagram represents users interactions within the system, and how they interact with the components
- Shows events occurring within system and how users trigger them
Sequence diagram shows temporal interaction between processes and user
- Time progresses downward
- example in slides
State machine diagram shows how the state of the system changes
- Similar to activity diagram but some fundamental differences
- State diagram performs actions in response to specific events
- Flowchart transitions from node to node on completion of activities
- Executing a program graph (flowchart) results in a state graph
- Instructions vs states

Architectural Patterns

Writing correct sequence diagrams:
- Participants are objects/entities
- Messages (arrows) are communications between objects
- Time moves from top to bottom
- Various ways of representing an object
  - Name:Type, can omit either name or type
- Dashed vertical line is the lifetime of the object, terminated with a cross
- When an object is active, represented with a box
  - Nest boxes for recursion
- Frame boxes allow for conditionals and loops
Architectural design is concerned with understanding how a system should be organised
- Often represented with box and line diagrams
- Two main uses:
  - Facilitating discussion about system design - high level view useful for stakeholders
  - Documenting that an architecture has been design with a complete system model
- Non-functional requirements refer to system as a whole, so architectural design is closely related. Considers:
  - Performance
  - Security
  - Safety
  - Availability
  - Maintanability
- First need to break system down into subsystems
- Box/arrow diagrams show general interactions
  - Arrows show direction of data/control
  - May break down larger systems into subsystems

System Design

Design Patterns

There are common design patterns in software that we can identify and exploit
- Standard solution to common programming problem
- Technique for making code more flexible by making it meet certain criteria
- Design or implementation structure that achieves a particular purpose
- High-level programming idiom
- Shorthand for describing certain aspects of program organisation
- Connections among program components
- The shape of an object model
Four essential elements to a design pattern
- A name the meaningfully refers to the pattern
- Description of the problem to which the pattern applies
- Solution describing of the parts of the design
- A statement of the consequences, results and tradeoffs, of applying the pattern
Goal of patterns is to have general solution that can be widely applied, utilising others experience in design
SOLID principles are five principles that improve OOP design
- Single responsibility
  - Class should be responsible for single piece of functionality
- Open/closed
  - Open for extension, closed for modification
  - Once classes are complete, should add functionality by extending instead of editing
- Liskov substitution
  - An object that uses a parent class can use its child classes without knowing
  - Behavioural sub-typing
- Interface segregation
  - Many specific interfaces are better than a general one
  - No code should depend on methods it does not use
- Dependency inversion
  - Ensure high level classes do not rely on functions from low level classes
  - Interactions should rely on well-defined interfaces and go from low level to high level

Creational

Factories
- A factory method is a method that manufactures objects of a particular type
- Constructors are limited, as they only allow objects of particular type
- Factories can bypass this problem to generate objects of different types
- Can be used anywhere a constructor can
- Example - the bike factory that creates bike and all it's dependencies, instead of creating all manually and passing as constructor arguments
- Cuts down on repeated code
- Easy to add new variations and scenarios
- Have to make additional classes
- Factory linked to class it produces
Builders
- Help create complex objects
- Extract construction into a set of methods, builders
- Object creation happens in a series of steps, only calling the builders that we need
- Each sub-step is a different method that could be called by any builder
- Sub-steps in abstract class builder, then make concrete classes for each type of object we want to make
- Builders are not factories, they're more flexible versions for complex classes with optional parameters
- Give more control over construction
- Can re-use code for different instances
- Similar to factories, require lots of new classes
- Code becomes longer, construction still complex but more modular
Prototypes
- Make one object, the prototype, then clone it, making copies of itself
- Putting the responsibility of duplication on the object itself helps us bypass issues around private/public variables
- Guarantees copy is identical
- Create a bunch of template objects, then can just clone the ones we want in each situation
- Don't need more classes just for creating objects
- Remove heavy initialisation in favour of cloning
- Circular references can be tricky
- Might have to perform heavy changes and updates on the cloned object

Structural

Proxy patterns
- May wish to reference an entity without instantiating it
- Create placeholders for other objects, often by adding another level of indirection
- Allows us to load on demand
- Example, image proxy only loads actual image when draw() called
- Uses include
  - Virtual proxy, delay loading of resource until needed (lazy evaluation)
  - Remote proxy, offers client functionality of an object on another server by handling networking
  - Protection proxy, provides access control
  - Logging proxy, keeps track of accesses and requests
  - Caching proxy, saves results of object
  - Smart referencing, if no client is using object it can be removed and then retrieved later (garbage collection)
- Can hide away parts of the service object so it can be changed or controlled
- Allows to manage object life cycle
- Provides availability if service object isn't ready or available
- New proxies cna be added without changing services or clients
- More classes so more complexity
- Adds another step so may result is slowdown
Decorator pattern
- Allows to add new behaviour to objects at runtime
- Wrap original object and add new functionality
- Alternative to subclassing
- Inheritance is static, decorator can be done at runtime
  - Pass classes to decorator classes dependant upon what requirements are
- Can extend behaviour without adding new subclasses
- Can combine wrappers and make functionality dynamic
- Removing wrappers is difficult
- Hard to implement in order-independent way
- Code can look messy
Adaptor pattern
- Adaptors convert data formats we're working with to allow to use other services
- Instead of rewriting entire code to change data type, just adapt it
  - Add new class that inherits original, but converts types
- Can do a slightly more complex version
- Promote single-responsibility principle
- New adaptors can be introduced without refactoring
- Depending on code size, converting original object may be cheaper
Flyweight pattern
- May have lots of objects that share properties, resulting in duplication of resources and wasting memory
- Hold one copy of all the properties that objects can then reference
- Identify resources or data that each object is referencing, then abstract it out to a static class
- Saves memory when lots of objects are in memory
- Lots of complexity
- May introduce additional overhead in compute time - tradeoff

Behavioural

Iterator pattern
- Traverse a container to access elements in order
- Does not expose container's data structure
- Allows to abstract traversal algorithms into own class
- New iterators can be introduced without re-designing existing code
- Can iterate multiple ways in parallel
- Not always necessary - do you really need one for a list
- Can be less effective for highly specialised objects
Observer pattern
- Allows an objects dependents to be notified automatically if state changes are made
- Can work in a push model or pull model
- Highly customisable, subscribers can be added/removed from what they want to be involved with
- Observer interface has notify method
- Class holds list of observing objects, calls their notify method when there is an update
- Key to many real-time systems and cornerstone of MVC architecture
- New subscribers can be added without redesigning the publisher
- Relationships can change at runtime
- Subscribers notified in random order
Memento pattern
- Save and restore objects without revealing details of implementation
- Make an object responsible for saving its own internal state
- Can be used to implement undo functionality for restoring state
- Snapshot implements a limited interface so it can be stored externally (in a caretake object) without exposing internal details
- Snapshot/memento stores the internal data of object and pointer to original object
- Caretaker handles restore
- Can make backups without violating encapsulation
- Extract out maintenance and resoration, keep original object interface simple
- Heavy memory cost
- Need caretakers to track original object life cycles to erase unneeded mementos
Strategy pattern
- Select the method to complete a task at runtime
- Want new object to be responsible for choosing the approach to a particular problem
- Have a number of classes, multiple strategies, that we can select between
- Original class becomes a context, doesn't know details of each strategy
- Route finder has many different strategies for finding routes, by car, by foot, by bus
  - Swap out travel method in route finding class
- Can swap implementations at runtime
- Separate details of algorithm from code that uses it
- Composition replaces inheritance
- If only a few choices, no need to increase complexity
- Requires clinets to understand key differences between strategies to select appropriate one

Architectural Patterns

Layered architecture structures system into layers that provide services above it
- More separate a system is, more independent each module is, more can localise changes
- Each layer relies on layer below and provides services
- Facilitates incremental design
- Layers can be replaced to improve or allow multiplatform support
- Can be developed layer-by-layer
- Separation of functionality can be hard
- Can have performance implications
- Layers depend on all layers below, can have reliability implications
- Useful when
  - Building on top of existing systems
  - When development is spread accross teams
  - When need to add security at each layer
Repository architecture has a central repository storing all data in the system
- Concerned with data sharing rather than structure
- Have large store of data used by many components
  - Database often passive, access and control done by components
- All interaction done through repo - subsystems do not interact
- Components can be independent
- All data can be managed consistently
- Efficient means of sharing large amounts of data
- Single point of failure is bad
- Can be inefficient to have all requests going through the repository
- Distributing repository to scale may be difficult as need to maintain consistency in data
- Useful when
  - System generates large volumes of data needed in persistent storage
  - Data-driven systems where the inclusion of data in the repository triggers an action
Pipe and filter has discrete processing components that filter data as it flows down a linear pathway (the pipe)
- Focuses on runtime organisation of the system
- Each component transforms input data to produce output
- Flexible - can introduce parallelism and change between batch and item-by-item execution
- Easy to understand and evolve
- Matches structure of many apps
- Supports reuse
- Flexible
- Requires standardised data format
  - Modifying standard difficult
- Useful when data processing
Model-View-Controller (MVC) focuses on how to interpret user interactions, update data, then present it to user
- Controller managers user interactions, passes them to view and model to update
- Model manages data, updates according to operations it is asked to perform
- View manages how data from model is presented to user
- Basis of interaction management in many web systems
- Each logical component deals with different aspect: presentation, interaction, data
- Data can be changed independently of how it is displayed
- Allows user to have control over how they see data without changing model
- Adds additional complexity to design
- Simple interactions require considering three different system aspects
- Can be hard to distribute development
- Portability is low due to heavy interaction
- Useful when:
  - System offers multiple ways to view and interact with data
    - Good for many types of web and mobile apps
  - Used when future requirements for interaction and presentation of data are unknown
    - Allows for flexibility in view without changing model

Testing

Dependability

Dependability is the trustworthiness of a computer system such that reliance can justifiably be placed in the service it delivers
It's important that we trust systems as they become more crucial to society and everyday life
- System failures affect people
- Users reject unreliable systems
- System failures are costly
- Undependable systems cause information loss
Reliability is a measure of how likely a system is to provide its service for a specified period of time
Perceived reliability is how reliable the system actually appears to users
- The two differ because systems may be unreliable in ways users do not see
There are a number of ways to measure reliability
- Probability of failure on demand - how likely is it that a request will fail
- Rate of occurrence of failures - how many failures will we expect to see in a fixed time period
- Mean time to failure - how long can system run without failing
- Availability - if a request is made to a system, what is the probability it will be operational
Attributes of dependability:
- Availability - likeliness a service is ready for use when invoked
- Reliability - a measure of how likely system is to provide it's designated service for a specified period of time
- Safety - extent to which system can operate without causing damage or danger to its environment
- Confidentiality - don't disclose undue information to unauthorised entities
- Integrity - capacity of a system to ensure absence of improper alterations with regard to the modification or deletion of information
- Maintanability - a function of time representing the probability that a failed computer system will be repaird in $t$ time or less
Some system properties are directly related to dependability:
- Repairability - how easy is the system to fix when it breaks?
- Future maintanability - is it economical to add new requirements and keep system relevant?
- Error tolerance - system must be able to avoid errors when the user inputs data
A fault is the cause of an error
An error is the manifestation of a fault
Failure is the result of an error propagating beyond a system boundary
- Systems can fail due to hardware/software failure, or operational failure
- Types of failure include:
  - Hardware failure: Components do not function
  - Software failure: Errors in specification, design or implementation
  - Operational failure: Error between the chair and the keyboard
Provide dependability by:
- Fault avoidance - write software to be robust
- Fault detection and correction - verification and validation processes
- Fault tolerance - design the system to manage faults
Dependable processes are designed to produce dependable software
- Documentable - should have a well-defined model
- Standardised - should be applicable for many different systems
- Auditable - should be understandable by other people
- Diverse - should include redundant and diverse verification techniques
- Robust - should be able to recover from failures of process activities
System architectures should also be designed to be dependable
- Diversity should be created by giving the same problem to different teams
- Protection systems
  - Specialised system monitors control system, equipment, hardware, environment
  - Takes action if a fault is detected
  - Moves system to safe state once problem detected
- Self-monitoring architectures
  - Designed to monitor own operation and take action if problem detected
  - Computations carried out in duplicate on separate channels, outputs compared
  - If any difference then failure detected
  - Hardware and software on channels should be diverse
- N-version programming
  - Multiple software units each made by different teams under same specification
  - Each version executed on separate computers
  - Outputs are compared using a voting system
  - High software cost so used where other dependable systems are impractical

System Testing

Testing shows that a program does what it was intended to do
Highlights defects before a software is in use
Forms a part of verification and validation
Demonstrates software meets requirements
Only shows presence of, not lack of error
Verification - does a product meet spec?
Validation - does it meet customer's needs?
Error - human action that produces incorrect result
Failure - deviation of software from expectations
Defects/bugs - manifestation of a software error
Testing - exercise software to assess if it meets requirements
Test case - a set of inputs, preconditions and expected outcomes developed to exercise compliance against a specific requirement
Reliability - probability software will not cause failure for a specified time
Test plan - record of the application of test cases and rationale
System testing - covers both functional and non-functional requirements
Static testing is testing without execution
- Code review, inspection
- Works well with pair programming
- Static testing is verification - does code meet spec?
- Static code analysis are becoming more common
- Not limited to code, can also consider documents
- Should use inspection:
  - Errors interact and hide other errors, inspection can uncover all errors
  - Code does not need to be complete to inspect it
  - Allows to consider code quality too
  - 90% of errors can be found through inspection
Dynamic testing executes code with given test cases
- Inspections bad at discovering timing and performance based issues
- Execute code with given test case
- Structural/white box testing is test cases derived from control/data flow of system
- Involves validation - does product meet needs of customer?
- Functional/black box testing is test cases derived form formal component specification
- Control flow graph shows all possible cases for program flow
  - Used to reason about test coverage
Unit tests involve initialising system with inputs and expected output, calling method, then checking the result
- May use mock objects to make testing faster if objects have heavy dependencies
- Testing is expensive, should aim to be effective with test cases
- May miss errors that occur in interactions between objects - integration tests
Interface errors are the most common in complex systems
- Interface misuse
- Interface misunderstanding
- Timing errors
- Guidelines for component testing:
  - Check extremes of ranges
  - Test interface culls with null pointers
  - Design tests that cause failure and see how failure handled
  - Stress test
  - Vary order order in which memory is accessed
Goal of system testing is to check that components are compatible and interact as expected
- Similar to integration testing but different
- Check full system including off-the-shelf components and components built by other teams
- Looking for emergent behaviour
  - The characteristics we only see when components interact
  - Both expected and unexpected
Test-driven development was originally part of XP but has become more mainstream
- Tests are developed for a bit of code, write the code so the test passes, move on
- Writing test first helps clarify and understand functionality
- Simplifies regression testing, debugging, improves documentation
- Can be bad if you don't know enough to write the tests, or forget important test cases
- Most effective when developing new system
- Does not replace system testing
- Bad when concurrency involved
User testing is important, as it tests the system in the actual case it will be used
- Alpha testing - early version, small group
  - During development
  - Requirements do not reflect all factors
  - Reduces risk of unanticipated changes to software
  - Requires heavy user involvement
- Beta testing - less early version, larger group
  - Test on version nearly complete
  - Large group of users find potential issues
  - Discovers issues in interaction between system and operating environment
  - Can be a form of marketing
- Acceptance testing - test release candidate with real people
  - Crucial for custom systems
  - Customers test system with their own data, decide if acceptable
  - Define acceptance criteria
  - Plan the testing
  - Derive the acceptance test cases, covering all requirements (functional and non-functional)
  - Do the tests with the users in a deployment
  - Negotiate tests results with customer, unlikely all will pass
  - Customer either accepts or rejects system
    - Can be accepted conditionally
  - In XP, is no acceptance tests as customer involved throughout
  - Best testers are typical users but can be difficult

Human-Computer Interaction

The success of software is determined by the people who use it
Attention is important, as we have to make use of it to make good UIs
- Can force or divide attention, or make use of involuntary attention
- Selective attention is when we focus on a particular stimuli
- Sustained attention is our ability to focus on a single task for a long period of time
- Divided attention is our ability to focus on multiple things at once, can depend on how complex tasks are
- Executive attention is a more organised version of sustained attention, when have a clear goal/plan and keep track of steps
Memory is important, have to make UIs intuitive and easy to remember
- Consider the context of the task - how much attention can we afford to give?
- Three components to memory:
  - Sensory stores - visual and auditory stores hold info before it enters working memory
  - Working memory - short term memory that holds transitory info and makes it available for further processing
    - Decays rapidly and has limited capacity
    - Most key in UI design
  - Long-term memory - holds info for long term storage
    - Episodic memory is knowledge of events and experiences
    - Semantic memory is a record of facts, concepts and skills
- Decrease cognitive load to make UI sparse and keep as few things as possible in short term memory
Cognition is the process by which we gain knowledge
Norman's human action cycle describes the actions people take when interacting with computer systems
- Steps:
  - Form a goal - user decides what they want to accomplish
  - Intention to act - user makes their intent explicit, considers options they could choose to achieve their goal
  - Planning to act - user chooses an action
  - Execution - user executed the action
  - Feedback - user receives feedback on their action
  - Interpret feedback - user makes their own interpretation of feedback compared to their expectations
  - Evaluate outcome - user determines if they have achieved their goal
- Gulf of evaluation - the gap which must be crossed to interpret a UI
  - Important to minimise cognitive load so UI is easy to evaluate
- Gulf of execution - the gap between the user's goals and the means to execute the goals
  - Number of steps it takes to complete an action
  - Should minimise for common tasks
- Can extract four goals from the cycle:
  - Provide visibility
  - Provide good mappings
  - Provide a good conceptual model
  - Provide feedback
Gestalt's laws or perceptual organisation are a set of principles around human visual perception
- Figure ground principle - people tend to segment their vision into the figure and the ground, the figure being the focus
- Similarity principle - if two things look similar we assume they behave the same way, form informs function
- Proximity principle - if two objects are close together they must be related, often overrides other visual attributes
- Common region principle - similar to proximity, if we have objects in a bordered region we assume they are related
- Continuity principle - objects on a line or curve are perceived as related
- Closure principle - complex arrangements can be seen as single patterns (eg, the blanks in the shapes showing a tiger)
- Focal point principle - will be drawn to the most obvious bit of an image first
Affordances are what an object allows us to do
- Important to make them as clear as possible to the user
- Signifiers are cues/hints about an objects affordances
  - ie, a save icon means you can save a file
- Can be perceptible or invisible
- Many exist by convention
Several usability concepts impact system design
- Feedback - give user visual/auditory feedback on actions performed
- Constraint - restrain users actions (gaussian blur)
- Mapping - relationship between controls and their effects (a trash can icon)
- Consistency - similar operations should use similar elements for similar tasks
Neilsons usability principles:
- Visibility of system status
- Match system and real world - use familiar language to user
- User control and freedom - give escape routes such as an undo button
- Consistency and standards (especially consistency in the use of language)
- Help user recognise and recover from error
- Error prevention - Are you sure?” dialogue
- Recognition over recall of action flows
- Flexibility and efficiency of use - eg, macros for advanced users
- Aesthetic and minimalist design
- Provide help and documentation

ES2C0

Diodes

Transistors were originally designed to replace mechanical switches/relays, but also provide amplification. Transitstor/diodes are made by adding impurities to silicon to make it either p-type (hole carriers, positive charge moves) or n-type (electron carriers, negative charge moves). Putting the two together makes a PN-junction, or diode. Diodes only allow current in one direction, as determined by the bias voltage (usually around 0.7v).

When the PN-junction is forward biased, current flows from P to N.

The PN-junction can be reverse biased too, and at a certain point ("the knee"), the bias will break down and current flow in reverse

The graph shows a typical small-signal silicon diode at a temperature of 300k. Zener Diodes are diodes where the reverse breakdown voltage is controlled during manufacture to create diodes that act as voltage regulators when reverse biased.

The Shockley equation for a PN-junction related diode current $i_{D}$ and voltage $v_{D}$ :

$i_{D} = I_{s} [exp (\frac{v _{D}}{V _{T}}) - 1]$

Where I_s is the reverse saturation current, and $V_{T} \approx 25 mV$ is the thermal voltage. When v_D is large, typically $v_{D} > 0.1 V$ :

$i_{D} ≊ I_{s} exp (\frac{v _{D}}{V _{T}})$

Load Line Analysis

For the circuit below, KVL gives $V_{s} s = R i_{D} + v D$ .

The Shockley equation also gives $V_{S} S = R \times I_{s} e^{\frac{v _{D}}{V _{T}}} + v_{D}$ . Equating these gives a transcendental equation with no trivial solution.

Instead, if an I-V curve is given, can perform load line analysis.

The load line is the straight line from one axis to the other, overlaid with the diode's I-V characteristic curve.

Point B is a perfect short circuit, $v_{D} = 0$ , $i_{D} = \frac{V _{s} s}{R}$
Point A is an open circuit, $v_{D} = V_{S} S$ , $i_{D} = 0$

The operating point, or Q (Quiescent)-point, is the point at which the two lines intersect, giving an operating point of $(V_{D Q}, I_{D Q})$ .

If the diode is not conducting, then tiny to zero current flows. Otherwise, it will conduct almost perfectly at about 0.7 volts, so $V_{D Q} \approx 0.7$ usually.

The Zener Diode

Zener diodes are designed to operate in the reverse breakdown region. The breakdown voltage is controlled by the doping level during manufacture, which allows a fixed voltage to appear between cathode and anode (that isn't just 0.7v). The ideal Zener diode behaves something like this:

The circuit below shows a diode being used to regulate the voltage of a variable supply, to keep the voltage supply to a load $R_{L}$ constant

As an example, given a Zener diode's I-V curve, find the output voltage for $V_{S} S = 15 V$ and $V_{S} S = 20 V$ , with $R = 1 k Ω$ . KVL gives a load line of $V_{S} S + R i_{D} + v_{D} = 0$ :

The graph shows the two load lines plotted with the diode I-V curve, giving $V_{o}$ of 10V and 10.5V, respectively.

When modelling Zener diodes, an internal resistance $r_{Z}$ is sometimes used, which is what gives the slope of the I-V curve as $1/ r_{Z}$ :

Oscillators

Oscillators employ feedback through amplifiers and frequency selective networks (capacitors/resistors) to create sinusoidal oscillation.

$A_{c l} (jω) = \frac{v _{o}}{v _{i}} = \frac{A}{1 - A β ( jω )}$

$A_{c l}$ is the closed loop gain of the system.
$A$ is the open loop gain (with no feedback)
$β$ is the feedback fraction, that feeds back a portion of the output voltage back to the input
- Negative feedback reduces the gain of the system, which is desirable because $A$ is often very large
  - Stabilises circuits
  - Reduces noise and distortion
  - Increases bandwith
- Positive feedback is employed in the circuit above, which is how oscillators are built
  - $A β = 1$ leads to oscillation
Both positive and negative feedback are used in oscillators
Loop gain $L (jω) = A β (jω)$ is the gain just before the summing junction in the feedback

If at a specific frequency $f_{0}$ , the loop gain $A β$ is unity, $A_{c l}$ will tend to infinity. This is an oscillator. The condition for sinusoidal oscillations of frequency $ω_{0}$ is:

$L (jω) = A β (jω) = 1$

At $ω_{0}$ the phase of the loop gain must be zero and the magnitude of the loop gain must be unity. This is known as the Barkhausen criterion.

The loop must produce and sustain an output with no input applied ( $v_{i} = 0$ )
The frequency $ω_{0}$ is determined by the phase characteristics of the feedback loop
If loop gain $A β > 1$ , output grows $\to \infty$
If loop gain $A β < 1$ , output decays $\to 0$

It is difficult to get exactly unity loop gain. In terms of sinusoidal functions in the laplace domain, we are trying to place both the poles of the function on the imaginary axis in the s-plane. Poles in the right hand side of the plane will initiate oscillation, but bringing them back to the imaginary axis will reduce loop gain to unity and sustain oscillation. Poles in the left hand side of the plane will give a decaying sinusoid.

Wien-Bridge Oscillator

A Wien Bridge employs frequency selective positive feedback through the capacitor/resistor connected to the non-inverting op-amp terminal, and frequency independent negative feedback connected to the inverting op-amp terminal.

$A$ is the open loop gain
$A β (jω)$ is the loop gain

For oscillation, we require $∣ A β ∣ = 1$ , as this gives closed loop gain $A_{c l} = \frac{A}{1 - 1} \to \infty$ , which causes oscillation.

First analysing the positive feedback network in the laplace domain (capacitor has capacitance $1/ s C$ in the s-domain)

$Z_{1} = R + \frac{1}{s C} Z_{2} = \frac{R}{1 + s CR}$

This is a frequency-dependant potential divider, so:

$\frac{V _{2}}{V _{1}} (s) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{s CR}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}}$

This is the transfer function of the frequency-selective positive feedback, as a function of $jω + σ$ . We require $σ = 0$ for a sinusoid, so:

$\frac{s CR}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}} = \frac{jω CR}{1 - ω ^{2} C ^{2} R ^{2} + 3 jω CR}$

The fraction above is real when $1 - ω^{2} C^{2} R^{2} = 0$ , so:

$ω_{0} = \frac{1}{RC} \Rightarrow f_{0} = \frac{1}{2 π RC}$

Which gives:

$\frac{V _{2}}{V _{1}} = β = \frac{jω RC}{j 3 ω RC} = \frac{1}{3}$

The gain loss of the positive feedback network is $β = \frac{1}{3}$ when $ω_{0} = \frac{1}{RC}$

As the feedback fraction $β = \frac{1}{3}$ , we require that $A = 3$ for unity gain. The negative feedback circuit with the two resistors forms a non-inverting amp, so:

$A = 3 = 1 + \frac{R _{2}}{R _{1}} \Rightarrow R_{2} = 2 R_{1}$

Verifying this using the overall loop gain:

$L (jω) = A β (j ω_{0}) = [1 + \frac{R _{2}}{R _{1}}] [\frac{jω CR}{( 1 - ω ^{2} C ^{2} R ^{2} ) + jω 3 CR}] = [1 + \frac{R _{2}}{R _{1}}] [\frac{j ω _{0} CR}{j ω _{0} 3 CR}] = [1 + \frac{R _{2}}{R _{1}}] \frac{1}{3}$

Phase Shift Oscillator

A phase shift oscillator relies on 180 degrees of phase shift from inverting op-amp A3, and then 3 lots of 60 degrees of additional phase shift from 3 voltage-buffered RC networks. With 360 degrees of phase shift around the loop, the final stage gain is set such that $L (jω) = A β (jω) = 1∠0$

The unity gain buffers provide voltage isolation between RC stages so the voltages
- Buffers have high input impedance and low output impedance, isolating stages to simplify analysis
The maximum phase shift an RC network can provide is 90 degrees, but it is hard to achieve this so three 60 degree networks are used instead
The final op-amp A3 creates a non inverting amplifier using $R$ and $R_{2}$ to give 180 degrees of phase shift

For 60 degrees of phase shift in an RC network, we require:

$60° = 90 - tan^{- 1} (ω RC)$

$ω_{0} = \frac{tan 30}{RC} = \frac{1}{3 RC}$

At the RC network also acts as a high pass filter, there will be a gain loss through them.

$\frac{v _{o}}{v _{i}} = \frac{jω RC}{1 + jω RC}$

Using $ω = ω_{0} = 1/ 3 RC$ :

$\frac{v _{o}}{v _{i}} = \frac{1/ 3}{1 + 1/3} = 0.5$

The gain loss for one RC stage is 0.5, so the 3 stages has a gain loss of $0. 5^{3} = 1/8$ . The inverting op-amp therefore must have a gain of -8 to give overall unity gain.

$A_{3} = \frac{- R _{2}}{R _{1}} = - 8$

So the value of $R_{2}$ must be set accordingly.

Bipolar Junction Transistors

There are two kinds of BJTs, NPN and PNP. Both have a base, collector, and emitter and consist of two PN-junctions.

The operating mode of a BJT depends on how the junctions are biased.

Forward active mode is used for amplification
Cutoff and saturation modes are used for switching in digital circuits
- Cutoff is when both junctions are fully off
- Saturation is when both junctions are fully on

Mode	Base-Emitter Bias	Collector-Base Bias
Cutoff	Reverse	Reverse
Forward active	Forward	Reverse
Saturation	Forward	Forward

Transistors obey KCL, so all currents entering a transistor must leave:

$i_{E} = i_{B} + i_{C}$

Transistors also have common-emitter current gain, $β$

$i_{C} = β i_{B}$

There is also the parameter $α$ , the common-base current gain:

$α = \frac{β}{1 + β} i_{E} = \frac{i _{C}}{α}$

$β$ is usually large, so $α \approx 1$

Large-Signal Model

BJTs operating in forward-active mode can be modelled as shown:

The current source shown is dependant upon the base current, and a diode is included to model the 0.7v drop across the PN-junction. This model assumes the transistor is biased correctly, as shown:

Biasing

Biasing is used to set up quiescent collector current to achieve optimum AC and DC conditions at the same time. $β$ is not well specified and can vary per device, so it should be designed to produce the correct operating conditions independent of device parameters and temperature. The Q-point is defined by $V_{CEQ}$ and $I_{C} Q$ .

The circuit shows a single resistor base-biased circuit. Doing KVL around the base-emitter loop gives $I_{C} Q$ :

$V_{CC} = I_{B} R_{B} + V_{BE}$

$I_{B} = \frac{V _{CC} - V _{BE}}{R _{B}}$

$I_{CQ} = β I_{B} = β \frac{V _{CC} - V _{BE}}{R _{B}}$

KVL between $V_{CC}$ and ground gives $V_{CEQ}$ :

$V_{CC} = I_{C} R_{C} + V_{CE}$

$V_{CE} = V_{CC} - I_{C} R_{C}$

$V_{CEQ} = V_{CC} - β \frac{V _{CC} - V _{BE}}{R _{B}} R_{C}$

Equations depending on $β$ are bad though, because $β$ can vary too much to rely on it as a parameter. The graph below shows the same circuit with the same resistors, as $β$ varies:

The Q-point shown is for $β = 175$ , which gives $V_{CEQ} = 3.5 V$ and $I_{CQ} = 1 m A$

Four-Resistor Voltage Divider Bias

This is the most widely used method to bias a BJT

Thevenin's theorem is used to simplify the bias circuit

$R_{T H} = R_{1} ∣∣ R_{2}$
$V_{T H} = V_{C} (R_{2} / (R_{1} + R_{2}))$

Applying KVL around the base-emitter loop:

$V_{T H} = I_{B} R_{T H} + V_{BE} + I_{E} R_{E}$

As $I_{E} = (1 + β) I_{B}$ :

$V_{T H} = I_{B} R_{T H} + V_{BE} + (1 + β) I_{B} R_{E}$

$I_{B} = \frac{V _{T H} - V _{BE}}{R _{T H} + ( 1 + β ) R _{E}}$

Therefore, the collector current $I_{C}$ :

$I_{C} = β I_{B} = β \frac{V _{T H} - V _{BE}}{R _{T H} + ( 1 + β ) R _{E}}$

Howere, we need to stabilise $I_{C}$ to not depend upon $β$ . If we choose $R_{T H}$ to be small, ie $R_{T H} = β R_{E} /10$ , then we can disregard it along with $β$ :

$I_{C} \approx \frac{β ( V _{T H} - V _{BE} )}{( 1 + β ) R _{E}} \approx \frac{V _{T H} - V _{BE}}{R _{E}} \approx I_{E}$

The equation (approximately) no longer depends upon $β$ . Applying KVL around the collector-emitter loop for the voltage gives:

$V_{CE} = V_{CC} - I_{C} R_{C} + R_{E}$

So, if $I_{C}$ is stable, then the Q-point $(V_{CEQ}, I_{CQ})$ is bias-stable. Stability is achieved through the choice of a small enough, $R_{T H}$ , and also the inclusion of an emitter resistor which provides negative feedback stabilisation.

Compare the graph below with the same one further up for the single-resistor bias circuit. The voltage/current are much more stable and less dependant up on $β$ .

Transistors in Saturation

$I_{B} = \frac{10 - 0.7}{10 k} = 0.93 m A$

$I_{C} = \frac{50 ( 1.7 - 0.7 )}{10 k} = 45.6 m A$

$V_{C} = 10 - 46.5 m \times 1 k = - 36.5 V$

This can't be correct. The model breaks down as the transistor is saturated, its no longer operating in the forward-active region.

The voltage accross the collector/emitter maxes out at about 0.2v
The transistor then turns on like a switch
Collector to emitter is (roughly) a short circuit

When operating in saturation, $β$ becomes $β_{forced}$ :

$β_{forced} = \frac{I _{C}}{I _{B}} = \frac{9.8}{0.93} = 10.5$

This is much lower than a typical $β$ would be. As $V_{BC}$ is increased further, $β$ decreases further and further.

BJT Amplifiers

BJTs make excellent amplifiers when biased in the forward-active region
Transistors can provide high voltage, current and power gain
DC biasing stabilises the operating point
DC Q-point determines
- Small-signal parameters
- Voltage gain
- Input & output impedances
- Power consumption
DC analysis finds the Q-point
AC analysis with the small-signal model is used to analyse the amplifier

Hybrid-Pi Model

The hybrid-pi small signal model is what is used for hand analysis of BJTs:

Intrinsic low-frequency representation of a BJT
- Does not work for RF stuff
Ignoring output impedance assumes $V_{A}$ is large
Parameters are controlled by the Q-point
Transconductance $g_{m} = \frac{I _{CQ}}{V _{T}} ≊ 40 I_{CQ}$
- Thermal voltage $V_{T} = 25 mV$
Input resistance $r_{π} = \frac{β V _{T}}{I _{CQ}} = \frac{β}{g _{m}}$

For AC analysis, coupling capacitors are replaced by short circuits, and DC voltages replaced by short circuits to ground. The circuit below shows a 4-resistor bias amplifier replaced by it's small signal model.

AC Analysis

The impdance at the base input $R_{ib}$ :

$R_{ib} = \frac{V _{in}}{i _{b}} = \frac{i _{b} r _{π} + R _{E} ( 1 + β ) i _{b}}{i _{b}} = r_{π} + R_{E} (1 + β)$

The impedance at the emitter is reflected back to the base, multiplied by $(1 + β)$ . This makes the overall input impedance of the amplifier:

$R_{i} = R_{ib} ∣∣ R_{1} ∣∣ R_{2} R_{i} = (r_{π} + R_{E} (1 + β)) ∣∣ R_{T H}$

The output impedance is easy, as lookong into the collector, we can see $R_{C}$ in parallel with a current source which has infinite impedance, so:

$R_{O} = R_{C}$

The voltage accross $R_{C}$ is the output voltage:

$V_{o} = - β i_{b} R_{C}$

The voltage accross $r_{π} + R_{E}$ is the input voltage:

$V_{in} = i_{b} (r_{π} + R_{E} (1 + β))$

The overall voltage gain is therefore:

$A_{v} = \frac{V _{o}}{V _{in}} = \frac{- β R _{C}}{r _{π} + R _{E} ( 1 + β )}$

Note that the gain is negative meaning this is an inverting amplifier. If we make the assumption that $r_{π} << R_{E} (1 + β)$ , and that $β$ is large, then:

$A_{v} = \frac{V _{o}}{V _{in}} \approx \frac{- β R _{C}}{R _{E} ( 1 + β )} \approx \frac{- R _{C}}{R _{E}}$

Common Collector Amplifier

The common collector (or emitter-follower) amplifier is another amplifier circuit used with BJTs (as oppose to the common emitter shown above).

The hybrid-pi model of this circuit looks like, as without the collector the circuit can be re-arranged to:

The output voltage is the voltage accross the emitter resistor, and as $i_{E} = (1 + β) i_{b}$ :

$V_{o} = i_{b} (1 + β) R_{E}$

The input voltage is the voltage accross both the emitter resisitor and $R_{π}$ :

$V_{in} = r_{π} i_{b} + i_{b} (1 + β) r_{E}$

Therefore the voltage gain for this amplifier is:

$A_{v} = \frac{V _{o}}{V _{i} n} = \frac{i _{b} ( 1 + β ) R _{E}}{r _{π} i _{b} + i _{b} ( 1 + β ) r _{E}}$

$A_{v} = \frac{R _{E} ( 1 + β )}{r _{π} + R _{E} ( 1 + β )}$

As usually, $r_{π} >> R_{E} (1 + β)$ , $A_{V} \approx 1$ . This amplifier has very low voltage gain, and instead acts as a current amplifier:

$A_{i} = \frac{i _{E}}{i _{b}} = \frac{( 1 + β ) i _{b}}{i _{b}} = 1 + β$

The input impedance is large, as it is the reflected impedance from the emitter resistor again:

$R_{in} = \frac{V _{i}}{i _{b}} = \frac{i _{b} r _{π} + i _{b} ( 1 + β ) R _{E}}{i _{b}} = r_{π} + R_{E} (1 + β)$

The output impedance can be calculate by shorting $V_{in}$ , and by applying a test current source $i_{x}$ accross the output terminals. I'm not going to type out all the analysis but:

$R_{o} = \frac{r _{π}}{1 + β} = \frac{1}{g _{m}}$

The emitter follow has high input and low output impedance with a high current gain, so acts as an impedance transformer and a buffer.

Example

A circuit for a common-emitter amplifier is shown below.

Work out the values of the DC biasing components $R_{B 1}$ , $R_{B 2}$ , and $R_{E}$ for the following conditions:

$V_{CC} = 10 V$
$I_{E} = 5 m A$
$V_{E} = 1.3 V$
Voltage accross $R_{C} = 4 V$
$β = 100$

Assuming $I_{E} = I_{C}$ and $α = \frac{β}{1 + β} = 0.99$ , we have:

$R_{C} = \frac{4}{5 m A} = 800Ω R_{E} = \frac{1.3}{5 m A} = 260Ω$

Calculating the Thevenin equivalent of the biasing resistors:

$R_{T H} = \frac{β R _{E}}{10} = 2.6 k Ω V_{T H} = V_{E} + V_{BE} + I_{B} R_{T H} = 1.3 + 0.7 + 50 μ A \times 2.6 k Ω = 2.13 V$

Then calculating the bias resistors from the Thevenin values:

$R_{1} = \frac{V _{CC} R _{T H}}{V _{T H}} = \frac{10 \times 2.6 k}{2.13} = 12.2 k Ω$

$R_{2} = \frac{R _{T H} R _{1}}{R _{1} - R _{T H}} = \frac{2.6 k \times 12.1 k}{12.1 k - 2.6 k} = 3.3 k Ω$

To derive an expression for the voltage gain, need to replace the BJT by it's small signal model

$V_{o u t} = - β i_{b} R_{c}$

$V_{s} = i_{b} r_{π} + i_{b} R_{E} + β i_{b} R_{E} = i_{b} (r_{π} (1 + β) R_{E})$

$A_{v} = \frac{V _{o u t}}{V _{s}} = \frac{- β R _{c}}{R _{π} + ( 1 + β ) R _{E}} = \frac{- 100 \times 800}{500 + 101 \times 260} = - 2.99$

MOSFETs

Metal Oxide Semiconductor Field Effect Transistors are the dominant type of transistor nowadays, due to their simplicity to fabricate in VLSI applications. They are voltage controlled current sources, unlike BJTs, which are current-controlled.

By convention, the source terminal is at lower voltage than drain, so $V_{D S} > 0$
MOSFETs have three regions of operation
- Cutoff
- Linear
- Saturation
  - Different to BJT saturation

A MOS transistor is characterised by it's transconductance:

$g_{m} = \frac{2 I _{D Q}}{V _{GS} - V _{TN}}$

Operating Regions

In the linear region:

$i_{D} = K_{n} [2 (V_{GS} - V_{TN}) - V_{D S}] V_{D S} for 0 < V_{D S} < (V_{GS} - V_{TN})$

$K_{n}$ is the transconductance constant, a function of the semiconductor physics and geometry, and will be given.
$V_{TN}$ is the N-channel threshold voltage for the MOSFET

In this region, the relationship betwen $i_{D}$ and $V_{D S}$ is (mostly) linear. The graph below shows the the current set by different voltages for different values of $V_{GS}$ . The current begins to saturate at higher voltages, but is linear at lower values.

When operating in saturation, the drain current begins to saturate when:

$V_{D S} \geq V_{D S (s a t)} = V_{GS} - V_{TN}$

$i_{D}$ in saturation:

$i_{D} = K_{n} (V_{GS} - V_{TN})^{2}$

In the cutoff region, no current flows, as $V_{GS} < V_{TN}$ .

The saturation voltage $V_{D S (s a t)} = V_{GS} - V_{TN}$
Device is in saturation when $V_{D S} > V_{D S (s a t)}$
Device is in linear region when $V_{D S} < V_{D S (s a t)}$

MOSFET Bias Networks

MOSFETs are useful in amplifiers when operating in saturation, when drain current is a function of gate-source voltage. As there is no gate current in a MOSFET:

$i_{G} = 0 ⟹ i_{D} = i_{S}$

KVL around the gate-source loop:

$V_{t h} = i_{G} R_{t h} + V_{GS} + i_{D} R_{S} = V_{GS} + i_{D} R_{S}$

Combinging this equation with $i_{D} = K_{n} (V_{GS} - V_{TN})^{2}$ gives the following quadratic equation in $i_{D}$ :

$i_{D} R_{S} + \frac{1}{K _{n}} i_{D} + (V_{TN} - V_{T H}) = 0$

As the equation for drain current is quadratic, there are two possible solutions:

$i_{D} = \frac{- K _{n}^{- 1/2} \pm K _{n}^{- 1} - 4 R _{s} ( V _{TN} - V _{T H} )}{2 R _{s}}$

Only one of the solutions will be valid, so both must be calculated and checked. Using the following values:

$K_{n} = 25 m A^{- 1} V^{2}$
$V_{TN} = 1.5 V$
$R_{S} = 330Ω$
$R_{D} = 560Ω$
$V_{CC} = 15 V$
$V_{T H} = 4.78 V$

Gives $i_{D} = 8.2 m A$ or $i_{D} = 12.1 m A$ . Checking the first one:

$V_{D} = 15 - 8.2 m \times 560 = 10.408 V V_{S} = 8.2 m \times 330 = 2.706 V V_{G} = V_{T H} = 4.78 V V_{GS} = V_{G} - V_{S} = 2.074 V > V_{TN} = 1.5 V V_{D S} = V_{D} - V_{S} = 7.7 V V_{D S (s a t)} = V_{GS} - V_{TN} = 2.074 - 1.5 = 0.574$

$i_{D} = 8.2 m A$ is a valid solution as $V_{GS} > V_{TN}$ , and $V_{D S} > V_{D S (s a t)}$ .

Doing the same calculations for the other value yields a gate-source voltage that is below the threshold voltage, so the transistor is not operating in saturation and not conducting, meaning it can be disregarded.

MOSFET Amplifiers

Small-Signal Model

As MOSFETs have no gate current, their small signal model is much simpler than that of a BJT.

Between the gate and the source is an open circuit, but the voltage between the two sets the dependant current source $i_{D} = g_{m} \times V_{GS}$ . The MOSFET also has infinite input impedance.

Common-Source Amplifier

Similar to a BJT common emitter amplifier, can construct a MOSFET common source amp:

Using the small signal model of the MOSFET, this amplifier looks like this:

Drain current $i_{D} = g_{m} V_{GS}$ , so:

$V_{o u t} = - g_{m} V_{D S} R_{D}$

$V_{in} = V_{GS} + g_{m} V_{GS} R_{S}$

$A_{v} = \frac{- g _{m} V _{D S} R _{D}}{V _{GS} + g _{m} V _{GS} R _{S}} = \frac{- g _{m} R _{D}}{1 + g _{m} R _{S}} \approx \frac{- R _{D}}{R _{S}}$

Note that transconductance in a MOSFET is:

$g_{m} = \frac{2 i _{D Q}}{V _{GS} - V _{TN}}$

This is much lower than transconductance in a BJT, hence the gain is much lower/

Bypass Capacitors

Adding a bypass capacitor to the amplifier increases the gain, while keeping the DC Q-point stable. Remember that capacitors act as short circuits in AC, and open circuits in DC.

$A_{v} = \frac{- g _{m} R _{D} V _{GS}}{V _{GS}} = - g_{m} R_{D}$

The gain of the amplifier with a bypass capacitor is much higher.

Input and Output Impedance

The input impedance of a MOSFET is infinite, as no current flows between gate and source.
The overall input impedance of a common source MOSFET amp is $R_{in} = R_{T H} = R_{1} ∣∣ R_{2}$ , as the two gate bias resisisors will act as impedances to input signals
The output impedance of the bypassed amplifier above is just $R_{D}$ , as that's the only impedance in the model.
- If $R_{D} = 0$ , like in a common drain/source follower, then this becomes $1/ g_{m}$
MOSFETs have higher input impedances for this reason, so MOSFET amplifiers are used over BJTs where high impedance is required.

Differential Amplifiers

Op-amps are differential amplifiers, designed to amplify the difference between two inputs. In an op-amp, the gain $A$ is usually very large, approaching $\approx 1 0^{6}$ in practice, and is modelled as infinite.

$V_{o u t} = A (V_{1} - V_{2})$

Op-amps are based on a circuit known as a long-tailed pair:

$Q_{1}$ and $Q_{2}$ are a matched pair of transistors, meaning they have the exact same electrical properties (
- $β_{Q_{1}} = β_{Q_{2}}$
The quiescent current $I_{Q}$ is the current through the shared emitter resistor, $R_{EE}$
$I_{Q} = I_{E_{1}} + I_{E_{2}}$
If $β$ is large, then $I_{C} = I_{E}$

Biasing

When doing bias calculations, the two inputs $V_{1}$ and $V_{2}$ are assumed to be grounded, $V_{B} = 0$ . As $V_{BE} = 0.7$ , $V_{E} = - 0.7$ . $V_{EE}$ , the tail voltage, is always taken as negative. Using this, we can calculate $I_{Q}$ :

$I_{Q} = \frac{- 0.7 - ( - V _{EE} )}{R _{EE}} = \frac{V _{EE} - 0.7}{R _{EE}} = 2 I_{E}$

$R_{EE}$ sets the quiscent collector current.

When the two inputs are grounded, the output at the collectors $V_{C_{1}}$ and $V_{C_{2}}$ are the same.

$V_{C_{1}} = V_{C_{2}} = V_{CC} - I_{C} R_{C}$

For matched transistors, $V_{C_{1}} - V_{C_{2}} = 0$ .

AC Analysis

The long-tailed pair can operate in two modes, depending upon how input is applied

Differential mode amplifies the difference between the two input signals
Common mode works similar to a regular BJT amplifier
Better amplifiers have a high ratio of differential to common gain, called the Common Mode Rejection Ratio (CMRR)

Differential Mode

The circuit below shows two AC sources connected, $V_{d} /2$ and $- V_{d} /2$ , to give a differential input signal of $V_{d}$ .

The differential output is the difference between the two outputs:

$V_{o 1} - V_{o 2} = V_{o d}$

And the differential mode gain:

$A_{i d} = \frac{V _{o d}}{V _{d}} = g_{m} R_{C}$

The way this circuit is usually used, however, is with one output referenced to ground:

This gives a single-ended output, with a gain of:

$\frac{V _{o}}{V _{d}} = \frac{g _{m} R _{C}}{2}$

The input and output resistances for differential mode inputs are:

$R_{i d} = 2 r_{π}$ $R_{o d} = 2 R_{E}$

Common Mode

Common mode input is when the same signal is connected to both input terminals, $V_{1} = V_{2} = V_{c m}$ . An ideal differential amplifier would reject common mode input, but this is often not the case. The performance of a differential amplifier is defined by it's CMRR, which would ideally be infinite, but is usually just very large in practice.

$A_{c m} = \frac{V _{o 1}}{V _{c m}} = \frac{- β R _{C}}{r _{π} + 2 ( 1 + β ) R _{EE}} \approx \frac{- R _{C}}{2 R _{E} E}$

The common mode input resistance:

$R_{i c} = \frac{r _{π}}{2} + R_{EE} (1 + β)$

The generalised output of a differential amplifier, factoring in both common mode and differential mode input signals is:

$V_{o u t} = A_{i d} V_{i d} + A_{c m} V_{c m}$

Example

Find $R_{EE}$ to give $I_{Q} = 1 m A$ , and $R_{C}$ for max AC swing, when $V_{CC} = 15 V$ .

$I_{Q} = \frac{15.7 - 0.7}{1 m} = 15 k Ω$

For max AC swing, $R_{C} \approx 7.5 V$ :

$R_{C} = \frac{7.5}{0.5 m A} = 15 k Ω$

Using $β = 100$ , calculate the differential and common mode gains, and the CMRR of the circuit.

$r_{π} = \frac{β V _{T}}{I _{C}} = \frac{100 \times 25}{0.5} = 5 k Ω$

$A_{i d} = \frac{g _{m} R _{C}}{2} = \frac{β R _{C}}{2 r _{π}} = \frac{100 \times 15 k}{2 \times 5 k} = 150$

$A_{c m} = \frac{- β R _{c}}{r _{π} + 2 ( 1 + β ) R _{EE}} = \frac{- 100 \times 15 k}{5 k + 202 \times 15 k} = - 0.49$

$CMRR = \frac{A _{i d}}{A _{c m}} = \frac{150}{0.49} = 303.5 = 49.6 d B$

The common mode rejection ratio for this circuit is fairly low, because $R_{EE}$ is low. $A_{c m} \to \infty$ as $R_{EE} \to \infty$ , so ideally $R_{EE}$ is as large as possible. Replacing it with an ideal current source with infinite resistance can acheive this.

Op-Amps

Operational Amplifiers are fundamental to modern electronics

$V_{o u t} = A (V_{2} - V_{1})$

Properties of an ideal Op-Amp:

$A \to \infty$
$R_{in} \to \infty$
- $I_{in} \to 0$
$R_{o u t} \to 0$
Slew rate $\to \infty$
- Output can change as fast as we want
Common Mode Rejection Ratio (CMRR) $\to \infty$
- Signals where $V_{2} = V_{1}$ are rejected and not amplified
Power supply rejection ratio $\to \infty$
Bandwith $\to \infty$

A large gain $A$ drives the differential input to zero $(V_{2} - V_{1}) \to 0$ , as the op-amp always tries to keep the two inputs the same.

Buffers

A buffer provides unity gain while acting as a signal buffer.

$V_{o u t} = V_{in} A_{v} = 1$

As $R_{in}$ is high and $R_{o u t}$ is low, no current flows in and there is no impedance to current flowing out, meaning the buffer acts to isolate stages of a circuit.

Active Filters

(they aren't really active, according to Ryan.)

$\frac{V _{o u t}}{V _{in}} = A (jω) = - \frac{Z _{2}}{Z _{1}}$

$Z_{1}$ and $Z_{2}$ are generalised impedances and can take any value. If $Z_{2} = R_{2} ∣∣ C$ , for example, then:

$A (jω) = - \frac{R _{2} / R _{1}}{1 + jω C R _{2}}$

A limits test shows that this would make a low pass filter:

As $f \to 0$ , $A (jω) \to - R_{2} / R_{1}$
As $f \to \infty$ , $A (jω) \to 0$
The mid-band is where $A (jω) = R_{2} / R_{1}$

Cutoff frequency is where $A (jω) = 1 + j$ , which is $\frac{1}{2 π R _{2} C}$ Hz

The other way round, where $Z_{1} = C + R_{1}$ and $Z_{2} = R_{2}$ is a high pass filter:

$A (jω) = - \frac{jω C R _{1}}{1 + jω C R _{1}}$

This gives a cutoff frequency of $f_c = \frac{1}{2 \pi R_1 C$, where the max gain as $f \to \infty$ is $R_{2} / R_{1}$

Equations

Below are some of the main equations that I have found useful to have on hand.

Use ./generateTables.sh ../src/es2c0/equations.md in the scripts folder.

Oscillators
Closed Loop Gain	$A_{c l} (jω) = \frac{v _{o}}{v _{i}} = \frac{A}{1 - A β ( jω )}$
Loop Gain	$L (j w) = A β (j w)$
Frequency Potential Divider ( $s$ )	$\frac{V _{o}}{V _{i}} (s) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{s CR}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}}$
Frequency Potential Divider ( $j w$ )	$\frac{V _{o}}{V _{i}} (j w) = \frac{jω CR}{1 - ω ^{2} C ^{2} R ^{2} + 3 jω CR}$
Frequency of Unity Gain (0 phase shift)	$ω_{0} = \frac{1}{RC} \Rightarrow f_{0} = \frac{1}{2 π RC}$
60 Degrees of phase shift in CR network	$ω_{60°} = \frac{1}{s q r t 3 \times RC}$
Transfer function of CR Network	$\frac{v _{o}}{v _{i}} = \frac{jω RC}{1 + jω RC}$
Transfer function of RC Network	$\frac{v _{o}}{v _{i}} = \frac{1}{1 + s CR} = \frac{1}{1 + j wCR}$
Transfer function of Inverse Frequency potential divider ( $s$ )	$\frac{V _{o}}{V _{i}} (s) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{( 1 + s CR ) ^{2}}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}}$
Transfer function of Inverse Frequency potential divider ( $j w$ )	$\frac{V _{o}}{V _{i}} (j w) = \frac{( 1 + j wCR ) ^{2}}{( 1 - w ^{2} C ^{2} R ^{2} ) + 3 j wCR}$
Transfer function of Frequency potential divider (Inductor) ( $s$ )	$\frac{V _{o}}{V _{i}} (s) = \frac{s L R}{( R ^{2} + s ^{2} L ^{2} ) + 3 s L R}$
Transfer function of Frequency potential divider (Inductor) ( $j w$ )	$\frac{V _{o}}{V _{i}} (j w) = \frac{j w L R}{( R ^{2} - w ^{2} L ^{2} ) + 3 j w L R}$
Frequency of Unity Gain (0 phase shift) (Inductor)	$ω_{0} = \frac{R}{L} \Rightarrow f_{0} = \frac{R}{2 π L}$

BJT Transitors
Common Emitter Forward Gain, $β$	$β = \frac{I _{C}}{I _{B}} = \frac{α}{1 - α}$
Common Base Forward current gain, $α$	$α = \frac{β}{β + 1}$
NPN Emitter Current	$I_{E} = I_{B} + I_{C} = \frac{I _{C}}{α} = I_{C} + \frac{I _{C}}{β} = (1 + β) I_{B}$
Emmitter Voltage Rule of Thumb	$V_{E} \approx \frac{V _{cc}}{10}$
Thevin Resistance Rule of Thumb	$R_{T H} = \frac{β R _{E}}{10}$
Four Resistor Bias Circuit $R_{T H}$	$R_{T H} = R_{1} // R_{2} = \frac{R _{1} R _{2}}{R _{1} + R _{2}}$
Four Resistor Bias Circuit $V_{T H}$	$V_{T H} = V_{CC} \frac{R _{2}}{R _{1} + R _{2}}$
Transconductance	$g_{m} = \frac{I _{CQ}}{V _{T}} ≊ 40 I_{CQ}$

AC BJT Analysis
Amplifier Topologies	$x$
Transistor Input Impedance	$r_{π} = \frac{β V _{T}}{I _{CQ}}$
Gain of Collector Follower (Common Emitter) AC	$A_{v} = \frac{V _{O}}{V _{in}} = \frac{- β R _{C}}{r _{π} + ( 1 + β ) R _{E}} \approx - \frac{R _{C}}{R _{E}}$
Input Impedance of Collector Follower (Common Emitter)	$R_{in} = r_{π} + (1 + β) R_{E}$
Output Impedance of Collector Follower (Common Emitter)	$R_{o u t} = R_{C}$
Emitter Follower (Common Collector)	$N / A$
Voltage Gain of Emitter Follower (Common Collector)	$A_{v} = \frac{R _{E} ( 1 + β )}{r _{π} + R _{E} ( 1 + β )}$
Current Gain of Emitter Follower (Common Collector)	$A_{i} = \frac{( 1 + β ) i _{b}}{i _{b}}$
Input Impedance of Emitter Follower (Common Collector)	$R_{in} = r_{π} + (1 + β) R_{E}$
Output Impedence of Emitter Follower (Common Collector)	$R_{o u t} = \frac{R _{E} ( r _{π} + R _{S} )}{( r _{π} + R _{S} ) + R _{E} ( 1 + β )}$
Output Impedence of Emitter Follower (Common Collector) Simple	$R_{o u t} = \frac{r _{π}}{( 1 + β )} = \frac{1}{g _{m}}$

MOSFETs DC
Stages	$V_{D S (s a t)} = V_{GS} - V_{TN}$
Linear Region Drain Current	$i_{D} = K_{n} [2 (V_{GS} - V_{TN}) - V_{D S}] V_{D S}$
Saturation Drain Current	$i_{D} = K_{n} (V_{GS} - V_{TN})^{2}$
Saturation Drain Current -> VGS	$V_{GS} = (\frac{i _{d}}{K _{n}})^{1/2} + V_{TN}$
Small Signal Model	$i_{D} = g_{m} \times V_{GS}$
Transconductance $g_{m}$	$g_{m} = \frac{2 I _{D Q}}{V _{GS} - V _{TN}}$
MOSFET Bias Network	$sqrt i_{D} = \frac{- K _{n}^{- 1/2} \pm ( K _{n}^{- 1} - 4 R _{s} ( V _{TN} - V _{T H} ) ) ^{\frac{1}{2}}}{2 R _{s}}$
MOSFET input impedence	$R_{in} = \infty$

MOSFET Common Source
Overall Input Impedence	$R_{in} = R_{T H}$
Overall Output Impedance	$R_{o u t} = R_{D}$
Bypassed Gain	$A_{v} = \frac{- g _{m} R _{D} V _{GS}}{V _{GS}} = - g_{m} R_{D}$
Common Drain (Source Follower)	$z$
Output Impedance	$R_{o u t} = \frac{1}{g m}$

Differential Amplifier
Quiescent Current of Long Tail Pair	$I_{Q} = I_{E 1} + I_{E 2}$
Biasing	$I_{Q} = \frac{( V _{EE} ) + V _{E}}{R _{EE}}$
Collector Voltage of Grounded Long Tail Pair	$V_{C_{1}} = V_{C_{2}} = V_{CC} - I_{C} R_{C}$
Differential Gain without ground	$A_{i d} = \frac{V _{o d}}{V _{d}} = g_{m} R_{C}$
Differential Gain - Single Ended	$\frac{V _{o}}{V _{d}} = \frac{g _{m} R _{C}}{2}$
Differential Input Resistance	$R_{i d} = 2 r_{π}$
Differential Output Resistance	$R_{o d} = 2 R_{E}$
Common Mode Gain	$A_{c m} = \frac{V _{o 1}}{V _{c m}} = \frac{- β R _{C}}{r _{π} + 2 ( 1 + β ) R _{EE}} \approx \frac{- R _{C}}{2 R _{E} E}$
Common Mode Input Resistance	$R_{i c} = \frac{r _{π}}{2} + R_{EE} (1 + β)$
CMRR - Common Mode Rejection Ratio	$CMRR = \frac{A _{i d}}{A _{c m}}$
Generalised Differential Amplifier Output	$V_{o u t} = A_{i d} V_{i d} + A_{c m} V_{c m}$

Impedance Laplace
Resistor	$Z = R$
Capacitor	$Z = \frac{1}{s C} = \frac{1}{j wC} = - \frac{1}{wC} j$
Inductor	$Z = s L$
Resistor Capacitor Series	$Z = R + \frac{1}{s C}$
Resistor Capacitor Parallel	$Z = \frac{R}{1 + s CR}$
Resistor Inductor Series	$Z = R + s L$
Resistor Inductor Parallel	$Z = \frac{s L R}{R + s L}$

Op-Amps
Non-inverting Gain	$A = 1 + \frac{R 2}{R 1}$
Inverting Gain	$A = - \frac{R 2}{R 1}$

Misc
Source Regulation	$\frac{Δ V _{L}}{Δ V _{i}}$
Load Regulation	$\frac{Δ V _{L}}{V _{L e x p ec t e d}}$

Oscillators

Closed Loop Gain

$A_{c l} (jω) = \frac{v _{o}}{v _{i}} = \frac{A}{1 - A β ( jω )}$

$A_{c l}$ is the closed loop gain of the system.
$A$ is the open loop gain (with no feedback)
$β$ is the feedback fraction, that feeds back a portion of the output voltage back to the input

Loop Gain

$L (j w) = A β (j w)$ For oscillation, need unity gain, so angle $= 0°$ therefore must be real, so $β$ also must be real.

Frequency Potential Divider ( $s$ )

$\frac{V _{o}}{V _{i}} (s) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{s CR}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}}$

Frequency Potential Divider ( $j w$ )

$\frac{V _{o}}{V _{i}} (j w) = \frac{jω CR}{1 - ω ^{2} C ^{2} R ^{2} + 3 jω CR}$

Frequency of Unity Gain (0 phase shift)

$ω_{0} = \frac{1}{RC} \Rightarrow f_{0} = \frac{1}{2 π RC}$

60 Degrees of phase shift in CR network

$ω_{60°} = \frac{1}{s q r t 3 \times RC}$

Transfer function of CR Network

$\frac{v _{o}}{v _{i}} = \frac{jω RC}{1 + jω RC}$ $\frac{v _{o}}{v _{i}}$ = Gain of CR network

Transfer function of RC Network

$\frac{v _{o}}{v _{i}} = \frac{1}{1 + s CR} = \frac{1}{1 + j wCR}$ $\frac{v _{o}}{v _{i}}$ = Gain of RC network

Transfer function of Inverse Frequency potential divider ( $s$ )

$\frac{V _{o}}{V _{i}} (s) = \frac{Z _{2}}{Z _{1} + Z _{2}} = \frac{( 1 + s CR ) ^{2}}{1 + 3 s CR + s ^{2} C ^{2} R ^{2}}$

Transfer function of Inverse Frequency potential divider ( $j w$ )

$\frac{V _{o}}{V _{i}} (j w) = \frac{( 1 + j wCR ) ^{2}}{( 1 - w ^{2} C ^{2} R ^{2} ) + 3 j wCR}$

Transfer function of Frequency potential divider (Inductor) ( $s$ )

$\frac{V _{o}}{V _{i}} (s) = \frac{s L R}{( R ^{2} + s ^{2} L ^{2} ) + 3 s L R}$

Transfer function of Frequency potential divider (Inductor) ( $j w$ )

$\frac{V _{o}}{V _{i}} (j w) = \frac{j w L R}{( R ^{2} - w ^{2} L ^{2} ) + 3 j w L R}$

Frequency of Unity Gain (0 phase shift) (Inductor)

$ω_{0} = \frac{R}{L} \Rightarrow f_{0} = \frac{R}{2 π L}$

BJT Transitors

Common Emitter Forward Gain, $β$

$β = \frac{I _{C}}{I _{B}} = \frac{α}{1 - α}$

Common Base Forward current gain, $α$

$α = \frac{β}{β + 1}$

NPN Emitter Current

$I_{E} = I_{B} + I_{C} = \frac{I _{C}}{α} = I_{C} + \frac{I _{C}}{β} = (1 + β) I_{B}$

Emmitter Voltage Rule of Thumb

$V_{E} \approx \frac{V _{cc}}{10}$

Thevin Resistance Rule of Thumb

$R_{T H} = \frac{β R _{E}}{10}$

Four Resistor Bias Circuit $R_{T H}$

$R_{T H} = R_{1} // R_{2} = \frac{R _{1} R _{2}}{R _{1} + R _{2}}$

Four Resistor Bias Circuit $V_{T H}$

$V_{T H} = V_{CC} \frac{R _{2}}{R _{1} + R _{2}}$

Transconductance

$g_{m} = \frac{I _{CQ}}{V _{T}} ≊ 40 I_{CQ}$

AC BJT Analysis

Amplifier Topologies

$x$

Transistor Input Impedance

$r_{π} = \frac{β V _{T}}{I _{CQ}}$ Where $V_{T}$ = 25mV, $I_{C} Q$ = Collector current at Q point.

Gain of Collector Follower (Common Emitter) AC

$A_{v} = \frac{V _{O}}{V _{in}} = \frac{- β R _{C}}{r _{π} + ( 1 + β ) R _{E}} \approx - \frac{R _{C}}{R _{E}}$

Input Impedance of Collector Follower (Common Emitter)

$R_{in} = r_{π} + (1 + β) R_{E}$ Into the transistor

Output Impedance of Collector Follower (Common Emitter)

$R_{o u t} = R_{C}$ As current source has infinite impedance.

Emitter Follower (Common Collector)

$N / A$

High Input, low ouput impedence
High current gain
So acts as impedence trasnformer and buffer

Voltage Gain of Emitter Follower (Common Collector)

$A_{v} = \frac{R _{E} ( 1 + β )}{r _{π} + R _{E} ( 1 + β )}$ $A_{v} \approx 1$ as $r_{π} >> RE (1 + β)$ So low voltage gain, so instead current amplifier.

Current Gain of Emitter Follower (Common Collector)

$A_{i} = \frac{( 1 + β ) i _{b}}{i _{b}}$

Input Impedance of Emitter Follower (Common Collector)

$R_{in} = r_{π} + (1 + β) R_{E}$

Output Impedence of Emitter Follower (Common Collector)

$R_{o u t} = \frac{R _{E} ( r _{π} + R _{S} )}{( r _{π} + R _{S} ) + R _{E} ( 1 + β )}$ Where $R_{S}$ = source input impedance

Output Impedence of Emitter Follower (Common Collector) Simple

$R_{o u t} = \frac{r _{π}}{( 1 + β )} = \frac{1}{g _{m}}$ Where $R_{S}$ = source input impedance

MOSFETs DC

No current through gate in MOSFET (as voltage controlled) (infinite input impedence) $i_{G} = 0 \to i_{D} = i_{S}$

Stages

$V_{D S (s a t)} = V_{GS} - V_{TN}$

Cut off (no current flows, $V_{GS} < V_{TN})$
Linear $V_{D S} > V_{D S (s a t)}$
Saturation $V_{D S} < V_{D S (s a t)}$

Where $V_{TN}$ = Threshold Voltage

Linear Region Drain Current

$i_{D} = K_{n} [2 (V_{GS} - V_{TN}) - V_{D S}] V_{D S}$

$for 0 < V_{D S} < (V_{GS} - V_{TN})$ , where $K_{n}$ = transconductance constant

Saturation Drain Current

$i_{D} = K_{n} (V_{GS} - V_{TN})^{2}$

Saturation Drain Current -> VGS

$V_{GS} = (\frac{i _{d}}{K _{n}})^{1/2} + V_{TN}$

Small Signal Model

$i_{D} = g_{m} \times V_{GS}$

Transconductance $g_{m}$

$g_{m} = \frac{2 I _{D Q}}{V _{GS} - V _{TN}}$

MOSFET Bias Network

$sqrt i_{D} = \frac{- K _{n}^{- 1/2} \pm ( K _{n}^{- 1} - 4 R _{s} ( V _{TN} - V _{T H} ) ) ^{\frac{1}{2}}}{2 R _{s}}$ Must check the two different values to see which ones are valid solutions.

$V_{GS} > V_{TN} and V_{D S} > V_{D S (s a t)}$

MOSFET input impedence

$R_{in} = \infty$ As no current flows into gate

MOSFET Common Source

Similar to BJT common emmitter amplifier

Overall Input Impedence

$R_{in} = R_{T H}$ As two gate bias resistors act as impedances to input signals. Therefore used over BJTs when high impedence required.

Is actually in parallel with source (input) impedence $R_{s}$ if it has it.

Overall Output Impedance

$R_{o u t} = R_{D}$ What the load resistor sees.

As current source has infinite impedence, therefore $R_{D}$ is the only impedence seen.

Unless there is an $R_{l o a d}$ which would be in parallel with $R_{D}$ .

Bypassed Gain

$A_{v} = \frac{- g _{m} R _{D} V _{GS}}{V _{GS}} = - g_{m} R_{D}$

Common Drain (Source Follower)

$z$

Output Impedance

$R_{o u t} = \frac{1}{g m}$

Differential Amplifier

Long tail pair:

Modes Can operate in two modes.

Differential (Amplfies Difference between two input signals)
Common mode (Works similar to regular BJT amp)

Common Mode Same signal is connected to both input terminals.

Ideal differential amp rejects common mode input, but not realistic
Defined by CMRR

Better amps, have high ratio of differnetial to common gain, AKA Common Mode Rejection Ratio (CMRR).

Quiescent Current of Long Tail Pair

$I_{Q} = I_{E 1} + I_{E 2}$ Current through shared emitter resistor, $R_{EE}$ .

Biasing

$I_{Q} = \frac{( V _{EE} ) + V _{E}}{R _{EE}}$ $V_{1}$ and $V_{2}$ are grounded, therefore collector voltages are the same.

Collector Voltage of Grounded Long Tail Pair

$V_{C_{1}} = V_{C_{2}} = V_{CC} - I_{C} R_{C}$ And for matched transistors, $V_{C_{1}} - V_{C_{2}} = 0$ .

Differential Gain without ground

$A_{i d} = \frac{V _{o d}}{V _{d}} = g_{m} R_{C}$ Not really used

Differential Gain - Single Ended

$\frac{V _{o}}{V _{d}} = \frac{g _{m} R _{C}}{2}$

Differential Input Resistance

$R_{i d} = 2 r_{π}$

Differential Output Resistance

$R_{o d} = 2 R_{E}$

Common Mode Gain

$A_{c m} = \frac{V _{o 1}}{V _{c m}} = \frac{- β R _{C}}{r _{π} + 2 ( 1 + β ) R _{EE}} \approx \frac{- R _{C}}{2 R _{E} E}$

Common Mode Input Resistance

$R_{i c} = \frac{r _{π}}{2} + R_{EE} (1 + β)$

CMRR - Common Mode Rejection Ratio

$CMRR = \frac{A _{i d}}{A _{c m}}$

Generalised Differential Amplifier Output

$V_{o u t} = A_{i d} V_{i d} + A_{c m} V_{c m}$ Both common mode and differential mode input signals are factored in.

Impedance Laplace

Resistor

$Z = R$

Capacitor

$Z = \frac{1}{s C} = \frac{1}{j wC} = - \frac{1}{wC} j$

Inductor

$Z = s L$

Resistor Capacitor Series

$Z = R + \frac{1}{s C}$

Resistor Capacitor Parallel

$Z = \frac{R}{1 + s CR}$

Resistor Inductor Series

$Z = R + s L$

Resistor Inductor Parallel

$Z = \frac{s L R}{R + s L}$

Op-Amps

Non-inverting Gain

$A = 1 + \frac{R 2}{R 1}$

Inverting Gain

$A = - \frac{R 2}{R 1}$

Active Filter Gain

$\frac{V _{o u t}}{V _{in}} = A (jω) = - \frac{Z _{2}}{Z _{1}}$

Active Filter Gain, Z2 = R2 || C

$A (jω) = - \frac{R _{2} / R _{1}}{1 + jω C R _{2}}$

Low Pass filter
Cutoff where $A (j w) = 1 + j$ = $\frac{1}{2 π R _{2} C}$ Hz

Misc

Source Regulation

Fraction of change in load and input voltage $\frac{Δ V _{L}}{Δ V _{i}}$

Load Regulation

Fraction of change in load to expected $\frac{Δ V _{L}}{V _{L e x p ec t e d}}$

ES2C6

Control Systems

A control system contains processes with the purpose of obtaining a desired output given a specific input
For example, consider a lift which rises from the ground to fourth floor:
- Pressing the button is a step input
- The lift rising is a transient response
Two major performance measures
- Steady-state error
- Transient response

Open loop control system configurations have an input that feeds directly into an output
- Cannot compensate for any disturbance
Closed loop system feed the output signal back into the controller by subtracting it from the input
- Error drives controller to make corrections

General closed loop feedback control:

To design control systems, a system model is often needed. There are two general approaches:

From first principles
- Uses known physical properties and laws (newton's laws, kirchhoff's laws, etc)
Data-driven
- Identifies the system based on data collected
Models usually take the form of a differential equation which describes the systems dynamics
Used for simulation, control design, reference tracking, disturbance rejection, etc

Transfer Functions

Transfer functions give a ratio of output to input for a system.

Consider an $n$ th order linear differential equation, where $c (t)$ is the output, $r (t)$ the input, and $a_{i}$ and $b_{i}$ are the model parameters:

$a_{n} \frac{d ^{n} c ( t )}{d t ^{n}} + a_{n - 1} \frac{d ^{n - 1} c ( t )}{d t ^{n - 1}} + ... + a_{0} c (t) = b_{m} \frac{d ^{m} r ( t )}{d t ^{m}} + a_{m - 1} \frac{d ^{m - 1} r ( t )}{d t ^{m - 1}} + ... + b_{0} r (t)$

Taking laplace transforms an putting into a ratio of input over output:

$G (s) = \frac{C ( s )}{R ( s )} = \frac{b _{m} s ^{m} + b _{m_{-} 1} s ^{m - 1} + ... + b _{0}}{a _{n} s ^{n} + a _{n_{-} 1} s ^{n - 1} + ... + a _{0}}$

The transfer function of multiple systems $G (s) = G_{1} (s) G_{2} (s) G_{3} (s)$
Working with transfer functions is easier than ODEs as it they don't involve any differentials.

Example

Given the transfer function $G (s) = \frac{1}{s + 2}$ , find the response to a unit step input $r (t) = u (t)$ :

$C (S) = \frac{R ( s )}{s + 2}$

Transfer function of the step input $R (s) = \frac{1}{s}$ , so:

$C (S) = \frac{1}{s + 2} \frac{1}{s} = \frac{1}{2 s} - \frac{1}{2 ( s + 2 )}$

Taking inverse laplace transforms:

$c (t) = (\frac{1}{2} - \frac{1}{2} e^{- 2 t}) u (t)$

Modelling

Two approaches to modelling a system:

Physical modelling
Data-driven modelling

Models are developed so we can obtain transfer functions for further system analysis. Focusing on mainly how to build physical models of systems from first principles, there are three main steps:

Structuring the problem
- Intended use of the model
- Inputs/outputs
- Other parameters
- How do subsystems interact
- Draw a block diagram
Formulate the basic equations
- Describe relationships between variables
- Write down conservation laws
- Write down relevant relationships
Formulate the ODE
- Express time derivatives of relevant variables
- Express outputs as function of inputs

There are two main physical systems relevant to this module, electrical, and rotational mechanical. The properties of the main components of these systems are shown in the tables below:

Rotational Systems

In a rotational system, we are interested in the relationship between applied torque and angular displacement. The sum of the applied torque is the sum of the moments of all the components. For example, obtain the equations of motion for the system shown:

The system has an input torque at $θ_{1}$ , two inertias $J_{1}$ and $J_{2}$ , the two bearings act as dampers $D_{1}$ and $D_{2}$ , and the torsion acts as a spring $K$ :

For inertia $J_{1}$ :

$K θ_{1} (t) + D_{1} \dot{θ}_{1} (t) + J_{1} \ddot{θ}_{1} (t) - K θ_{2} (t) = T (t)$

And for $J_{2}$ :

$K θ_{2} (t) + D_{2} \dot{θ}_{2} (t) - J_{2} \ddot{θ}_{2} (t) - K θ_{1} (t) = 0$

Note that for both these equations the form is [sum of impedances connected to motion] - [sum of impedance between motions] = [sum of applied torque at motion]. This general form can be applied to any rotational (or electrical) modelling problem.

Electrial Systems

Obtain the voltage-current relationship of the following electrical system:

Using KVL for loop 1:

$L \frac{d I _{1} ( t )}{d t} + R_{1} I_{1} (t) - L \frac{d I _{2} ( t )}{d t} = V (t)$

And loop 2:

$L \frac{d I _{2} ( t )}{d t} + R_{2} I_{2} (t) + \frac{1}{C} \int_{0}^{t} I_{2} (τ) d τ - L \frac{d I _{1} ( t )}{d t} = V_{c} (t)$

Again, noting that the form of the equation is the same as rotational: [sum of impedances around loop] - [sum of impedance between loops] = [sum of applied voltage]

Block Diagram Algebra

A subsystem can be represented as a block with an input,output, and transfer function. Multiple blocks are connected to form systems, which involve summing junctions and pickoff points:

There are a few familiar forms that always pop up in block diagrams, that can be reduced down into simpler blocks:

Cascade Form

In a cascade form, each signal is the product of the input and the transfer function. The transfer functions of blocks in a cascade are multiplied to form a single function.

Parallel Form

In a parallel form, there is a single input, and the output is the sum of the outputs of all the subsystems.

Feedback Form

Feedback form is the most important form encountered in control systems:

This can be reduced to a single transfer function:

$\frac{C ( s )}{R ( s )} = \frac{G ( s )}{1 \pm G ( s ) H ( s )}$

Other Identities

Moving left past a summing junction:

Moving right past a summing junction:

Moving left past a pickoff point:

Moving right past a pickoff point:

Example

The goal is to rearrange diagrams into familiar forms that can then be collapsed

Forming the equivalent parallel system:

Collapsing the cascade:

We now have a single transfer function that is the ratio of output/input for the entire system.

Poles and Zeros

A system can be analysed to obtain time response characteristics

Transient response is the initial response that takes place over a time before reaching steady state
Steady state response is the final response of the system after the transient has diminished

Conside the general form of the transfer function:

$G (s) = \frac{( s + z _{1} ) ( s + z _{2} ) \dots ( s + z _{n} )}{( s + p _{1} ) ( s + p _{2} ) \dots ( s + p _{n} )}$

Poles are the roots of the denominator
- The values of $s$ that make $G (s)$ infinite
Zeros are the roots of the numerator
- The values of $s$ that make $G (s)$ zero

As $s = jω + σ$ is a complex number, poles and zeros can be plotted on an argand diagram. If $G (s) = \frac{s + 2}{s + 5}$ , then the transfer function has a pole at $s = - 5$ and a zero at $s = - 2$ :

To further analyse this transfer function, we can give it an input step to analyse it's step response. The overall transfer function is now given by:

$C (s) = \frac{1}{s} \frac{s + 2}{s + 5} = \frac{2}{5 s} + \frac{3}{5 ( s + 5 )}$

$c (t) = \frac{2}{5} + \frac{3}{5} e^{- 5 t}$

This shows that:

The pole of the input function generates the form of the forced response (constant term)
- Step input has a pole at the origin, which generates a step function at the output
The pole of the transfer function generates the form of the natural response
- $s = - 5$ gave the form $e^{- 5 t}$
The pole of the real axis generates an exponential response of the form $e^{s t}$
- The farther to the left a pole is, the faster the transient decays
- Poles to the right of the imaginary axis will generate unstable responses
Zeros and poles generate amplitudes for both forced and natural responses

Stability

Stability is the most system specification in control design
Unstable systems are useless
The definition of stability used here is that of a linear time invariant system
- Any that can be represented as a transfer function

The response of any system can be expressed as the sum of it's forced and natural responses:

A system is stable if the natural response decays to zero as $t \to \infty$
A system is unstable if the natural response grows without bound ( $\to \infty$ ) as $t \to \infty$
A system is marginally stable if the response is constant or oscillatory

The stability of a system is defined by the poles of it's closed loop transfer function:

If the poles are all negative, the system is stable and decays exponentially
An unstable system has at least one negative pole
If a pole lies on the imaginary axis then the system is oscillatory

Transient Response Characteristics

The output response of a system for any given input depends on it's order. First and second order systems respond differently to the same input.

First Order

A first order system only has one pole. A general first order system with one pole and no zeros, subject to a unit step response:

$C (S) = R (S) G (S) = \frac{1}{s} \frac{a}{s + a} = f r a c a s (s + a)$

$c (t) = c_{f} (t) + c_{n} (t) = 1 - e^{- a t}$

Note that there is only a single parameter, $a$ that describes the dynamics of this system.

When $t = 1/ a$ , then $c (t) \approx 0.63$
- This is the time constant, $τ$ of the system
- The time it takes for the step response to rise to 63% if it's final value
- The further the pole from the imaginary axis, the faster the transient response and the lower the time constant
Rise time $t_{r}$ is the time for the response to go from 10% to 90%
- $t_{r} \approx 2.2/ a$
Settling time $t_{s}$ is the time for the response to reach, and stay within, 2% of it's final value
$t_{s} \approx 4/ a$

Often it is not possible to obtain the transfer function of a system analytically, so we can obtain a time constant and other system parameters from data/graphs. The graph below shows a first order step response:

$C (s) = \frac{1}{s} \frac{K a}{s + a}$

The final value of the response is 0.72, so the time constant is where the response reaches roughly $0.63 \times 0.72 = 0.45$ , which is at about 0.13s. Hence $a = \frac{1}{0.13} = 7.7$ . To find $K$ , we can use the final value theorem:

$s \to 0 lim s C (s) = s \to 0 lim \frac{1}{s} \frac{a}{s + a} = \frac{a}{s ( s + a )} = 0.72 \Rightarrow K = 0.72$

Second Order

A second order system exhibits a wider range of responses than first order. A change in parameter changes the shape of the response entirely. There are four kinds of 2nd order response:

Overdamped response has two poles $σ_{1}$ and $σ_{2}$ , both on the real axis, which exhibit the combined exponential response of the two poles.

$c (t) = K_{1} e^{σ_{1} t} + K_{2} e^{σ_{2} t}$

Underdamped response has a conjugate pair of complex poles $- σ \pm jω$ , with the real part exhibiting exponential response, and the imaginary part sinusoidal.

$c (t) = A e^{- σ t} cos (ω t - ϕ)$

Undamped response has two imaginary poles, $\pm jω$ , exhibiting purely sinusoidal response.

$c (t) = A cos (ω - ϕ)$

Critically damped response has two repeated real poles, $- σ$ , so exhibits an exponential response, and an exponential response multiplied by time:

$c (t) = K_{1} e^{- σ t} + K_{2} t e^{- σ t}$

There are two other meaningful parameters of a 2nd order response:

Natural frequency $ω_{n}$ is the frequency of oscillation of the system with no damping
Damping ratio $ζ$ is the ratio of exponential decay frequency to natural frequency

A general 2nd order transfer function is given by:

$G (s) = \frac{ω _{n}^{2}}{s ^{2} + 2 ζ ω _{n} s + ω _{n}^{2}}$

The damping ratio $ζ$ determines the characteristics of the system response:

There are additional metrics that describe the response:

Settling time $t_{s} = \frac{4}{ζ ω _{n}}$
Peak time $t_{p}$ is the time required to reach the first or maximum peak of the response
Percentage overshoot % $OS$ is the amount that the response overshoots the steady state value at it's peak, expressed as a percentage of the steady state value
Rise time cannot be trivially defined for a 2nd order system

$t_{p} = \frac{π}{ω _{n} 1 - ζ ^{2}}$

$OS = \frac{c ( t _{p} ) - 1}{1} \times 100 = exp (- \frac{ζ π}{1 - ζ ^{2}}) \times 100$

The damping ratio can also be defined in terms of these parameters:

$ζ = - \frac{ln ( OS /100 )}{π ^{2} + ( ln ( OS /100 ) ) ^{2}}$

Example

Find the damping ratio, natural frequency, damping characteristics, peak time, overshoot, settling time of:

$G (s) = \frac{100}{s ^{2} + 15 s + 100}$

$ω_{n} = 100 ζ = \frac{15}{2 \times 10} = 0.75$

As $0 < ζ < 1$ , this is an underdamped system.

$t_{p} = \frac{π}{10 1 - 0.7 5 ^{2}} = 0.475 t_{s} = \frac{4}{10 \times 0.75} = 0.533$

$OS = exp (- \frac{0.75 π}{1 - 0.7 5 ^{2}}) \times 100 = 2.84%$

Steady State Response Characteristics

Steady state response is the final response of the system after the transient has diminished. The primary design focus with control systems is around reducing steady state error, the difference between the input and the output ( $e_{ss} = r (t) - c (t)$ ). In the graph below, output 1 has zero error, while output 2 has finite steady state error. It is possible for a system to have infinite steady state error if it continues to diverge from the input.

For three different kinds of test input, the corresponding steady state errors are given as

Step input:

$e_{step} (\infty) = \frac{1}{1 + lim _{s \to 0} G ( s )} = \frac{1}{1 + K _{p}}$

Ramp input:

$e_{ramp} (\infty) = \frac{1}{1 + lim _{s \to 0} s G ( s )} = \frac{1}{K _{v}}$

Parabolic input:

$e_{ramp} (\infty) = \frac{1}{1 + lim _{s \to 0} s ^{2} G ( s )} = \frac{1}{K _{a}}$

$K_{p}$ , $K_{v}$ , and $K_{a}$ are static error constants associated with different input types.

In order to acheive zero steady state error for a step input the denominator of $G (s)$ has to be 0 as $s \to 0$ , which is only possible if $n \geq 1$ in the equation below:

$G (s) = \frac{( s + z _{1} ) ( s + z _{2} ) ...}{s ^{n} ( s + p _{1} ) ( s + p _{2} ) ...}$

Meaning that there must be at least one pure integrator (multiple of $1/ s$ ) present in $G (s)$ . For ramp and parabolic input, the same applies for $n \geq 2$ and $n \geq 3$ .

PID Controllers

PID controllers are a control method that consits of a proportional, integral, and derivative of an error input $e (t)$ :

$u (t) = K_{p} e (t) + K_{i} \int e (t) d t + K_{d} \frac{d e ( t )}{d t}$

$\frac{U ( s )}{E ( s )} = K_{p} + \frac{K _{i}}{s} + s K_{d}$

PID controllers are widely used as they are robust, versatile, and easy to tune. The tuning parameters are the three constants, $K_{p}$ , $K_{i}$ , and $K_{d}$

Increasing the proportional term increases the output for the same level of error
- Causes the controller to react harder to errors so will react more quickly but overshoot more
- Reduces steady-state error
The inclusion of an integrator helps to eliminate steady-state error
- If there is a persistent error the integrator builds and increases the control signal to reduce the error
- Can make the system respond slower and be more oscillatory
The derivative term allows the controller to anticipate error
- The control signal can become large if the error is sloping steeply upwards, irrelevant of magnitude
- Adds damping to the system to decrease overshoot
- Does not affect steady-state error

	Rise time	Overshoot	Settling time	Steady-state error
$K_{p}$	Decrease	Increase	Small change	Decrease
$K_{i}$	Decrease	Increase	Increase	Decrease
$K_{d}$	Small change	Decrease	Decrease	No change

PID Tuning

Tuning a PID controller can be done easily if a model of the system can be derived, as then analytical techniques can be applied to determine the ideal parameters. If a model cannot be obtained, then an experimental approach is required. The Ziegler-Nichols method is one common approach. The three constants are determined based upon the transient response characteristics of a given system, and there are two different methods, both aiming to give less than 25% overshoot.

Note that for the Ziegler-Nichols method, integral and derivative gains are used, where $K_{i} = K_{p} / T_{i}$ and $K_{d} = K_{p} T_{d}$

The first method involves experimentally obtaining a unit step input
If the system involved neither an integrator, or dominant complex poles then the output will look like an s-shaped curve
- This is when this method applies
- If this method doesn't apply, the system likely has a built in integrator, and the 2nd method is needed
The curve is characterised by two parameters, the delay time $L$ and time constant $T$ :

The transfer function can then be approximated by:

$\frac{C ( s )}{U ( s )} = \frac{K e ^{- L s}}{T s + 1}$

And the PID constants are set according to the following:

Controller type	$K_{p}$	$T_{i}$	$T_{d}$
P	$T / L$	$\infty$	0
PI	$0.9 T / L$	$L /0.3$	0
PID	$1.2 T / L$	$2 L$	$0.5 L$

For the second method:

Set $T_{i} = \infty$ and $T_{d} = 0$ . Using $K_{p}$ only
Increase the constant to a critical value at which the output exhibits sustained oscillation
- If this does not happen for any $K_{p}$ , this method is not applicable
The critical gain, $K_{cr}$ and corresponding critical oscillation period $P_{cr}$ are experimentally determined
These are then used to set the other constants as per the following:

Controller type	$K_{p}$	$T_{i}$	$T_{d}$
P	$0.5 K_{cr}$	$\infty$	0
PI	$0.45 K_{cr}$	$0.83 P_{cr}$	0
PID	$0.6 K_{cr}$	$0.5 P_{cr}$	$0.125 P_{cr}$

Sometimes further tuning is required beyond these two methods to fine-tune the parameters to gain a response suitable to the application.

Drive Systems

Rotary Systems

A rotary system is a system in which the load is rotating
A direct drive system is one in which the motor is directly driving a load through a shaft
- No other transmission system other than the shaft
- All components have the same angular velocity
Inertia is the rotary equivalent of mass
Torque is the rotary equivalent of force.

System parameters:

$J_{M}$ = motor inertia, kgm $^{2}$
$J_{L}$ = load inertia, kgm $^{2}$
$T_{L}$ = torque load, Nm
$T_{M}$ = motor torque, Nm
$B$ = shaft damping, Nm/rad/s
$\dot{θ}$ = angular velocity, rad/s

The system equation for how much torque the motor must provide is:

$(J_{M} + J_{L}) \ddot{θ} = T_{M} - B \dot{θ} - T_{L} T_{M} = (J_{M} + J_{L}) \ddot{θ} + B \dot{θ} + T_{L}$

The system's total moment of inertia $(J_{M} + J_{L})$ is the sum of the inertias in the transmission system and load referred to the motor shaft, plus the inertia of the motor.

The inertias here can be summed as the have the same angular velocity
The load will accelerate or decelerate depending on whether the applied torque is greater than or less than the required driving torque
For an accelerating system, the motor must overcome thr torque load, frictional forces, and the total inertia of the system
For a decelerating system, the frictional forces and torque load work to slow system down, but system inertia must still be overcome

Example

Using the same system shown above with parameters:

$J_{M} = 2 \times 1 0^{- 3}$
$J_{L} = 1 \times 1 0^{- 2}$
$T_{L} = 0.5$
$B = 1 \times 1 0^{- 2}$

$T_{M} = (J_{M} + J_{L}) \ddot{θ} + B \dot{θ} + T_{L} = (2 \times 1 0^{- 3} + 1 \times 1 0^{- 2}) \ddot{θ} + 1 0^{- 2} \dot{θ} + 0.5$

To rotate the load from stationary to 20 rad/s, at an acceleration of 10 rad/s $^{2}$ , the torque delivered is:

$T_{M} = 0.62$ Nm at $t = 0$
$T_{M} = 0.82$ Nm at $t = 2$
$T_{M} = 0.7$ Nm at $t > 2$
It can be seen that for a given motion trajectory, the maximum torque load was when the system was still accelerating, but had reached its final velocity
Decreasing the acceleration will reduce the maximum torque requirement, which will reduce load on the motor
More torque is required to accelerate a load than decelerate it due to friction
If there is a torque load remaining when the load is stationary, the motor must compensate for this

Moments of Inertia

An object's moment of inertia is determined by it's shape, and the axis through which it rotates. For a point mass the moment of inertia $J = m r^{2}$ , where $m$ is the mass and $r$ the perpendicular distance from the center of mass to the axis. Infinite infinitesimally small masses can be considered to calculate the moment of inertia of an entire body through integration.

Fortunately, this is rarely needed as the inertias of common shapes through all 3 axes are given:

For rotation about an axis other than one through the centre of gravity, the parallel axis theorem can be used. The parallel axis theorem states that the moment of inertia about any axis $J$ is equal to the moment of inertia about an parallel axis through the centre of gravity $J_{G}$ , plus the mass of the body $m$ times the square distance between the two axes $d^{2}$ :

$J = J_{G} + m d^{2}$

Example 1

The body shown is modelled as two 30kg spheres with radii 0.1m, connected with a slender rod of length 1m with weight 10kg. The whole body rotates about the $z$ axis, shown. Calculate the total inertia.

First the moment of inertia of the rod about the $z$ axis:

$J_{r, z} = \frac{1}{12} m_{r} d_{r}^{2} = \frac{10}{12} = \frac{5}{6}$

The moment of inertia of the spheres requires the parallel axis theorem:

$J_{z, s} = \frac{2}{5} m_{s} R_{s}^{2} + m_{s} d^{2} = \frac{2}{5} (30) (0.1)^{2} + (30) (0.6)^{2} = 10.92$

Total inertia:

$J_{z, t} = 2 J_{z, s} + J_{r, z} = \frac{5}{6} + 2 \times 10.92 = 22.67$

Example 2

Derive an equation for motor torque in the system below

$J_{M} = 2 \times 1 0^{-} 3$
$J_{L} = 1 0^{- 2}$
$J_{S} = 1 0^{- 3}$
$T_{L} = 0.5$
$B = 1 0^{- 2}$
Rod length $l_{r} = 0.5$
Rod mass $m_{r} = 0.2$
Encoder radius $r_{e} = 0.1$
Encoder length $l_{e} = 0.05$
Encoder mass $m_{e} = 0.1$
Shaft radius $r_{s} = 0.15$

Inertia of encoder:

$J_{e} = \frac{m _{e} r _{e}^{2}}{2} = \frac{0.1 \times 0. 1 ^{2}}{2} = 5 \times 1 0^{- 4}$

Inertia of rod using the parallel axis theorem, with the axis through it's centre of mass halfway up it's length parallel to the shaft:

$J_{r} = J_{G} + m d^{2} = \frac{m _{r} l _{r}^{2}}{12} + m_{r} (r_{s} + \frac{l _{r}}{2})^{2} = \frac{0.2 \times 0. 5 ^{2}}{12} + 0.2 (0.15 + \frac{0.5}{2})^{2} = 3 \times 1 0^{- 2}$

The total inertia:

$J_{T} = J_{M} + J_{L} + J_{S} + J_{r} + J_{r} = 4.95 \times 1 0^{- 2}$

Deriving the equation for motor torque and then substituting in:

$T_{m} = J \ddot{θ} + B \dot{θ} + T_{L} = (4.95 \times 1 0^{- 2}) \ddot{θ} + (1 0^{- 2}) \dot{θ} + 0.5$

Geared Rotary Systems

Connecting a load to a motor via a gearbox allows a motor to drive higher torque loads, at the expense of reducing the angular velocity (or vice versa). Analysis of such systems is more complex as there are different velocities involved. Systems can be reduced to an equivalent direct drive system by referring torques accross the gearbox.

Assuming a gearbox is 100% efficient, input and output power $P = T ω$ are the same
Angular velocity is decreased and torque increased by a factor of $N$
- If $N < 1$ , the inverse happens
The gear ratio $N$ is defined as the number of teeth on output gear $N_{2}$ over the number of teeth on the input gear $N_{1}$
- $N = N_{2} / N_{1}$
The sign of the output is determined by the structure of the gearbox, two gears will rotate in opposite directions
- Three gears in chain will rotate in the same direction

$T_{out} = \pm N T_{in} \ddot{θ}_{out} = \pm \frac{θ ¨ _{in}}{N} \dot{θ}_{out} = \pm \frac{θ ˙ _{in}}{N}$

In general, terms reflected across a gear system are:

$J_{ref} = \frac{J}{N ^{2}} B_{ref} = \frac{B}{N ^{2}} T_{ref} = \frac{T}{N}$

Example 1

A geared rotary system is shown below. Derive an equation for the torque delivered by the motor

$J_{M} = 2 \times 1 0^{-} 3$
$J_{L} = 1 0^{- 2}$
$T_{L} = 0.5$
$B_{1} = 1 0^{- 2}$
$B_{2} = 1 0^{- 3}$
$N_{1} = 5$
$N_{2} = 20$

Total inertia is the motor inertia plus the load inertia reflected across the gearbox

$J_{t} = J_{M} + \frac{J _{L}}{N ^{2}} = 2 \times 1 0^{-} 3 + \frac{1 0 ^{- 2}}{1 0 ^{2}} = 2.1 \times 1 0^{- 3}$

Reflecting the damping and torque load too:

$B_{t} = B_{1} + \frac{B _{2}}{N ^{2}} = 1 0^{- 3} + \frac{1 0 ^{- 2}}{1 0 ^{2}} = 1.1 \times 1 0^{- 3}$

$T_{L r} = \frac{T _{L}}{N} = \frac{0.5}{10} = 5 \times 1 0^{- 2}$

Final equation:

$T_{m} = J \ddot{θ} + B \dot{θ} + T_{L} = (2.1 \times 1 0^{- 3}) \ddot{θ} + (1.1 \times 1 0^{- 3}) \dot{θ}$

Gear Ratios

The chosen gear ratio affects the behaviour of the system, so the gear ratio is an important design choice. Minimising the peak torque requirement of the motor is important and can be done through the gear ratio.

In the example above, the peak torque when accelerating to 20 rad/s at 10 rad/s $^{2}$ is 0.48 Nm. This is still less than direct drive, but there is an optimal gear ratio that minimises the strain on the motor. Through differentiation, this is found to be:

$N^{*} = \frac{J _{L}}{J _{M}}$

The minimum torque in a geared assembly with no torque load is achieved when the reflected load inertia is equal to the motor inertia. There are a few reasons why this may not be achevied, however:

The $J_{M}$ term also must include additional components such as encoders, couplings, etc, each of which require energy input which is not then available to the load
The gears also have an inertia which represents a loss factor as torque is required to turn these
Off the shelf gears come in finite configurations so there may not be available components which match the theoretical optimum

Gear ratios may also be optimised to reduce the angular velocity and power of the motor, which may be a more desirable outcome. In practice, either acceleration or torque will be optimised for, or a compromise between the two must be made.

Example 2

For the geared system with a torque load shown below, find the gear ratio that minimises the torque delivered by the motor

$J_{L} = 3 \times 1 0^{- 1}$
$J_{M} = 2.4 \times 1 0^{- 4}$
$\dot{θ}_{L max} = 25$
$t_{acc} = 0.1$
$T_{L} = 100$

The motor torque is the acceleration times total inertia, plus referred torque load

$T_{M} = \ddot{θ}_{L} N (J_{M} + \frac{J _{L}}{N ^{2}}) + \frac{T _{L}}{N}$

Rearranging for acceleration:

$\ddot{θ}_{L} = \frac{T _{M} - T _{L} / N}{N ( J _{M} + \frac{J _{L}}{N ^{2}} )}$

The addition of a constant torque load changes the optimal gear ratio, which is now given by:

$\frac{\partial T _{M}}{\partial N} = \frac{\partial}{\partial N} (\ddot{θ}_{L} N (J_{M} + \frac{J _{L}}{N ^{2}}) + \frac{T _{L}}{N}) = 0$

$N^{*} = \frac{θ ¨ _{L} J _{L} + T _{L}}{θ ¨ _{L} J _{M}}$

This is the optimal gear ratio for a geared rotary system with a constant torque load

$\ddot{θ}_{L} = \frac{θ ˙ _{L max}}{t _{acc}} = \frac{25}{0.1} = 250$

$N * = \frac{250 \times 3 \times 1 0 ^{- 1} + 100}{250 \times 2.4 \times 1 0 ^{-} 4} = 54$

$T_{M}^{*} = (J_{M} + \frac{J _{L}}{( N ^{*} ) ^{2}}) + \frac{T _{L}}{N ^{*}} = (2.4 \times 1 0^{- 4} + \frac{3 \times 1 0 ^{- 1}}{5 4 ^{2}}) + \frac{100}{54} = 6.48$

Torque Loads

There are 4 main types of torque loads:

Windage torque $T_{w}$
- Motor is driving a component that moves fluid such as a propeller in water, or a fan
- Torque load is proportional to square of the speed of the motion
  - $T_{w} = k \ddot{θ}^{2}$
- Can be useful, such as a fan
- Can be considered a loss, such as a motor doing work to move air when it should be rotating a shaft
Electromagnetic torque
- Exists in motors because that's how motors work
- If the same machine is being driven mechanically to generate electricity, ie in a generator, electromagnetic torque must be overcome
- $K_{i} = J \ddot{θ} + Bθ + T_{L}$ is the EM torque generated from input electrical energy
Resistive torque
- Any mechanical resistance to the torque, such as overcoming gravity by lifting a mass with a pulley
- Any resistive force seen by the motor as torque
Frictional torque
- Any two moving surfaces in contact
- Two models of friction
- Coulomb friction
- Constant independent of velocity
  - Coefficient $μ_{c}$ multiplied by the sign of the velocity such that it always resists motion
  - $T_{c} = μ_{c} sign (\dot{θ})$
- Viscous friction
  - Coefficient multiplied by velocity
  - $T_{v} = μ_{v} \dot{θ}$
- Both models summed to give a more accurate frictional model
- Values of constants can be found experimentally

Motion Profiles

Most rotary and linear systems can be categorised as either:

Incremental Motion
- Repetitive motion between two positions
- Time and distance are important
- Velocity is secondary
- For example, pick and place
- A conveyor belt that has stop/start behaviour
Constant Motion
- Velocity and distance are more important
- A machining operation such as CNC milling
- A conveyor belt that reaches a fixed velocity and keeps going

There are four types of motion profiles:

Triangular
Trapezoidal (the only examinable one)
Cosine
Polynomial

They are defined by:

Acceleration time $t_{acc}$
The time spent at constant velocity, slew time $t_{s}$
Deceleration time $t_{dec}$
Total motoring time $t_{M}$

The beginning and ends of the time are dentoed $t_{a}$ and $t_{b}$ , where $t_{b}$ and $t_{c}$ are the beginning and end of the slew time.

Additionally, $k$ is a value whch is defined as the fraction of the total runtime for which velocity is constant:

$k = \frac{t _{s}}{t _{M}} = \frac{t _{M} - t _{a} cc - t _{d} ec}{t _{M}}$

Trapezoidal Motion

We want to define the acceleration, velocity, and position in the three distinct time periods: accelerating $[t_{a}, t_{b}]$ , constant velocity $(t_{b}, t_{c}]$ , and decelerating $(t_{c}, t_{d}]$ .

$θ (t) = ⎩ ⎨ ⎧ \frac{1}{2} α t^{2} \frac{1}{2} α t_{b}^{2} + α t_{b} (t - t_{b}) \frac{1}{2} α t_{b}^{2} + α t_{b} (t_{c} - t_{b}) + α t_{b} (t - t_{c}) + α t_{c} (t - t_{c}) - \frac{1}{2} α (t^{2} - t_{c}^{2}) t \in [t_{a}, t_{b}] t \in (t_{b}, t_{c}] t \in (t_{c}, t_{d}]$

$\dot{θ} (t) = ⎩ ⎨ ⎧ α t α t_{b} α t_{b} - α (t - t_{c}) t \in [t_{a}, t_{b}] t \in (t_{b}, t_{c}] t \in (t_{c}, t_{d}]$

$\ddot{θ} (t) = ⎩ ⎨ ⎧ α 0 - α t \in [t_{a}, t_{b}] t \in (t_{b}, t_{c}] t \in (t_{c}, t_{d}]$

We can also define the time periods with respect to $k$ :

$t_{a} = 0$

$t_{b} = t_{a} + \frac{1}{2} (t_{M} - k t_{M}) = t_{a cc}$

$t_{c} = t_{b} + k t_{M}$

$t_{c} = t_{c} + t_{b} = t_{M}$

And the max velocity/acceleration:

$\dot{θ}_{ma x} = \frac{2 L}{( 1 + k ) t _{M}}$

$\ddot{θ}_{ma x} = \pm (\frac{4}{1 - k ^{2}}) (\frac{L}{t _{M}^{2}}) = \pm α$

Gearboxes

Rotary transmission systems (gearboxes) used multiple gears compounded together with intermediate shafts

Driven gears are rotated by another gear
Driver gears are rotated by a shaft
Used where higher gear ratios are needed

The gear ratio for compound gears like this:

Gears 2,4,6 are driven gears
Gears 1,3,5 are driver gears

$N = \frac{ω _{i}}{ω _{o}} = \frac{product of teeth on driven gears}{product of teeth on driver gears}$

Worm and Wheel

A worm and wheel gearbox changes the axis of rotation and provides a high gear ratio

The worm drives the wheel
- Wheel cannot drive worm
The lead is the distance the worm moves forward in one revolution
- $L = N_{1} p_{a}$
- $N$ is teeth on worm, $p_{a}$ is axial pitch in meters
The axial pitch is the distance between each thread on the worm gear
A worm with one tooth is single start, two teeth double start, three teeth triple start
Gear ratio $N$ is wheel teeth / worm teeth
To drive the gearbox backwards, $tan λ < μ$
- $μ$ is coefficient of friction
- $λ$ is angle formed by the triangle between the length of the worm $L$ , and $π d_{w}$
  - $d_{w}$ is diameter of the worm gear
In most applications $tan λ >> μ$ , so cannot be drive backwards

Planetary Gearbox

A planetary gearbox is a co-axial gearbox, used in high-torque low-speed applications. It is cheap, compact, and efficient.

Four main components
- Sun gear in the centre connected to one shaft
- Carrier connected to another shaft
  - That fidget spinner-looking bit in the picture
- Outer ring
- Multiple planet gears connected to the carrier
Relationship between input and output torque depends on which components are fixed in place

$\frac{ω _{s u n} - ω _{c a rr i er}}{ω _{r in g} - ω _{c a rr i er}} = - \frac{N _{r in g}}{N _{s u n}} = \frac{teeth on ring}{teeth on sun}$

One of these velocities will always be zero, so the relationships are given below between velocities and torques for different fixed components

Choosing a Gearbox

An appropriate gearbox should be chosen based on velocities and torques in the system:

Max intermittent and continuous velocities
Max intermittent and continuous torques
Gear ratio
Radial and axial loads

Equivalent torque $T_{e q u i v}$ is found based upon the motion profile and average torques:

$\overset{ˉ}{T}$ is average torque in a time period
$\overset{ω}{ˉ}$ is average velocity in a time period
$x$ is a constant depending upon the gear construction, usually between 0.3 and 10
- Always use 5 here

$T_{e q u i v} = x \frac{ω ˉ _{a cc} t _{a cc} T ˉ _{a cc}^{x} + ω _{s} t _{s} T ˉ _{s}^{x} + ω ˉ _{d ec} t _{d ec} T ˉ _{d ec}^{x}}{ω ˉ _{a cc} t _{a cc} + ω _{s} t _{s} + ω ˉ _{d ec} t _{d ec}}$

Mean velocity is also required:

$ω_{a vg} = \frac{ω ˉ _{a cc} t _{a cc} + ω _{s} t _{s} + ω ˉ _{d ec} t _{d ec}}{t _{a cc} + t _{s} + t _{d ec}}$

The selection process for an appropriate gearbox is as follows:

Choose a gearbox whose maximum continuous torque (rated torque) is larger that $T_{e q u i v}$
Ensure max intermittent torque is frater than max torque load (torque at end of t_{acc})
Divide the max gearbox speed by $ω_{s}$ to determine maximum possible gear ratio
Select a standard gear ratio $N$ below this value
Input mean velocity is $N \times ω_{a vg}$
Input peak velocity is $N \times ω_{s}$
If either of these exceed gearbox velocity ratings, select a lower gear ratio and try again

Rotary to Linear Motion

Belt and Pulley

Transfers rotary motion across a distance

The rotational position, velocity, and acceleration of the motor and load are related by the relative diameters of the pulleys

$N = \frac{D _{P L}}{D _{PM}}$
$θ_{M} = N θ_{L}$
$\dot{θ}_{M} = N \dot{θ}_{L}$
$\ddot{θ}_{M} = N \ddot{θ}_{L}$

The total intertia of the torque load is the intertia of the motor, pullets, belt, and load, all referred to the motor

$J_{T} = J_{M} + J_{PM} + J_{P L \to M} + J_{B \to M} + J_{L \to M}$
$J_{P L \to M} = J_{P L} \times \frac{1}{N ^{2}}$
$J_{B \to M} = M_{B} \times \frac{D _{PM}^{2}}{N ^{2}}$
$J_{L \to M} = J_{L} \times \frac{1}{N ^{2}}$

The torque load must also be referred across the belt and pully system using the equation

$T_{L \to M} = T_{L} \frac{1}{N}$

The total torque the motor must provide for the belt and pully system shown is:

$T_{M} = (J_{M} + J_{PM} + J_{P L} \times \frac{1}{N ^{2}} + M_{B} \times \frac{D _{PM}^{2}}{N ^{2}} + J_{L} \times \frac{1}{N ^{2}}) \ddot{θ}_{M} + (B_{M} + B_{L} \frac{1}{N ^{2}}) \dot{θ}_{M} + \frac{T _{L}}{N}$

Lead and Screw

The screw is rotated by the motor, which makes the nut move along the thread of the screw
The distance the nut moves in one rotation is the lead $L$
The pitch is the distance between two adjacent threads
The starts is the number of independent threads in a screw, typically 1-3
$L = s t a r t s \times p i t c h$
The relationship between rotary velocity of the screw and linear velocity of the nut is $ω = V / L$

The diagram below shows a lead and screw subject to three forces

Push-pull $F_{P}$
Gravity $F_{g} = M_{L} g sin γ$
Frition $F_{f} = μ M_{L} g cos γ$

The forces must be referred to the motor as a torque, which is done using the lead

$T_{M} = (J_{M} + J_{S} + M_{L} (\frac{L}{2 π})) \ddot{θ} + \frac{L}{2 π} (F_{P} + F_{g} + F_{f})$

The equation above is written using lead in m/rev. Lead is sometimes given in m/rad, and the conversion is given as:

$1 \frac{m}{re v} = \frac{1}{2 π} \frac{m}{r a d}$

Conveyor Belt

The position, velocity, and acceleration of the motor and the load can be related using the following formulae:

$θ_{M} = \frac{x _{L}}{D _{P 1} /2} \dot{θ}_{M} = \frac{v _{L}}{D _{P 1} /2} \ddot{θ}_{M} = \frac{a _{L}}{D _{P 1} /2}$

The inertia of each of the pulleys depends on their relative diameters, so the total intertia referred to each motor is it's own inertia, plus the intertia of each pulley, plus the load:

$J_{T} = J_{M} + J_{P 1} + J_{P 2} (\frac{D _{P 1}}{D _{P 2}})^{2} + J_{P 3} (\frac{D _{P 1}}{D _{P 3}})^{2} + J_{L \to M}$

$J_{L \to M} = (M_{B} + M_{L}) (\frac{D _{P 1}}{2})$

The forces from the load must be referred to the motor as a torque, which is done using the diameter of the pulley the motor is connected to also:

$T_{L \to M} = (F_{P} + F_{g} + F_{f}) \frac{D _{P 1}}{2}$

Rack and Pinion

The equations for position, velocity, acceleration,inertia, and torque are literally all the same as for a conveyor what more do u want me to say

Transmission Efficiency and RMS Torque

No gearboxes have 100% efficiency
Efficiency modifies torque, not velocity
$P_{in} = T_{in} \dot{θ}_{in}$
$P_{o u t} = (η T_{o u t}) \dot{θ}_{o u t}$

RMS torque is a useful metric of a system to inform the choice of motor used in design. Assuming a trapezoidal motion profile:

$T_{RMS} = \frac{\frac{1}{2} ( T _{M, a cc}^{2} + T _{P, a cc}^{2} ) t _{a cc} + T _{P, s}^{2} t _{s} + \frac{1}{2} ( T _{M, d ec}^{2} + T _{P, d ec}^{2} ) t _{d ec}}{t _{M}}$

Sensors

Sensors measure physical quantities that are outputs from electromechanical systems. A sensed signal will go through a few steps before we have access to the data:

The physical phenomena, the signal source, will happen
The sensor will detect this by some mechanism and output a noisy signal
Some signal conditioning/processing will take place to make the signal easier to read
Analogue to Digital conversion samples and digitises the data
The digitised data is presented to software as binary information

Performance of Sensors

There are a number of metrics used to measure the performance of a sensor, and which metrics are considered will depend upon the use case.

Accuracy
- How close is the output to the true value of the input?
- A sensor with high accuracy will give readings close to the quantity being sensed
Precision
- How consistent are the readings for the same input?
- How repeatable are the readings?
- Precise data is close to each other, but not necessarily to the true value
- High precision with low accuracy may be acceptable if the systematic inaccuracy can be compensated for
Drift
- Changes in the output of the sensor not related to the input
- Often related to temperature, as this affects electrical properties
Hysteresis
- The difference between the output when the input is increasing, and the output when the input is decreasing
- Quantities may be sensed differently depending upon their rate of change
- Common phenomenon and is often useful in other applications
- Often provided as an average percentage
Linearity
- How the output changes with input over its operating range
- Linear behaviour is ideal as it simplifies output processing
- Many sensors have a linearity error of how much the output deviates from linear behaviour
Resolution
- Changes in measured quantity may be too small to detect
- Sensor will have a max resolution which is the smallest changes it can sense
- Resolution also limited by ADC
Gain
- How much the output changes with the input
- Too high and small changes will give large output swings and low noise tolerance
- Too low and the system will not respond to small changes
- Often given as how much voltage changes per measured unit
  - A temperature sensor will have a gain in mV/°C
Range
- The max and min values that can be sensed
- Can also define a linear range, the range for which the sensor has linear behaviour
- Can set a fixed operating range, to increase sensitivity or resolution over a smaller range
- Wider range usually gives lower sensitivity/resolution

Signal Conditioning

Generally sensor output is some voltage, which will be given as input to a microcontroller. Voltage signals can be too large, too small, or too noisy, so some conditioning/processing is required

Filtering to remove noise
Amplification to increase the range of the signal
Attenuation to decrease the range of the signal
- Too large a voltage may damage the electronics

Op-amp circuits are usually involved in signal conditioning.

$V_{out} = A (V_{2} - V_{1})$
- $A$ is the open loop gain
- Both open loop gain and input resistance are $\infty$ in an ideal op amp
No current flows in or out of the inputs
The two inputs are always at the same voltage

Buffer

The output is connected to the inverting input
- Negative feedback
Provides decoupling between circuits
No current flows into $V_{2}$ , but $V_{out}$ will still equal $V_{2}$ as the two inputs are always at the same voltage
- Ensures no current flows to provide protection
No current is drawn from the supply by the op-amp
$V_{out} = V_{2}$

Comparator

Amplifies the difference between the two input voltages
Output saturates at power rail voltages
Useful for indicating when output reaches a threshold

Inverting Op-Amp

$\frac{V _{out}}{V _{in}} = - \frac{R _{2}}{R _{1}}$

Inverts and amplifies the input
Amplifies small sensor output voltages
(see ES191)

Non-Inverting Op-Amp

$\frac{V _{out}}{V _{in}} = 1 + \frac{R _{1}}{R _{2}}$

Amplifies and does not invert input

Attenuation

Voltage attenuation can be easily achevied with just a voltage divider

$V_{in}$ has range 0 to 20V
$R_{1} = 30 k Ω$ , $R_{2} = 10 k Ω$ ,
- $V_{out} / V_{in} = 0.25$
$V_{out}$ has range 0 to 5V

Low Pass Filter

A low pass filter attenuates the high frequency components of a signal:

This is a voltage divider with a capacitor:

$V_{out} = V_{in} \frac{X _{c}}{R ^{2} + X _{c}^{2}}$

The impedance of a capacitor $X_{c}$ is dependant upon frequency: $X_{c} = 1/2 π f C$
- Higher frequency, lower impedance
The corner/cutoff frequency $f_{c}$ is where the output is -3 decibels smaller than the input (about 71%)
- $f_{c} = 1/2 π RC$

Reading Signals and ADC

Signals are typically read with microcontrollers
Input to microcontrollers has a maximum which if exceeded will damage the part
Signals are read and digitised so they can be understood by digital electronics
Signal is sampled at discrete time steps, at a sampling frequency $f_{s}$
- Each sample is the value of the signal at time $t$
The sample value is held until the next sample, when the sample value is updated
- This creates a digital signal, an approximation to the input signal
Sampling frequency has a large affect on how close the digital signal is to the original
- To maintain the highest frequency components of the signal $f_{s} \geq 2 f_{ma x}$
- $f_{ma x}$ is the highest frequency present in the signal, the nyquist frequency
- In practice, sample rate should be much higher than double
Signal sample levels may only take a finite, discrete number of values
- Quantisation level
- Samples are rounded to nearest quantum
- Higher sampling resolution means more accurate digital signal

A signal measured with a 4-bit ADC:

The circuit below shows a 3-bit ADC implemented with a priority encoder and op amps:

Wheatstone Bridge

A wheatsone bridge is a common circuit used to measure an unknown resistance:

4 resistors, one with an unknown value
Input is a known voltage $V_{S}$
Output is the measured difference between $V_{C}$ and $V_{D}$
- Output of two potential dividers in parallel
When $V_{out} = 0$ , the bridge is balanced
- $R_{1} / R_{2} = R_{3} / R_{4}$

This can be exploited to find the value of an unknown resistance. If $V_{o u t} = 0$ , and $R_{1}$ is unknown and the rest are fixed values:

$R_{1} = R_{2} \frac{R _{3}}{R _{4}}$

Can also derive an expression for $R_{1}$ in terms of the rest of the circuit, if $V_{o u t}$ is non-zero:

$R_{1} = R_{2} (\frac{V _{S} ( R _{3} + R _{4} )}{V _{o u t} ( R _{3} + R _{4} ) + V _{S} R _{4}} - 1)$

The unknown resistance may be some sensor which changes its resistance based upon a physical quantity, ie an LDR or strain gauge. The circuit below shows a photoresistor in a wheatstone bridge, with buffered outputs connected to a differential amplifier, which will provide an output voltage:

The gain of the differential amplifier is calculated using the following, where $R_{1} = R_{2}$ and $R_{3} = R_{4}$

$V_{o u t} = \frac{R _{3}}{R _{1}} (V_{C} - V_{D})$

Force and Torque Sensors

Strain Gauge

A thin strip of semiconductor which is wafer thin and can be stuck onto things
The strip deforms as the surface deforms
When subject to a strain, its resistance changes
- $\frac{Δ R}{R} = Gε$
- $G$ is the gauge factor, $ε$ is the strain
Strain is the ratio of change in length to original length, so this will measure how much a material has stretched by
- The diagram below shows how

Load Cell

A load cell uses strain gauges to measure force:

As the force causes the shape to deform, the strain gauges sense this and the applied force can be calculated
Important factors to consider are:
- Maximum force load
- How the force can be applied to the cell
- Rated output

Rotary Torque Sensor

Torque sensors work similar to load cells, using strain gauges to detect deformation.

The sensor is coupled to a rotating shaft
The rotation of the shaft causes small deformations within the torque sensor, which are detected by strain gauges

Position and Speed Sensors

An encoder is a device that gives a digital output dependent upon linear or angular displacement.

Incremental encoders detect changes in rotary postition from a starting point
Absolute encoders give a rotational position

Incremental Encoder

Incremental encodes contain a disc with multiple holes
As the disc rotates, the holes will create pulses of light, with each pulse representing a displacement of a certain number of degrees
Outer two layers slightly offset so direction of rotation can be determined
Innermost hole counts number of revolutions
The one shown has 12 holes so a 30° resolution

Absolute Encoder

An absolute encoder works on a similar principal to an incremental encoder
The output takes the form of binary code whose value is related to the absolute position of the disc
- Multiple layers used to provide unique encoding for each disc segment
Encoders use gray coding so that if any holes are misaligned then error is minimised
An 8-bit encoder has 360/256 = 1.4° resolution

Speed sensors

Encoders can also be used to measure angular velocity by measuring the time taken between pulses within the encoder
Reflective photoelectric sensors work by reflecting light off a disc with reflective and matte colours, and measuring the rate at which the reflected light changes intensity
Slotted photoelectric sensors work by detecting if a rotating part is blocking a beam of light or not

Current Sensors

Current Sense Resistors

Due to Ohm's law, a current passing through a resistor will cause a voltage drop
That voltage can be measured, and the current accross it calculated
This will modify the voltage accross the load and cause a power drop
- A small resistor should be used, typically less than 10 ohms

Hall Effect Sensors

Hall effect sensors use the physical phenomena of flowing electrons being deflected in a magnetic field to measure current
A magnetic field will cause electrons to be deflected, which will charge either side of a sensor plate depending upon current direction

The potential difference between either side of the plate is given by

$V = K_{H} \frac{B I}{t}$

$K_{H}$ is hall coefficient
$B$ is the flux density of the magnetic field
$I$ is current
$t$ is plate thickness

Since $K_{H}$ , $B$ , and $t$ are constants, the relationship between current and voltage is linear.

Electromagnetics & Motors

There are 3 basic elements of any electrical machine

Something to create a magnetic field on demand
Something to channel said magnetic field
Something to usefully be acted upon by the field

Magnetic Fields

Magnets are dipoles, with a north and south seeking pole
Moving charge creates a magnetic field
A magnetic field is a region of influence where a force can act on a particle

Field lines are closed loops from north to south poles
Lines never cross
Closer the lines, stronger the field
Lines are elastic, will always act to shorten themselves

Moving charges create a magnetic field, so a current moving through a wire will induce a magnetic field around the wire:

The field radiates outwards from the wire
Field is stronger close to the wire
The number of field lines passing through an area is magnetic flux density $B$ , measure in Teslas
Area 1 has a higher flux density than area 2
The direction of the field is determined by the corkscrew rule
- Make a fist with your right hand
- Thumb is the current direction
- Fingers point in field direction

The magnetic flux density around a conductor is $B$ is calculated:

$B = \frac{μ _{0} I}{2 π r}$

$B$ is flux density in Teslas (T)
$I$ is current in Amps (A)
$r$ is the distance from the conductor in meters (m)
$μ_{0}$ is the permeability of free space in Henries per meter H/m

Flux density may also be expressed in terms of flux $ϕ$ :

$B = \frac{ϕ}{A}$

$ϕ$ is magnetic flux in Webers (Wb)
$A$ is the enclosed area in square meters (m $^{2}$ )

When there is more than one conducting wire, current in the same direction will augment a field

A long wire with $N$ coils will create a solenoid
Each extra turn develops a given flux, re-enforced with each turn
The total flux available in a solenoid is the flux linkage $λ = Nϕ$ in weber-turns

Permeability $μ$ is a measure of how well a material builds a magnetic field under the influence of a magnetising source. A coil of $N$ turns carrying a current $I$ with length $L$ develops a magnetic field intensity $H$ , in amp-turns per meter:

$H = \frac{N I}{L}$

The useful magnetic field from which is then

$B = μ H$

By using a material with higher magnetic permeability, we can create a higher magnetic flux density.

Permeability $μ$ is often given in terms of the permeability of free space, and the material's relative permeability: $μ = μ_{r} μ_{0}$
$μ_{0} = 4 π \times 1 0^{- 7} H m^{- 1}$
Ferromagnetic materials have high permeability
Non-ferrous materials have low permeability
Magnetic cores of ferrous materials are used in solenoids to channel the field
- An iron core has a higher permeability than air
Stronger field creates a higher flux density

A current-carrying wire will interact with a magnetic field to create a force

Fleming's left hand rule explains how this works, with force, magnetic field and current all acting in opposite directions.

A loop of wire in a field will have current flowing through it in opposite directions, so the wire will spin as equal forces will be induced on it in opposite directions. This is the basic principle behind how motors work.

The force on a conductor in a magnetic field can be calculated:

$F = B I L sin θ$

$F$ is the force on the conductor in Newtons (N)
$B$ is the flux density in Teslas (T)
$I$ is the current in Amps (A)
$L$ is the wire length in meters (m)
$θ$ is the angle between the plane of the coil and the magnetic field lines

Magnetic Circuits

Magnetic circuits can be thought of in a similar way to electrical:

Magneto-motive force $F$ causes flux $ϕ$ to flow through various reluctances $R$
$F = ϕ R$ - Hopkinson's Law

Magneto-motive force is considered the potential for a device to produce flux, and is related to the current and field intensity by:

$F = N I = H l = ϕ R$

Flux is akin to magnetic current
Reluctance defines how much flux a given potential develops
Reluctance is a function of the geometry and material of the flux pathway
- Similar to electrical resistivity

$R = \frac{l}{μ _{0} μ _{r} A}$

Hysteresis/ B-H Curves

The magnetic field obtained is a function of field intensity, the direction it is applied, and the existing field
Saturation is the max possible field strength
Remanence is the field left when the magnetising source is removed
Coercivity is how hard it is to swap field direction
Soft materials are easier to de-magnetise and re-magnetise

Example

A steel ring, with a coil around it. The ring is 0.2m long with area 400mm $^{2}$ , the coil has 300 turns:

Calculate the magneto-motive force for 500 $μ Wb$ to flow, and the amount of current required to sustain this.

Flux density:

$B = \frac{ϕ}{A} = \frac{500 \times 1 0 ^{- 6}}{400 \times 1 0 ^{- 6}} = 1.25 T$

The field intensity is given from the table describing the hysteresis characteristics, $H = 1500 A t / m$ . Relating magneto-motive force, current and field intensity:

$F = H l = 1500 \times 0.2 = 300 A t$

$F = N I = 300$

$I = \frac{F}{N} = \frac{300}{300} = 1 A$

Lenz's Law

The direction of an induced EMF is always such that the current it produces acts to oppose the change in flux or motion causing the induced EMF.

A clockwise field is generated by the first coil
The flux generated by the first coil links with the second coil's turns
If this flux is changing, an EMF is induced in the second coil
- More turns = more linkage = more emf
The EMF induces a current in the second coil
The current in the coil causes it to generate it's own flux, in opposition to the flux of the first coil
EMF out and current out are a function of the ratio between coil turns due to flux linkage
- This is how transformers work

To induce an EMF, the flux linking the coil must be changing, so typically an AC signal is used. The magnitude of this induced EMF $e$ is the rate of change of flux linkage

$e = \frac{d}{d t} (Nϕ) = \frac{d λ}{d t}$

Reluctance and Force

An armature exposed to a magnetic field will try to move to the point in the field where the least resistance to flux exists

A current is applied to a the coil to develop a field
A soft iron bar is inserted which becomes magnetised
The forced drags the bar in toward the centre of the coil
As the bar moves in the field a counter current is generated in the coil due to Lenz's law, which reduces net field and force
The field is not uniform, and is strongest in the centre
The bar moves back and forth and eventually comes to rest in the centre of the field, where the force is strongest and reluctance is lowest

The energy stored in the coil does work by moving the bar, and the energy comes from inductance, the property of a magnetic field that defines its ability to store energy. THe voltage accross an inductor is given as:

$V = L \frac{d I}{d t}$

Thus the power is:

$V I = P = L I \frac{d I}{d t}$

The total work done in Joules is the integral of the power over time:

$W_{f} = \int_{0}^{t} P d t = \frac{1}{2} L I^{2}$

The force developed in a field is the Maxwell pulling force, and can be determined in several ways:

$F = \frac{L I ^{2}}{2 x} = \frac{N ^{2} I ^{2}}{2 R x} = μ A \frac{N ^{2} I ^{2}}{2 x ^{2}} = \frac{B ^{2} A}{2 μ}$

$L$ is inductance in henries (H)
$I$ is current in amps (A)
$x$ is field length/air gap in meters (m)
$N$ is coil turns
$R$ is reluctance
$μ$ is material or air gap permeability
$A$ is field area in square meters (m $^{2}$ )
$B$ is flux density in Teslas (T)

The equation relating flux, current, turns and inductance is: $L = \frac{ϕ N}{I}$

PMDC Motors

Permanent magnet DC motors are widely used in a variety of applications due to their simplicity of control. They consist of two main parts: a stator, and an armature. Stationary magnets are attached to the stator, and coils of wire are wound around the rotating armature:

The circuit below is commonly used as a model of a PMDC motor:

Using this model, the following equations can be derived:

$V = L \frac{d i}{d t} + R i + K_{e} \dot{θ}$

$k_{t} i = J \ddot{θ} + b \dot{θ} + T_{L}$

$V$ is applied voltage in Volts (V)
$L$ is armature inductance in Henries (H)
$R$ is armature resistance in ohms ( $Ω$ )
$J$ is inertia in kgm $^{2}$
$b$ is friction in Nm/rad/s
$k_{e}$ is the back emf constant in V/rad/s
$k_{t}$ is the torque constant in Nm/A
$T_{L}$ is torque load in Nm
$i$ is current in Amps (A)
$θ$ is position in radians (rad)

Operating Points

The voltage applied causes motion, and the speed is determined by torque. The motor has linear relationsips in speed, torque, and current.

For a given voltage, speed will decrease with torque and current will increase with increase torque
The motor can operate over a range of input voltages
The voltage applied determines the exact relationship between speed, current, and load
If a certain known torque load wants to be driven at a certain speed, then a set input voltage can be calculated, which will draw a set amount of current
- To increase the speed of the same torque load, increase the voltage, which will increase the current
The combination of speed, current, and load is the motor's operating point

Any given voltage and torque produces a speed and current, and the ideal operating point of a motor will be between the maximum efficiency and maximum output power points. When a motor is at a constant speed and current, the dynamic equations can be simplified to steady-state equations (also $k_{t} = k_{e} = k$ ):

$V = R i + k \dot{θ}$

$ki = b \dot{θ} + T_{L}$

Steady state current and velocity are therefore:

$i = \frac{bω + T _{L}}{k}$

$ω = \frac{kV - R T _{L}}{R b + k ^{2}}$

Increasing $V$ will cause an increase in $ω$
Increasing $T_{L}$ will cause a decrease in $ω$
Increasing $ω$ will cause an increase in $i$

Power and Efficiency

The useful output power of a motor is rotational mechanical power. $J \ddot{θ}$ and $b \dot{θ}$ are considered losses.

$P_{o u t} = T_{L} ω$

Input electrial power is $P_{in} = iV$ , so electrical losses are mainly $P_{l oss} = i^{2} R$ . The efficiency is output mechanical power over input electrical power:

$η = \frac{T _{L} ω}{iV} = \frac{T _{L}}{V} \times \frac{k ^{2} V - k R T _{L}}{( bω + T _{L} ) ( R b + k ^{2} )}$

Decreasing the friction $b$ will always increase efficiency, but as the other terms appear in both numerator and denominator, it is hard to find an optimum.

Wound DC Motors

Wound DC motors have a magnetic field generated by an electromagnet instead of a permanent magnet, so are generally more powerful and controllable.

Separately excited DC motors use a source of current separate from the armature current to generate the field
Series connected DC motors have the field windings in series with the armature
Shunt connected DC motors have the field windings in parallel with the armature

Separately Excited

Two separate input voltages
Both windings used DC current
Most controllable as field strength is isolated from armature current
Mutual inductance couples the motor equations as the flux from $L_{f}$ and $L_{a}$ interact
Used when a DC motor with high controllability and high power output is required, such as in electric trains

$\frac{d i _{a}}{d t} = - \frac{R _{a}}{L _{a}} i_{a} - \frac{L _{a f}}{L _{a}} i_{f} ω_{r}$

$\frac{d i _{f}}{d t} = - \frac{R _{f}}{L _{f}} i_{f} + \frac{u _{f}}{L _{f}}$

$\frac{d ω _{r}}{d t} = \frac{L _{a f}}{J} i_{a} i_{f} - \frac{B _{m}}{J} ω_{r} - \frac{τ _{L}}{J}$

$i_{a}$ , $i_{f}$ armature/field current
$u_{a}$ , $u_{f}$ armature/field voltage
$R_{a}$ , $R_{f}$ armature/field resistance
$L_{a}$ , $L_{f}$ armature/field inductance
$L_{a f}$ mutual inductance
$J$ armature inertia
$B_{m}$ armature damping
$ω_{r}$ armature velocity
$τ_{L}$ torque load

Series Connected

Self-exciting: no separate input to excite magnetic field
Field lines are cut by armature field lines
High starting torque
Should not be run with no load as they have very high speeds
Used in heavy industrial equipment with high torques

$\frac{d i _{a}}{d t} = - \frac{R _{a} + R _{f}}{L _{a} + L _{f}} i_{a} - \frac{L _{a f}}{L _{a} + L _{f}} i_{a} ω_{r} + \frac{u _{a}}{L _{a} + L _{f}}$

$\frac{d ω _{r}}{d t} = \frac{L _{a f}}{J} i_{a}^{2} - \frac{B _{m}}{J} ω_{r} - \frac{τ _{L}}{J}$

Shunt Connected

Field windings are connected in parallel with armature windings
Very good speed regulation
Better at maintaining speed over a range of torque loads
Best used where torque loads can vary ie in machining tools

$\frac{d i _{a}}{d t} = - \frac{R _{a}}{L _{a}} i_{a} - \frac{L _{a f}}{L _{a}} i_{f} ω_{r} + \frac{u _{a}}{L _{a}}$

$\frac{d i _{f}}{d t} = - \frac{R _{f}}{L _{f}} i_{f} + \frac{u _{f}}{L _{f}}$

$\frac{d ω _{r}}{d t} = \frac{L _{a f}}{J} i_{a} i_{f} - \frac{B _{m}}{J} ω_{r} - \frac{τ _{L}}{J}$

Motor Control

Changes in speed are often required in a system.
This can be done in PMDC motors by changing armature voltage
Microcontrollers output a control signal to control the voltage

Pulse Width Modulation

Works by providing a high-frequency square wave
The ratio of high/low is called the duty ratio
Effectively turns a transistor on/off very quickly
Duty ratio determines voltage accross motor

The graph below shows a PWM signal along with the average voltages

The signal switches on and off very quickly, meaning the motor control circuit is turned on/off, but the motor has a high inductance meaning it does not respond as quickly
This has the effect of averaging the voltage
The PWM frequency is typically very high, and the period must be lower than the response time of the load

Motors can be modelled as an circuit with an inductance in series with a resistor, and an emf representing the motor's back emf:

The power supply is connected and disconnected by a switch controlled by PWM
The instantaneous $V_{B}$ and average $E_{r}$ voltage accross the motor is shown on the graph for two different duty ratios
The motor does not stop when disconnected because of the rise and fall time of the current in the RL circuit
The diode is a freewheeling diode that allows a current path when the voltage switch is off

Low Side Drive Circuit

The basic circuit for implementing motor speed control is shown below, known as a "Low Side PMDC Motor Drive Circuit"
- A high side version swaps the transistor and motor

The circuit is built around a transistor used to switch the voltage on and off
- N-type MOSFET generally the best choice
Freewheeling diode provides a current path for motor current when the switch is off
- Typically a schottky diode
- Forward rated current should be greater than max current
- Reverse voltage should be higher than motor voltage
Pull down resistor ensures transistor gate voltage is 0 when no input is applied
- Typically 10k
Current limiting resistor protects transistor from damage

The signal from the controller will be connected to the transistor gate, switching on and off at the PWM frequency. The duty ratio $D$ determines the ratio of on/off, so the average voltage is:

$\overset{ˉ}{V} = D \times V_{cc}$

H-Bridge

A H-Bridge is a power electronic circuit that can convert DC to AD current. For motor control, it can be used to drive a motor in either direction or apply PWM control.

The switches $S_{1}$ and $S_{4}$ , and $S_{2}$ and $S_{3}$ work in pairs
The state of each pair should always be opposite
Current flowing in different directions causes the motor to rotate in different directions
There are also 3 other states:
- Shorting is when one side of the circuit has both switches closed and current flows straight to ground
  - This is a short circuit and will cause damage
  - Do not do this
- Braking
  - $S_{2}$ and $S_{4}$ are closed, connecting both terminals to ground and causing the motor to brake sharply
- Coasting
  - All switches open, motor will continue to spin until mechanical load brings it to a stop

Equations

Below are just the majority of the equations in one place without having to scroll :)

$B = \frac{μ _{0} I}{2 π r}$

$B = \frac{ϕ}{A}$

$H = \frac{N I}{L}$

$B = μ H$

$F = B I L sin θ$

$F = N I = H l = ϕ R$

$R = \frac{l}{μ _{0} μ _{r} A}$

$e = \frac{d}{d t} (Nϕ) = \frac{d λ}{d t}$

$V = L \frac{d I}{d t}$

$V I = P = L I \frac{d I}{d t}$

$W_{f} = \int_{0}^{t} P d t = \frac{1}{2} L I^{2}$

$F = \frac{L I ^{2}}{2 x} = \frac{N ^{2} I ^{2}}{2 R x} = μ A \frac{N ^{2} I ^{2}}{2 x ^{2}} = \frac{B ^{2} A}{2 μ}$

$L = \frac{ϕ N}{I}$

$V = L \frac{d i}{d t} + R i + K_{e} \dot{θ}$

$k_{t} i = J \ddot{θ} + b \dot{θ} + T_{L}$

AC Power

The overwhelming majority of electrical power is AC power, single phase power from the mains at 240V 50-60 Hz.

Reactance of Capacitors and Inductors

$X_{L} = ω L = 2 π f L Z_{L} = j X_{L} = jω L = j (2 π f L)$

$X_{C} = \frac{1}{ω C} = \frac{1}{2 π f C} Z_{C} = \frac{X _{C}}{j} = \frac{1}{jω C} = \frac{1}{j \times 2 π f C} = - j \frac{1}{2 π f C}$

When in parallel the impedances is:

$\frac{1}{Z _{t}} = \frac{1}{Z _{1}} + \frac{1}{Z _{2}} Z_{t} = \frac{Z _{1} Z _{2}}{Z _{1} + Z _{2}}$

RMS Power

AC voltages and currents alternate polarities so it is useful to define a DC equivalent, an average voltage/current. This is obtained by taking the root mean square of the sine wave:

$V_{RMS} = \frac{1}{2} V_{p} I_{RMS} = \frac{1}{2} I_{p}$

Real Power

Assume a simple circuit with just an AC source and resistor. The time taken for voltage and current to complete one cycle is $T = 2 π / ω = 1/ f$ . The power dissapated in a resistor over a full cycle is:

$P = \frac{1}{T} \int_{0}^{\frac{2 π}{ω}} (V I) d t = \frac{ω}{2 π} \int_{0}^{\frac{2 π}{ω}} (V I) d t = \frac{V _{p} I _{p}}{2} = V_{RMS} I_{RMS}$

There is a real power dissipated by a resistor
- Also called active, average or useful power.
Measured in Watts
Useful because it is converted to non-electrical forms like heat, light, or torque

Reactive Power

Assume a simple circuit with just an AC source and an inductor:

$V = V_{p} cos ω t$

$I = \frac{1}{L} \int V d t = \frac{V _{p} sin ω t}{ω L}$

The power dissipated in one cycle $T$ is:

$P = \frac{1}{T} \int_{0}^{\frac{2 π}{ω}} (V I) d t = \frac{ω}{2 π} \int_{0}^{\frac{2 π}{ω}} (V_{p} cos ω t) (\frac{V _{p} sin ω t}{ω L}) d t = 0$

The average power dissipated by an inductor is 0
No useful work is done as there is no energy conversion
Energy is exchanged between the magnetic field of the inductor and the power supply
Instantaneous power is not zero
Power consumed by a reactance is called reactive power and is measured in VARS (Volt-Amp Reactives)
The same can be done for a capacitor, which exchanges energy between the power supply and it's electric field

Complex Power

In a pure resistance, the voltage and current are in phase, and all power is positive and is dissipated
In a pure reactance, the voltage and current are out of phase by 90 degrees, and the average power over a cycle is 0
- Instantaneous power, the power at any given point in time, is $P (t) = V (t) I (t)$
Most AC circuits contain both real and reactive components
- Resistors are real and dissipate active power in Watts
- Capacitors/inductors are reactive and dissipate reactive power in VARS
The power supply will delive both real and reactive power in proportion to the magnitudes of real and reactive components
Total power delivered is the complex power, a vector sum of real and reactive power
- Measured in Volt-Amps (VA)

Say an AC circuit applies a voltage $V_{p} ∠ θ_{v}$ accross an impedance $Z = R + j X$ , causing a current of $I_{p} ∠ θ_{i}$ to flow. The impedance can be written:

$Z = ∣ Z ∣∠ ϕ ∣ Z ∣ = R^{2} + X_{L}^{2} ϕ = tan^{- 1} \frac{X}{R}$

By Ohm's law:

$I_{P} ∠ θ_{i} = \frac{V _{p} ∠ θ _{v}}{∣ Z ∣∠ ϕ}$

$ϕ = θ_{v} - θ_{i}$

$ϕ$ is the load angle, which can be used to sketch a load triangle representing the complex power:

The load multiplied by the current square gives the power ( $P = i^{2} R$ ):

$i_{r m s}^{2} Z = i_{r m s}^{2} R + j i_{r m s}^{2} X_{L}$

$S = P + j Q$

$S$ is the complex power, comprised of the real and reactive power.

$S = V_{r m s} I_{r m s} = P + j Q = ∣ S ∣∠ ϕ ∣ S ∣ = P^{2} + Q^{2}$

$P = I_{r m s} V_{r m s} cos ϕ Q = I_{r m s} V_{r m s} sin ϕ$

$cos ϕ$ is the power factor. The closer it is to 1, the more real, useful, power is being dissapated in the system, which we want to maximise.

If $ϕ$ is positive, the power factor is lagging, meaning that the phase of the current is lagging the voltage
- The load is inductive, as current lags voltage in an inductance
If $ϕ$ is negative, the power factor is leading, current leads voltage
- The load is capacitive

Power Factor Correction

Electrical power sources have to produce both real and reactive power
Real power is useful and does work, reactive power does not
- Most reactive power is inductance in transmission lines
We want to maximise the real power in the system, the ratio of which is given by the power factor $cos ϕ = P /∣ S ∣$
Inductive loads cause a positive phase angle
- Lagging power factor as current lags voltage
Capacitive loads cause a negative phase angle
- Leading power factor as current leads voltage
Additional capacitors or inductors can be added to a power system to make the power factor as close to 1 as possible

The power triangle below shoes a reduction in $ϕ$ reducing the reactive power but keeping the same real power

Example 1

Improve the power factor of the AC system shown to 0.98 lagging by adding a shunt reactance to the circuit

Reducing the system to a single impedance:

$Z_{T} = \frac{Z _{C} ( R + Z _{L} )}{Z _{C} + R + Z _{L}} = \frac{\frac{- j}{ω C} \times ( R + jω L )}{\frac{- j}{ω C} + R + jω L} = \frac{( - j 6.37 ) ( 2 + j 1.57 )}{- j 6.37 + 2 + j 1.57} = 3.11∠15.5° Ω = 3 + j 0.83 Z = R + j X = 3 + j 0.83$

Calculating the complex power:

$I_{r m s} = \frac{V}{Z} = \frac{25∠0}{3.11∠15.5} = 8.04∠ - 15.5$

$P = I^{2} R = 194$

$Q = I^{2} X = 56.63$

$∣ S ∣ = P^{2} + Q^{2} = 202$

The current power triangle is therefore:

With a power factor of $cos (tan^{- 1} \frac{56.63}{194}) = 0.961$ . The new load angle we require is $cos^{- 1} (0.98) = 11.48$ . This will require a capacitance in parallel with the current impedance, which will dissipate more reactive power $Q_{C}$ to give a new overall reactive power $Q_{1}$ :

$Q_{C} = Q - Q_{1} = 56.63 - P tan 11.48 = 17.23$

$X_{C} = \frac{V ^{2}}{Q _{C}} = \frac{2 5 ^{2}}{17.23} = 36.27$

$C = \frac{1}{2 π f X _{C}} = \frac{1}{2 π \times 50 \times 36.27} = 8.775 \times 1 0^{- 5}$

The shunt capacitance should have a value of $87.75 μ F$ to increase the power factor to 0.98 lagging.

Example 2

Add a component to this system to improve the power factor to 0.8 lagging.

The total impedance of the system:

$Z_{T} = 1 + \frac{j 6 ( 2 - 2 j )}{j 6 + 2 - 2 j} = 4.6 - j 1.2 = 4.75∠ - 14.6$

$I = \frac{V}{Z} = \frac{8∠ - 40}{4.75∠ - 14.6} = 1.68∠ - 25.4$

Calculating the power:

$S = ∣ V_{r m s} ∣∣ I_{r m s} ∣ = \frac{8}{2} \times \frac{1.68}{2} = 6.72$

$P = I^{2} R = \frac{1.6 8 ^{2}}{2} \times 4.6 = 6.6$

$Q = I^{2} X = \frac{1.6 8 ^{2}}{2} \times 1.2 = 1.7$

The current load angle is 14.6 lagging, so we need to add an inductance to make the system have a load angle of $arccos 0.8 = 36.9$ lagging

The power dissipated by the new inductor:

$Q_{L} = Q + Q_{1} = 1.7 + 6.5 tan (36.9) = 6.58$

$X_{L} = \frac{V _{r m s}^{2}}{Q _{L}} = \frac{8 ^{2} /2}{6.58} = 4.86$

$L = \frac{X _{L}}{ω} = \frac{4.86}{2} = 2.43$

A shunt inductor of $2.43$ H is added to the system.

Resonant Circuits

In any RLC circuit, it is possible to select a frequency at which the impedance is purely real
At this frequency the circuit will draw only real power
All the reactance will cancel out
In cases where frequency is controllable this is useful to improve efficiency
To calculate:
- Derive expression for the total circuit impedance
- Split into real and imaginary parts
- Derive a value of $ω$ such that the imaginary part is 0

Example

Find an expression for the resonant frequency:

$Z_{1} = R_{1} + jω L$

$Z_{2} = \frac{R _{2}}{1 + jω R _{2} C} = \frac{R _{2}}{1 + ω ^{2} R _{2}^{2} C ^{2}} - j \frac{ω R _{2}^{2} C}{1 + ω ^{2} R _{2}^{2} C ^{2}}$

$Z_{T} = R_{1} + jω L + \frac{R _{2}}{1 + ω ^{2} R _{2}^{2} C ^{2}} - j \frac{ω R _{2}^{2} C}{1 + ω ^{2} R _{2}^{2} C ^{2}}$

$re (Z_{T}) = R_{1} + \frac{R _{2}}{1 + ω ^{2} R _{2}^{2} C ^{2}}$

$im (Z_{T}) = ω L - \frac{ω R _{2}^{2} C}{1 + ω ^{2} R _{2}^{2} C ^{2}}$

We require that $ω$ such that $im (Z_{T}) = 0$ :

$0 = ω L - \frac{ω R _{2}^{2} C}{1 + ω ^{2} R _{2}^{2} C ^{2}}$

$ω = \frac{1}{L C} - \frac{1}{R _{2}^{2} C ^{2}}$

Transformers

Transformers are the link between power systems of different voltage levels
- Step-up and step-down voltage
- An increase in voltage gives decrease in current and vice versa
Have full-load efficiencies of around 98% and are highly reliable
Similar to how mechanical gears increase/decrease torque/velocity dependent upon gear ratio, electrical transformers increase/decrease voltage/current dependent upon turns ratio
Consist of an iron/ferromagnetic core with wires wrapped around either side

Changing voltage accross one coil induces magneto-motive force channelled through core
The other coil links the changing flux, inducing a voltage accross it

Ideal Transformers

Ratios of input/output for an ideal transformer are given by:

$N = \frac{N _{P}}{N _{S}} = \frac{E _{in}}{E _{o u t}} = \frac{I _{o u t}}{I _{in}}$

When referring electrical properties over a transformer, multiply or divide by $N^{2}$

$Z_{P} = Z_{S} N^{2}$

An ideal transformer is assumed to be 100% efficient:

$S_{in} = S_{o u t}$

Example

A single phase, 2 winding transformer is rated at 20kVA, 480V/120V, 50Hz. A source connected to the 480V (primary) winding supplies an impedance load connected to the 120V (secondary) winding. The load absorbs 15kVA at 0.8pf lagging when the load voltage is 118V.

The turns ratio is given by the ratio of voltages:

$N = \frac{480}{120} = 4$

The load accross the primary winding is then calculated based on the load on the secondary winding:

$E_{1} = N E_{2} = 4 \times 118 = 472 V$

The current on the secondary side is calculated from the power:

$I_{2} = \frac{S _{2}}{E _{2}} = \frac{15000∠ cos ^{- 1} ( 0.8 )}{118∠0} = 127.12∠36.87$

The power factor is lagging so the current is lagging voltage, the current should have a negative phase angle:

$I_{2} = 127.12∠ - 36.87$

The load impedance can then be calculated from this:

$Z_{2} = \frac{E _{2}}{I _{2}} = \frac{118}{127.12∠ - 36.87} = 0.928∠36.87$

The load impedance referred over the transformer, as seen by the primary winding:

$Z_{2}^{'} = N^{2} Z_{2} = 16 \times 0.928∠36.87 = 14.85∠36.87$

The real and reactive power supplied to the primary winding is calculated easily as this is an ideal transformer, so $S_{1} = S_{2}$

$S_{1} = S_{2} = 15000∠36.87 = 15000 cos 36.87 + j 15000 sin 36.87 = 12000 + j 9000$

Non-Ideal Transformers

In reality:

Windings have resistance
Core has a reluctance
Flux is not entirely confined to the core
There are real and reactive power losses so efficiency is not 100%

To model a transformer more accurately, introduce a resistance in series to model windings resistance, and inductance in series to model flux being not confined to core:

The model above shows a non-ideal transformer modelled with a single extra resistance and inductance, where the impedances from one side have been referred to the other to create a single impedance with values shown.

Note that in large power transformers, the winding resistance is tiny compared to leakage reactance, so series resistances may sometimes be omitted.

Example

An example of a power system containing an ideal transformer is shown below. An AC generator with internal impedance ZGen is connected to a transmission line with impedance $Z_{l in e}$ . The voltage is then stepped up by an ideal transformer with a turns ratio of 0.1 and supplies a load impedance of $Z_{l o a d}$ with a voltage $V_{l o a d}$ . Find the real power dissipated by the line impedance and the voltage accross the load.

Refer the load accross the transformer to create a single circuit:

$Z_{l o a d}^{'} = Z_{l o a d} N^{2} = (10 + j 1.5) \times 0. 1^{2} = 0.1 \times j 0.15$

Now the circuit is a simple AC circuit with three impedences in parallel:

$Z_{g e n} + Z_{l in e} + Z_{l o a d}^{'} = 4.1 + j 6.15 = Z_{T}$

Current delivered by the generator:

$I_{in} = \frac{V}{Z _{T}} = \frac{240∠30}{7.39∠56.3} = 32.47∠ - 26.3$

Real power dissipated by the line impedance is:

$P_{l in e} = ∣ I ∣^{2} R = 32.4 7^{2} \times 3 = 3.164 kW$

To calculate the load voltage, we first need to refer current accross the transformer:

$I_{l o a d} = N I_{in} = 0.1 \times 32.47∠ - 26.3 = 3.247∠ - 26.3$

The load voltage is then:

$V_{l o a d} = I_{l o a d} Z_{l o a d} = (3.247∠ - 26.3) (18∠56.3) = 58.45∠30$

Three Phase AC Systems

3 phase systems exist because generators are usually design to have 3 outputs
Power is transmitted as 3 phase AC power
The 3 phases are all AC signals 120° degrees out of phase with each other
A balanced system has voltages and currents of the same amplitude and frequency shifted 120°
- Assumes all 3 transmission lines and loads have the same impedance
- Each of the three phases can be connected to identical loads, and the system would consist of three single phase circuits
Phase sequence determines the order that the peaks of each phase pass
Positive phase sequence means the peaks pass in the order ABC
- Phase A leads B by 120°
- Phase B leads C by 120°
Negative phase sequence means the peaks pass in the order ACB
- Phase A leads C by 120°
- Phase C leads B by 120°
- Phasors are rotating clockwise

Star and Delta Connected Systems

There are two ways to connect 3 phase sources and loads:

Star connected systems
- The negative of each phase is connected to ground
Delta connected systems
- The negative of each phase is connected to another phase
The phase voltage is the voltage between a phase and the ground, eg $V_{an}$
The line voltage is the voltage between two transmission lines, eg $V_{ab}$
The phase current is the current flowing through a phase, eg $I_{ab}$
The line current is the current flowing out of each phase, eg $I_{a}$

Star Connected

The phase voltages are measured accross a single phase:

$V_{an} = V_{p h} ∠0 V_{ab} = V_{p h} ∠ - 120 V_{c n} = V_{p h} ∠ - 240$

Line voltages are measured between each pair of lines, and are different from the phase voltages
Phase currents are measure in each phase, and are the same as the line currents

Positive Sequence

All three line voltages are $3 \times$ the phase voltages, and lead them by 30°:

$V_{ab} = 3 ∣ V_{an} ∣∠30 V_{b c} = 3 ∣ V_{bn} ∣∠ - 90 V_{c a} = 3 ∣ V_{c n} ∣∠ - 210$

All 6 voltage phasors are shown in the diagram below:

Negative Sequence

All three line voltages are $3 \times$ the phase voltages, and lag them by 30°:

$V_{ab} = 3 ∣ V_{an} ∣∠30 V_{b c} = 3 ∣ V_{bn} ∣∠ - 90 V_{c a} = 3 ∣ V_{c n} ∣∠ - 210$

Delta connected

Phase voltages are measured accross a single phase:

$V_{an} = V_{p h} ∠ - 30 V_{ab} = V_{p h} ∠ - 150 V_{c n} = V_{p h} ∠ - 270$

Line voltages are measured between the lines, and are the same as the phase voltages
Phase currents are measured in each phase, and are different from the line currents

Positive Sequence

All three line currents are $3 \times$ the phase currents, and lag them by 30°:

$I_{a} = 3 ∣ I_{ab} ∣∠ - 30 I_{b} = 3 ∣ I_{b c} ∣∠ - 150 I_{c} = 3 ∣ I_{c a} ∣∠ - 270$

Negative Sequence

All three line currents are $3 \times$ the phase currents, and lead them by 30°:

$I_{a} = 3 ∣ I_{ab} ∣∠30 I_{b} = 3 ∣ I_{b c} ∣∠ - 90 I_{c} = 3 ∣ I_{c a} ∣∠ - 210$

Three Phase Loads

3-phase loads can also be star or delta connected
Phases are assumed to be balanced because shit gets fucked if they're not
Sometimes it is necessary to convert between star and delta loads

$R_{Δ}$ denotes the load in a delta connected system
$R_{Y}$ denotes the load in a star connected system

Delta to Star

$R_{a} = \frac{R _{2} R _{3}}{R _{1} + R _{2} + R _{3}} R_{b} = \frac{R _{1} R _{3}}{R _{1} + R _{2} + R _{3}} R_{c} \frac{R _{1} R _{2}}{R _{1} + R _{2} + R _{3}}$

For a balanced load where $R_{1} = R_{2} = R_{3} = R_{Δ}$ :

$R_{Y} = \frac{R _{Δ}}{3}$

Star to Delta

$R_{1} = R_{b} + R_{c} + \frac{R _{b} R _{c}}{R _{a}} R_{2} = R_{a} + R_{c} + \frac{R _{a} R _{c}}{R _{b}} R_{3} = R_{a} + R_{b} + \frac{R _{a} R _{b}}{R _{c}}$

For a balanced load where $R_{a} = R_{b} = R_{c} = R_{Y}$ :

$R_{Δ} = 3 R_{Y}$

System Configurations

There are four possible configurations of sources and loads. It easiest to perform analysis on star to star connected systems as it allows single phase analysis, so converting delta to star loads is often needed.

Power in Three Phase Circuits

The total power delivered by a 3-phase generator and absorbed by a three phase load is the sum of the power in each of the three phases, or 3 times the power in one phase in a balanced system. Power can be expressed in terms of phase voltages and currents.

For both star and delta connected loads:

Power	Equation
Active power per phase	$P_{p h} = ∣ I_{p h} ∣∣ V_{p h} ∣ cos ϕ$
Three phase active power	$P_{3 p h} = 3∣ I_{p h} ∣∣ V_{p h} ∣ cos ϕ$
Reactive power per phase	$Q_{p h} = ∣ I_{p h} ∣∣ V_{p h} ∣ sin ϕ$
Three phase reactive power	$Q_{3 p h} = 3∣ I_{p h} ∣∣ V_{p h} ∣ sin ϕ$
Apparent power per phase	$∣ S_{p h} ∣ = ∣ I_{p h} ∣∣ V_{p h} ∣$
Three phase apparent power	$∣ S_{3 p h} ∣ = 3∣ I_{p h} ∣∣ V_{p h} ∣$

Example 1

A balanced 3-phase star connected positive sequence source delivers power to a balanced 3-phase star connected load:

Line-to-Line voltage at each source is $V_{A B} = 415∠40 V_{r m s}$
Each transmission line had a resistance of 1 Ohm and an inductance of 9.5 mH
Each phase is a 4 Ohm resistance and a 20 mH inductor
System operates at 50Hz

Converting line voltage to phase voltage:

$V_{an} = \frac{V _{ab}}{3} ∠ - 30 = \frac{415}{3} ∠ (40 - 30) = 239.6∠10$

The impedance of the load and the transmission line :

$Z_{Y} = 4 + j 2 π \times 20 \times 1 0^{- 3} = 7.44∠57.5 Z_{l in e} = 1 + j 2 π \times 9.5 \times 1 0^{- 3} = 3.14∠71.45$

The line and load current (they're the same in star systems) are calculated using the phase voltage and the total impedance:

$I = \frac{V _{an}}{Z _{Y} + Z _{l in e}} = \frac{239.6∠10}{10.52∠61.6} = 22.8∠ - 51.6$

The voltage across each phase load is the line current and the load impedance:

$V_{p h, l o a d} = I Z_{Y} = (22.8∠ - 51.6) (7.44∠57.5) = 169.6∠5.9$

The total active and reactive power dissipated by all phases of the load can then be calculated:

$P_{l o a d} = 3∣ V_{p h, l o a d} ∣∣ I ∣ cos ϕ = 3∣169.9∣∣22.8∣ cos (57.5) = 6.233 kW$

$Q_{l o a d} = 3∣ V_{p h, l o a d} ∣∣ I ∣ sin ϕ = 3∣169.9∣∣22.8∣ sin (57.5) = 9.784 kV A RS$

$S_{l o a d} = 6.233 + j 9.784 kV A$

The total active and reactive power consumed by the line:

$P_{l in e} = 3∣ I^{2} ∣ R_{l in e} = 3 \times 22. 8^{2} \times 1 = 1.56 kW$

$Q_{l in e} = 3∣ I^{2} ∣ X_{l in e} = 3 \times 22. 8^{2} \times 2.98 = 4.647 kV A RS$

$S_{l in e} = 1.56 + j 4.647 kV A$

Therefore the total complex power delivered by the source is:

$S = S_{l o a d} + S_{l in e} = 1.56 + 6.233 + j (4.647 + 9.784) = 7.78 + j 14.38 kV A$

Example 2

A balanced 3-phase star connected positive sequence voltage source delivers power to a balanced 3-phase delta connected load:

Line-to-Line voltage at each source is $V_{A B} = 415∠40 V_{r m s}$
Each transmission line had a resistance of 1 Ohm and an inductance of 9.5 mH
Each phase is a 4 Ohm resistance and a 20 mH inductor
System operates at 50Hz

The delta connected load must be converted to it's star equivalent, by dividing the impedences and phase shifting voltages and currents where necessary.

Converting line to phase voltage to get the voltage of each phase at the source:

$V_{an} = \frac{V _{ab}}{3} ∠ - 30 = \frac{415}{3} ∠ (40 - 30) = 239.6∠10$

The impedance of each line:

$Z_{l in e} = 1 + j 2 π \times 9.5 \times 1 0^{- 3} = 3.14∠71.45$

The impedance of each load, then converted to it's star equivalent to calculate individual line currents:

$Z_{Δ} = 4 + j 2 π \times 20 \times 1 0^{- 3} = 7.44∠57.5$

$Z_{Y} = \frac{Z _{Δ}}{3} = 2.48∠57.5$

The line and phase currents in a delta load are different, so the line current is calculated from the source phase voltage and total impedance (star equivalent load and line impedances):

$I_{L} = \frac{V _{an}}{Z _{Y} + Z _{l in e}} = \frac{239.6∠10}{2.48∠57.5 + 3.14∠71.45} = 42.76∠ - 55.33$

The phase current of the delta load can then be calculated from the line current:

$I_{p h, Δ} = \frac{I _{L}}{3} ∠30 = \frac{42.67}{3} ∠ (30 - 55.33) = 24.7∠ - 25.33$

The phase voltage of each delta load is then:

$V_{p h, Δ} = I_{p h, Δ} Z_{Δ} = (24.7∠ - 25.33) (7.44∠57.5) = 183.8∠31.17$

The power consumed by the load is then:

$P_{Δ} = 3∣ I_{p h, Δ}^{2} ∣ R_{Δ} = 3 \times 24. 7^{2} \times 4 = 7.3 kW$

$Q_{Δ} = 3∣ I_{p h, Δ}^{2} ∣ X_{Δ} = 3 \times 24. 7^{2} \times (2 π \times 50 \times 20 \times 1 0^{- 3}) = 11.45 kV A RS$

$S_{l o a d} = 7.3 + j 11.45 kV A$

The power consumed by the line:

$P_{l in e} = 3∣ I_{L}^{2} ∣ R_{l in e} = 3 \times 42.6 7^{2} \times 1 = 5.49 kW$

$Q_{l in e} = 3∣ I_{L}^{2} ∣ X_{l in e} = 3 \times 42.6 7^{2} \times 2.98 = 16, 37 kV A RS$

$S_{l in e} = 5.49 + j 13.37 kV A$

Total power delivered:

$S = 7.3 + j 11.45 + 5.49 + j 13.37 = 12.8 + j 27.9 kV A$

ES2C7

Binomial Theorem & Taylor Series

Binomial Theorem

Taking powers of binomial expressions yields binomial expressions, the coefficients of which form pascals triangle:

$(a + b)^{0} = 1$

$(a + b)^{1} = a + b$

$(a + b)^{2} = a^{2} + 2 ab + b^{2}$

$(a + b)^{3} = a^{3} + 3 a^{2} b + 3 a b^{2} + b^{3}$

This can be generalised to:

$(a + b)^{n} = a^{n} + n a^{n - 1} b + \frac{n ( n - 1 )}{2 !} a^{n - 2} b^{2} + ... + b^{n}$

For the particular case where $a = 1$ and $b = x$ ,we have:

$(1 + x)^{n} = 1 + n x + \frac{n ( n - 1 )}{2 !} x^{2} + \frac{n ( n - 1 ) ( n - 2 )}{3 !} x^{3} + ... + x^{n}$

When $n$ is not a positive integer and $- 1 < x < 1$ :

$(1 + x)^{n} = 1 + n x + \frac{n ( n - 1 )}{2} x^{2} + ... + \frac{n ( n - 1 ) ... ( n - r - 1 )}{r !} x^{r} ...$

Note that this is now an infinite series which converges. Can be used to approximate functions by ignoring higher order terms.

Sequences

A sequence is any arrangement of numbers, functions, terms, etc, in a specific order.

May be finite or infinite
The $k^{t h}$ term of the sequence $Z$ is denoted $Z [k]$

A sequence of functions, $Z$ :

$Z [1] = (1 + \frac{x}{2}) Z [2] = (1 + \frac{x}{3})^{2} Z [3] = (1 + \frac{x}{4})^{3}$

$Z [k] = (1 + \frac{x}{k + 1})^{k} k = 1, 2, 3, 4...$

Series

A series is obtained by summing a sequence

$k = 0 \sum \infty (\frac{1}{4})^{k} = 1 + \frac{1}{4} + \frac{1}{4 ^{2}} + \frac{1}{4 ^{3}} + ...$

Arithmetic sequences/series have a common difference, $d$ , between terms

$a, a + d, a + 2 d, ...$

$Z [k] = a + (k - 1) d$

$S_{n} = \frac{n}{2} (2 a + (n - 1) d)$

Geometric series are obtained by multiplying the previous term by a fixed number, the common ratio

$a, a r, a r^{2}, ...$

$Z [k] = a r^{(k - 1)}$

$S_{n} = \frac{a ( 1 - r ^{n} )}{1 - r}$

Limits

It is important to know if a sequence converges to a value as $k \to \infty$ , or diverges to $\pm \infty$ as $k \to \infty$ . Consider:

$1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, ..., \frac{1}{k}$

$k \to \infty lim \frac{1}{k} = 0$

A sequence converges if it has a limit. If not, it diverges

Converge of Infinite Series

Manipulating the sequence can make it easier to see if the sequence converges or diverges. For example:

$x [k] = \frac{2 k ^{2} + 3 k - 1}{7 k ^{2} + 4 k + 2}, k = 1, 2, 3, ...$

Divide by the highest power of k:

$\frac{2 + \frac{3}{k} - \frac{1}{k ^{2}}}{7 + \frac{4}{k} + \frac{2}{k ^{2}}}$

Since $\frac{1}{k}$ and $\frac{1}{k ^{2}}$ both tend to 0 as $lim_{k \to \infty}$ , the sum is convergent.

Another example, consider the series

$k = 1 \sum \infty \frac{1}{k} = 1 + \frac{1}{2} + \frac{1}{3} + ... + \frac{1}{n} + ...$

Clearly, $lim_{k \to \infty} \frac{1}{k} = 0$ , however the partial sum $S_{n}$ (the sum of terms up to $n$ ) has $n$ terms, the smallest being $\frac{1}{n}$ . Thus:

$S_{n} \geq n (\frac{1}{n}) = n, \Rightarrow n \to \infty lim S_{n} \to \infty$

The series is divergent, as can be seen from the limit of partial sums. In order to see whether an infinite series converges to a limit, $S$ , (a finite sum for infinite number of terms) we look at the sequence of partial sums, $S_{n}$ , up to $n$ terms. Another example:

$k = 1 \sum \infty (\frac{1}{k} - \frac{1}{k + 1}) = k \to \infty lim [(\frac{1}{1} - \frac{1}{2}) + (\frac{1}{2} - \frac{1}{3}) + ... + (\frac{1}{k} - \frac{1}{k + 1})]$

Sequence of partial sums:

$S_{1} = (\frac{1}{1} - \frac{1}{2}) = \frac{1}{2}$

$S_{2} = (\frac{1}{1} - \frac{1}{2}) + (\frac{1}{2} - \frac{1}{3}) = \frac{2}{3}$

$S_{2} = (\frac{1}{1} - \frac{1}{2}) + (\frac{1}{2} - \frac{1}{3}) + (\frac{1}{3} - \frac{1}{4}) = \frac{3}{4}$

$n \to \infty lim S_{n} = n \to \infty lim [1 - \frac{1}{n + 1}] = 1 - 0 = 1$

The sequence of partial sums shows that the series converges.

Infinite arithmetic series are always divergent.
Infinite geometric series are convergent iff $∣ r ∣ < 1$
- Sum is $\frac{a}{1 - r}$

Tests for Convergence

Comparison Test

A series of positive terms is convergent if the value of each of its terms is less than or equal to the corresponding terms of another series of positive terms that is convergent.

A series of positive terms is divergent if the value of each of its terms is greater than or equal to the corresponding terms of another series of positive terms that is divergent

Ratio Test

The series of positive terms

$k = 1 \sum \infty a_{k} = a_{1} + a_{2} + ... + a_{k} + ...$

is convergent if:

$k \to \infty lim \frac{a _{k + 1}}{a _{k}} < 1$

and divergent if:

$k \to \infty lim \frac{a _{k + 1}}{a _{k}} > 1$

Example

Testing the following sequence $S$ for convergence:

$S = k = 1 \sum \infty \frac{1}{k 3 ^{k - 1}} = 1 + \frac{1}{2 \times 3} + \frac{1}{3 \times 3 ^{2}} + ...$

Compare it with a sequence less than it that is known to be convergent:

$1 + \frac{1}{3} + \frac{1}{3 ^{2}} + ... + \frac{1}{3 ^{k - 1}} + ...$

$1 = 1$

$\frac{1}{2 \times 3} < \frac{1}{3}$

$\frac{1}{3 \times 3 ^{2}} < \frac{1}{3 ^{2}}$

$\frac{1}{3 ^{k - 1}} < \frac{1}{3 ^{k - 1}}$

Thus $S$ is convergent.

Taylor & Maclaurin Series

Taylor and Maclaurin series provide polynomial approximations to any function. Suppose that a function $f (x)$ is infinitely differentiable, and its derivatives known at a particular point, $x^{*} = a$ . This function can then be expressed as an infinite polynomial series.

$f (x) = n = 0 \sum \infty c_{n} (x - a)^{n} = c_{0} + c_{1} (x - a) + c_{2} (x - a)^{2} + c_{3} (x - a)^{3} + ...$

This series can be repeatedly differentiated to obtain values for all the constants:

$c_{0} = a c_{1} = f^{'} (a) c_{2} = \frac{1}{2 !} f^{(2)} (a) ... c_{n} = \frac{1}{n !} f^{(n)} (a)$

Therefore the Taylor series expansion of $f (x)$ about the point $x^{*} = a$ is:

$f (x) = n = 0 \sum \infty \frac{1}{n !} f^{(n)} (a) (x - a)^{n} = f (a) + f^{'} (a) (x - a) + \frac{1}{2 !} f^{(2)} (a) (x - a)^{2} + ... + \frac{1}{n !} f^{(n)} (a) (x - a)^{n}$

Alternatively expressed as

$f (a + h) = f (a) + h f^{'} (a) + \frac{h ^{2}}{2 !} f^{(2)} (a) + ... + \frac{h ^{n}}{n !} f^{(n)} (a)$

Maclaurin Series

If expanding about the point $a = 0$ , then the Taylor series becomes the Maclaurin series:

$f (x) = n = 0 \sum \infty \frac{1}{n !} f^{(n)} (0) x^{n} = f (0) + f^{'} (0) x + \frac{1}{2 !} f^{(2)} (0) x^{2} + ... + \frac{1}{n !} f^{(n)} (0) x^{n}$

Example

Finding Maclaurin series for $sin (x)$ :

$f (0) = sin (0) = 0$

$f^{'} (0) = cos (0) = 1$

$f^{''} (0) = - sin (0) = 0$

$f^{'''} (0) = - cos (0) = - 1$

$f (x) = f (0) + f^{'} (0) x + \frac{1}{2 !} f^{(2)} (0) x^{2} + \frac{1}{3 !} f^{(2)} (0) x^{3} + ...$

$f (x) = 0 + 1 x + \frac{1}{2 !} 0 x^{2} - \frac{1}{3 !} f^{(2)} (0) x^{3} + ...$

$f (x) = x - \frac{x ^{3}}{3 !} + \frac{x ^{5}}{5 !} - ...$

$f (x) = n = 0 \sum \infty \frac{( - 1 ) ^{k}}{( 2 k + 1 )!} x^{2 k + 1} = sin (x)$

The image below shows the polynomial maclaurin approximations to $sin (x)$ for increasing $n$ . You can see how accuracy improves as $n \to \infty$

Matrices & Quadratic Forms

Linear Algebra

Linear algebra is the formalisation/generalisation of linear equations involving vectors and matrices. A linear algebraic equation looks like

$A x = b$

where $A$ is a matrix, and $x$ , $b$ are vectors. In an equation like this, we're interested in the existence of and the number of solutions. Linear ODEs are also of interest, looking like

$\overset{x}{˙} (t) = A x (t)$

where $A$ is a matrix, $t$ is a vector, and $x$ is a function over a vector.

I'm really not about to go into what a matrix or it's transpose is
$A^{'}$ denotes the transpose of $A$
$b$ is a column vector, indexed $b_{i}$
$b^{'}$ is a row vector
You can index matrices using the notation $A_{ij}$ , which is the element in row $i$ and column $j$ , indexed from 1

Matrices can be partitioned into sub-matrices:

$M = [A C B D]$

Column and row partitions give row/column vectors.

$A = 159261037114812$

$w^{(1)^{'}} = [1234]$

$w^{(2)^{'}} = [5678]$

$w^{(3)^{'}} = [9101112]$

A square matrix of order $n$ has dimensions $n$ x $n$
The leading diagonal is entries $A_{11}, A_{22}, A_{33}, ..., A_{nn}$
- The trace of a square matrix is the sum of the leading diagonal
A diagonal matrix has only entries on the leading diagonal
The identity matrix is a diagonal matrix of ones

The Inner Product

The inner product of two vectors $w^{'}$ , a row vector, and $v$ , a column vector:

$w^{'} v = w_{1} v_{1} + w_{2} v_{2} + ... + w_{n} v_{n}$

(1x $n$ ) matrix times ( $n$ x1) to yield a scalar
If the inner product is zero, then $w$ and $v$ are orthogonal
In euclidian space, the inner product is the dot product
The norm/magnitude/length of a vector is $∣∣ w ∣∣ = w^{'} w = (w_{1}^{2} + w_{2}^{2} + ... + w_{n}^{2})$
- If norm is one, vector is unit vector

Linear Independence

Consider a set of vectors all of equal dimensions, $v^{(1)}, v^{(2)}, ..., v^{(r)}$ . The vector $v^{(r)}$ is linearly dependent on the vectors $v^{(1)}, v^{(2)}, ..., v^{(r - 1)}$ if there exists $(r - 1)$ non-zero scalars $α_{1}, α_{2}, ..., α_{r - 1}$ such that:

$v^{(r)} = α_{1} v^{(1)} + α_{2} v^{(2)} + ... + α_{r - 1} v^{(r - 1)}$

If no such scalars exist, the set of vectors are linearly independent.

Finding the linearly independent rows in a matrix:

$A = 04415726103713 = w^{(1)^{'}} w^{(2)^{'}} w^{(3)^{'}} w^{(1)^{'}} = [0123] w^{(2)^{'}} = [4567] w^{(3)^{'}} = [471013]$

$w^{(2)^{'}}$ is independent of $w^{(1)^{'}}$ since $w^{(2)^{'}}! = k w^{(1)^{'}}$ for any $k$
$w^{(3)^{'}} = 2 w^{(2)^{'}} + w^{(1)^{'}}$
- Row 3 is linearly dependent on rows 1 and 2
There are 2 linearly independent rows
It can also be found that there are two linearly independent columns

Any matrix has the same number of linearly independent rows and linearly independent columns

A more formalised approach is to put the matrix into row echelon form, and then count the number of non-zero rows. $A$ in row echelon form may be obtained by gaussian elimination:

$100010 - 1 20 230$

Minors, Cofactors, and Determinants

For an $n$ x $n$ matrix $A$ , the determinant is defined as

$det (A) = j = 1 \sum n a_{ij} γ_{ij} for any i = 1, 2, ..., n$

$i$ denotes a chosen row along which to compute the sum
$γ_{ij}$ is the cofactor of element $a_{ij}$
- $γ_{ij} = (- 1)^{i + j} μ_{ij}$
$μ_{ij}$ is the minor of element $a_{ij}$
The minor is obtained by calculating the determinant from the matrix obtained by deleting row $i$ and column $j$
The cofactor is the minor with the appropriate sign from the matrix of signs

Determinant Properties

$det (A) = det (A^{'})$
If a constant scalar $α$ times any row/column is added to any other row/column, the $det (A)$ is unchanged
If $A$ and $B$ are of the same order, then $det (A B) = det (A) det (B)$
$det (A) = 0$ iff the rank of $A$ is less than its order, for a square matrix.

Rank

The rank of a matrix is the number of linearly independent columns/rows

Any non-zero $n$ x $m$ matrix $A$ has rank $r$ if at least one of it's $r$ -square minors is non-zero, while every $(r + 1)$ -square minor is zero.

$r$ -square denotes the order of the determinant used to calculate the minor

For example:

$det (A) = 123235347 = 0$

The determinant is 0
The rank is less than 3
The minor $a_{33} = - 1 \neq = 0$ .
The order of this minor is 2
Thus, the rank of $A$ is 2

There are two other ways to find the rank of a matrix, via gaussian elimination into row-echelon form, or by the definition of linear independence.

Inverses of Matrices

The inverse of a square matrix is defined:

$A^{- 1} = \frac{1}{det ( A )} adj (A)$

$A A^{- 1} = A^{- 1} A = I_{n}$
$A^{- 1}$ is unique

$adj (A)$ is the adjoint of $A$ , the transpose of the matrix of cofactors:

$adj (A) = γ_{11} γ_{21} ⋮ γ_{n 1} γ_{12} γ_{22} ⋮ γ_{n 2} \dots \dots ⋱ \dots γ_{1 n} γ_{1 n} ⋮ γ_{nn}^{'}$

If $det (A) = 0$ , $A$ is singular and has no inverse.

Pseudo-inverse of a Non-Square Matrix

Given a more general $n$ x $m$ matrix $A$ , we want some inverse such that $A B = I_{m}$ , or $B A = I_{n}$ .

If $m < n$ (more columns than rows, matrix is fat), and $det (A A^{'}) \neq = 0$ , then the right pseudo-inverse is defined as:

$A_{R}^{+} = A^{'} (A A^{'})^{- 1}$

If $n < m$ (more rows than columns, matrix is tall), and $det (A^{'} A) \neq = 0$ , then the left pseudo-inverse is defined as:

$A_{L}^{+} = (A^{'} A)^{- 1} A^{'}$

For example, the right pseudo inverse of $A = [142536]$ :

$A^{'} = 123456 A A^{'} = [14323277] det (A A^{'}) = 54$

$(A A^{'})^{- 1} = \frac{1}{54} [77 - 32 - 32 14]$

$A_{R}^{+} = A^{'} (A A^{'})^{- 1} = 123456 \frac{1}{54} [77 - 32 - 32 14] = \frac{1}{54} - 51 - 6 39 246 - 12$

Symmetric Matrices

A matrix $A$ is symmetric if $A = A^{'}$

$A = [1222] = A^{'}$

A matrix is skew-symmetric if $A = - A^{'}$

$A = [02 - 2 0] = - A^{'}$

For any square matrix $A$ :

$A A^{'}$ is a symmetric matrix
$A + A^{'}$ is a symmetric matrix
$A - A^{'}$ is a skew-symmetric matrix

Every square matrix $A$ can be written as the sum of a symmetric matrix $B$ and skew-symmetric matrix $C$ :

$A = B + C B = \frac{1}{2} (A + A^{'}) C = \frac{1}{2} (A - A^{'})$

Quadratic forms

Consider a polynomial with $n$ variables $x_{i}$ and $n^{2}$ constants $a_{ij}$ of the form:

$Q (x_{1}, x_{2}, ..., x_{n}) = i = 1 \sum n j = 1 \sum n a_{ij} x_{i} x_{j}$

When expanded:

$Q = a_{11} x_{1}^{2} + a_{22} x_{2}^{2} + ... + a_{nn} x_{n}^{2} + ... + (a_{12} + a_{21}) x_{1} x_{2} + ... + (a_{n - 1, n} + a_{n, n - 1}) x_{n - 1} x_{n}$

This is known as a quadratic form, and can be written:

$Q (x) = x^{'} A x$

where $x$ is an $n \times 1$ column vector, and $A$ is an $n \times n$ symmetric matrix. In two variables:

$Q (x_{1}, x_{2}) = d_{11} x_{1}^{2} + d_{22} x_{2}^{2} + d_{12} x_{1} x_{2} = [x_{1} x_{2}] [d_{11} d_{12} /2 d_{12} /2 d_{22}] [x_{1} x_{2}]$

Linear forms are also a thing. A general linear form in three variables $x_{1}$ , $x_{2}$ , $x_{3}$ :

$Q (x_{1}, x_{2}, x_{3}) = d_{1} x_{1} + d_{2} x_{2} + d_{3} x_{3} = [d_{1} d_{2} d_{3}] x_{1} x_{2} x_{3}$

This allows us to represent any quadratic function $f (x)$ as a sum of:

$f (x) = x^{'} A x + b^{'} x + c$

For example:

$f (x_{1}, x_{2}) = - x_{1}^{2} + 4 x_{1} x_{2} + 2 x_{x}^{2} + 3 x_{1} - 3 x_{2} + 7 = [x_{1} x_{2}] [1222] [x_{1} x_{2}]$

Linear Simultaneous Equations

The general form of a set of linear simulatenous equations:

$a_{11} x_{1} + a_{12} x_{2} + ... + a_{1 n} x_{n} = b_{1} a_{21} x_{1} + a_{22} x_{2} + ... + a_{2 n} x_{n} = b_{2} a_{m 1} x_{1} + a_{m 2} x_{2} + ... + a_{mn} x_{n} = b_{m}$

This can be rewritten in a matrix/vector form:

$A x = b$

$A = a_{11} a_{21} ⋮ a_{m 1} a_{12} a_{22} ⋮ a_{m 2} \dots \dots ⋱ \dots a_{1 n} a_{1 n} ⋮ a_{mn} b = b_{1} b_{2} ⋮ b_{n} x = x_{1} x_{2} ⋮ x_{n}$

Equations of this form have three cases for their solutions:

The system has no solution
The system has a unique solution
The system has an infinite number of solutions
- $x$ can take a number of values

An over-determined system has more equations than unknowns and has no solution:

$1 - 1 - 2 121 [x_{1} x_{2}] 212$

An under-determined system has more unknowns than equations and has infinite solutions:

$[11] [x_{1} x_{2}] [2]$

A consistent system has a unique solution

$[1 - 1 12] [x_{1} x_{2}] [21]$

The solution for this system is $[11]$ . Note that the rank and order of $A$ are both 2, and $A^{- 1}$ exists in this case. If the determinant of a consistent system is 0, there will be no solutions.

Solutions of Equations

To determine which of the three cases a system is:

Introduce the augmented matrix: $A = [A ∣ b]$
Calculate the rank of $A$ and $A$

No Solution

If $rank (A) \neq = rank (A)$ , then the system has no solution
All vectors $x$ will result in an error vector
A particular error vector will minimise the norm of the equation error
- The least square error solution, $x^{*}$

$∣∣ ϵ ∣ ∣^{2} = ∣∣ A x - b ∣ ∣^{2} = (A x - b)^{'} (A x - b)$

$x^{*} = A_{L}^{+} b = (A^{'} A)^{- 1} A^{'} b$

Unique Solution

$rank (A) = rank (A) = n$ where $n$ is the number of variables in $x$

$det (A) \neq = 0$ .

$x = A^{- 1} b$

Infinite Solutions

$rank (A) = rank (A) = k < n$
Paramaeters can be assigned to any $n - k$ elements of the vector $x$ and the remaining $k$ elements can be computed in terms of these parameters
A particular vector $x^{*}$ will again minimise the square of the norm of the solution vector

$x^{*} = A_{R}^{+} b = A^{'} (A A^{'})^{- 1} b$

Homogenous Systems

A system of homogenous equations take the form:

$A x = o_{n} b = o_{n} = 00 ⋮ 0$

$A$ is an $n$ x $n$ matrix of known coefficients
$b$ is an $n$ x $1$ null column vector
$x$ is an $n$ x $1$ vector of unknowns

The augmented matrix $rank (A) = [A ∣ o_{n}]$ and $rank (A) = rank (A)$ , so there is at least one solution vector $x = o_{n}$ . There are two possible cases for other solutions:

$rank (A) = n$ and $det (A) \neq = 0$ , then the trivial solution is the only unique solution
If $rank (A) < n$ and $det (A) = 0$ , then there is an infinite number of non-trivial solutions $x \neq = o_{n}$
- This includes the trivial solution $x = o_{n}$

Example 1

Solutions to:

$147258360 x_{1} x_{2} x_{3} = 61515$

$A = 14725836061515$

First calculate the determinant of $A$ :

$147258360 = 5860 - 24760 + 35758 = 27$

$det (A) = 3$ so $A$ is a full rank matrix (rank = order = 3). We know solutions exist, but need to find the rank of $A$ to check if unique or infinite solutions. Using gaussian elimination to put $A$ into row-echelon form:

$100 2 - 2 0 3 - 6 - 9 6 - 9 - 9$

The rank of $rank (A) = rank (A) = 3$ , so there is a unique solution $x = A^{- 1} b$

$A^{- 1} = \frac{1}{27} - 48 423 - 24 - 21 6 - 3 6 - 3$

$x = \frac{1}{27} - 48 423 - 24 - 21 6 - 3 6 - 3 61515 = 111$

Example 2

Solutions to:

$147258369 x_{1} x_{2} x_{3} = 000$

There is the trivial solution $x = o_{n}$ , but we need to known if there is infinite solutions, which we can determine from $rank (A)$ . Putting it into row-echelon form:

$100 2 - 3 0 3 - 6 0$

$rank (A) = 2 < n$ , so there is infinite solutions. Can introduce a parameter $α$ to express solutions in terms of. Using the coefficients from the row-echelon form:

$x_{1} + 2 x_{2} + 3 x_{3} = 0 - 3 x_{2} - 6 x_{3} = 0$

$x_{1} = x_{3} = α x_{2} = - 2 x_{3} = - 2 α$

$x = 000$

$x = α - 2 α α$

Eigenvalues & Eigenvectors

For a square $n \times n$ matrix $A$ , a scalar $λ$ is an eigenvalue of $A$ , where:

$A v = λ v$

This can be rewritten as a homogenous equation in an unknown vector $v$ :

$(A - λ I_{n}) v = o_{n}$

This equation has infinitely many non-trivial solutions for $v$ , where:

$det (A - λ I_{n}) = 0$

This is the characteristic equation of $A$ , and the eigenvalues $λ$ are scalars that satisfy this. Since the characteristic equation is an $n$ -th degree polynomial, an $n \times n$ matrix will have $n$ eigenvalues $λ_{i}$ for $i = 1, 2..., n$ .

Corresponding to each eigenvalue $λ_{i}$ , eigenvectors $v^{(i)}$ are non-trivial solutions of:

$(A - λ_{i} I_{n}) v^{(i)} = o_{n}$

Example

Eigenvalues and vectors of:

$A = [0 - 2 1 - 3]$

The characteristic equation and it's solutions:

$det (A - λ I_{n}) = 0$

$- λ - 2 1 - 3 - λ = λ^{2} + 3 λ + 2 = 0 λ = - 1, - 2$

Eigenvector for $λ = - 1$ :

$(A - λ I_{n}) v = o_{n}$

$([0 - 2 1 - 3] - [- 1 0 0 - 1]) [x_{1} x_{2}] = 0$

$x_{1} + x_{2} = 0 - 2 x_{1} - 2 x_{2} = 0$

$x_{1} = - x_{2} x = [- α α]$

Eigenvector for $λ = - 2$ :

$(A - λ I_{n}) v = o_{n}$

$([0 - 2 1 - 3] - [- 2 0 0 - 2]) [x_{1} x_{2}] = 0$

$2 x_{1} + x_{2} = 0 - 2 x_{1} - x_{2} = 0$

$- 2 x_{1} = x_{2} x = [α - 2 α]$

Spectral Decomposition

An $n$ x $n$ matrix $A$ has $n$ eigenvectors $λ_{1} ... λ_{n}$ and $n$ associated eigenvectors $v^{(1)} ... v^{(1)}$ .

$V$ is an $n$ x $n$ matrix of column eigenvectors, and $Λ$ is an $n$ x $n$ diagonal matrix of eigenvalues

$V = [v^{(1)} v^{(2)} ... v^{(n)}]$

$Λ = λ_{1} 0 ⋮ 0 0 λ_{2} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ λ_{n}$

$A V = V Λ$ for all $n \times n$ matrices

In general, eigenvectors of $A$ are linearly independent and so $V^{- 1}$ exists. The spectral decomposition of a matrix $A$ can then be written:

$A = V Λ V^{- 1}$

This allows for diagonalisation of a matrix in terms of its eigenvectors, and for breaking down a multi-dimensional problem into a set of single dimensional problems.

This is only possible if all eigenvectors are linearly independent.
- If any are repeated then this is not the case

If $A$ is a symmetric matrix, then the eigenvectors are mutually orthogonal, ie $v^{(i)} \cdot v^{(j)} = 0$ for all $i \neq = j$ . If these eigenvectors are orthonormalised (of unit length), then the matrix of eigenvectors $V$ is an orthogal matrix, meaning its transpose is equal to it's inverse. Hence, the spectral resolution of a symmetric matrix $A$ is:

$A = V Λ V^{- 1} = V Λ V^{'}$

Example

Find the spectral resolution of, and hence diagonalise:

$A = [1221]$

The eigenvalues of $A$ are $λ_{1} = 3$ and $λ_{2} = - 1$ . These can then be used to compute the corresponding eigenvectors:

$v^{(1)} = [α α] v^{(2)} = [α - α]$

Using $α = 1$ :

$Λ = [λ_{1} 0 0 λ_{2}] = [30 0 - 1]$

$V = [v_{1}^{(1)} v_{2}^{(1)} v_{1}^{(2)} v_{2}^{(2)}] = [11 1 - 1]$

$V^{- 1} = [0.5 0.5 0.5 - 0.5]$

The spectral resolution of $A$ is given by:

$A = V Λ V^{- 1} = [11 1 - 1] [30 0 - 1] [0.5 0.5 0.5 - 0.5]$

$A$ can then be diagonalised by $V^{- 1} A V$ :

$V^{- 1} A V = [0.5 0.5 0.5 - 0.5] [30 0 - 1] [11 1 - 1] = [30 0 - 1]$

Oscillators & State Space Systems

Oscillators are coupled mass/spring, pendulums, etc systems, which can be analysed using modal analysis:

Start with a complex coupled system
Use spectral decomposition to diagonalise the system into simpler uncoupled systems
Solve for each system

Single Degree of Freedom Oscillators

Mass-Spring

The equation of motion is:

$\frac{d ^{2} z ( t )}{d t ^{2}} = - k z (t)$

where $k$ is the normalised stiffness, $k = \overset{ˉ}{k} / m$ . Assuming an oscillatory solution:

$z (t) = A cos ω_{0} t + B sin ω_{0} t$ $\overset{z}{˙} (t) = - ω_{0} A sin ω_{0} t + ω_{0} B cos ω_{0} t$ $\overset{z}{¨} (t) = - ω_{0}^{2} A cos ω_{0} t - ω_{0}^{2} B sin ω_{0} t$

Solving for $ω_{0}$ by substituting back in gives $ω_{0} = k$ .

Setting $z (0) = 0$ and $\overset{z}{˙} (0) = v_{0}$ :

$z (t) = a sin (ω_{0} t) a = \frac{v _{0}}{ω _{0}}$

This system oscillates at a single frequency, $ω_{0} = \overset{ˉ}{k} / m$ .

Pendulum

The equation of motion for a pendulum in the tangential direction is:

$m l \ddot{θ} = - m g sin θ$ $\ddot{θ} (t) = - k θ (t) (for small θ)$

Where $k = g / l$

The system oscillates at the frequency $ω_{0} = k$
This system has the same form, and therefore solution, as the mass-spring.
The frequency depends only on the length, not the mass, a property unique to pendulums.

Multiple Degrees of Freedom

This single degree of freedom can be generalised to a 2nd order $n$ -degree of freedom system:

$\frac{d ^{2} y}{d t ^{2}} = - Ky (t)$

$K = k_{11} k_{21} ⋮ k_{n 1} k_{12} k_{22} ⋮ k_{n 2} \dots \dots ⋱ \dots k_{1 n} k_{2 n} ⋮ k_{nn} y = y_{1} y_{2} ⋮ y_{n}$

$K$ is an $n \times n$ matrix
$y$ is an $n$ -dimensional column vector

The goal is to find frequencies $ω$ such that the solution $y (t)$ can be expressed as harmonic functions of $sin (ω t)$ . This is done by spectral decomposition:

$K = V Λ V^{- 1}$ $\frac{d ^{2} y}{d t ^{2}} = - V Λ V^{- 1} y (t)$ $\frac{d ^{2} V ^{- 1} y}{d t ^{2}} = - V^{- 1} V Λ V^{- 1} y (t)$

Introduce a new variable $z (t) = V^{- 1} y (t)$ , so that for $z (t)$ :

$\frac{d ^{2} z}{d t ^{2}} = - Λ z (t)$

$Λ = λ_{1} 0 ⋮ 0 0 λ_{2} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ λ_{n}$

This equation involving a diagonal matrix can then be decomposed to $n$ uncoupled scalar equations (the normal modes) for each scalar $z_{i}$ in $z$ :

$\frac{d ^{2} z _{i}}{d t ^{2}} = - λ_{i} z (t)$

This is a single degree of freedom scalar equation, as the previous two examples, thus:

$λ_{i} = ω_{i}^{2}$
$z (t) = a sin (ω t)$

The solution of the 2nd order $n$ -DoF system is defined by a superposition of the normal modes $z_{i} (t)$

$λ_{i} = ω_{i}^{2}$ is an eigenvalue of $K$
- $ω_{i}$ is the frequency of the normal mode
$v_{i}$ is an eigenvector of $K$
- Specifies the shape of the normal mode

Example 1

Conside a system of two coupled masses:

Two masses $m_{1}$ and $m_{2}$
Two displacements $y = [y_{1} y_{2}]$
- The variable to solve for
Three springs $k_{1}$ , $k_{2}$ , $k_{3}$

Two equations of motion, one for each mass:

$m_{1} \frac{d ^{2} y _{1}}{d t ^{2}} = - k_{1} y_{1} - k_{3} (y_{1} - y_{2})$ $m_{2} \frac{d ^{2} y _{2}}{d t ^{2}} = - k_{2} y_{2} + k_{3} (y_{1} - y_{2})$

Rearranging into a matrix equation:

$\frac{d ^{2} y _{1}}{d t ^{2}} = \frac{- ( k _{1} + k _{3} ) y _{1} + k _{3} y _{2}}{m _{1}}$ $\frac{d ^{2} y _{2}}{d t ^{2}} = \frac{k _{3} y _{1} - ( k _{3} + k _{3} ) y _{2}}{m _{2}}$

$\frac{d ^{2} y}{d t ^{2}} = - [\frac{k _{1} + k _{3}}{m _{1}} \frac{- k _{3}}{m _{2}} \frac{- k _{3}}{m _{1}} \frac{k _{2} + k _{3}}{m _{2}}] [y_{1} y_{2}] = - Ky$

Let $k_{1} = k_{2} = 2$ , and $k_{3} = m_{1} = m_{2} = 1$ :

$K = [3 - 1 - 1 3]$

To solve the system, need to compute the eigenvalues and eigenvectors of $K$ , and hence the normal modes. Starting with the eigenvalues:

$3 - λ - 1 - 1 3 - λ = 0$

$λ^{2} - 6 λ + 8 = 0$ $λ_{1} = 2 λ_{2} = 4$

Hence the two natural frequencies of oscillation are $ω_{1} = 2$ and $ω_{2} = 2$ . Now for the eigenvectors:

$(A - λ_{i}) v = 0$

$λ_{1} = 2 [1 - 1 - 1 1] [v_{1}^{(1)} v_{2}^{(2)}] = 0$

$v_{1}^{(1)} = v_{2}^{(2)}$

$v^{(1)} = [11]$

$λ_{2} = 4 [- 1 - 1 - 1 - 1] [v_{1}^{(1)} v_{2}^{(2)}] = 0$

$v_{1}^{(1)} = - v_{2}^{(2)}$

$v^{(1)} = [1 - 1]$

The first mode $λ_{1} = 2$ , $v^{(1)} = [11]$ , $ω_{1} = 2$ implies that both bodies move in unison at the frequency of the mode $f = \frac{1}{\sqrt{2\pi}$Hz. The spring between the two masses does not stretch or contract.

The second mode $λ_{1} = 4$ , $v^{(2)} = [1 - 1]$ , $ω_{2} = 2$ implies that both bodies move in opposition at the frequency of the mode $f = \frac{1}{π}$ Hz, with the connecting spring stretching and contracting.

Example 2

The full nonlinear equations of motion for a double pendulum are:

$(m_{1} + m_{2}) L_{1} \ddot{θ_{1}} + m_{2} L_{2} \dot{θ_{2}^{2}} sin (θ_{1} - θ_{2}) + m_{2} L_{2} \ddot{θ_{2}} cos (θ_{1} - θ_{2}) + (m_{1} + m_{2}) g sin (θ_{1}) = 0$ $m_{2} L_{2} \ddot{θ_{2}} + m_{2} L_{1} \ddot{θ_{1}} cos (θ_{1} - θ_{2}) - m_{2} L_{1} \dot{θ_{1}^{2}} sin (θ_{1} - θ_{2}) + m_{2} g sin θ_{2}$

Assuming small angles, and therefore neglecting square terms and making small angle trigonometric approximations:

$(m_{1} + m_{2}) L_{1} \ddot{θ_{1}} + m_{2} L_{2} \ddot{θ_{2}} + (m_{1} + m_{2}) g θ_{1} = 0$ $m_{2} L_{2} \ddot{θ_{2}} + m_{2} L_{1} \ddot{θ_{1}} + m_{2} g θ_{2} = 0$

Let:

$m_{1} = 1$
$m_{2} = 5$
$L_{1} = 2$
$L_{2} = 3$

$12 \ddot{θ_{1}} + 15 \ddot{θ_{2}} + 60 θ_{1} = 0$ $10 \ddot{θ_{2}} + 15 \ddot{θ_{1}} + 50 θ_{2} = 0$

$[12101515] [\ddot{θ_{1}} \ddot{θ_{2}}] + [600050] [θ_{1} θ_{2}] = 0$

$[12101515] [\ddot{θ_{1}} \ddot{θ_{2}}] = - [600050] [θ_{1} θ_{2}]$

To put into the form $\ddot{θ} = - K θ$ , we can premultiply by the inverse of the first matrix:

$[\ddot{θ_{1}} \ddot{θ_{2}}] = - [12101515]^{- 1} [600050] [θ_{1} θ_{2}]$

$[\ddot{θ_{1}} \ddot{θ_{2}}] = - [30 - 20 - 25 20] [θ_{1} θ_{2}]$

Now we have $K$ , we can compute it's normal modes.

Mode 1 $λ_{1}$ :

$λ_{1} = 2.0871$
$ω_{1} = 1.4447$ rad/s
- The system oscillates at a low frequency
$v^{(1)} = [0.6672 0.7449]$
- System oscillates in-phase

Mode 2 $λ_{2}$ :

$λ_{2} = 47.9129$
$ω_{1} = 6.9219$ rad/s
- The system oscillates at a high frequency
$v^{(1)} = [0.8129 - 0.5824]$
- System oscillates out of phase

The oscillation of the overall system will be the superposition of these two modes.

State Space Linear Systems

Consider a second order linear ODE of the form $\overset{y}{¨} (t) + a \overset{y}{˙} (t) + b y (t) = 0$ . Two variables are needed to uniquely specify the state of the system at any moment in time, the displacement $x_{1} = y (t)$ , and the velocity $x_{2} = \overset{y}{˙} (t)$ . The system can be rewritten in terms of these:

$\overset{x_{1}}{˙} = \overset{y}{˙} = x_{2}$ $\overset{x_{2}}{˙} = \overset{y}{¨} = - a \overset{y}{˙} - b y = - a x_{2} - b x_{1}$

$[\overset{x_{1}}{˙} \overset{x_{2}}{˙}] = [0 - b 1 - a] [x_{1} x_{2}]$

This has replaced a 2nd order scalar equation with a two-state 1st order matrix equation. This concept can be generalised to express an $n$ th order linear ODE as an $n$ -state first order linear matrix ODE:

$\overset{x_{1}}{˙} \overset{x_{2}}{˙} ⋮ \overset{x_{n}}{˙} = 00 ⋮ 0 - a_{n} 10 ⋮ 0 - a_{n - 1} 01 ⋮ 0 - a_{n - 2} \dots \dots ⋱ \dots \dots 00 ⋮ 1 - a_{1} x_{1} x_{2} ⋮ x_{n}$

Where the state vector $x (t) = [x_{1} x_{2} \dots x_{n}]^{'}$

$x_{1} = y$
$x_{2} = y^{'} = \overset{x_{1}}{˙}$
$x_{3} = y^{''} = \overset{x_{2}}{˙}$
$x_{n} = y^{(n)} = \overset{x_{n - 1}}{˙}$

Now to work out how to solve it. In the scalar case, the solution to $\overset{x}{˙} (t) = a x (t)$ with $x (0) = 0$ has the form

$x (t) = e^{a t} x_{0} = x_{0} (1 + a t + \frac{( a t ) ^{2}}{2 !} + \frac{( a t ) ^{3}}{3 !} + \dots)$

The sign of $a$ determines the stability of the system:

$a$ is negative: the system decays exponentially and is stable
$a$ is zero: nothing ever happens
$a$ is positive: the system rises exponentially and is unstable

The matrix case has the same solution:

$x (t) = e^{A t} x_{0} = x_{0} (I_{n} + A t + \frac{( A t ) ^{2}}{2 !} + \frac{( A t ) ^{3}}{3 !} + \dots)$

The task is then to compute the matrix exponential, and characterise the dynamics of the solution using the matrix $A$ .

Suppose $A$ has the spectral decomposition $A = V Λ V^{- 1}$ :

$\overset{x}{˙} (t) = A x (t) = V Λ V^{- 1} x (t)$ $\overset{x}{˙} (t) = A x (t) = Λ V^{- 1} x (t)$

Defining $z (t) = V^{- 1} x (t)$ again:

$\overset{z}{˙} (t) = Λ z (t)$

$z (t) = e^{λ_{1} t} 0 ⋮ 0 0 e^{λ_{2} t} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ e^{λ_{n} t} z (0)$

Since $Λ$ is diagonal, this is now a set of uncoupled equations:

$\overset{z_{1}}{˙} (t) = λ_{1} z_{1} (t) ⟹ z_{1} (t) = e^{λ_{1} t} z_{1} (0)$ $\overset{z_{2}}{˙} (t) = λ_{1} z_{2} (t) ⟹ z_{2} (t) = e^{λ_{1} t} z_{2} (0)$ $\overset{z_{n}}{˙} (t) = λ_{1} z_{n} (t) ⟹ z_{n} (t) = e^{λ_{1} t} z_{n} (0)$

$z_{1} (t), z_{2} (t), \dots, z_{n} (t)$ are the individual modes of the solution $x (t)$ and are defined by the eigenvalues $λ_{1}, λ_{2}, \dots, λ_{n}$ alone. The matrix exponential $e^{A t}$ is given by:

$e^{A t} = V z (t) = V diag {e^{λ_{1} t}, e^{λ_{2} t}, \dots, e^{λ_{n} t}} V^{- 1}$

Multiplying by the starting state $x (0) = x_{0}$ gives:

$x (t) = x_{0} e^{A t} = x_{0} V diag {e^{λ_{1} t}, e^{λ_{2} t}, \dots, e^{λ_{n} t}} V^{- 1}$

The solution is a linear combination of the terms $e^{λ_{1} t}, e^{λ_{2} t}, \dots, e^{λ_{n} t}$
Hence, behaviour is defined by the eigenvalues
The system is stable if all eigenvalues are negative
If at least one is positive, the system is unstable

Example

Consider an elementary RLC circuit with all components in series, with a non-zero initial charge $q_{0}$ on the capacitor. The instantaneous charge $q (t)$ in the circuit is described by a linear state space differential equation where $x (t) = [q (t) \overset{q}{˙} (t)]$ and $x_{0} = [q_{0} 0]$ . Suppose:

$q_{0} = 6$
$R = 7$
$L = 1$
$C = 0.1$

Find the particular solution for this system and discuss it's stability.

The state space equation for the system in the form $\overset{x}{˙} (t) = A q (t)$ is:

$[\overset{x_{1}}{˙} (t) \overset{x_{2}}{˙}] = [0 - 10 1 - 7] [q (t) \overset{q}{˙} (t)]$

The eigenvalues and eigenvectors of $A$ are:

$λ_{1} = - 2 v^{(1)} = [1 - 2]$

$λ_{2} = - 5 v^{(2)} = [1 - 5]$

The spectral resolution of $A$ :

$V = [1 - 2 1 - 5]$

$V^{- 1} = - \frac{1}{3} [- 5 2 - 1 1]$

$Λ = [- 2 0 0 - 5]$

$A = - \frac{1}{3} [1 - 2 1 - 5] [- 2 0 0 - 5] [- 5 2 - 1 1] = [0 - 10 1 - 7]$

The solution is given by $x (t) = e^{A t} x_{0}$ , and the matrix exponential term $e^{A t} = V diag {e^{λ_{1} t}, e^{λ_{2} t}, \dots, e^{λ_{n} t}} V^{- 1}$ :

$x (t) = [1 - 2 1 - 5] [e^{- 2 t} 0 0 e^{- 5 t}] (\frac{- 1}{3}) [- 5 2 - 1 1] [60] = [- 10 e^{- 2 t} - 4 e^{- 5 t} - 20 e^{- 2 t} + 20 e^{- 5 t}]$

Thus the solution:

$q (t) = - 10 e^{- 2 t} - 4 e^{- 5 t}$ $\overset{q}{˙} (t) = - 20 e^{- 2 t} + 20 e^{- 5 t}$

Also, since both eigenvalues $< 0$ , the system is stable.

Differential Matrix Calculus

The Derivative of a Matrix

Consider $x (t)$ where $x^{'} = [x_{1} x_{2} ... x_{n}]^{'}$ and $t$ is a scalar. The derivative of $x (t)$ with respect to time $t$ is:

$\frac{d x}{d t} = \frac{d x _{1}}{d t} \frac{d x _{2}}{d t} ⋮ \frac{d x _{n}}{d t}$

The derivative of a matrix with respect to a scalar is just the derivative of all the values. Similarly for an $m \times n$ matrix $X$

$\frac{d X}{d t} = \frac{d x _{11}}{d t} \frac{d x _{21}}{d t} ⋮ \frac{d x _{m 1}}{d t} \frac{d x _{12}}{d t} \frac{d x _{22}}{d t} ⋮ \frac{d x _{m 2}}{d t} \dots \frac{d x _{1 n}}{d t} \dots \frac{d x _{2 n}}{d t} ⋱ ⋮ \dots \frac{d x _{mn}}{d t}$

Vector-Valued Functions

The set of $m$ functions on the same $n$ variables can be represented as a vector-valued function $f$ over the vector $x$

$f (x) = f_{1} (x_{1}, x_{2}, ..., x_{n}) f_{2} (x_{1}, x_{2}, ..., x_{n}) ⋮ f_{m} (x_{1}, x_{2}, ..., x_{n})$

Each element of the vector $f$ is a function of the $n$ variables $x_{1}, x_{2}, ... x_{n}$

$f$ is an $m \times 1$ vector function over $x$
$x$ is an $n \times 1$ vector

The Matrix Form of the Chain Rule

If $f = f (x)$ and $x = (θ)$ such that $f = f (x (θ))$ :

$\frac{\partial f}{\partial θ} = \frac{\partial f}{\partial x} \cdot \frac{\partial x}{\partial θ}$

This is the same as the scalar case, but note that matrix multiplication is not commutative so the order matters.

The Jacobian Matrix

The derivative of a vector function $f$ with respect to a column vector $x$ is defined formally as the $m \times n$ Jacobian matrix:

$f = f_{1} f_{2} ⋮ f_{m} x = f_{1} f_{2} ⋮ f_{m}$

$J = \frac{\partial f}{\partial x} = \frac{\partial f _{1}}{\partial x _{1}} \frac{\partial f _{2}}{\partial x _{1}} ⋮ \frac{\partial f _{m}}{\partial x _{1}} \frac{\partial f _{1}}{\partial x _{2}} \frac{\partial f _{2}}{\partial x _{2}} ⋮ \frac{\partial f _{m}}{\partial x _{2}} \dots \dots ⋱ \dots \frac{\partial f _{1}}{\partial x _{n}} \frac{\partial f _{2}}{\partial x _{n}} ⋮ \frac{\partial f _{m}}{\partial x _{n}}$

$J_{ij} = \frac{\partial f _{i}}{\partial x _{j}}$

The Jacobian matrix is the derivative of a multivariate function, representing the best linear approximation to a differentiable function near a point. Geometrically, it defines a tangent plane to the function $f$ at the point $x$

Linearisation of a Matrix Differential Equation

Assume that $x^{*}$ is a stationary point (equilibrium state) of a non-linear system described by a matrix differential equation:

$\dot{x} (t) = f [x (t)] f [x^{*} (t)] = 0 f (x) = f_{1} (x_{1}, x_{2}, ..., x_{n}) f_{2} (x_{1}, x_{2}, ..., x_{n}) ⋮ f_{m} (x_{1}, x_{2}, ..., x_{n})$

The linearisation of this system is the evaluation of the Jacobian matrix at $x^{*}$ . The linearised equation is $\dot{x} (t) = Ax (t)$ , with the matrix of constants $A$ .

$A = \frac{\partial f}{\partial x}_{x = x^{*}} = a_{11} a_{11} ⋮ a_{m 1} a_{12} a_{22} ⋮ a_{m 2} \dots \dots ⋱ \dots a_{1 n} a_{2 n} ⋮ a_{mn}$

Example

Linearise the system around an equilibrium state:

$\frac{d}{d x} x_{1} x_{2} x_{3} = f (x) = σ (x_{1} - x_{2}) x_{1} (ρ - x_{3}) - x_{2} x_{1} x_{2} - β x_{3}$

$σ$ , $ρ$ , and $β$ are parameters. At it's equilibrium, $f (x) = 0$

$f (x) = σ (x_{1} - x_{2}) x_{1} (ρ - x_{3}) - x_{2} x_{1} x_{2} - β x_{3} = 0$

There are three solutions to this system of algebraic equations, but we're interested in the one at the origin where $[x_{1} x_{2} x_{3}]^{'} = [000]^{'}$ . Evaluating the Jacobian at this point:

$J = \frac{\partial f}{\partial x} = \frac{\partial f _{1}}{\partial x _{1}} \frac{\partial f _{2}}{\partial x _{1}} \frac{\partial f _{3}}{\partial x _{1}} \frac{\partial f _{1}}{\partial x _{2}} \frac{\partial f _{2}}{\partial x _{2}} \frac{\partial f _{3}}{\partial x _{2}} \frac{\partial f _{1}}{\partial x _{3}} \frac{\partial f _{2}}{\partial x _{3}} \frac{\partial f _{3}}{\partial x _{3}} = σ ρ - x_{3} x_{2} - σ - 1 x_{1} 0 - x_{1} - β$

$J (x^{*}) = J (0) \frac{\partial f}{\partial x}_{x = 0} = σ ρ 0 - σ - 1 0 00 - β$

The linearised equation is therefore:

$\frac{d f}{d x} = Ax = σ ρ 0 - σ - 1 0 00 - β x_{1} x_{2} x_{3}$

The Derivative of a Scalar Function With Respect to a Vector

If $Q (x) = Q (x_{1}, x_{2}, ..., x_{n})$ is a scalar quantity that depends on a vector $x$ of $n$ variables, then the derivative of $Q$ with respect to $\mathbf$ is a row vector:

$\frac{\partial Q}{\partial x} = [\frac{\partial Q}{\partial x _{1}} \frac{\partial Q}{\partial x _{2}} \dots \frac{\partial Q}{\partial x _{n}}] = \nabla Q$

This is the gradient $grad (Q)$ or nabla ( $\nabla Q$ )

The Derivative of the Quadratic Form

Using an auxillary result

$α = y^{'} Bx \frac{\partial α}{\partial x} = y^{'} B \frac{\partial α}{\partial y} = x^{'} B^{'}$

We can compute the derivative of a quadratic form $Q = x^{'} Ax$ :

$\frac{\partial Q}{\partial x} = \frac{\partial}{\partial x} (x^{'} Ax) + \frac{\partial}{\partial ( x ^{'} )} (x^{'} Ax) = x^{'} A + x^{'} A^{'} = x^{'} (A + A^{'})$

Since $A$ is symmetric by definition of the quadratic form, $A = A^{'}$ , the derivative of the quadratic form is a row vector:

$\frac{\partial Q}{\partial x} = 2 x^{'} A$

Example

Consider the polynomial $Q (x_{1}, x_{2}) = x_{1}^{2} + 4 x_{1} x_{2} + 2 x_{2}^{2}$ . Find $\nabla Q$ . First putting the equation into quadratic form:

$Q (x_{1}, x_{2}) = [x_{1} x_{2}] [1122] [x_{1} x_{2}]$

The derivative $\frac{\partial Q}{\partial x} = 2 x^{'} A$ :

$\frac{\partial Q}{\partial x} = 2 x^{'} A = 2 [x_{1} x_{2}] [1122] = [2 x_{1} + 4 x_{2} 4 x_{1} + 4 x_{2}]$

Optimisation

Multidimensional Taylor Series

The scalar case of the taylor series is an expansion of the function $f (x)$ about the point $x^{*} = a$ :

$f (x) = n = 0 \sum \infty \frac{1}{n !} f^{(n)} (a) (x - a)^{n}$

This can be generalised to a matrix case. Let $f$ be a scalar function of a column vector $x$ . The taylor series expansion of $f (x)$ about the point $x^{*} = a$ is:

$f (x) = n = 0 \sum \infty \frac{1}{n !} (x - a)^{n} (\partial^{(n)} f) (a)$

This result $f$ is a scalar. Consider the first three terms:

$f (x) = n = 0 \sum \infty \frac{1}{n !} (x - a)^{n} (\partial^{(n)} f) (a) = f (a) + (\frac{\partial f}{\partial x}_{x = a}) (x - a) + \frac{1}{2 !} (x - a)^{'} (\frac{\partial ^{2} f}{\partial x ^{2}}_{x = a}) (x - a)$

$g_{x = a}$ is a $1 \times n$ row vector with it's gradient evaluated at the point $a$ :

$(\frac{\partial f}{\partial x}_{x = a}) = [\frac{\partial f}{\partial x _{1}}_{x = a} \frac{\partial f}{\partial x _{2}}_{x = a} \dots \frac{\partial f}{\partial x _{n}}_{x = a}]$

$H_{x = a}$ is the $n \times n$ matrix of second derivatives, called the Hessian matrix, evaluated at point $a$

The Hessian matrix is generally symmetric
Matrix of mixed partial derivatives

$H_{x = a} = (\frac{\partial ^{2} f}{\partial x ^{2}}_{x = a}) = [\frac{\partial ^{2} f}{\partial x _{i} \partial x _{j}}_{x = a}]$

Taylor series can be used to approximate multidimensional functions:

Let $S$ be a scalar function of an $n \times 1$ vector $x$ , $S (x)$
Expand $S$ about a point $x$ , assuming displacements $h$ about $x$
The first term is a linear form
Second term a quadratic form
Higher order terms are ignored

$S (x + h) = S (x) + g_{x} h + \frac{1}{2} x^{'} H_{x} h + ...$

Multidimensional Optimisation

Optimisation tasks involve finding $x$ such that $S$ is at an extremum (max/min).

Consider a continuous function $S (x)$ , expanded about the point $x_{0}$ , with a vector $ζ$ as the displacement from $x_{0}$ :

$S (x_{0} + ζ) ≊ S (x_{0}) + (\frac{\partial S}{\partial x})^{'} ζ + \frac{1}{2} ζ^{'} (\frac{\partial ^{2} S}{\partial x ^{2}}) ζ = S (x_{0}) + g_{x_{0}}^{'} ζ + \frac{1}{2} ζ^{'} H_{x} ζ$

The point $x_{0}$ is an extremum if the gradient vector $g^{'} (x) = 0$ when $x = x_{0}$ . The homogenous nonlinear equation $g^{'} (x_{0}) = 0$ therefore defines an extremum $x_{0}$ .

If $g^{'} (x_{0}) = 0$ , then:

$S (x_{0}) + g_{x_{0}}^{'} ζ + \frac{1}{2} ζ^{'} H_{x} ζ ⟹ S (x_{0}) + \frac{1}{2} ζ^{'} H_{x} ζ$

Therefore:

$S (x_{0} + ζ) - S (x_{0}) = \frac{1}{2} ζ^{'} H_{x} ζ$

This is the important result that defines the extremum of a function $S (x)$

To determine the nature of the extremum, the sign of the $ζ^{'} H_{x} ζ$ must be determined. By the spectral resolution, this is determined by the eigenvalues $λ_{i}$ of $H$ . It is said that the sign definiteness of $H$ is determined by $λ_{i}$

Let $Q (x)$ be a quadratic form $Q (x) = ζ^{'} H_{x} ζ$ m, where $H$ is a symmetric $n \times n$ matrix. The eigenvalues of $H$ are $λ_{1}, λ_{2}, ..., λ_{n}$ . The definiteness is determined by all of the eigenvalues:

$λ_{i}, i = 1... n$	Definiteness of $H$	Nature of Point $x_{0}$
$> 0$	Positive Definite	Minimum
$\geq 0$	Positive Semidefinite	Probable Valley
$> 0$ and $< 0$	Indefinite	Saddle Point
$\leq 0$	Negative Semidefinite	Probable Ridge
$< 0$	Negative Definite	Maximum

Extrema of a Multivariate Quadratic

For a quadratic $Q (x) = x^{'} Ax + b^{'} x + c$ , the extremum is at the point where $\frac{\partial Q}{\partial x} = 0$ :

$\frac{\partial Q}{\partial x} = 2 x^{'} A + b^{'} = 0$

$2 Ax_{0} = - b$

$x_{0} = - \frac{1}{2} A^{- 1} b$

A maximum/minimum exists at the point $x_{0}$ if the matrix $A$ is positive/negative definite.

Example 1

Find the extremum (and it's nature) of the quadratic function:

$f (x_{1}, x_{2}) = 4 x_{1}^{2} + 2 x_{1} x_{2} - x_{1} + 2 x_{2} + 1$

Put into matrix form:

$f (x) == x^{'} A^{'} x + b^{'} x + 1 = x^{'} [4110] x + [- 1 2] x + 1$

The extremum of this quadratic form exists at the point $x_{0}$ where:

$x_{0} = - \frac{1}{2} A^{- 1} b$

$A^{- 1} = [01 1 - 4]$

$x_{0} = - \frac{1}{2} A^{- 1} b = - \frac{1}{2} [01 1 - 4] [- 1 2] = [- 1 4.5]$

$x_{0} = [- 1 4.5]$

To determine the nature, find the eigenvalues of $A$ :

$λ_{1} = 4.235 λ_{2} = - 0.236$

The eigenvalues lie either side of zero, which makes $A$ indefinite, and the extremum is therefore a saddle point.

Example 2

Find the stationary points (and their nature) for the function:

$f (x) = f (x_{1}, x_{2}) = x_{1}^{2} + 3 x_{1} x_{2}^{2} - x_{2}^{3} + 4$

The stationary points lie where $g (x_{0}) = 0$ :

$g (x) = [\frac{\partial f}{\partial x _{1}} \frac{\partial f}{\partial x _{2}}] = [2 x_{1} + 3 x_{2}^{2} 6 x_{1} x_{2} - 3 x_{2}^{2}]$

The solutions are therefore:

$[2 x_{1} + 3 x_{2}^{2} 6 x_{1} x_{2} - 3 x_{2}^{2}] = 0$

$2 x_{1} + 3 x_{2}^{2} = 0 6 x_{1} x_{2} - 3 x_{2}^{2} = 0$

Two solutions:

$x_{0} = [00] x_{0} = [- 1/6 - 1/3]$

To determine the nature of the extremum, we need the hessian matrix:

$H (x) = [\frac{\partial ^{2} f}{\partial x _{1}^{2}} \frac{\partial ^{2} f}{\partial x _{1} x _{2}} \frac{\partial ^{2} f}{\partial x _{2} x _{1}} \frac{\partial ^{2} f}{\partial x _{2}^{2}}] = [2 6 x_{2} 6 x_{2} 6 x_{1} - 6 x_{2}]$

The eigenvalues at each point will give the nature. For $x_{0} = [00]$ :

$λ_{1} = 0 λ_{2} = 2$

The hessian matrix is positive semidefinite, so the point is probably a valley, but further analysis is required to determine the nature of the point.

For $x_{0} = [- 1/6 - 1/3]$ :

$λ_{1} = 3.56 λ_{2} = - 0.56$

The hessian matrix is indefinite, so the point is a saddle point.

Fourier Series and Transforms

Fourier Series

Fourier series provide a way of representing any periodic function as a sum of trigonometric functions. For a periodic function $f (t)$ with period $2 L$ , the Fourier series is given by:

$f (t) = \frac{a _{0}}{2} + n = 1 \sum \infty a_{n} cos \frac{nπ t}{L} + n = 1 \sum \infty b_{n} sin \frac{nπ t}{L}$

Where the coefficients $a_{n}$ and $b_{n}$ are called the Fourier coefficients, integrals calculated over the period of the function:

$a_{0} = \frac{1}{L} \int_{- L}^{L} f (t) d t a_{n} = \frac{1}{L} \int_{- L}^{L} f (t) cos \frac{nπ t}{L} d t b_{n} = \frac{1}{L} \int_{- L}^{L} f (t) sin \frac{nπ t}{L} d t$

Note that if the function is even $f (t) = f (- t)$ , then the $b_{n}$ term is always 0, and the series is comprised of cosine terms only:

$f (t) = \frac{a _{0}}{2} + n = 1 \sum \infty a_{n} cos \frac{nπ t}{L}$

Likewise for odd functions $f (t) = - f (- t)$ , the $a_{n}$ term is always zero, and the series is comprised of sine terms only:

$f (t) = n = 1 \sum \infty b_{n} sin \frac{nπ t}{L}$

The Fourier series uniquely represents a function if:

The integral of function over its period is finite
The function has a finite number of discontinuities over any finite interval
Most (if not all) functions/signals of any engineering interest will satisfy these conditions

Exponential Representation

The Fourier series can be rewritten using Euler's formula $e^{j θ} = cos θ + j sin θ$ :

$f (t) = - \infty \sum \infty c_{n} e^{j 2 nπ t / T}$

$c_{n} = \frac{1}{T} \int_{0}^{T} f (t) e^{- j 2 nπ t / T} d t for n = 0, \pm 1, \pm 2, \pm 3...$

Note that T = 2L, the period of the function.

Frequency Spectrum Representation

The spectrum representation gives the magnitude $A_{n}$ and phase $ϕ_{n}$ of the harmonic components defined by the frequencies $f_{n}$ contained in a signal $f (t)$

$f (t) = n = 1 \sum \infty A_{n} sin (2 π f_{n} t + ϕ_{n}) = \frac{a _{0}}{2} + n = 1 \sum \infty a * n cos \frac{nπ t}{L} + \sum \infty * n = 1 b_{n} cos \frac{nπ t}{L}$

$A_{n} = a_{n}^{2} + b_{n}^{2} ϕ_{n} = tan^{- 1} \frac{a _{n}}{b _{n}} f_{n} = \frac{n}{2 L}$

This gives two spectra:

The frequency spectrum, describing the magnitude $A_{n}$ for each frequency $f_{n}$ present in the signal
The phase spectrum, describing the phase $ϕ_{n}$ for each frequency $f_{n}$ present in the signal

The diagram below shows the frequency spectrum for the functions $f (t) = 3 sin 5 t$ and $f (t) = 7 sin 6 t - 2 sin 3 t$ , respectively:

Example

Find the fourier series of the following function:

$f (t) = {- 1 1 - π \leq t \leq 0 0 \leq t \leq π$

$f (t)$ is an odd function with period $2 π$ ( $L = π$ ), hence we only need the $b_{n}$ integral:

$b_{n} = \frac{1}{L} \int_{- L}^{L} sin \frac{nπ t}{L} d t = \frac{1}{π} (\int_{- π}^{0} sin n t + \int_{0}^{π} sin n t) d t$

$= \frac{1}{π} ([\frac{1}{n} - \frac{1}{n} cos (- πn)] + [- \frac{1}{n} cos (nπ) + \frac{1}{n}]) = \frac{2}{nπ} (1 - cos (nπ))$

Since $cos nπ = (- 1)^{n}$ :

$b_{n} = \frac{2}{nπ} (1 - (- 1)^{n})$

Can introduce a new index $k$ , such that $n = 2 k - 1$ :

$b_{k} = \frac{4}{2 kπ - π} for 1, 2, 3, ..., \infty$

The Fourier series for $f (t)$ is therefore given by:

$\frac{4}{π} k = 1 \sum \infty \frac{1}{2 k - 1} sin ((2 k - 1) t)$

Fourier Transforms

Fourier series give a representation of periodic signals, but non periodic signals can not be analysed in the same way. The Fourier transform works by replacing a sum of discrete sinusoids with a continuous integral of sinusoids over frequency, transforming from the time domain to the frequency domain. A non-periodic function $f (t)$ can be expressed as:

$f (t) = \int_{0}^{\infty} A (ω) cos ω t + B (ω) sin ω t d ω$

$A (ω) = \frac{1}{π} \int_{- \infty}^{\infty} f (t) cos ω t d t B (ω) = \frac{1}{π} \int_{- \infty}^{\infty} f (t) sin ω t d t$

Provided that:

$f (t)$ and $f^{'} (t)$ are piecewise continuous in every finite interval
$\int_{\infty}^{- \infty} ∣ f (t) ∣ d t$ exists

This can also be expressed in complex notation:

$f (t) = \frac{1}{2 π} \int_{- \infty}^{\infty} F (ω) e^{jω t} d ω$

$F (ω) = \int_{- \infty}^{\infty} f (t) e^{- jω t} d t$

$F (ω)$ is the Fourier transform of $f (t)$ , denoted $F {f (t)}$
$f (t)$ is the inverse Fourier transform of $F (ω)$ , denoted $F^{- 1} {F (ω)}$

For periodic signals:

Fourier series break a signal down into components with discrete frequencies
- Amplitude and phase of components can be calculated from coefficients
- Plots of amplitude and phase against frequency give frequency spectrum of a signal
- The spectrum is discrete for periodic signals

For non-periodic signals:

Fourier Transforms represent a signal as a continuous integral over a range of frequencies
- The frequency spectrum of the signal is continuous rather than discrete
- $∣ F (ω) ∣$ gives the spectrum amplitude
- $ar g (F (ω))$ gives the spectrum phase

Fourier Transform Properties

Fourier transforms have linearity, same as z and Laplace.

Time Shift

For any constant $a$ :

$F {f (t - a)} = e^{- jωa} F (ω)$

If the original function $f (t)$ is shifted in time by a constant amount, this does not affect the magnitude of its frequency spectrum $F (ω)$ . Since the complex exponential always has a magnitude of 1, the time delay alters the phase of $F (ω)$ but not its magnitude.

Frequency Shift

For any constant $a$ :

$F {e^{ja t} f (t)} = F (ω - a)$

Example

Find the Fourier integral representation of

$f (t) = {10 - 1 \leq t \leq 1 ∣ t ∣ > 1$

$F (ω) = \int_{- \infty}^{\infty} f (t) e^{- jω t} d t = \int_{- 1}^{1} (1) e^{- jω t} d t = [\frac{e ^{- jω t}}{- jω}]_{- 1}^{1}$

$F (ω) = \frac{e ^{- jω} - e ^{jω}}{jω}$

This is the Fourier transform of $f (t)$ . Using Euler's relation $sin θ = \frac{e ^{j θ} - e ^{- j θ}}{2 j}$ :

$F (ω) = \frac{e ^{- jω} - e ^{jω}}{jω} = \frac{2 sin ω}{ω}$

Therefore, the integral representation is:

$f (t) = \frac{1}{2 π} \int_{- \infty}^{\infty} \frac{2 sin ω}{ω} e^{jω t} d ω$

Z Transforms

Difference Equations

A difference equation is a discrete equivalent of a differential equation, used in situations where only discrete values can be measured:

$\frac{d ^{2} x}{d t ^{2}} + 5 \frac{d x}{d t} + 2 x = 3 t$

becomes

$x [n + 2] + 5 x [n + 1] + 2 x [n] = 3 n n = 0, 1, 2, ...$

These can be solved numerically by just evaluating the output for each value of n. For example:

$y [n] - 0.5 y [n - 1] = x [n]$

$x [n] = {01 n < 0 n \geq 0$

This evaluates to:

$y [0] = 0.5 y [- 1] + x [0] = 0.5 (0) + 1 = 1 y [1] = 0.5 y [0] + x [1] = 0.5 (1) + 1 = 1.5 y [2] = 0.5 y [1] + x [2] = 0.5 (1.5) + 1 = 1.75 \dots$

Alternatively, there is an analytical solution...

The z Transform

Consider a discrete sequence $f [k], n \in N$ . The z transform of this sequence is defined as:

$F (z) = Z {f [k]} = k = 0 \sum \infty f [k] z^{- k} = F (z) = f [0] + \frac{f [ 1 ]}{z} + \frac{f [ 2 ]}{z ^{2}} + \frac{f [ 3 ]}{z ^{3}} ...$

A closed-form expression can generally be found by the sum of the infinite series. For example, the z transform of the unit step $f [k] = 1$ :

$k = 0 \sum \infty f [k] z^{- k} = 1 + \frac{1}{z} + \frac{1}{z ^{2}} + ...$

This is a geometric series with $a = 1$ , $r = z^{- 1}$ , hence the sum is $\frac{a}{1 - r}$

$F (z) = \frac{z}{z - 1} for ∣ z ∣ > 1$

Taking a z transform of a difference equation converts it to a continuous function. The z domain is similar to the laplace domain, but for discrete time signals instead.

Common z Transforms

z Transform Properties

z transforms have linearity, the same as laplace and fourier transforms.

First Shift Theorem

If $f [k]$ is a sequence and $F (z)$ it's transform, then

$Z {f [k + i]} = z^{i} F (z) - (z^{i} f [0] + z^{i - 1} f [1] + ... + z f [i - 1])$

For example, if $i = 1$ :

$Z {f [k + 1]} = z F (z) - z f [0]$

For $i = 2$ :

$Z {f [k + 2]} = z^{2} F (z) - z^{2} f [0] - z f [1]$

Second Shift Theorem

The function $f (t) u (t)$ is defined:

$f (t) u (t) = {f (t) 0 t \geq 0 t < 0$

Where $u (t)$ is the unit step function. The function $f (t - i T) u (t - i T)$ , where $i$ is a positive integer, represents a shift to the right of this function by $i$ sample intervals. If this shifted function is sampled, we have $f [k - i] u [k - i]$ . The second shift theorem states:

$Z {f [k - i] u [k - i]} = z^{- i} F (z)$

Inverse z Transforms

z transforms are inverted using lookup tables, but to get them into a recognisable form, some manipulation is often needed, including partial fractions. For example, finding the inverse transform of $F (z) = \frac{z + 3}{z - 2}$ :

$F (z) = \frac{z + 3}{z - 2} = \frac{z}{z - 2} + \frac{3}{z - 2}$

The first term can be seen immediately from the table:

$Z {\frac{z}{z - 2}} = 2^{k}$

The second term rearranges to give:

$\frac{3}{z - 2} = \frac{3}{z} \frac{z}{z - 2} = 3 z^{- 1} \frac{z}{z - 2}$

This is in the form of the second shift theorem, so this can be applied to give:

$Z {3 (2)^{k - 1} u [k - 1]} = 3 z^{- 1} \frac{z}{z - 2}$

Thus,

$Z^{- 1} {\frac{z + 3}{z - 2}} = 2^{k} + 3 (2)^{k - 1} u [k - 1]$

Example

Solve $y [k + 2] - 5 y [k + 1] + 6 y [k] = 0$ , where $y [0] = 0$ , $y [1] = 2$ .

$Z {y [k + 2]} - 5 Z {y [k + 1]} + 6 Z {y [k]} = 0$

Taking z transforms:

$z^{2} Y (z) - zy [0] - zy [1] - 5 z Y (z) + 5 zy [0] + 6 Y (z) = 0$

Rearranging and using initial conditions:

$(z^{2} - 5 z + 6) Y (z) = 2 z$

$Y (z) = \frac{2 z}{z ^{2} - 5 z + 6}$

Using partial fractions:

$Y (z) = \frac{2}{z - 3} + \frac{2}{z - 2}$

Using inverse transforms straight from the table to get the solution:

$y [k] = 2 \times 3^{k} - 2 \times 2^{k}$

Partial Differential Equations

PDEs are use to model many kinds of problems. Their solutions give evolution of a function $y (x, t)$ as a function of time and space. Boundary conditions involving time and space are used as initial conditions.

A method of separation of variables is used for solving them, where it is assumed that $V (x, y) = X (x) Y (y)$ . Two other auxiliary ODE results are also needed:

$\frac{d ^{2} y}{d x ^{2}} = a^{2} y ⟹ y = A_{1} e^{a x} + A_{2} e^{- a x}$

$\frac{d ^{2} y}{d x ^{2}} = - a^{2} y ⟹ y = B_{1} sin αx + B_{2} cos a x$

Another auxillary ODE are needed for some situations

$\frac{d y}{d x} = - a y ⟹ y = C_{1} e^{- a x}$

The general process for solving PDEs:

Apply separation of variables
Make an appropriate choice of constant $\pm μ^{2}$
- Nearly always $- μ^{2}$
Solve resulting ODEs
Combine ODE solutions to form general PDE solution
Apply boundary conditions to obtain particular PDE solution
- Work out values for the arbitrary constants

Laplace's Equation

Laplace's equation described many problems involving flow in a plane:

$\frac{\partial ^{2} V ( x , y )}{\partial x ^{2}} + \frac{\partial ^{2} V ( x , y )}{\partial y ^{2}} = 0$

Find the solution with the following boundary conditions:

$V (0, y) = 0$ and $V (π, y) = 0$
$V (x, y) \to 0$ as $y \to \infty$
$V (x, 0) = 10 sin x + 7 sin 3 x$

Starting with separation of variables:

$\frac{\partial V ( x , y )}{\partial x} = Y (y) \frac{d X ( x )}{d x} \Rightarrow \frac{\partial ^{2} V ( x , y )}{\partial x ^{2}} = Y (y) \frac{d ^{2} X ( x )}{d x ^{2}}$

$\frac{\partial V ( x , y )}{\partial y} = X (x) \frac{d Y ( y )}{d y} \Rightarrow \frac{\partial ^{2} V ( x , y )}{\partial y ^{2}} = X (x) \frac{d ^{2} Y ( y )}{d y ^{2}}$

Substituting back into the original PDE:

$Y (y) \frac{d ^{2} X ( x )}{d x ^{2}} + X (x) \frac{d ^{2} Y ( y )}{d y ^{2}} = 0$

$\frac{1}{X ( x )} \frac{d ^{2} X ( x )}{d x ^{2}} = - \frac{1}{Y ( y )} \frac{d ^{2} Y ( y )}{d y ^{2}}$

We have transformed the PDE into an ODE, where each side is a function of $x$ / $y$ only. The only circumstances under which the two sides can be equal for all values of $x$ and $y$ is if both sides independent and equal to a constant. Since the constant is arbitrary, let it be $μ^{2}$ . Now we have two ODEs and their solutions from the auxiliary results earlier:

$\frac{1}{X ( x )} \frac{d ^{2} X ( x )}{d x ^{2}} = μ^{2} - \frac{1}{Y ( y )} \frac{d ^{2} Y ( y )}{d y ^{2}} = μ^{2}$

$\frac{d ^{2} X ( x )}{d x ^{2}} = μ^{2} X (x) \frac{d ^{2} Y ( y )}{d y ^{2}} = - μ^{2} Y (y)$

$X (x) = A_{1} e^{μx} + A_{2} e^{- μx} Y (y) = B_{1} sin μ y + B_{2} cos μ y$

Substituting the solutions back into $V (x, y) = X (x) Y (y)$ , we have a general solution to our PDE in terms of 4 arbitrary constants:

$V (x, y) = (A_{1} e^{μx} + A_{2} e^{- μx}) (B_{1} sin μ y + B_{2} cos μ y)$

We can now apply boundary conditions:

Substituting in $V (0, y) = 0$ gives $A_{1} + A_{2} = 0$
Substituting in $V (π, y) = 0$ gives $A_{1} e^{μ π} + A_{2} e^{- μ π} = 0$
Using the two together gives $A_{1} (1 - e^{- 2 μ π)} = 0$ , so either:
- $A_{1} = 0$
- $(1 - e^{- 2 μ π}) = 0$
If $A_{1} = 0$ , then $A_{2} = 0$ , so $V (x, y) = 0$
- This is the trivial solution and is of no interest
If $(1 - e^{- 2 μ π}) = 0$ , then $μ = 0$
- This also implies that $V (x, y) = 0$ , so is useless too

The issue is that we selected our arbitrary constant badly. If we use $- μ^{2}$ instead, then our solutions are the other way round:

$X (x) = A_{1} sin μx + A_{2} cos μx Y (y) = B_{1} e^{μ y} + B_{2} e^{- μ y}$

Checking the boundary conditions again:

First condition, $V (0, y) = 0$
- Gives $A_{2} = 0$
Second condition $X (π, y) = 0$
- Gives $A_{1} sin μ π = 0$
  - Either $A_{1} = 0$ (not interested)
  - $sin μ π = 0$
- $μ$ is an integer, $n$

We now have:

$V (x, y) = (A_{1} sin n x) (B_{1} e^{n y} + B_{2} e^{- n y})$

Where $n$ is any integer. Using the other boundary conditions:

$V (x, y) \to 0$ as $y \to \infty$
If $n$ is positive, then $B_{1} = 0$ (otherwise $B_{1} e^{n y} \to \infty$ )
If $n$ is negative, then $B_{2} = 0$ (otherwise $B_{2} e^{- n y} \to \infty$ )

Taking $n$ as positive, the form of the solutions is:

$V (x, y) = C_{1} e^{- y} sin x (C_{1} = A_{1} B_{2} for n = 1)$

$V (x, y) = C_{2} e^{- 2 y} sin 2 x (C_{2} = A_{1} B_{2} for n = 2)$

$V (x, y) = C_{n} e^{- n y} sin n x (C_{n} = A_{1} B_{2} for n)$

The most general form is the sum of these:

$n = 1 \sum \infty C_{n} e^{- n y} sin n x$

Applying the final boundary condition: $V (x, 0) = 10 sin x + 7 sin 3 x$

$C_{1} = 10$
$C_{3} = 7$
$C_{n} = 0$ for all other $n$

The complete solution is therefore:

$V (x, y) = 10 e^{- y} sin x + 7 e^{- 3 y} sin 3 x$

The Heat Equation

The heat equation describe diffusion of energy or matter. With a diffusion coefficient $c$ :

$\frac{\partial ^{2} u ( x , t )}{\partial x ^{2}} = \frac{1}{c ^{2}} \frac{\partial u ( x , t )}{\partial t}$

Solving with the following boundary conditions:

$u (0, t) = 0$
$u (2, t) = 0$
$u (x, 0) = 10$

Separating variables, $u (x, t) = X (x) T (t)$ , and substituting, exactly the same as Laplace's equation, we have:

$\frac{1}{X ( x )} \frac{d ^{2} X ( x )}{d x ^{2}} = \frac{1}{c ^{2} T ( t )} \frac{d T ( t )}{d t}$

Setting both sides again equal to a constant $- μ^{2}$ , we have two ODEs (one 2nd order, one 1st):

$\frac{d ^{2} X ( x )}{d x ^{2}} = - μ^{2} X (x) \Rightarrow X (x) = A cos μx + B sin μx$

$\frac{d T ( t )}{d t} = - μ^{2} c^{2} T (t) \Rightarrow T (t) = C e^{- μ^{2} c^{2} t}$

The general solution is therefore:

$u (x, y) = (A cos μx + B sin μx) C e^{- μ^{2} c^{2} t}$

Tidying up a bit, let $D = A C$ , $E = BC$ , $λ = μ c$ :

$u (x, y) = (D cos \frac{λ}{c} x + E sin \frac{λ}{c} x) e^{- μ^{2} t}$

Applying the first boundary condition:

$u (0, t) = 0$
Gives $0 = D e^{- λ^{2} t}$
Since $e^{- λ^{2} t} \neq = 0$ for all $t$ , $D = 0$

We now have $u (x, t) = E sin (\frac{λ x}{c}) e^{- λ^{2} t}$ . The second boundary condition:

$u (2, t) = 0$ , so $0 = E sin (\frac{2 λ}{c}) e^{- λ^{2} t}$
For the non trivial solution $E \neq = 0$ ,and since $e^{- λ^{2} t} \neq = 0$ , $sin \frac{2 λ}{c} = 0 = nπ$
Therefore, $λ = \frac{nπ c}{2}$ for $n = 1, 2, 3, ...$

Substituting this in gives:

$u (x, t) = E sin (\frac{nπ x}{2}) e^{\frac{- n ^{2} π ^{2} c ^{2} t}{4}} for n = 1, 2, 3, ...$

The above equation is valid for any $n = 1, 2, 3, ...$ , so summing these gives the most general solution:

$u (x, t) = n = 1 \sum \infty E_{n} sin (\frac{nπ x}{2}) e^{\frac{- n ^{2} π ^{2} c ^{2} t}{4}}$

The last boundary condition is $u (x, 0) = 10$ :

$10 = n = 1 \sum \infty E_{n} sin \frac{nπ x}{2}$

This is in the form of the a Fourier series:

$f (t) = n = 1 \sum \infty b_{n} sin \frac{2 πn t}{T} b_{n} = \frac{2}{T} \int_{0}^{T} f (t) sin \frac{2 πn t}{T} d t$

We have:

$10 = n = 1 \sum \infty E_{n} sin \frac{nπ x}{2} E_{n} = \frac{2}{2} \int_{0}^{2} 10 sin \frac{nπ x}{2} d t$

$\int_{0}^{2} 10 sin \frac{nπ x}{2} d t = 10 [\frac{- 2}{nπ} cos \frac{nπ x}{2}]_{0}^{2} = \frac{- 20}{nπ} [(- 1)^{n} - 1]$

$E_{n} = {0 \frac{40}{nπ} n even n odd$

Substituting this into $u$ , and letting $n = 2 k - 1$ :

$u (x, t) = \frac{40}{π} n = 1 \sum \infty \frac{1}{2 k - 1} sin (\frac{( 2 k - 1 ) π x}{2}) e^{- \frac{( 2 k - 1 ) ^{2}}{4} π^{2} c^{2} t}$

The Wave Equation

The wave equation is used to describe vibrational problems:

$\frac{\partial ^{2} y ( x , t )}{\partial x ^{2}} = c^{2} \frac{\partial ^{2} y ( x , t )}{\partial t ^{2}}$

Solving the equation with the boundary conditions:

$y (0, t) = 0$
$y (0, l) = 0$
$y (x, 0) = \frac{\partial y ( x , t )}{\partial t}$
$y (x, 0) = 12 sin \frac{3 π x}{l}$

Doing the usual separation of variables and substitution, and choosing a constant $- μ^{2}$ :

$\frac{1}{X ( x )} \frac{d ^{2} X ( x )}{d x ^{2}} = \frac{c ^{2}}{T ( t )} \frac{d ^{2} T ( t )}{d t ^{2}} = - μ^{2}$

Solving both ODEs:

$\frac{d ^{2} X ( x )}{d x ^{2}} = - μ^{2} X (x) \Rightarrow X (x) = A cos μx + B sin μx$

$\frac{d ^{2} T ( x )}{d t ^{2}} = \frac{- μ ^{2}}{c ^{2}} T (t) \Rightarrow T (t) = C cos \frac{μ}{c} t + D sin \frac{μ}{c} t$

$y (x, t) = (A cos μx + B sin μx) (C cos \frac{μ}{c} t + D sin \frac{μ}{c} t)$

This is the general solution. Start applying boundary conditions:

$y (0, t) = 0$ implies that $0 = A (C cos \frac{μ}{c} t + D sin \frac{μ}{c} t)$
- As this is true for all $t$ , $A = 0$
$y (0, t) = l$ implies that $0 = B sin μ l (C cos \frac{μ}{c} t + D sin \frac{μ}{c} t)$
- This is also true for all $t$ , so $B sin μ l = 0$
- Required that $B! = 0$ , so $0 = sin μ l$
- $μ = \frac{nπ}{l}$ for $n = 1, 2, 3, ...$

We now have:

$y (x, t) = B sin \frac{nπ}{l} x (C cos \frac{μ}{c} t + D sin \frac{μ}{c} t) for n = 1, 2, 3, ...$

Applying the third boundary condition, $y (x, 0) = \frac{\partial y ( x , t )}{\partial t}$ :

$\frac{\partial y ( x , t )}{\partial t} = B sin \frac{nπ}{l} x (- C \frac{μ}{c} sin \frac{μ}{c} t + D \frac{μ}{c} cos \frac{μ}{c} t)$

$\frac{\partial y ( x , t )}{\partial t} = 0 = B sin \frac{nπ}{l} x (D \frac{μ}{c})$

As this is for all $x$ , $B sin \frac{nπ}{l} x \neq = 0$ , so $D = 0$ . We now have:

$y (x, t) = E sin \frac{nπ x}{l} cos \frac{nπ t}{c l} for n = 1, 2, 3, ...$

The general solution is then:

$y (x, t) = n = 1 \sum \infty E_{n} sin \frac{nπ x}{l} cos \frac{nπ t}{c l}$

Applying the final boundary condition of $y (x, 0) = 12 sin \frac{3 π x}{l}$ , gives $E_{3} = 12$ , else $E_{n} = 0$ . The particular solution is therefore:

$y (x, t) = 12 sin \frac{3 π x}{l} cos \frac{3 π t}{c l}$

ES2E3

Flash cards:

The ever so kind Aaron has made some flashcards. They are somewhat brief and don't cover everything, but they are better than nothing :)

Click here for quizlet.

Logic

Whilst its only recapped on in some of the lectures, it assumes knowledge from the engineering module, which the computer science Computer Organisation and Architecture (CS132) also covers.

A fair bit of this information is already in CS132 Logic Page. There are some engineering specific things, and stuff that's just handy to have on one page.

Boolean Algebra Laws

There are several laws of boolean algebra which can be used to simplify logic expressions:

Name	AND form	OR form
Identity Law	$1 A = A$	$0 + A = A$
Null Law	$0 A = 0$	$1 + A = 1$
Idempotent Law	$AA = A$	$A + A = A$
Inverse Law	$A \overset{ˉ}{A} = 0$	$A + \overset{ˉ}{A} = 1$
Commutative Law	$A B = B A$	$A + B = B + A$
Associative Law	$(A B) C = A (BC) = A BC$	$(A + B) + C = A + (B + C) = A + B + C$
Distributive Law	$A + BC = (A + B) (A + C)$	$A (B + C) = A B + A C$
Absorption Law	$A (A + B) = A$	$A + A B = A$
De Morgan's Law	$\overline{A \cdot B} = \overset{ˉ}{A} + \overset{ˉ}{B}$	$\overline{A + B} = \overset{ˉ}{A} \cdot \overset{ˉ}{B}$

Can go from AND to OR form (and vice versa) by swapping AND for OR, and 0 for 1

Most are fairly intuitive, but some less so. The important ones to remember are:

$A + BC = (A + B) (A + C)$
$A (B + C) = A B + A C$
$A (A + B) = A$
$A + A B = A$

Latches

SR Latch

When $S$ is asserted, $Q$ goes high.
When $R$ is asserted, $Q$ goes low.
When both are de-asserted (low and low), $Q$ holds its value
When both are asserted (high and high), $Q$ and $\overline{Q}$ goes low (not intended!)

D latch

Passes through the $D$ input whenever $C L K$ is high, and hold when $C L K$ is low.

D Flip Flop

Will copy the $D$ input to the $Q$ output at rising edges of $C L K$ . Bit storage.

Hardware Description Languages

So far, we've been restricted to describing circuits using equations and diagrams. Diagrams can convey structure, but behaviour can be hard to see and they become unweildy as they grow. HDLs are languages that describe hardware with a heirarchical design.

History

Programmaple Array Logic (PAL) allows for implementing a sum of products logic, building circuits by blowing fuses in certain places.
- This got cumbersome for larger circuits
PALASM developed as a language for mapping functional specifications to PAL
Other languages developed around this concept, all with the idea of introducing more layers of abstraction
There are two main languages in use today:
- Verilog
- VHDL
Verilog started as a proprietary language, released to the public in 1991 and standardised in 1995 by the IEEE
- Standard revised in 2001
SystemVerilog is an extension of Verilog with more capabilities

Verilog

Verilog designs are broken down into modules
- A module is an encapsulation of a unit of functionality
- Good designs have appropriate levels of hierarchy
- At each level, modules below are treated as black boxes
Modules are declared using the module keyword and a list of ports
- Can indicate the direction of the port as input or output
- endmodule indicate the end of a module
Identifiers are the names of modules, signals, ports, etc
- Must start with a letter, and can't clash with keywords
Wires can be declared within modules using the wire keyword
Verilog is case sensitive, ignores whitespace and uses C-style //comments

Structural Verilog Design

Circuits are described structurally, by the structure of their constituent parts
Primitives are included for all basic gates:
- and(x, a, b) is equivalent to $x = a \cdot b$
- or(z, a, b, c, d) is equivalent to $z = a + b + c + d$
- Arguments are either ports or wires declared within module

Consider an and-or inverter: $o u t = ((a \cdot b) + (c \cdot d))^{'}$

module andorinv (input a, b, c, d, output out);
    wire and1out, and2out;
    and (and1out, a, b);
    and (and2out, c, d);
    nor(out, and1out, and2out);
endmodule

Note the two internal wires being used here. Gates can also be given identifiers, which helps with testing and readability:

module andorinv (input a, b, c, d, output out);
    wire and1out, and2out;
    and g1(and1out, a, b);
    and g2(and2out, c, d);
    nor g3(out, and1out, and2out);
endmodule

The order of statements in Verilog is irrelevant, as each statement describes a piece of hardware, so there is no sequence of steps, unlike when writing procedural code.

It is also important to obey the usual connection rules for combinational circuits:

Every node of the circuit is either an input, or connects to exactly one output terminal of a gate
The same wire cannot be driven by multiple gates
There can be no cycles in the circuit

Structural Verilog

Have seen how to write Verilog for combinational modules consisting of gates
Each time we use a gate, we are creating an instance of that gate connected to the wires in the brackets
This concept extends to all Verilog modules

Binary Adder

A half adder takes two 1-bit inputs and generates a sum and a carry out:

A	B	sum	carry
0	0	0	0
0	1	1	0
1	0	1	0
1	1	0	1

Can see there are two gates in this design:

Sum is an XOR
Carry is an AND

Can express in verilog as follows:

module add_half(input  a,   b //two inputs two outputs
                output sum, carry);

  xor g1(sum,a,b); //xor gate for sum output
  and g2 (carry,a,b) //and gate for carry output

endmodule;

Full Adder

A full adder is similar but accepts a carry in to chain carries out

Cin	A	B	Cout	Sum
0	0	0	0	0
0	0	1	0	1
0	1	0	0	1
0	1	1	1	0
1	0	0	0	1
1	0	1	1	0
1	1	0	1	0
1	1	1	1	1

Can see that this is made using half adders:

Structural verilog allows for building modules from other modules to create a hierarchy. Can instantiate our half adder module twice to reuse it in our full adder module to create a hierarchical design.

module full_add(input a, b, Cin,
                output sum, Cout);

  wire w1, w2, w3;

  //instance of add_half
  add_half m1 (a, b, w1, w2);
  add_half m2 (Cin,w1,sum,w3);
  or(Cout,w2,w3);


endmodule;

Instantiation in Verilog

Instantiate a module by invoking its name and then naming that instance
Example above creates two add_halfs named m1 and m2
Connects the signals and ports referenced in the parentheses with the corresponding ports of the instantiated module
- Same as gate modules
Order of signals determines connections
This is error prone, as it requires to remember the order of the ports
If port specification is changed, have to change the instantiation
Should always instead use a named connection:

add_half(.a(a), .b(b), .sum(w1), .Cout(w2))

The port name for the module is preceded with a dot ., and the internal port is given in brackets.

Assign Statements

Verilog has assign statements to express combinational logic

assign result = a & b;

This is called a continuous assignment: it allows us to assign the result of a boolean expression to a signal. there is a range of bitwise operators:

Operator	Function
`&`	`AND`
`\|`	`OR`
`~`	`NOT`
`^`	`XOR`
`~&`	`NAND`
`~\|`	`NOR`

Here is the full adder from earlier using assign statements instead of gates. There is no need to describe the structure in terms of gates, only logic functions. As with gate instances, the order of assign statements is irrelevant.

module full_add(input a, b, Cin,
                output sum, Cout);

  assign sum = a ^ b ^ Cin;
  assign Cout = (a & b) | (b & Cin) | (a & Cin);


endmodule;

It is also possible to assign implicitly in a wire declaration:

wire y;
assign y = (a & b) ^ c;
// equivalent to
wire y = (a & b) ^ c;

User-Defined Primitives

Verilog also allows you to create your own primitive modules which are defined using a truth table (though this isn't used much).

Can only have one output and it must be the first port
? signifies a don't-care condition

primitive mux_prim(output mux_out,
                   input select, a, b);

  table
  // select a b : mux_out
      0     0 ? : 0;
      0     1 ? : 1;
      1     ? 0 : 0
      1     ? 1 : 1
      ?     0 0 : 0
      ?     1 1 : 1;
  endtable

endprimitive;

Conditional Assignment

It is possible to have conditional assignment. Output is assigned to one of two possible expressions, dependant upon a condition:

// a multiplexer
assign y = sel ? x1 : x0;

The signal y will be connected to x1 if sel is 1, else it will be connected to x0.

Multi-bit Signals

Verilog supports multi-bit signals, called vectors or buses. A signal is declared as a bus by specifying a range:

wire [31:0] databus; //32-bit bus

//ports can also be multiple bits wide
module add16(input [15:0] a, b,
             output [15:0] sum,
             output cout);

By convention, ranges are specified [MSB:LSB], meaning a 16-bit signal is [15:0]. The range is specified preceding the signal name.

Numeric Literals

Literals use the format <size>'<radix><value>

size is the width of the number in bits
radix is binary decimal, octal or hexadecimal
4'b0000
- 4 binary bits 0000
8'h4F
- 8 bit wide hex number 4F
8'b0100_1111
- 8 bit wide binary number
- Underscores can split long strings
1'b1
- A single 1 bit

Working with Vectors

When using the vector name, all the bits are being operated on. Logic operations performed on vectors are bitwise.

wire [3:0] a = 4'b0110;
wire [3:0] b = 4'b1010;

wire [3:0] x = a & b;
wire [3:0] y = a ^ b;

Can access parts of a vector by specifying a range after the signal name

assign y = some[3];
- Assign 4th bit of signal some to y
`assign z = some[4:3];
- Creates two bit signal z from 5th/4th bit of some

The widths of vectors in assignments should match. Verilog doesn't check and will let you do:

assign x[2:0] = y[1];
assign x[2:1] = a;

This is probably not what you wanted to do. Always check widths and remember that LSB is 0.

Combinational Arithmetic

Verilog supports basic arithmetic and comparison:

Arithmetic +, -, *, /
Comparison
- Return 1 for true and 0 for false

assign sum = a+b;
assign diff = curr - prev;
assign max = (a > b) ? a : b;

Vectors are all treated as unsigned numbers

Parameters

Constants that are local to a module that can be optionally redefined on an instance-by-instance basis.

module some_mod#(parameter SIZE=8)
                (input[SIZE-1:0] X, Y
                 output[SIZE-1:0 Z])

When module is instantiated parameters can be changed. The module above is instantiated twice below, but each instance is 16 bits:

module some_other_mod(input [15:0] a, b, c, output [15:0] D, E);

some_mod #(.SIZE(16)) U1 (.X(a), .Y(b), .Z(D));
some_mod #(.SIZE(16)) U1 (.X(c), .Y(b), .Z(E));
endmodule;

Concatenation and Replication

Signals can be concatenated into a single signal using brace syntax.

//b is 8 bit
assign b = {a[3:0], 4'b0000}

wire [3:0] a, b;
wire [7:0] y;

//join two 4 bit signals to create 8 bit bus
assign y = {a,b};

Signals can also be replicated with a preceding integer or variable.

//c is also 8 bit
assign c = {4{a[3]}, a[3:0]};

Example: 2-bit comparator

A verilog module to compare two 2-bit signals a [1:0] and b [1:0]

module comp_2bit (input [1:0] a,b output a_gt_b);

assign a_gt_b = //complex combinatorial logic

//alternatively
assign a_gt_b = (a > b);

endmodule;

Behavioural Verilog

Rather than describe how the circuit is constructed or it's raw function, describe how it behaves
Implementation tools work out how to make hardware that fulfils the behaviour, considering the target architecture

The `always` block

An always block contains procedural statements that describe the behaviour of the required hardware.

always @ (a,b)
    begin
        x = a & b;
        y = a | b;
    end

The always keyword starts a block
The sensitivity list (in brackets after the @) contains the names of any signals that affect the block's output
- The block is sensitive to a and b
- Signals the circuit should respond to
- Shorthand always @ * includes all signals in sensitivity list
Procedural statements between begin and end
Give a more readable description of logic by describing how the output should change.
assign keyword not used - always block is an alternative to using it

`reg` signals

Since we are modelling at a higher level of abstraction, we use something other than wires
Signals assigned to from within always blocks must be declared as of type reg
A reg is like a wire but can only be assigned to from within an always block
- A wire is a connection between components and does not have its own value
Cannot assign to a reg using an assign statement or use it to connect to the output of a module
If you want to assign to an output port from inside an always block, it must be declared as reg in the module header too

The following two are functionally equivalent:


// x and y must be reg
always@ *
begin
  x = a & b
  y = a | b
end

//and

assign x = a & b;
assign y = a | b;

`if` Statements

Allows to describe a combinational circuit at a higher level of abstraction

always @ *
begin
  if (x < 6)
    alarm = 1'b0;
  else
    alarm = 1'b1
  end
end

Each branch can have more than one statement
Use begin and end the same as braces in C
Statements can be nested with other
Condition can be anything the evaluates to a boolean value
Can use comparisons and equality operators
Can combine conditions with logical operators !, &&, ||

`case` Statements

Verilog features case statements that let us choose from multiple possibilities, similar to C.

always @ *
case (sel)
  2'b00 : y = a;
  2'b01 : y = b;
  2'b10 : y = c;
  default: y = 4'b1010;
endcase

A decoder is a good use case for a case statement

module decoder3_8(input [2:0] ival, output reg [7:0] d_out);
always @ *
  case(ival)
    3'b000 : d_out = 8'b00000001;
    3'b001 : d_out = 8'b00000010;
    //etc...
    3'b111 : d_out = 8'b10000000;
  endcase
endmodule

Can also describe a multiplexer:

module mux4 (input [3:0] d, input [1:0] sel, output reg q)
  always @ * begin
    case (sel)
      2'b00 : q = d[0]
      2'b01 : q = d[1]
      2'b10 : q = d[2]
      2'b11 : q = d[3]
    endcase
  end
endmodule

Can assign to multiple signals from inside one always block
If you assign to a signal from inside an always block, must never do so anywhere else
- Using assign
- In another always block
- Like connecting a wire to multiple inputs: not allowed
Order matters in an always block as we are describing behaviour
If a signal is assigned to more than once, the last one takes precedence

Avoiding latches

always @ *
begin
  if (valid) begin
    x = a | b;
    y = c;
  end
  else
    x = a;
end

What happens to y in the else branch? No output is specified
No output is explicitly specified
- y latches on previous value
- Not ideal
All outputs from the always block must be assigned to in all circumstances
An output not being assigned to implies it should be latched or stored
If no output is specified, output is no longer combinational
Compiler would understand it to be a latch

A way to avoid this is to always use a default assignment at the top of the always block. The default will be overwritten by any subsequent assignments

always @ * begin
  y = x;
  if(valid) begin
    c = a | b;
    y = z;
  end
  else
    c = a;
    // y is x here
  end
end

Must always include any signal that is in the sensitivity list
Must assign to an output signal in all possible cases
- This is to maintain combinational logic

FPGA Design Flow

How do you go from HDL to a circuit? The key development was tools that could take HDL and generate a circuit automatically. Design flow is the process by which we specify and design a system all the way through to implementation.

The Design Process

Design is always informed by a specification
- What does the circuit do?
- What I/O does it need?
- Performance requirements
- Space/Power budget
- Edge cases
This is the most important stage as it defines what the design will be verified against
- Also influences the choice of target architecture
- Errors in interpretation of the specification/requirements can cause issues
Next is design entry
- Writing the actual HDL files
- Modules and sub-modules are defined
- I/O is defined
There are two main aspects to architecture design
- The datapath, logic that acts on data to compute the required functions
- The control path, logic that manages the movement of data and controls the datapath
Functional verification is performed throughout the design process
- It is important to verify that the HDL meets the specification
- Usually start at the lowest level modules and move up
- This is an iterative process, and testing should be continuous
Synthesis is the process by which the design is converted into circuits
- Lots of optimisation goes on at this stage
- Combinational logic is minimised
- Arithmetic operators are expanded into primitive operations
- Basic structures like memories, multiplexers, decoders are inferred
The result of synthesis is a netlist, a low-level representation of the circuit using basic blocks
Mapping takes the netlist and works out how to build it on the target architecture
- For ASIC design, each node in the netlist is mapped to cells from a cell library
- For FPGA design, each node is mapped to resources available on the FPGA
  - Maps combinational logic to LUTs
  - Synchronouse components mapped to flip-flops
  - Arithmetic mapped to ALUs or DSPs
- This gives an architecture-specific netlist
Synthesis verification checks the circuit is valid
- Checks circuit fits on FPGA
- Estimates timing, power usage, performance
Place and Route is when the netlist is mapped onto specific locations on the FPGA, and routing is configured to connect the blocks
- Often multiple iterations are needed to get it right
Timing verification checks timing constraints have been met
The bitstream is a file that is loaded onto the FPGA that tells it how to configure itself

Intellectual Property

IP cores are premade designs
Implementing complex hardware is pointless when you can re-use other modules
Similar to software libraries
Lots of open source cores are available online
Vendors and FPGA companies sell IPs
Good IP works as a black-box
- Well-defined
- Configurable
- Thoroughly tested and verified
- Provided with data sheets like any other piece of hardware

FPGA Architecture

Anything written in HDL will probably eventually end up as a real circuit that the mapping tool has to generate from basic components.

In ASICs, CMOS is the most common technology, however:
- Fabrication is complex and expensive
- Designs are inflexible
- High-start up costs
FPGAs are more attractive because they are cheaper and more flexible
ASICs can be cheaper for large volumes however, so there is a cost tradeoff

Understanding FPGA architecture gives us a better understanding of the mapping process to make our circuits easier to map, and to make most efficient use of FPGA resources. There are primarily four distinct types of resources on an FPGA:

Flexible logic: basic configurable blocks to implement combinational logic, coupled with clocked elements to enable synchronous logic and pipelining
Flexible routing: signigicant chip area dedicated to wires and switch boxes that enable connections between all components
Flexible I/O: multi-standard interfacing to external pins, with a range of speed capabilities
Embedded hard modules: an number of different resources optimised for speed and area, including DSPs and memories

Logic Blocks

Logic blocks do most of the computation on FPGAs
Made up of basic elements called Configurable Logic Blocks (CLBs), which consist of:
- A LUT to implement combinational functions
- Some arithmetic logic
- Flip-flops
LUTs and flip-flops can be used together or independently
Most FPGAs are built using SRAM technology
- An n-input LUT is just a $2^{n}$ x 1-bit memory
- The truth table for the function is stored in the LUT
- When an input pattern is applied, the bit at the corresponding location is the output
The propagation delay through a LUT is independent of the function it computes
LUTs can be broken down for smaller functions or combined for larger functions
LUTs are grouped together in groups of 4 to form a slice
- Slices also contain clocked elements, ALUs, etc
- Multiple slices form CLBs
LUTs can also be used as mini-memories to form distributed RAM
Each 6-input LUT can also implement a 32-bit shift register, without using the flip-flops in the slice

Routing

There is a large grid of wires throughout the FPGA
Connection boxes allow different elements to connect to this network
Switch boxes allow tracks to connect to each other
Place & route tools work out how to most efficiently make these connections
Routing is a key factor in the performance of a design
- Longer wires = higher latency
- Dedicated wires between blocks exist and are faster and save the general routing for other uses
- The individual bits of multi-bit wide signals may take different routes
As architectures evolve, connectivity keeps improving
- A mix of wire lengths helps improve performance

I/O

A key feature of FPGAs is highly flexible I/O.

Individual groups of pins can be interfaced according to different standards
High end FPGAs include high-speed serial interfaces
- Support for 10GigE, SATA, PCIe
- IP blocks included to configure these
On modern FPGAs, rates of over 32 Gb/s can be achieved

Block Memory

LUTs can implement very small memories, but hard blocks of synchronous memories are also included as block RAMs.

36Kb, and can be split into two 18Kb blocks
Can run at well over 500MHz
Support different sizes and configurations
All the features of a high-end memory system

DSP

FPGAs excel in Digital Signal Processing applications, so modern FPGAs include hard DSP blocks.

Usable for any multiply/add/accumulate operations
Highly parallel dataflow arrangement
Much faster than LUTs

DSP blocks are highly configurable

Configurable number of pipeline stages
Dynamically configurable ALU function
Dynamically configurable bypass for pre-adder and multiplier
Can cascade signals for combining DSP blocks

Synthesis and mapping tools work out how best to utilise all the resources most efficiently, but Verilog should always be written to optimise for and take advantage of the target architecture.

Sequential Verilog

We can design combinational circuits using
- Gate-level structural design
- Assign statements
- Behavioural always blocks
Important to consider that our circuits are purely combinational in all cases
It is possible to design sequential circuits
- Most designs will be synchronous: synced with a clock

Latches

SR Latch

Two inputs
Two outputs
Two NOR gates

module srlatch(input R, S
             output Q, Qbar);

nor N1 (Q, R, Qbar);
nor N2 (Qbar, S, Q);

// Alternatively

assign Q = R ~| Qbar;
assign Qbar = S ~| Q;

endmodule

D Latch

A D latch is synchronous, where an SR isn't:


module dlatch(input EN, D,
              output reg Q, Qbar);

always @ (D, EN)
    if(EN) begin
        Q <= D;
        Qbar <= ~D;
    end
end
endmodule;

D goes to Q if enable is high: circuit is described succinctly.

Generally, FPGA designs will be synchronous as it allows us to more easily understand the timing of the circuit. Most of the logic we will look at will be edge-triggered, which is described as follows:

module simplereg(input d, clk, output reg q);
    always @ (posedge clk)
        q <= d;
endmodule

posedge keyword can can be used in a sensitivity list to define a trigger on the rising edge of a clock (negedge is also a thing). In this case, a simple register is created. A multi-bit register/flip flop is defined below:

module simplereg(input [3:0] d, input clk, output reg [3:0] q);

    always @ (posedge clk)
        q <= d;
endmodule

Clocks and Reset

All circuits should be synchronised based on the same clock signal
Clock can be named whatever (usually clk) and defined as an input to the module
We often need to reset the contents of a register or state of circuit to 0/a default
- Two types of reset:
  - Asynchronous: whenever the reset input is asserted, the reset is triggered
  - Synchronous: if the reset is asserted on the rising edge, reset is triggered
- In modern FPGA design, we use synchronous reset

An 8 bit register with synchronous reset:


module 8bitreg(input [7:0] d,
               input clk, rst,
               output reg [7:0] q);

always @ (posedge clk) begin
    if(rst)
        q <= 8'b00000000;
    else
        q <= d;
end
endmodule

For an asynchronous reset, the reset signal is added to the sensitivity list so that the block can be triggered independently of the clock. However, this will desynchronise the always block from the rest of the circuit so is not the prefferred way to do it.


module 8bitreg(input [7:0] d,
               input clk, rst,
               output reg [7:0] q);

always @ (posedge clk or posedge rst) begin
    if(rst)
        q <= 8'b00000000;
    else
        q <= d;
end
endmodule

Registers

Can control multiple registers from the same block. Each assignment in a synchronous always block creates a register controlled by the same block. This verilog module contains 3 8-bit registers.

module multireg(input [7:0] a, b, c
                input clk, rst
                output reg [7:0] q, r, s);

always @ (posedge clk) begin
    if(!rst) begin
        q <= 0;
        r <= 0;
        s <= 0;
    end
    else begin
        q <= a;
        r <= b;
        s <= c;
    end
end

endmodule

When drawing can ignore clock and reset as they should always be there
Putting a triangle on an input in a block diagram shows that the input is edge-triggered

Non-Blocking assignment

The <= operator is called non-blocking assignment
For combinational always blocks, as use blocking assignment and order matters
For a synchronous block, order does not matter
- Everything only happens on the rising edge

Counters

A register where the value increments on the rising edge (or decrements if down signal is asserted).

module simplecount(input clk, rst, down, output reg [3:0] q);
    always @ (posedge clk) begin
        if(rst)
            q <= 4'b0000;
        else
            if(down)
                q <= q - 1'b1;
            else
                q <= q + 1'b1;

endmodule

Can alter to include an enable signal. Since it's a synchronous component, don't need to account for all branches.

module simplecount(input clk, rst, down, enable, output reg [3:0] q);
    always @ (posedge clk) begin
        if(rst)
            q <= 4'b0000;
        else
            if(enable)
                if(down)
                    q <= q - 1'b1;
                else
                    q <= q + 1'b1;
    end
endmodule

Can again alter to include the ability to load a value.

module simplecount(input clk, rst, down, load, input [3:0] cnt_in, output reg [3:0] q);
    always @ (posedge clk) begin
        if(rst)
            q <= 4'b0000;
        else
            if(load)
                q <= cnt_in
            else
                if(down)
                    q <= q - 1'b1;
                else
                    q <= q + 1'b1;
    end
endmodule

Shift Registers

1 bit serial in serial out shift register. Propagation occurs on the rising edge of the clock.

Order of assignment does not matter

module shiftreg(input clk, y, output reg q);
    req q1,q2,q3;

    always @ (posedge clk) begin
        q1 <= y;
        q2 <= q1;
        q3 <= q2;
        q <= q3;
    end
endmodule

Can make the module simpler using vectors, where each stage in the shift register is a separate position in the vector. The LSB is replaced by the input, and the MSB is the output.

module shiftreg(input clk, y, output reg q_out);
    req [4:0] q;

    always @ (posedge clk) begin
        q[0] <= y;
        q[4:1] <= q[3:0];
        q_out <= q[4];
    end
endmodule

Memory

64 element memory requires 6-bit address input, with each word as 16 bits
Declare internal 64-element array, where each position is 16 bits.

module spram(input clk, en, write_en,
             input [5:0] addr,
             input [15:0] d_in
             output reg [15:0] d_out);

    reg [15:0] ram [0:63];

    always @ (posedge clk) begin
        if (en) begin
            if (write_en) begin
                ram[addr] <= d_in;
            end
            d_out <= ram [addr];
        end
    end
endmodule

On each clock cycle:

output 16 bit word that is on the provided addres
if write_en, then d_in is stored at the memory location addr

Finite State Machines

Take a binary counter as an example, the output of which is a sequence of numbers, increasing by one each step. The behaviour is described in terms of a register and an incrementer circuit. This is fairly easy to reason about as a state machine:

At each point in time the system is in a state that determines what the output is (the contents of the register)
On each transition, the state changes (counter increments)

This is a state machine. A finite state machine describes a system using a finite number of states, and associated transitions. In synchronous design, an FSM is in one state for the duration of each clock cycle, and may transition on each rising edge depending upon the input.

State transition diagrams show the different states of a system and transitions between them. The diagram below shows a 3-bit binary counter with an enable signal. The state only transitions if enable is high.

The diagram consisits of nodes with states, edges between the states, and conditions that determine which transitions may occur. Diagrams may be simplified by only including conditions that result in state-changing transitions:

Consider an up/down counter:

Two input signals, dn and en
Counts up when en is high
Counts up when dn is low, down when dn is high
An input of 10 is en high and dn low
- Other combinations with en = 0 result in no transition

States can be labelled with a more meaningful name, like in the example below

This FSM always produces 3 high cycles followed by one low, with the output x shown with the states. The FSM is off in the off state, and then on for the three on states.

The diagram below shows the same FSM, but it will only output the three-cycle pulse when an input signal b is set high. The output of each state is also shown in the circle next to the state: state/output. It is important to label diagrams with a proper legend to make it clear what means what.

State transition information can also be presented in tables, which is effectively a truth table for the next state based on the current state

Current state	Input `b`	Next state	Output `x`
off	0	off	0
off	1	on1	0
on1	0	on2	1
on1	1	on2	1
on2	0	on3	1
on2	1	on3	1
on3	0	off	1
on3	1	off	1

Can see that the state only transitions from off to on1 when input b is asserted.

Implementing FSM

Using a state table, you can build the combinational circuit that determines the next state from the current state. Connected with a register holding state, this forms the structure of a finite state machine.

Consider an example of designing a lock that only unlocks (output u = 1) when input buttons are pressed in a fixed sequence. Inputs are 4 buttons: start, red, blue, green, and an input that indicates if any button has been pressed.

We can capture the lock's behaviour in a state diagram:

Start in an initial state
If start is pressed then move to another state (input s)
If a button is pressed and its red, move to next state (input ar)
- So on for button presses
If at any point a button is pressed but it is not the correct one, go back to the start again

However their is still another issue, where if someone holds down all the buttons the machine will just cycle through to the end state. This can be fixed by attaching conditions to check that other buttons aren't pressed to the states, ensuring the machine is robust with regards to the requirements.

Consider another example of a vending machine:

Accepts only £1 and 50p coins
Dispenses drink for £1.50 and change if necessary
Two inputs, c100 for £1 coin, c50 for 50p coin
Assume only ever 1 input high, and it tells us which coin is inserted
Two outputs, vend to release a drink, and change to give 50p of change.

However, notice that some of these states are equivalent. These can be merged to reduce the number of states in the diagram, hence simplifying it

Moore and Mealy Machines

In all the previous examples, outputs depend only on the state. These are called Moore Machines. The alternative is Mealy Machines, where the output depends on the state and the current value of the inputs. Mealy machines are harder to design and analyze, but can be more compact. In mealy machines, the outputs cant be drawn in the state circles, so are added to the edges, as output is a function of state and input. The diagram below shows a previous example with the outputs on the arrows.

The two diagrams below show the same machine, that outputs a 1 when the last two inputs were 0 then 1.

State Encoding & Transition Logic

In a synchronous design, we can assume that the FSM changes states only on rising edges, so we can put together a circuit like this, with a register and two sets of combinational logic:

The state register stores the current state
- To encode state, a binary value is assigned to each state
- Need $lo g_{2} m$ bits for $m$ states
Must also build the transition logic
- This can be done from a state table, by replacing names with encodings

Consider the state transition logic for the example with the three-cycle pulse:

Current state: `s[1:0]`	Input `b`	Next state: `ns[1:0]`	Output `x`
00	0	00	0
00	1	01	0
01	0	10	1
01	1	10	1
10	0	11	1
10	1	11	1
11	0	00	1
11	1	00	1

This is now a binary truth table, from which we can determine equations for each output bit, mapping s to ns.

ns[1] = s[1] & !s[0] | !s[1] & s[0]
ns[0] = !s[1] & !s[0] & b | s[1] & !s[0]
x = s[1] | s[0]

We can now create the two circuits and connect them into a state register. As a Verilog module, this requires a register, an always block, and combinational assignments connecting s and ns.

module pulse3 (input clk, rst, b, output x);

reg [1:0] s;
wire [1:0] ns;

assign ns[1] = s[1] ^ s[0];
assign ns[0] = (!s[1] & !s[0] & b) | (s[1] & !s[0]);
assign x = s[1] | s[0];

always @ (posedge clk) begin
    if (rst) begin
        s <= 2'b00;
    end else begin
        s <= ns;
    end
end
endmodule

When implementing a finite state machine, always ensure the state register as a reset, and a defined initial state, otherwise the starting state of the FSM is unpredictable.

More Complex FSM

More complex FSMs with more states and inputs can be hard to construct truth tables and equations for. Verilog's behavioural abstractions can be used instead.

Each state can be assigned a binary value and used as a named constants
Still need two registers for the state and next state
Synchronous logic to move state into next state
Behavioural combinatorial always block with a case statement for state transitions

Consider the more complex example with the button lock again:

module lock(input clk, rst, s, r, g, b, a, output u);

//define states as parameters
parameter wt = 3'b000, str = 3'b001, rd1 = 3'b010,
    blu = 3'b011, grn = 3'b100, rd2 = 3'b101;

//state registers
reg [2:0] nst, st;

//output logic
//output u is only high when state is rd2
assign u = (st == rd2);

//synchronous logic for changing state
always @ (posedge clk) begin
    if (rst) st <= wt;
    else st <= nst;
end

//input logic
//combinatorial logic for defining state transitions
always @ * begin
  nst = st;
  case(st)
    wt:
        if(s) nst = str;
    str:
        if(a)
            if(r&~b&~g) nst = rd1;
        else nst = wt;
    rd1:
        if(a)
            if(b&~r&~g) nst = blu;
        else nst = wt;
    blu:
        if(a)
            if(g&~r&~b) nst = grn;
        else nst = wt;
    grn:
        if(a)
            if(r&~g&~b) nst = rd2;
        else nst = wt;
    rd2:
        nst = wt;
    default:
        nst = wt;
  endcase
end

endmodule

The general structure of a state machine will always follow the example above.

Always ensure next state is assigned in every case
Use a default next state and output assignment at the top of the state transition block to minimise the number of statements
Using a combinatorial alway block, it becomes easy to verify that the FSM is correct, as we can verify against the state transition diagram.

Verification

Testbenches

Testing by loading to the FPGA takes ages. Testbenches allow for easier verifying correctness of verilog designs.

Algorithmic verification: is the selected algorithm suitable for the desired application?
Functional verification: does the designed architecture correctly implement the algorithm?
Synthesis verification: is the design fully synthesisable and implementable on the target design platform?
Timing verification: once synthesised, placed, and routed, does it meet timing constraints?

Sources of error in design

The specification may be incorrect or incomplete
- Even if it meets specification, it may not function as intended
Specification may have been misunderstood
- What has been implemented matches what you think the specification means, not what it actually means
Specification has been implemented incorrectly
Errors in code

Most of our time will be spent in functional verification.

Does design perform all functions in spec?
Are all required features implemented?
Does it handle corner/edge cases?

What is a testbench?

A self contained module, with no inputs or outputs
Instantiates the unit under test (UUT) - the module we want to verify
Contains a number of blocks
- Clock generator for driving synchronous elements
- Data and control signal generators for mimicking circuit inputs
- Data and status signal monitors for checking outputs match spec

If the module under test gives correct results when given inputs, then we can assume that it works.

In verilog, a testbench is just a normal module with no ports:

module testbench;
    //testbench statements
endmodule

The inputs to our unit under test will be driven by the testbench and must be declared as reg signals. Outputs must be declared wire. Inputs and outputs are then connected to the instanted module to be tested.

Initial Block

Another type of procedural block used in testbenches only
- Cannot be synthesised
Runs concurrently with always blocks
Used to initialise values when system first starts up
Can also set values with delay using #10 a = 1'b1; statements
- This tells the simulator to wait 10 time steps and then set a to 1
Delays are only for simulation and cannot be synthesised

`display`

The simulator has a console where the simulator prints messages
The display task/function allows us to print info to the console
Allows for C-style format strings
Argument can be an expression also

Verifying Combinational Modules

We want to verify a simple combinational module that computes y = abc + a'bc' + ab'c'. Manually stimulate the inputs to cover all 8 possible input values:

module simplecomb(input a,b,c, output y);
assign y = abc + a'bc' + ab'c';
endmodule

module comb_test();

reg at,bt,ct;
wire yt;
simplecomb uut(.a(at),.b(bt),.c(ct));

initial begin
    at = 1'b0;
    bt = 1'b0;
    ct = 1'b0;

    //increment every 10 time steps
    #10 ct = 1'b1;
    #10 bt = 1'b1; ct = 1'b0;
    #10 ct = 1'b1;
    #10 at = 1'b1; bt = 1'b0;  ct = 1'b0;
    #10 ct = 1'b1;
    #10 bt = 1'b1; ct = 1'b0;
    #10 ct = 1'b1;

    #10 \$finish;
end
endmodule

Unit under test is instantiated, connecting ports to signals
Start with input values for a, b, c of 000
Wait 10 timesteps and change inputs to 001
Continue cycling through all possible values
\$finish terminates simulation
We want to see what output waveform is generated by the module so we can verify it exhibits the correct behaviour

Checking the waveform manually is tedious, so we can instead add assertions into testbench to \$display an error if the output does not match the expected value

at = 1'b0; bt = 1'b0; ct = 1'b0;
if(yt != 1'b0) \$display("000 failed")
//.. and so on

This is still tedious, as we still have to work out the correct value in advance. We could carry out the verification using verilog's language features instead:

always #10
    if (yt != (a&b&c | (!a)&b&(!c) | a&(!b)&(!c)))
        \$display("testbench failed for %b %b %b",a,b,c);

In testbench design we can be much more relaxed about using language constructs:

initial is not synthesisable but is fine in testbenches
Delays on assignments can be used
Assigning to a signal from multiple blocks is not an issue

Testbenches are not designed to be turned into circuits: they are software, not hardware.

Synchronous Verification

For synchronous testbenches, we need a clock input to oscillate between 0 and 1
Initial value (high or low) is important and can be done either way
Verilog below sets clock to change on each timestep (50% duty cycle)

initial clk = 0;
always #1 clk = ~clk;

Timing in Testbenches

So far, we have assumed dimensionless time
We can specify the time dimensions timescale 1ns / 100ps
- This line is placed at the top of the testbench source file
- Specifies unit time is 1ns
- Specifies max rounding precision to be 100ps
  - #10/8 would give 1.2, not 1.25
Most simulation tools require the timescale to be stated in order to simulate
During functional simulation this means nothing since timing is not factored in

Since for most designs, clock and reset behaviour is the same, we can use a standard template:

module sync_test;

reg clk, rst;

initial begin
    \$display("Start of Simulation");
    clk = 1'b0;
    rest = 1'b1;
    #10 rest = 1'b0
end

always #5 clk = ~clk;

endmodule

Reset is held high for 10 time steps, then brought down to enable circuit. Clock oscillates continually.

Accessing files

The set of inputs driving the circuit is a test vector
Creating test vectors within a testbench is generally only feasible for simple parts of a circuit
It is also possible to access test data stored in external files
- Allows us to prepare more complex types of test data, eg images
Can also store simulation outputs in an external file
- Allows for analysis using more suitable tools, eg scripting with matlab/python
File I/O in verilog is very similar to C
- Need a file handle (stored as an integer)
- Can use read/write/append mode r/w/a
A self-checking testbench can be constructed by reading a set of inputs and outputs from files, and seeing if the unit under test matches them

integer infile;
infile = \$fopen("inputfile.txt","r");
while (!\$feof(infile)) begin
@(posedge clk);
\$fscanf(infile,"%h","%b\n",data,mode);
end

This example reads one hex and one binary value on each line of a test file in each clock cycle, and assigns them to the data and mode signals.

Advanced Verification

When working with testbenches, should always use the same clock throughout
If driving an input reg signal on one clock, that data will only enter the module at the next rising edge
Simple combinational circuits can be tested using counters and inspection of outputs
For more complex circuits, prepare test data and load from/write to files
\$random function generates a 32-bit random number and can be used for random testing
If there are too many input possibilities, focus on edge cases or cases more likely to cause error
Finite state machines are nicely decomposed for testing
- Test combinational state transition logic separately
- Test the whole state machine, manipulating inputs.
Testing process should be iterative and integrated.

How to verify

Start with a good design specification
Prototyping is important: develop a prototype first
Software models can be constructed of at various levels
- Simple model with no reflection of hardware design
- Model that mimics overall functional architecture
- A cycle-accurate model
- A bit-accurate model
Mode detailed models give a better reflection of the hardware, but take longer to develop (and can have more bugs)
A functionally correct circuit should produce the same results as a software model of the function, however some discrepancies may still be present
- Number representation can cause differences
- May use different calculation methods
- Can take shortcuts or refactor parts of an algorithm to simplify implementation in hardware
- Should be aware of these discrepancies and know when it is safe to ignore them
Can apply the same set of input vectors to the hardware design and software model and compare the outputs
Can also implement the inverse function in software
- Run data through hardware module
- Put outputs into software inverse
- If software inverse outputs original hardware inputs, design is correct

Simulation Environments

Vivado comes with simulator built in
Waveform window shows signals in design
- Signal values plotted as wave over time
- Useful for debugging
- More complex designs require more complex verification techniques

Modern Verification

There have been many recent developments in electronic design automation around verification
SystemVerilog adds new verification features
Formal mathematical circuit verification involved proving a design is correct
Sources of error can occur in places other than the design
- Faulty specification
- Buggy tools

Consider a large multiprocessor SoC:

Test each processor and all layers in a hierarchy
Test communication and interfaces
Test contention for resources
Test different clocks for different units
Predict effects of cache misses and race conditions

For simple systems we work with:

Prepare a software model
Construct testbenches for simple logic
Use self-checking testbenches if possible
Use external files for test data if appropriate
Make testing an iterative process

Testing can and should consume most of your time!

FPGA Arithmetic

FPGAs demonstrate their power specifically in applications that require complex computation at high data throughput, so the specifics of how arithmetic is carried out is important.

Number Format

The binary number format is positional
The value of a binary number is the sum of each element multiplied by it's position
The rightmost bit is the LSB
Leftmost bit is MSB
Bits are indexed by their power
- LSB is bit 0
- MSB is bit n-1
The range of an unsigned n-bit number is $2^{n - 1}$
Sign-magnitude can be used to represent signed numbers
An offset can also be used, where the number range is shifted by an amount
Two's complement is mostly used where the MSB has a negative weight
- To negate a number, invert the bits and add 1
- If MSB = 1, the number is negative
- Has range $- 2^{n - 1}$ to $2^{n - 1} - 1$
To widen a two's complement number, you need to sign extend
- Add more bits to the left with the same value as the current sign bit
- $- 3 5_{10} = 101110 1_{2} \to 1111101110 1_{2} = - 3 5_{10}$

Adders

The full adder allows to carry out a full add operation on two operands, producing a sum and carry ouput. This can be extended into a ripple adder, which is multiple adders chained together to create multi-bit adders.

Carry bits are passed up the chain from LSB to MSB
- The carry ripples through the circuit
We have to wait for all the carry bits to propagate through the circuit to get the correct result
- Not efficient

The ripple adder can also be adapted to be able to subtract using XOR gates, and by adding one using the input carry bit.

In a synchronous system, we place the operands in registers and sum the registers
The speed at which we can run the clock to update the registers depends on the propagation delay of the adder
Can only clock the circuit as fast as the critical path allows
- In an adder, this is the carry from LSB to MSB
- Wider adders lengthen this period

Carry-Lookahead Adders

A bit position generates a carry if it produces a carry, no matter what the carry in to that stage is
A bit position propagates a carry if it produces a carry whenever it's carry is high
This can be expressed as logical expressions
- $g_{i} = a_{i} & b_{i}$
- $p_{i} = a_{i} ∣ b_{i}$
The carry out of bit position $i$ is
- $c_{o u t, i} = g_{i} ∣ p_{i} & c_{in, i}$
Also, $c_{in, i} = c_{o u t, i - 1}$
- $c_{1} = g_{0} ∣ p_{0} & c_{0}$
- $c_{2} = g_{1} ∣ p_{1} & c_{1} = g_{1} ∣ p_{1} & g_{0} ∣ p_{1} & p_{0} & c_{0}$
- The carry of each bit position can be expressed in terms of the previous ones
At each stage, the sum, generate, and propagate can be computed
This allows us to compute any intermediate carry bit
Since $g$ and $p$ signals depend only on $a$ and $b$ , there is no more ripple

Several can be chained to implement a much wider adder
Wider lookahead require much more gates
Instead of building wider adders from just gates, larger adders can be built up hierarchically from smaller adders (the $PG$ and $GG$ output signals are chained)

Other techniques for fast adders include carry-skip adders, which allow carries to skip over bits, and the Manchester carry chain, which uses shared logic for lookahead

Multipliers

Binary multiplication is done similar to decimal long multiplication
Multiplication between two bits is an and operation
After each multiplication stage, one of the operands is shifted
Partial products as the product of each pair of bits in each shift position are generated
These partial products are then all summed

Alternative architectures try to reduce the amount of addition, eg the Wallace multiplier
FPGA tools take care of implementing multipliers efficiently using LUTs
Wider multipliers are mapped to DSP blocks
- Very wide ones might use multiple DSPs

Fixed Point Arithmetic

Fixed point notation allows us to work with fractional numbers.

Place a binary point at any location within the number
Arithmetic is performed as integers
The location of the binary point is kept track of
Designer can select a precision suited to the application

The only difference when calculating a fixed point value is that some numbers have a weight that is a negative power of two, ie $1 1_{2}$ with a fixed point in the middle is $2^{0} + 2^{- 1} = 1.5$ .

The binary number now consists of two parts
- The integer part determines the range
- The fractional part determines the precision
Choosing a different position for the point allows trading accuracy for range
- 4 integer bits gives a range of 0-15
- 6 fractional bits can represent values with a precision of up to $2^{- 6} = 0.015625$
There is no fixed notation for stating the position of a binary point, so it is important to be clear
- An $m$ bit fixed point number with $q$ fractional bits has $m - q$ integer bits
- If the number is signed, the first bit also has negative weight
Not all numbers can be represented exactly in a given fixed point format
- This causes some error, but selecting an appropriate precision for the use case can make this tolerable
- True also for floating point

Fixed Point Conversion

The easiest way to convert a fractional number to a given fixed point format is as follows:

Multiply the number by $2^{I}$ where $I$ is the number of fractional bits
Round the result to an integer
Convert the integer to binary in the standard way
Use the binary representation of that number as the fixed point representation
- Remember where the position of the binary point is for the calculation

For example, convert 2.384 to an 8 bit number with 6 fractional bits

$2.384 \times 2^{6} = 152.576 ≊ 153$
$15 3_{10} = 1001100 1_{2}$
$2.38 4_{10} = 10.01100 1_{2}$ is the fixed point approximation
$2.39062 5_{10}$ is the actual value of the approximation
The error is 0.006625 absolute or 0.28% relative
- Probably fine, depending upon the design

When converting, there are some things to watch out for:

Need to maintain the same binary point position for all values
If the converted number exceeds the width of the format then integer bits are lost
Always work out the max integer width and precision you need first based on expected integer range
When numbers are signed, the MSB has negative weight

Arithmetic affects the binary point:

Addition and subtraction don't change the position, but an extra bit may be needed to prevent overflow
Multiplication of an $m$ bit number with $n$ fractional bits and a $p$ but number with $q$ fractional bits yields an $m + p$ bit number with $n + q$ fractional bits
It is important to keep track of where integer and fractional parts are in circuits

Fixed Point in Verilog

Verilog has no native support for fixed point, so the designer must keep track of the positions within the code. Vector slicing is used to choose the required bits. The module below multiplies two numbers with 4 integer and 12 fractional bits.

module mul_short(input signed [15:0] a,b, output signed [11:0] prod)
wire signed [31:0] x = a * b; //wider to prevent result being truncated
assign prod = x[31:20];

All verilog signals are treated as unsigned numbers by default, and we can use built in arithmetic operators on them.

Signed Arithmetic in Verilog

Any reg or wire is considered unsigned, unless it is declared as a signed signal

wire signed [3:0] x;
wire signed [15:0] y;

Signals like this are considered signed, and the design tools take care of generating signed circuits. For signed operations, all operands must be declared singed or verilog will default back to unsigned arithmetic. Signals can be cast using signed() and unsigned() functions

A basic 4-bit signed adder:

module add_signed(input signed [2:0] a,b, output signed [3:0] sum);
assign sum = a+b;
endmodule

A 3-bit signed adder with a carry out will be generated, and any sign-extension is done automatically.

Signed literals can also be used

reg signed [15:0] count_limit = -16’d47
reg signed [7:0] bits_left = 8’d12

When using unsigned vectors, verilog will automatically zero-extend when needed, which is bad for signed numbers. When declaring signals as being signed, verilog automatically sign-extends instead.

reg signed [15:0] x = 8'b1001_1111;
//results in x= 1111_1111_1001_1111

To mix signed and unsigned numbers, it is important to manually cast unsigned to signed:

module add_signed(input signed [2:0] a,b, input carry_in, output signed [3:0] sum);
assign sum = a + b + \$signed({1'b0, carry_in});
endmodule

carry_in is casted so signed circuitry is generated
It is extended with a 0 because just casting a single bit to signed would result in a -1
- This is important to do to prevent numbers becoming negative

A signed number can be truncated to narrow it's width, but only safely when the upper bits are all the same as the new MSB:

11110101 safely truncates to 10101 (-11)
000000011100 safely truncates to 011100 (28)

Verilog will always truncate MSBs as needed if not careful, so care must always be taken when working with signed arithmetic

Look out for synthesis warnings
Make internal signals as wide as needed then truncate at the output

Floating Point

Floating point allows to represent fractional numbers with an adjustable scale
32 bits are decomposed into separate fields that make up a number

Sign	Exponent	Mantissa
1 bit	8 bits	23 bits

$(- 1)^{sign} \times mantissa \times 2^{exponent - 127}$

Can represent numbers as small as $\pm 2^{- 126}$ and as large as $\pm (2 - 2^{- 23}) \times 2^{127}$
Not all numbers can be accurately represented
The exponent determines which powers of two the window of values covers
- The mantissa determines where within the window the value is
- As the size of the window increases, the less accurate the values can be
Floating point circuits are large and complicated
- Not supported in most synthesis tools
IP blocks are provided for floating point computation, but it should be considered how necessary it is before use

Timing & Pipelining

So far, digital circuits have been considered as instantaneous, where outputs are available immediately. This is an approximation, as there are delays in the propagation of signals through circuits. There are delays associated with many elements in circuits, and these need to be analysed and take in to account.

Every gate or circuit element exhibits a propagation delay
A change in the input causes a change in output, but only after a propagation delay: $t_{p}$
Delay arises due to low level factors to do with analog properties including capacitance
- Figures usually supplied by manufacturers
- Can differ for different gates
- Can be affected by temperature
- Low-to-high may differ from high-to-low delay
- Also related to fanout
  - The number of inputs the output is driving
This information can give the total propagation delay for a whole circuit
- Sum up all delays along all paths from inputs to outputs
  - Worst case delay is the one we're concerned with
- Different path delays can cause internal glitches
- The worst case through a circuit is the deciding factor in how fast we can supply inputs

Combinational Timing

It is typically easiest to trace through a circuit gate by gate, working out what the delay would be at each step. The gate delay can be indicated on timing diagrams, but only when specifically interested in them. The diagram shows a circuit, along with it's timing diagram including propagation delay.

Attaching some numbers to the delays:

$t_{p A N D} = t_{pOR} = 3$ ns
$t_{p I N V} = 2$ ns
$t_{pXOR} = 4$ ns

There are four inputs and one output in this circuit, so four possible paths. All paths have the same delay of 6ns.

Another example:

The first two paths have a 6ns delay
the second two paths also include an inverter, which adds another 2ns of delay for a total of 8ns

Looking again at the ripple adder, and assuming each gate has a unit delay:

a,b to sum has 2 delays
cin to sum has 1 delay
a,b to cout has 3 delays
cin to cout has 2 delays
Worst case is 3 delays, so this is how long we must wait for signals to propagate fully

Uneven path delays mean there may be invalid intermediate values before outputs settle, called glitches. It is important to wait for all signals to propagate to avoid incorrect results in the circuit.

The total delay of an n-stage ripple adder is $4 + 2 (n - 2)$ gate delays:

The 1st stage has 3 delays
intermediate ripple stages have 2 delays
The final stage will have 2 delays

Any combinational circuit element will have a delay, which can be determined from it's datasheet. Combining larger complex combinational elements must follow the same rules as combining gates. Consider the circuit to the right which computes $f = a x^{2} + b x + c$ :

The longest path from input to output is through the 2 multipliers and 2 adders
Assuming 4ns and 2ns delay respectively, the total worst case delay is 12ns

Synchronous Timing

When composing large combinational circuits, the timing characteristics of each part must be considered to ensure that inputs and outputs are all timed correctly, analysing all paths. Any change to the circuit requires re-analysis of the timing behaviour. Sequential circuits are more complex to analyse.

A synchronous system has a single clock that marks the timesteps when the new inputs are passed to sub-circuits
At each rising edge, register outputs change
Any combinational path must process this input, and have the result ready before the next clock cycle
The clock must be slow enough to accommodate the slowest path between two flip-flops

At the first clock edge, values emerge from the first set of registers and propagate through the circuit, taking 12ns
After 12ns, the values at the combinational output are stable and correct
At the next clock edge, this output is stored in the register, and a new set of values enter the combinational circuit
The maximum frequency = 1/12ns = approx 83MHz
- Or any slower clock will also work

When looking at a larger synchronous circuit:

There will be a delay between any pair of registers
The output of any register must have enough time to propagate through all the combinational logic to the next input
The paths between all pairs of registers is considered, and the longest is selected as the critical path
- The critical path determines the maximum clock frequency for the whole circuit

Flip-Flop Timing

Recall how flip-flops are constructed, in a master/slave arrangement such that the master is active low and the slave active high, which traps data on the rising edge of the clock.

int represents the output of the master latch, and follows the input as long as clk is low. When clk is high the slave latch becomes transparent, passing through the trapped value.

The actual timing characteristics are a little more complex:

There is a clock-to-q delay, which is the delay between the clock edge and output changing
Any desired input must arrive and be stable for a portion of time before the rising edge: the setup time
The input must be held for a short while after the rising edge: the holdtime*

For a more accurate max clock period, this must be factored in
Min clock period $T$ is given by $t_{c Q} + t_{p} + t_{s} < T$
This does not factor in hold time
- The previous register must not produce an output that can reach the next register's input before $t_{h}$ after the rising edge
- As long as $t_{c Q} > t_{h}$ , this cannot happen
  - $t_{h}$ is often 0 in modern devices

Example

Assuming a gate delay of 1ns, $t_{c Q} = 0.6 n s$ , and $t_{s} = 0.4 n s$ , determine the max clock frequency for a 6 bit ripple adder.

$n$ bit ripple adder has $4 + 2 (n - 2)$ gate delays
- For 6 bits, delay is 12ns
Require $T > t_{c Q} + t_{p} + t_{s}$
$T > 13 n s$
$f_{ma x} = 1/13 ns = 76.9 MHz$

Synchronous Design

When designing a circuit, we want to process large amounts of data
Software programs used for data processing spend most of their time inside loops
- eg processing an image means applying a filter which involved maths over each pixel of the frame
Combinational circuits with no clocks become difficult to build as they increase in size
- Difficult to keep track of the delay
- A synchronous system gives a predictable, fixed rate of data movement so the system can be modelled more easily
I/O typically supplies data at a regular rate
- Sensors/ADCs
- I/O Busses such as PCIe
- Outputs read from a memory
Building a data processing pipeline allows computation to be done quicker than a processor
- Can exploit parallelism
- Implement complex, custom datapaths
Throughput is the number of input values that can be processed per unit time
- Determines the real speed of the circuit
- A fully synchronous circuit that can accept one set of inputs per cycle has a throughput of clock speed x amount of data per input
Latency is the time between an input entering the system, and the computed output emerging from the system
- Less critical as it is a fixed delay
Generally, it is desirable to maximise throughput, even if this comes at a slight cost of latency
The limiting factor in any circuit is combinational propagation delay
- The largest chunk of combinational logic between any two registers determines the max frequency
- Large chunks of logic can be broken down by adding another register in the middle
  - This increases latency, but allows the circuit to be clocked faster

Pipelining Circuits

Recall the polynomial calculation circuit from earlier. If another register stage is added in the middle, we now have two sets of paths between three registers:

The longest path between the first two register stages is through two multipliers: 8ns
Longest path between second pair is 4ns
Critical path is now 8ns, so can be clocked at 125MHz
- Latency is now 2 clock cycles, 16ns
Throughput has been increased

This can be broken down further to add yet another register stage:

Critical path now 4ns, clock speed now 250MHz
Latency now 3 cycles (12ns)
Have increased throughput and decrease latency

It would be pointless to add another stage between the two final adders as the critical path would still be elsewhere. It is important to place register stages to balance delays between pairs.

As a general rule, clock frequency can be increased by splitting up combinational logic. This is pipelineing

A heavily pipeline circuit has many pipeline stages to allow the clock to be as fast as possible
Even if cycle latency increases, may actually still be faster due to increased clock
Leads to more complex designs and increased resource usage

To add a pipeline stage to a circuit:

Find the largest block of combinational logic
Break all paths between the registers with a new pipeline stage
Wherever the break crosses a signal, place a register
Registers are drawn combined but each signal requires a separate register
Some registers will do nothing but delay signals so they align correctly
The widths of registers will depend on the signals going in/out of them

Timing in FPGAs

On FPGAs, logic is implemented in LUTs, so gate delays are not relevant
A 6-input LUT can implement any function of 6 inputs
- The propagation delay is the same no matter what function is implements
If a function is too large for a single LUTs, it will be spread accross multiple, increasing delay
- A single LUT in a Xilinx Virtex-6 is around 240ps
Other FPGA resources have specified delays too
- DSP blocks have a specific combinational delay
  - Registers can be enabled to decrease the critical path
Total combinational delay in an FPGA is composed of
- Logic delay: delay through LUTs, DSPs, etc
- Routing delay: the delay through the routing fabric
Synthesis and mapping tools will break logic into blocks, but pipelining is done as coded in verilog
Place and route tools will minimise routing delay
- Use numbers from datasheet to find critical paths
FPGAs have many flip flops around the chip to allow deep pipelining
Timing characteristics are given in datasheet
- A slice register has clock-to-output delay, $T_{C K O}$ , of around 0.4ns
- Setup and hold times are around 0.4ns and 0.2ns, respectively
- Specifics depend on which outputs are used
- Slice multiplexers affects timing

Interfaces

FPGAs come in a wide variety of packages with a range of IO capabilities
- Most pins are reserved for specific uses such as voltage rails, clocks, configuration
- Other pins are multifunction and used for I/O
FPGAs can be incorporated into a system in many ways
- Standalone, interfacing with peripherals and implementing all functionality
- As a peer to a more general purpose processor, connected with high bandwith
- As an accelerator on a high performance bus with shared memory
- As a separate device that communicates with another processor over a lower throughput bus
How to integerate and communicate with an FPGA depends on the application
- Tightly coupled offers good bandwith but requires complex OS support
- Treating it as an accelerator like a GPU allows it to work with the CPU
New hybrid FPGA designs that include an embedded processor in the same fabric
- Design built around a processor subsystem along with programmable logic
- High throughput interconnect

ADCs and DACs

Interfacing with the real, analog world requires converting between analog and digital signals
Analog-to-digital converters take an analog voltage level and convert it to a digital word
Digital-to-analog converters take a digital word and convert to an analog voltage level
ADCs and DACs are characterised by
- Sampling rate: the number of values the device can create/consume per second
  - Determines the bandwidth based on the Nyquist theorem
- Resolution: the number of different levels the device can differentiate between
- Various fidelity characteristics such as linearity, noise, jitter
In most cases, external ADCs/DACs are used with FPGAs
Modern FPGAs include analog interfaces with internal ADCs
Recent RFSoC radio-focused FPGAs include high speed ADCs and DACs on chip for integrated RF implementation

GPIO

Most FPGAs and microcontrollers have pins for general purpose I/O
Each pin can be set as an input or output for a single bit
The I/O voltage level is customisable for banks of GPIO pins
Easiest way to get data in and out of an FPGA
Support switching rates of over 200MHz
The number of pins is generally limited and insufficient for creating large parallel data busses
- Parallel I/O at high speeds requires detailed timing calibration and synchronisation

PWM

Method of switching an output on and off, where the ratio of on to off, the duty cycle, gives an average output level
Used for changing motor speed, servo direction, LED brightness
Works due to the inertial load of output devices
- High speed switching means the overall output level is the average of the high and low periods
- An LED flickering at 500Hz cannot be detected as flickering by a human eye
Microcontrollers use timers to generate waveforms, and the number of timers available limits the number of PWM signals that can be generated
FPGAs can create counters specifically for PWM

module pwmgen #(parameter CNTR_BITS=6) (input clk, rst,
                input [CNTR_BITS-1:0] duty,
                output pwm_out);

reg [CNTR_BITS-1:0] pwm_step;

always @ (posedge clk) begin
    if(rst)
        pwm_step <= 1'b0;
    else
        pwm_step <= pwm_step + 1'b1;
end

assign pwm_out = (duty >= pwm_step);

endmodule

CNTR_BITS is the width of the counter
duty is the number of steps that the pwm signal is high for
pwm_step is the internal counter for each period

UART

Universal Asynchronous Receiver/Transmitter is the easier way of sending multi-bit data between two systems
- Uses a single wire
- Asynchronous because no clock line between
  - Baud rate is pre-agreed
Data is transmitted in frames
- Frames can vary in bit length, and sometimes include parity, start, and stop bits
Shift register is used at either end for parallel-serial conversion
Rx of one device connected to Tx of another
Combination of start and stop bit means frames can always be detected
Can be issues when clocks are not well matched, which limits possible throughput

SPI

Serial Peripheral Interface is a syncrhonous communication protocol that uses a shared clock at both transmitter and receiver
Master initiates communication and generates clock
Slave devices used as peripherals
- A single master can communicate with multiple slaves on the same SPI bus
Four signals required
- SCLK - the clock generated by the master
- MISO - master in slave out
  - Data input from slave to master
- MOSI
  - Data output from master to slave
- SS - slave select
  - Select which slave is being communicated with
  - Typically active low
Each slave connected to a master requires a separate slave select line
Master outputs the same clock for synchronous communication

To initiate communication, the master sets the required slave select line low and sends a clock signal
On each clock edge, the data can be sent bi-directionally on MOSI and MISO
With multiple slaves, the MISO line must only be driven by one at a time so other slaves must be set to high impedance
All devices must agree on clock frequency, polarity and phase
- Specified in datasheets

I2C

Inter-intergrated circuit protocol is similar do SPI but has different features
- Uses fewer wires due to lack of slave select lines
- Uses addressing to allow a large number of devices to share the same lines
Only two wires
- SDA - serial data
- SCL - serial clock
- I2C clock is usually 100kHz
All devices connected to an I2C bus act the same
Whichever device is transmitting is the master for that communication
Pull-up resistors keep each line high when no device is transmitting
The device intending to communicate indicates this by pulling SDA low
Data is then put onto the bus while SCL is low and sampled by slave devices during the rising edges
Simpler signalling means more complicated data framing
- Pulled low to start
- 7 bit address sent
- 1 bit for read/write mode
- 1 bit slave ack
- 8 bit word
- 1 bit ack signal
- Stop bit

Takes 20 cycles to read a single byte
- Vs 10 for SPI
I2C is also half-duplex with a slow clock
I2C used when there is less pins, SPI needed for higher data throughput

High Speed Serial I/O

Higher speed communication off ship is facilitated by special serial/desrial blocks
- These take data words and serialise them, and transmit them over differential pairs of I/O pins
- Controller by high-speed clocks
- Can acheive up to 10s of gigabit speeds
Differential signalling is used to improve noise resistance at high speed
- Signal sent twice, one an inverted copy of the other
- Balanced lines means better resistance to EM interference
Clock information is encoded in data that is sent
Data is encoded and scrambled to ensure sufficient transitions between 1s and 0s for receiver to be able to decode
Extra bits are added to the data bits to ensure sufficient transitions and DC balance
Specific schemes are specified by different physical layer standards
- 8b/10b means 2 extra bits are added to each byte
Effective data rate is determined from two specifications
- Baud rate
- Encoding scheme
- For example, 2GHz with 8b/10b encoding gives 200MB/s
  - 20% of baud rate is encoding overhead
Multiple lanes are used to improve throughput
- PCIe gen 3 had a transfer rate of 8Gb/s per lane and uses a 128b/130b encoding
  - 985 MB/s
  - 1.5% encoding overhead
  - 16 lanes (PCIe3 x16) gives about 16GBps
Use in many interfaces
- Serial ATA for disks and storage
- Gigabit ethernet
- Used over a variety of physical media
Circuits required to interface with high speed I/O have to be designed carefully to meet strict timing requirements
- Vendors usually provide IP for this
- IP blocks designed to specific standard for the interface they are meant to be using
The simplest form of communicating between modules in design is the ready/valid handshaking
- One module is a source, another a sink
- The sink module asserts a ready signal when it is ready to consume data
- The source module asserts a valid signal when it is outputting valid data
- At any clock edge when both ready and valid are asserted, data is transferred on the data line
- Can introduce a bottleneck
In the source module, the pipeline can be halted when the sink is not ready, and resumed when ready
- In the sink, ready is asserted when data is ready to be accepted
- Such an interface allows a FIFO buffer to be inserted between modules to offer more isolation

AXI4

Most hybrid FPGAs include an ARM processor
Advanced microcontroller bus architecture (AMBA) is an on-chip interconnect specification introduced by ARM for use in SoCs
Defines a number of interfaces
- AXI4 for high performance memory mapped communication
- AXI4-Lite is a simpler interface for low throughput
- AXI4-Stream is for high speed streaming data
Reads are initiated by a master over the read address channel
- The slave response with data over the read data channel
Writes are similar, with address and control data being placed on the write address channel
- The master sends data over the write data channel
- Slave responds on the write response channel
Read and write channels are separeatre, allowing bidirectional communication
AXI4 supports bursts of up to 256 words
Each master/slave pair can have a separate clock
A system consists of multiple masters and slaves connected on an interconnect
Most vendor IP is provided with an AXI4 interface to simplify integration into a design
- Different interface specifications are shown in datasheets

Processor Implementation

Fixed Purpose Processors

Digital circuits designed to implement a specific application, when fabricated so silicon, are Application Specific Integrated Circuits (ASICs).
The alternative is creating FPGA bitstreams and loading them into FPGAs
Changing the function of an FPGA is easy, creating new ASICs is expensive.

Custom datapaths for specific applications have the benefit of high performance due to being tailored for the use case, and being able to exploit parallelism. When repeating the same computation on a stream of data, a simple feed forward datapath is most performant, and can be pipelined to improve throughput

The example below shows a feed-forward data path for multiplying two complex numbers, with six pipeline stages.

Finite Impulse Response (FIR) filters are also easy to map to hardware. The delay blocks are just registers, and the arithmetic blocks are implemented directly. Using the transpose form shortens the critical path to improve performance further.

General Purpose Processors

General purpose processors need to support:

A set of arithmetic operations
Movement of data in and out of arithmetic logic
A way of breaking down functions into discrete steps
A way to program the circuit to carry out the steps

Each of these components can be constructed in Verilog using basic synchronous elements.

Program Counter

Just a register with an input and output (32 bits).

module pc_reg(input clk, rst, input [31:0] pcnext, output reg [31:0] pc);

always @ (posedge clk) begin
    if (rst) //point to base address on reset
        pc <= 32'd0;
    else
        ps <= pcnext;
end
endmodule

Register File

The register file constains 32 32-bit registers, and has two read ports.

Two read address, one for each port (ra1,ra2)
A write address (wa3)
A write data input (wd3)
Two read outputs (rd1, rd2)
A write enable input (we3)


module regfile (input clk, we3,
    input [4:0] ra1, ra2, wa3,
    input [31:0] wd3,
    output [31:0] rd1, rd2);

reg [31:0] rf [0:31];

always @ (posedge clk) begin
    if(we3) rf[wa3] <= wd3;
end

assign rd1 = (ra1 != 32’d0) ? rf[ra1] : 0;
assign rd2 = (ra2 != 32’d0) ? rf[ra2] : 0;

endmodule

RAM

Standard memory with one read and one write port
Reads are combinational and writes synchronous

module dmem (input clk, we,
    input [31:0] ad, wd,
    output [31:0] rd);

reg [31:0] ram [0:65535];

// byte-addressing to word-aligned
always @ (posedge clk)
    if(we) ram[ad[31:2]] <= wd;

assign rd = ram[ad[31:2]];

endmodule

Combinational elements

There are other combinational elements in the processor, multiplexers, incrementers, sign extension, etc, all of which are fairly easy to implement. The ALU may be more complex, but a simple example of one is shown below, which supports 8 different functions, selected using a function control input F[2:0].

module alu (input [31:0] a,b, input [2:0] func,
    output reg [31:0] out);

wire [31:0] bfin = func[2] ? ~b : b;
wire [31:0] sumout = a + bfin + func[2];

always @ *
    case (func[1:0])
        2'b00: out = a & bfin;
        2'b01: out = a | bfin;
        2'b10: out = sumout;
        2'b11: out = sumout[31];
    endcase

endmodule

Processor control

The processor also has a control unit, which asserts signals to inform the datapath for the processing of a particular instruction. The control unit uses combinational logic to break down the instruction and then output signals to control the rest of the processor

Pipelining

A pipeline processor requires register stages to be added within the data and control paths.

ES3C5 - Signal Processing

Brief Notes + Equations (Aaron)

Brief Notes + Equations

This is just a collection of notes for ES3C5 Signal Processing that I have found useful to have on hand and easily accessible.

The notes made by Adam (MO) cover everything so this is just intended to be an easy to search document.

Download lecture notes here

Use ./generateTables.sh ../src/es2c5/brief-notes.md in the scripts folder.

Laplace Conversion
Laplace Table	Insert table here
Finding Time Domain Output $y (t)$
Input as Delta Function $δ (t)$	$x (t) = δ (t)$
Input as Step Function $u (t)$	$x (t) = u (t)$
LTI System Properties	LTI =

3 - Poles and Zeros
General Transfer Function as 2 polynomials	$H (s) = \frac{b _{0} s ^{M} + b _{1} s ^{M - 1} + \dots + b _{M - 1} s + b _{M}}{a _{0} s ^{N} + a _{1} s ^{N - 1} + \dots + a _{N - 1} s + a _{N}}$
Factorised Transfer Function	$H (s) = K \frac{( s - z _{1} ) ( s - z _{2} ) \dots ( s - z _{M} )}{( s - p _{1} ) ( s - p _{2} ) \dots ( s - p _{N} )}$
Real system as real	$M \leq N$
Zero Definition	Roots z of the numerator. When $s =$ any $z$ , $H (s) = 0$
Pole Definition	Poles p of the denominator. When $s =$ any $p$ , $H (s)$ approaches $in f$
Transfer Function Gain	K is the overall transfer function gain. (Coefficient of $s^{M}$ and $s^{N}$ is 1.)
Stable System	A system is considered stable if its impulse response tends to zero or a finite ...
Components to Response	Real Components $\Rightarrow$ Exponential Response $∣$ Imaginary $\Rightarrow$ angular f...

4 - Analog Frequency Response
Frequency Response	Frequency response of a system = output in response to sinusoid input of unit ma...
Continuous Fourier Transform	$F (j w) = \int_{t = 0}^{\infty}, f (t), e^{- jω t}, d t$
Inverse Fourier Transform	$f (t) = \frac{1}{2 π} \int_{ω = - \infty}^{\infty}, F (jω), e^{jω t}, d ω$
Magnitude of Frequency Response (MFR) $∣ H (jω) ∣$	$∣ H (jω) ∣ = ∣ K ∣ \frac{\prod _{i = 1}^{M} ∣ jω - z _{i} ∣}{\prod _{i = 1}^{N} ∣ jω - p _{i} ∣}$
Phase Angle of Frequency Response (PAFR) $∠ H (jω)$ - $K > 0$	$∠ H (jω) = \sum_{i = 1}^{M} ∠ (jω - z_{i}) - \sum_{i = 1}^{N} ∠ (jω - p_{i})$
Phase Angle of Frequency Response (PAFR) $∠ H (jω)$ - $K < 0$	$∠ H (jω) = \sum_{i = 1}^{M} ∠ (jω - z_{i}) - \sum_{i = 1}^{N} ∠ (jω - p_{i}) + π$

5 - Analog Filter Design
Ideal Filters	Each ideal filter has unambiguous
Realisability	System starts to respond to input before input is applied. Non-zero for $t < 0$ .
Causality	Output depends only on past and current inputs, not future inputs.
Realising Filters	Realise as we seek smooth behaviour.
Gain $G_{d B}$ (linear $\to$ dB)	$G_{d B} = 20, l o g_{10} (G_{l in e a r})$
Gain $G_{l in e a r}$ (dB $\to$ linear)	$G_{l in e a r} = 1 0^{\frac{G _{d B}}{20}}$
Transfer Function of Nth Order Butterworth Low Pass Filter	$H (s) = \frac{ω _{c}^{N}}{\prod _{n = 1}^{N} ( s - p _{n} )}$
Frequency Response of common Low pass Butterworth filter	$∣ H (jω) ∣ = \frac{1}{s q r t 1 + ( \frac{w}{w _{c}} ) ^{2 N}}$
Normalised Frequency Response of common Low pass Butterworth filter	$∣ H (jω) ∣ = \frac{1}{s q r t 1 + w ^{2 N}}$
Minimum Order for Low Pass Butterworth	$N = \frac{l o g ( \frac{1 0 ^{- \frac{G _{s}}{10}} - 1}{1 0 ^{- \frac{G _{p}}{10}} - 1} )}{2 , l o g ( \frac{ω _{s}}{ω _{p}} )}$
Low pass Butterworth Cut-off frequency $ω_{c}$ (Pass)	$ω_{c} = \frac{ω _{p}}{( 1 0 ^{- \frac{G _{p}}{10}} - 1 ) ^{\frac{1}{2 N}}}$
Low pass Butterworth Cut-off frequency $ω_{c}$ (Stop)	$ω_{c} = \frac{ω _{s}}{( 1 0 ^{- \frac{G _{s}}{10}} - 1 ) ^{\frac{1}{2 N}}}$

6 - Periodic Analogue Functions
Exponential Representation from Trigonometric representation	$e^{j x} = cos x + j sin x$
Trigonometric from exponential - Real (cos)	$cos x = R e e^{j x} = \frac{e ^{j x} + e ^{- j x}}{2}$
Trigonometric from exponential - Imaginary (cos)	$sin x = I m e^{j x} = \frac{e ^{j x} + e ^{- j x}}{2 j}$
Fourier Series	$x (t) = \sum_{k = - \infty}^{\infty} X_{k} e^{jk ω_{0} t}$
Fourier Coefficients	$X_{k} = \frac{1}{T _{0}} \int_{T_{0}} x (t) e^{- jk ω_{0} t} d t$
Fourier Series of Periodic Square Wave (Example)	$x (t) = \sum_{k = - \infty}^{\infty} \frac{A τ}{T _{0}} s in c (k ω_{0} \frac{τ}{2}) e^{jk ω_{0} t}$
Output of LTI system from Signal with multiple frequency components	$y (t) = \sum_{k = - \infty}^{\infty} H (jk ω_{0}) X_{k} e^{jk ω_{0} t}$
Filtering Periodic Signal (Example 6.2)	See example 6.2 below...

7 - Computing with Analogue Signals

8 - Signal Conversion between Analog and Digital
Digital Signal Processing Workflow	See diagram:
Sampling	Convert signal from continuous-time to discrete-time. Record amplitude of the an...
Oversample	Sample too often, use more complexity, wasting energy
Undersample	Not sampling often enough, get
Aliasing	Multiple signals of different frequencies yield the same data when sampled.
Nyquist Rate	$ω_{s} = 2 ω_{B}$
Quantisation	The mapping of
Data Interpolation	Convert digital signal back to analogue domain, reconstruct continous signal fro...
Hold Circuit	Simplest interpolation in a DAC, where amplitude of continuous-time signal match...
Resolution	$\frac{1}{2 ^{W}} \times 100$
Dynamic range	$, 20 l o g_{10} 2^{W} \approx 6 W d B$

9 - Z-Transforms and LSI Systems
LSI Rules	Linear Shift-Invariant
Common Components of LSI Systems	For digital systems, only need 3 types of LSI circuit components.
Discrete Time Impulse Function	Impulse response is very similar in digital domain, as it is the system output w...
Impulse Response Sequence	$h [n] = F δ [n]$
LSI Output	$y [n] = \sum_{k = - \infty}^{\infty} x [k] h [n - k] = x [n] * h [n] = h [n] * x [n]$
Z-Transform	$Z f [n] = F (z) = \sum_{k = 0}^{\infty} f [k] z^{- k}$
Z-Transform Examples	Simple examples...
Binomial Theorem for Inverse Z-Transform	$\sum_{n = 0}^{\infty} a^{n} = \frac{1}{1 - a}$
Z-Transform Properties	Linearity, Time Shifting and Convolution
Sample Pairs	See example
Z-Transform of Output Signal	$Y (z) = Z y [n] = Z x [n] * h [n] = Z x [n] Z h [n] = X (z) H (z) \Rightarrow Y (z) = X (z) H (z)$
Finding time-domain output $y [n]$ of an LSI System	Transform, product, inverse.
Difference Equation	Time domain output $y [n]$ directly as a function of time-domain input $x [n]$ as ...
Z-Transform Table	See table...

10 - Stability of Digital Systems
Z-Domain Transfer Function	$H (z) = \frac{b [ M ] z ^{- M} + b [ M - 1 ] z ^{1 - M} + \dots + b [ 1 ] z ^{- 1} + b [ 0 ]}{a [ N ] z ^{- N} + a [ N - 1 ] z ^{1 - N} + \dots + a [ 1 ] z ^{- 1} + 1}$
General Difference Equation	$y [n] = \sum_{k = 0}^{M} b [k] x [n - k] - \sum_{k = 0}^{N} a [k] y [n - k]$
Poles and Zeros of Transfer Function	$H (z) = K \frac{( z - z _{1} ) ( z - z _{2} ) \dots ( z - z _{M} )}{( z - p _{1} ) ( z - p _{2} ) \dots ( z - p _{M} )} = K \frac{\prod _{i = 1}^{M} z - z _{i}}{\prod _{i = 1}^{N} z - p _{i}}$
Bounded Input and Bounded Output (BIBO) Stability	Stable if bounded input sequence yields bounded output sequence.

11 - Digital Frequency Response
LSI Frequency Response	Output in response to a sinusoid input of unit magnitude and some specified freq...
Discrete-Time Fourier Transform (DTFT) - Digital Frequency Response	$F (e^{j Ω}) = F (Ω) = \sum_{k = 0}^{\infty} f [k] e^{- jk Ω}$
Inverse Discrete-Time Fourier Transform (Inverse DTFT)	$f [k] = \frac{1}{2 π} \int_{- π}^{π} F (e^{j Ω}) e^{jk Ω} d Ω$
LSI Transfer Function	$H (e^{j Ω}) = K \frac{\prod _{i = 1}^{M} e ^{j Ω} - z _{i}}{\prod _{i = 1}^{N} e ^{j Ω} - p _{i}}$
Magnitude of Frequency Response (MFR) $∣ H (e^{j Ω}) ∣$	$H (e^{j Ω}) = ∣ K ∣ \frac{\prod _{i = 1}^{M} ∣ e ^{j Ω} - z _{i} ∣}{\prod _{i = 1}^{N} ∣ e ^{j Ω} - p _{i} ∣}$
Phase Angle of Frequency Response (PAFR) $∠ H (e^{j Ω})$ - $K > 0$	$∠ H (e^{j Ω}) = \sum_{i = 1}^{M} ∠ (e^{j Ω} - z_{i}) - \sum_{i = 1}^{N} ∠ (e^{j Ω} - p_{i})$
Example 11.1 - Simple Digital High Pass Filter	See image...

12 - Filter Difference equations and Impulse responses
Z-Domain Transfer Function	$H (z) = \frac{b [ M ] z ^{- M} + b [ M - 1 ] z ^{1 - M} + \dots + b [ 1 ] z ^{- 1} + b [ 0 ]}{a [ N ] z ^{- N} + a [ N - 1 ] z ^{1 - N} + \dots + a [ 1 ] z ^{- 1} + 1}$
General Difference Equation	$y [n] = \sum_{k = 0}^{M} b [k] x [n - k] - \sum_{k = 0}^{N} a [k] y [n - k]$
Example 12.1 Proof y[n] can be obtained directly from H[z]	See image...
Order of a filter	$or d er = ma x (N, M)$
Taps in a filter	Minimum number of unit delay blocks required. Equal to the order of the filter.
Example 12.2 Filter Order and Taps	See example...
Tabular Method for Difference Equations	Given a difference equation, and its input x[n], can write specific output y[n] ...
Example 12.3 Tabular Method Example	See example
Infinite Impulse Response (IIR) Filters	IIR filters have
Example 12.4 IIR Filter	See example
Finite Impulse Response (FIR) Filters	FIR Filter are none recursive (ie, no feedback components), so a[k] = 0 for k!=0...
FIR Difference Equation	$y [n] = \sum_{k = 0}^{M} b [k] x [n - k]$
FIR Transfer function	$H (z) = b [M] z^{- M} + b [M - 1] z^{1 - M} + \dots + b [1] z^{- 1} + b [0]$
FIR Transfer Function - Roots	$H (z) = \frac{b [ M ] + b [ M - 1 ] z + \dots + b [ 1 ] z ^{M - 1} + b [ 0 ] z ^{M}}{z ^{M}} = K \frac{\prod _{k = 0}^{M} ( z - z _{k} )}{z ^{M}}$
FIR Stability	FIR FILTERS ARE ALWAYS STABLE. As in transfer function, all M poles are all on t...
FIR Linear Phase Response	Often have a linear phase response. The phase shift at the output corresponds to...
FIR Filter Example	See example 12.5
Ideal Digital Filters	Four main types of filter magnitude responses (defined over $0 \le \Omega \le \p...$
Realising Ideal Digital Filters	Use poles and zeros to create simple filters. Only need to consider response ove...
Example 12.6 - Simple High Pass Filter Design	See diagram

13 - FIR Digital Filter Design
Discrete Time Radial Frequency	$Ω = \frac{ω}{f _{s}} = \frac{2 π f}{f _{s}}$
Realising Ideal Digital Filter	Aim is to get as close as possible to
Practical Digital Filters	Good digital low pass filter will try to realise the (unrealisable) ideal respon...
Windowing	Window Method - design process: start with ideal $h_{i} [n]$ and windowing infinite...
Windowing Criteria
Practical FIR Filter Design Example 13.2	See example...
Specification for FIR Filters Example 13.3	See example...

14 - Discrete Fourier Transform and FFT
Discrete Fourier Transform DFT	$X [k] = \sum_{n = 0}^{N - 1} x [n] e^{- jnk \frac{2 π}{N}}$
Inverse DFT	$x[n] = \frac{1}{N}\sum_{k=0}^{N-1}X[k]e^{jnk\frac{2\pi}{N}}, \quad n=\left { 0,1,2, \cdots , N-1 \right }$
Example 14.1 DFT of Sinusoid	See example
Zero Padding	Artificially increase the length of the time domain signal $x [n]$ by adding zero...
Example 14.2 Effect of Zero Padding	See example
Fast Fourier Transform FFT	Family of alogrithms that evaluate DFT with complexity of $O (Nl o g_{2} N)$ compare...

15 - Computing Digital Signals

16 - Digital vs Analogue Recap
Aperiodic (simple periodic) continuous-time signal f(t)	Laplace, fourier transform.
More Complex Continuous-time signal f(t)	Fourier series, multiples of fundamental, samples of frequency response.
Discrete-time signal f[n] (infinite length)	Z-Domain, Discrete-time fourier transform
Discrete-time signal f[n] (finite length)	Finite Length N, convert to frequency domain (DFT), N points distributed over 2 ...
Stability	S-domain: negative real component, Z domain: poles within unit circle.
Bi-Linearity	Not core module content.

17 - Probabilities and random signals
Random Variable	A quantity that takes a non-deterministic values (ie we don't know what the valu...
Probability Distribution	Defines the probability that a random variable will take some value.
Probability Density Function (PDF) - Continuous random variables	$\int_{x = x_{min}}^{x_{ma x}} p (x) d x = 1$
Probability mass function (PMF) - Discrete random variables	$\sum_{x = x_{min}}^{x_{ma x}} p (x) = 1$
Moments	$E [X^{n}] = \sum_{x = x_{min}}^{x_{ma x}} x^{n} p (x) E [X^{n}] = \int_{x = x_{min}}^{x_{ma x}} x^{n} p (x) d x$
Uniform Distribution	Equal probability for a random variable to take any value in its domain, ie over...
Bernoulli	Discrete probability distribution with only 2 possible values (yes no, 1 0, etc)...
Gaussian (Normal) Distribution	Continuous probability distribution over $(- \infty, \infty)$ , where values closer...
Central Limit Theorem (CLT)	Sum of independent random variables can be approximated with Gaussian distributi...
Independent Random Variables	No dependency on each other (i.e., if knowing the value of one random variable g...
Empirical Distributions	Scaled histogram by total number of samples.
Random Signals	Random variables can appear in signals in different ways, eg:

18 - Signal estimation
Signal Estimation	Signal estimation, refers to estimating the values of parameters embedded in a s...
Linear Model	See equation
Generalised Linear From	See equation
Optimal estimate	See equation
Predicted estimate	See equation
Observation Matrix $Θ$	See below
Mean Square Error (MSE)	See equation
Example 18.1	See example
Example 18.2	See example
Linear Regression	$t h e t a = O b s ∖ y$
Weighted Least Squares Estimate	Weighted least squares, includes a weight matrix W, where each sample associated...
Maximum Likelihood Estimation (MLE)	See equation

19 - Correlation and Power spectral density
Correlation	Correlation gives a measure of time-domain similarity between two signals.
Cross Correlation	$R_{x_{1} x_{2}} [k] \approx \frac{1}{N - k} \sum_{n = 0}^{N - k} x_{1} [n] x_{2} [k + n]$
Example 19.1 - Discrete Cross-Correlation	See example
Autocorrelation	Correlation of a signal with itself, ie $x_{2} [n] = x_{1} [n]$ or $x_{2} (t) = x_{1} (t)$
Example 19.2 - Discrete Autocorrelation	See example
Example 19.3 - Correlation in MATLAB	See example

20 - Image Processing
Types of colour encoding	Binary (0, 1), Indexed (colour map), Greyscale (range 0->1), True Colour (RGB)
Notation	See below
Digital Convolution	$y [n] = \sum_{k = - \infty}^{\infty} x [k] h [n - k] = x [n] * h [n] = h [n] * x [n]$
Example 20.1 - 1D Discrete Convolution	See example
Example 20.2 - Visual 1D Discrete Convolution	See example
Image Filtering	Determine output y[i][j] from input x[i][j] through filter (kernel) h[i][j]
Edge Handling	Zero-padding and replicating
Kernels	Different types of kernels.
Example 20.3 - Image Filtering	See example

Part 1 - Analogue Signals and Systems

Laplace Conversion

Laplace Table

Insert table here

Finding Time Domain Output $y (t)$

Transform $x (t)$ and $h (t)$ into Laplace domain
Find product $Y (s) = X (s) H (s)$
Take inverse Laplace transform $Y (s)$

Input as Delta Function $δ (t)$

$x (t) = δ (t)$ Then $X (s) = 1$ , so $Y (s) = H (s)$ .

Input as Step Function $u (t)$

$x (t) = u (t)$ Then $X (s) = \frac{1}{s}$ , so $Y (s) = \frac{H ( s )}{s}$ .

LTI System Properties

LTI = Linear Time Invariant.

LTI systems are linear. Given system $F {}$ and signals $x_{1} (t)$ , $x_{2} (t)$ etc
- LIT is Additive: $F {x_{1} (t) + x_{2} (t)} = F {x_{1} (t)} + F {x_{2} (t)}$
- LTI is scalable (or homogeneous) $F {α x_{1} (t)} = α F {x_{1} (t)}$
LTI is time-invariant, ie, if output $y (t) = F {x_{1} (t)}$ then:
- $y (t - τ) = F {x_{1} (t - τ)}$

3 - Poles and Zeros

General Transfer Function as 2 polynomials

$H (s) = \frac{b _{0} s ^{M} + b _{1} s ^{M - 1} + \dots + b _{M - 1} s + b _{M}}{a _{0} s ^{N} + a _{1} s ^{N - 1} + \dots + a _{N - 1} s + a _{N}}$

Factorised Transfer Function

$H (s) = K \frac{( s - z _{1} ) ( s - z _{2} ) \dots ( s - z _{M} )}{( s - p _{1} ) ( s - p _{2} ) \dots ( s - p _{N} )}$ Is factorised and rewrite as a ratio of products: $= K \frac{\prod _{t = 1}^{M} s - z _{t}}{\prod _{t = 1}^{N} s - p _{t}}$

Real system as real

$M \leq N$ Where the numerator i a $M$ th order polynomial with coefficients $b$ s and the denominator is a $N$ th order polynomial with coefficients $a$ s. For a system to be real, the order of the numerator polynomial must be no greater than the order of the denominator polynomial, ie: $M \leq N$ .

Zero Definition

Roots z of the numerator. When $s =$ any $z$ , $H (s) = 0$

Pole Definition

Poles p of the denominator. When $s =$ any $p$ , $H (s)$ approaches $in f$

Transfer Function Gain

K is the overall transfer function gain. (Coefficient of $s^{M}$ and $s^{N}$ is 1.)

Stable System

A system is considered stable if its impulse response tends to zero or a finite value in the time domain.

Requires all real components to be negative (on the left hand side of the complex s-plane of a pole-zero plot (left if the imaginary s axis)).

Components to Response

Real Components $\Rightarrow$ Exponential Response $∣$ Imaginary $\Rightarrow$ angular frequency of oscillating responses.

4 - Analog Frequency Response

Frequency Response

Frequency response of a system = output in response to sinusoid input of unit magnitude and specified frequency, $ω$ . Response is measured as magnitude and phase angle.

Continuous Fourier Transform

$F (j w) = \int_{t = 0}^{\infty} f (t) e^{- jω t} d t$

Laplace transform evaluated on the imaginary s-axis at some frequency $s = jω$ .

$ω =$ radial frequency, $\frac{r a d}{s}$

Inverse Fourier Transform

$f (t) = \frac{1}{2 π} \int_{ω = - \infty}^{\infty} F (jω) e^{jω t} d ω$

Magnitude of Frequency Response (MFR) $∣ H (jω) ∣$

$∣ H (jω) ∣ = ∣ K ∣ \frac{\prod _{i = 1}^{M} ∣ jω - z _{i} ∣}{\prod _{i = 1}^{N} ∣ jω - p _{i} ∣}$

In words, the magnitude of the frequency response (MFR) $∣ H (jω) ∣$ is equal to the gain multiplied by the magnitudes of the vectors corresponding to the zeros, divided by the magnitudes of the vectors corresponding to the poles.

Phase Angle of Frequency Response (PAFR) $∠ H (jω)$ - $K > 0$

$∠ H (jω) = i = 1 \sum M ∠ (jω - z_{i}) - i = 1 \sum N ∠ (jω - p_{i})$

Phase Angle of Frequency Response (PAFR) $∠ H (jω)$ - $K < 0$

$∠ H (jω) = i = 1 \sum M ∠ (jω - z_{i}) - i = 1 \sum N ∠ (jω - p_{i}) + π$

In words, the phase angle of the frequency response (PAFR) $∠ H (jω)$ is equal to the sum of the phases of the vectors corresponding to the zeros, minus the sum of the phases of the vectors correspond to the poles, plus $π$ if the gain is negative.

Each phase vector is measured from the positive real s-axis (or a line parallel to the real s-axis if the pole or zero is not on the real s-axis).

5 - Analog Filter Design

Ideal Filters

Each ideal filter has unambiguous pass bands, which are ranges of frequencies that pass through the system without distortion, and stop bands, which are ranges of frequencies that are rejected and do not pass through the system without significant loss of signal strength. The transition band between stop and pass bands in ideal filters has a size of 0; transitions occur at single frequencies.

Realisability

System starts to respond to input before input is applied. Non-zero for $t < 0$ .

Causality

Output depends only on past and current inputs, not future inputs.

Realising Filters

Realise as we seek smooth behaviour.

Drop $h_{i} (t)$ for $t < 0$ ( $h_{i} (t) u (t)$ )
- Would not get suitable behaviour in frequency domain, as discarded 50% of system energy
But can tolerate delays
- So shift sinc to the right
- Time domain shift = scaling by complex exponential in laplace
- True in fourier transform, so delay in time maintains magnitude but changes phase of frequency response
Truncate
- As can't wait for infinity, so truncate impulse response.

Gain $G_{d B}$ (linear $\to$ dB)

$G_{d B} = 20 l o g_{10} (G_{l in e a r})$

Gain $G_{l in e a r}$ (dB $\to$ linear)

$G_{l in e a r} = 1 0^{\frac{G _{d B}}{20}}$

Transfer Function of Nth Order Butterworth Low Pass Filter

$H (s) = \frac{ω _{c}^{N}}{\prod _{n = 1}^{N} ( s - p _{n} )}$

Butterworth = Maximally flat in pass band (freq response magnitudes are flat as possible for given order)

$p_{n}$ = nth pole
- = $jω_{c} e^{\frac{jπ}{2 N} (2 n - 1)}$
- = $- ω_{c} s in (\frac{π ( 2 n - 1 )}{2 N}) + j ω_{c} cos (\frac{π ( 2 n - 1 )}{2 N})$
- Form semi-circle to left of imaginary s-axis
$ω_{c}$ = half-power cut-off frequency
- Frequency where filter gain is $G_{l in e a r} = \frac{1}{2}$ or $G_{d B} = - 3 d B$

Frequency Response of common Low pass Butterworth filter

$H (jω) = \frac{1}{1 + ( \frac{w}{w _{c}} ) ^{2 N}}$

Increasing order improves approximation of ideal behaviour

Normalised Frequency Response of common Low pass Butterworth filter

$H (jω) = \frac{1}{1 + w ^{2 N}}$

To convert normalised frequency form to non-normalised = multiply $ω$ by the actual $ω_{c}$

Minimum Order for Low Pass Butterworth

$N = \frac{l o g ( \frac{1 0 ^{- \frac{G _{s}}{10}} - 1}{1 0 ^{- \frac{G _{p}}{10}} - 1} )}{2 l o g ( \frac{ω _{s}}{ω _{p}} )}$

Round up as want to over-satisfy not under-satisfy

Low pass Butterworth Cut-off frequency $ω_{c}$ (Pass)

$ω_{c} = \frac{ω _{p}}{( 1 0 ^{- \frac{G _{p}}{10}} - 1 ) ^{\frac{1}{2 N}}}$

Gain in dB

Low pass Butterworth Cut-off frequency $ω_{c}$ (Stop)

$ω_{c} = \frac{ω _{s}}{( 1 0 ^{- \frac{G _{s}}{10}} - 1 ) ^{\frac{1}{2 N}}}$

Gain in dB

6 - Periodic Analogue Functions

Exponential Representation from Trigonometric representation

$e^{j x} = cos x + j sin x$

Trigonometric from exponential - Real (cos)

$cos x = R e {e^{j x}} = \frac{e ^{j x} + e ^{- j x}}{2}$

Trigonometric from exponential - Imaginary (cos)

$sin x = I m {e^{j x}} = \frac{e ^{j x} + e ^{- j x}}{2 j}$

Fourier Series

$x (t) = k = - \infty \sum \infty X_{k} e^{jk ω_{0} t}$ Period signal = sum of complex exponentials.

Fundamental frequency $f_{0}$ , such that all frequencies in signal are multiples of $f_{0}$ .

Fundamental period $T_{0} = 1/ f_{0}$

$w_{0} = 2 π f_{0} = 2 π / T_{0}$

Fourier spectra only exist at harmonic frequencies (ie integer multiples of fundamental frequency)

Fourier Coefficients

$X_{k} = \frac{1}{T _{0}} \int_{T_{0}} x (t) e^{- jk ω_{0} t} d t$

Important property of Fourier series is how is represents real signals $x (t)$ .

Even magnitude spectrum $\to ∣ X_{k} ∣ = ∣ X_{- k} ∣$
Odd phase spectrum = $\to ∠ X_{k} = - ∠ X_{- k}$

Fourier Series of Periodic Square Wave (Example)

$x (t) = k = - \infty \sum \infty \frac{A τ}{T _{0}} s in c (k ω_{0} \frac{τ}{2}) e^{jk ω_{0} t}$

Where $X_{k} = \frac{A τ}{T _{0}} s in c (k ω_{0} \frac{τ}{2})$

Output of LTI system from Signal with multiple frequency components

$y (t) = k = - \infty \sum \infty H (jk ω_{0}) X_{k} e^{jk ω_{0} t}$

Or in other words:

$Y_{k} = H (jk ω_{0}) X_{k}$

The output of an LTI system due to a signal with multiple frequency components can be found by superposition of the outputs due to the individual frequency components. IE system will change amplitude and phase of each frequency in the input.

Filtering Periodic Signal (Example 6.2)

See example 6.2 below...

7 - Computing with Analogue Signals

This topic isn't examined as it is MATLAB

8 - Signal Conversion between Analog and Digital

Digital Signal Processing Workflow

See diagram:

Low pass filter applied to time-domain input signal $x (t)$ to limit frequencies
An analogue-to-digital converter (ADC) samples and quantises the continuous time analogue signal to convert it to discrete time digital signal $x [n]$ .
Digital signal processing (DSP) performs operations required and generates output signal $y [n]$ .
A digital-to-analogue converter (DAC) uses hold operations to reconstruct an analogue signal from $y [n]$
An output low pass filter removes high frequency components introduced by the DAC operation to give the final output $y (t)$ .

Sampling

Convert signal from continuous-time to discrete-time. Record amplitude of the analogue signal at specified times. Usually sampling period is fixed.

Oversample

Sample too often, use more complexity, wasting energy

Undersample

Not sampling often enough, get aliasing of our signal (multiple signals of different frequencies yield the same data when sampled.)

Aliasing

Multiple signals of different frequencies yield the same data when sampled.

If we sample the black sinusoid at the times indicated with the blue marker, it could be mistaken for the red dashed sinusoid. This happens when under-sampling, and the lower signal is called the alias. The alias makes it impossible to recover the original data.

Nyquist Rate

$ω_{s} = 2 ω_{B}$

Minimum ant-aliasing sampling Frequency.

Frequencies above this $ω_{s} \geq 2 ω_{B}$ remain distinguishable.

Quantisation

The mapping of continuous amplitude levels to a binary representation.

IE: $W$ bits then there are $2^{W}$ quantisation levels. ADC Word length $= W$ .

Continuous amplitude levels are approximated to the nearest level (rounding). Resulting error between nearest level and actual level = quantisation noise

Data Interpolation

Convert digital signal back to analogue domain, reconstruct continous signal from discrete time series of points.

Hold Circuit

Simplest interpolation in a DAC, where amplitude of continuous-time signal matches that of the previous discrete time signal.

IE: Hold amplitude until the next discrete time value. Produces staircase like output.

Resolution

$\frac{1}{2 ^{W}} \times 100%$

Space between levels, often represented as a percentage.

For $W$ -bit DAC, with uniform levels

Dynamic range

$20 l o g_{10} 2^{W} \approx 6 W d B$

Range of signal amplitudes that a DAC can resolve between its smallest and largest (undistorted) values.

9 - Z-Transforms and LSI Systems

LSI Rules

Linear Shift-Invariant

Common Components of LSI Systems

For digital systems, only need 3 types of LSI circuit components.

A multiplier scales the current input by a constant, i.e., $y [n] = b [1] x [n]$ .
An adder outputs the sum of two or more inputs, e.g., $y [n] = x_{1} [n] + x_{2} [n]$ .
A unit delay imposes a delay of one sample on the input, i.e, $y [n] = x [n - 1]$ .

Discrete Time Impulse Function

Impulse response is very similar in digital domain, as it is the system output when the input is an impulse.

Impulse Response Sequence

$h [n] = F {δ [n]}$

LSI Output

$y [n] = k = - \infty \sum \infty x [k] h [n - k] = x [n] * h [n] = h [n] * x [n]$

Discrete Convolution of input signal with the impulse response.

Z-Transform

$Z {f [n]} = F (z) = k = 0 \sum \infty f [k] z^{- k}$

Converts discrete-time domain function $f [n]$ into complex domain function $F (z)$ , in the z-domain Assume $f [n]$ is causal, ie $f [n] = 0, \forall n < 0$

Discrete time equivalent to Laplace Transform. However can be written by direct inspection (as have summation instead of intergral). Inverse equally as simple.

Z-Transform Examples

Simple examples...

Binomial Theorem for Inverse Z-Transform

$n = 0 \sum \infty a^{n} = \frac{1}{1 - a}$

Cannot always find inverse Z-tranform by immediate inspection, in particular if the Z-transform is written as a ratio of polynomials of z. Can use Binomial theorem to convert into single (sometimes infinite length) polynomial of $z$

Z-Transform Properties

Linearity, Time Shifting and Convolution

Sample Pairs

See example

Z-Transform of Output Signal

$Y (z) = Z {y [n]} = Z {x [n] * h [n]} = Z {x [n]} Z {h [n]} = X (z) H (z) \Rightarrow Y (z) = X (z) H (z)$

Where $H (z)$ = Pulse Transfer Function (as it is also the system output when the time-domain input is a unit impulse.) but by convention can refer to $H (z)$ as the Transfer Function

Finding time-domain output $y [n]$ of an LSI System

Transform, product, inverse.

Transform $x [n]$ and $h [n]$ into z-domain
Find product $Y (z) = X (z) H (z)$
Taking the inverse Z-transform of $Y (z)$

Difference Equation

Time domain output $y [n]$ directly as a function of time-domain input $x [n]$ as well as previous time-domain outputs $x [n - k]$ (ie can be feedback).

Z-Transform Table

See table...

10 - Stability of Digital Systems

Z-Domain Transfer Function

$H (z) = \frac{b [ M ] z ^{- M} + b [ M - 1 ] z ^{1 - M} + \dots + b [ 1 ] z ^{- 1} + b [ 0 ]}{a [ N ] z ^{- N} + a [ N - 1 ] z ^{1 - N} + \dots + a [ 1 ] z ^{- 1} + 1}$

Negative powers of z.

No constraint on $M$ and $N$ to be real (unlike analogue) but often assume $M = N$

General Difference Equation

$y [n] = k = 0 \sum M b [k] x [n - k] - k = 0 \sum N a [k] y [n - k]$

Poles and Zeros of Transfer Function

$H (z) = K \frac{( z - z _{1} ) ( z - z _{2} ) \dots ( z - z _{M} )}{( z - p _{1} ) ( z - p _{2} ) \dots ( z - p _{M} )} = K \frac{\prod _{i = 1}^{M} z - z _{i}}{\prod _{i = 1}^{N} z - p _{i}}$

Coefficient of each $z$ in this form is 1.
Poles $p_{i}$ and zeros $z_{i}$ carry same meaning as analogue
Unfortunately symbol for variable $z$ and zeros $z_{i}$ are very similar (take care)
Insightful to plot

Bounded Input and Bounded Output (BIBO) Stability

Stable if bounded input sequence yields bounded output sequence.

A system is BIBO stable if all of the poles lie inside the $∣ z ∣ = 1$ unit circle

A system is Conditionally stable if there is atleast 1 pole directly on the unit circle.

Explanation:

An input sequence $x [n]$ is bounded if each element in the sequence is smaller than some value $A$ .
An output sequence $y [n]$ corresponding to $x [n]$ is bounded if each element in the sequence is smaller than some value $B$ .

11 - Digital Frequency Response

LSI Frequency Response

Output in response to a sinusoid input of unit magnitude and some specified frequency. Shown in two plots (magnitude and phase) as a function of input frequency.

Discrete-Time Fourier Transform (DTFT) - Digital Frequency Response

$F (e^{j Ω}) = F (Ω) = k = 0 \sum \infty f [k] e^{- jk Ω}$

Where angle $Ω$ is the angle of th unit vector measured from the positive real $z$ -axis. Denotes digital radial frequency, measured in radians per sample $\frac{r a d}{s am pl e}$

$F (e^{j Ω})$ as spectrum of $f [n]$ (frequency response).

Convention of writing DTFT includes $F (e^{j Ω})$ or simply $F (Ω)$

Derivation: Using Z-Transform Definition. $Z {f [n]} = F (z) = k = 0 \sum \infty f [k] z^{- k}$

Let $z$ be polar coords ( $z = r e^{j Ω}$ ), ie magnitude to r, angle $Ω$ . Hence rewrite $F (r e^{j Ω}) = k = 0 \sum \infty f [k] r^{- k} e^{- jk Ω}$

Then let $r = 1$ , so that any point lies on the unit circle. $F (e^{j Ω}) = F (Ω) = k = 0 \sum \infty f [k] e^{- jk Ω}$

Inverse Discrete-Time Fourier Transform (Inverse DTFT)

$f [k] = \frac{1}{2 π} \int_{- π}^{π} F (e^{j Ω}) e^{jk Ω} d Ω$

LSI Transfer Function

$H (e^{j Ω}) = K \frac{\prod _{i = 1}^{M} e ^{j Ω} - z _{i}}{\prod _{i = 1}^{N} e ^{j Ω} - p _{i}}$

$H (e^{j Ω})$ is a function of vectors from the system's poles and zeros to the unit circle at angle $Ω$ .Thus from pole-zero plot, can geometrically determine magnitude and phase of frequency response.

Magnitude of Frequency Response (MFR) $∣ H (e^{j Ω}) ∣$

$H (e^{j Ω}) = ∣ K ∣ \frac{\prod _{i = 1}^{M} e ^{j Ω} - z _{i}}{\prod _{i = 1}^{N} ∣ e ^{j Ω} - p _{i} ∣}$

In words, the magnitude of the frequency response (MFR) $∣ H (e^{j Ω}) ∣$ is equal to the gain multiplied by the magnitudes of the vectors corresponding to the zeros, divided by the magnitudes of the vectors corresponding to the poles.

Repeats every $2 π$ as Eulers Formula. Due to symettery of poles and zeros about real $z$ -axis, frequency response is symmetric about $Ω = π$ , so only need to find over one interval of $π$

Phase Angle of Frequency Response (PAFR) $∠ H (e^{j Ω})$ - $K > 0$

$∠ H (e^{j Ω}) = i = 1 \sum M ∠ (e^{j Ω} - z_{i}) - i = 1 \sum N ∠ (e^{j Ω} - p_{i})$

In words, the phase angle of the frequency response (PAFR) $∠ H (e^{j Ω})$ is equal to the sum of the phases of the vectors corresponding to the zeros, minus the sum of the phases of the vectors correspond to the poles, plus $π$ if the gain is negative.

Example 11.1 - Simple Digital High Pass Filter

See image...

12 - Filter Difference equations and Impulse responses

Z-Domain Transfer Function

$H (z) = \frac{b [ M ] z ^{- M} + b [ M - 1 ] z ^{1 - M} + \dots + b [ 1 ] z ^{- 1} + b [ 0 ]}{a [ N ] z ^{- N} + a [ N - 1 ] z ^{1 - N} + \dots + a [ 1 ] z ^{- 1} + 1}$

General Difference Equation

$y [n] = k = 0 \sum M b [k] x [n - k] - k = 0 \sum N a [k] y [n - k]$

Real coefficients $b [\dot{]}$ and $a [\dot{]}$ are the same. (Note $a [0]$ = 1, so no coefficient corresponding to $y [n]$ ).

Easier to convert directly between transfer function $H (z)$ (with negative powers of z) and the difference equation for output $y [n]$ , ideal for implementation of the system. (rather than use time-domain impulse response $h [n]$ )

Example 12.1 Proof y[n] can be obtained directly from H[z]

See image...

Order of a filter

$or d er = ma x (N, M)$

Taps in a filter

Minimum number of unit delay blocks required. Equal to the order of the filter.

Example 12.2 Filter Order and Taps

See example...

Tabular Method for Difference Equations

Given a difference equation, and its input x[n], can write specific output y[n] using tabular method.

Starting with input $x [n]$ , make a column for every input and output that appears in difference equation
ASsume every output and delayed input is initially zero (ie the filter is causal, initially no memory, hence system is quiescent)
Fill in column for $x [n]$ with given system input for all rows needed, and fill in delayed versions of $x [n]$
Evaluate $y [0]$ from inital input, and propagate the value of y[0] to delayed outputs (as relavent)
Evaluate $y [1]$ from $x [\dot{]}$ s and $y [0]$
Continue evaluating output and propagating delayed outputs.

Can be alternative method for finding time-domain impulse response $h [n]$

Example 12.3 Tabular Method Example

See example

Infinite Impulse Response (IIR) Filters

IIR filters have infinite length impulse responses because they are recursive (ie feedback terms associated with non-zero poles in transfer function, hence $y [n - k]$ exists.)

Standard transfer function and difference equation can be used to represent. $H (z) = \frac{b [ M ] z ^{- M} + b [ M - 1 ] z ^{1 - M} + \dots + b [ 1 ] z ^{- 1} + b [ 0 ]}{a [ N ] z ^{- N} + a [ N - 1 ] z ^{1 - N} + \dots + a [ 1 ] z ^{- 1} + 1}$ $y [n] = k = 0 \sum M b [k] x [n - k] - k = 0 \sum N a [k] y [n - k]$

Not possible to have a linear phase response (so there are different delays associated with different frequencies, and they are not always stable (depending on the exact locations of poles.))

IIR filters are more efficient than FIR designs at controlling gain of response.

Although response is technically infinite, in practice decays towards zero or can be truncated to zero (assume response is $h [n] = 0$ beyond some value $n$ )

Example 12.4 IIR Filter

See example

Finite Impulse Response (FIR) Filters

FIR Filter are none recursive (ie, no feedback components), so a[k] = 0 for k!=0.

Finite in length, and strictly zero beyond that ( $h [n] = 0$ for $n > M$ ). Therefore the number of filter taps dictates the length of an FIR impulse response

Since there is no feedback, can write impulse response $h [n]$ as: $h [n] = b [n]$

FIR Difference Equation

$y [n] = k = 0 \sum M b [k] x [n - k]$

FIR Transfer function

$H (z) = b [M] z^{- M} + b [M - 1] z^{1 - M} + \dots + b [1] z^{- 1} + b [0]$

Simplified from general differernce equation tranfer function.

FIR Transfer Function - Roots

$H (z) = \frac{b [ M ] + b [ M - 1 ] z + \dots + b [ 1 ] z ^{M - 1} + b [ 0 ] z ^{M}}{z ^{M}} = K \frac{\prod _{k = 0}^{M} ( z - z _{k} )}{z ^{M}}$

More convenient to work with positive powers of z, so multiply top and bottom by $z^{M}$ then factor.

FIR Stability

FIR FILTERS ARE ALWAYS STABLE. As in transfer function, all M poles are all on the origin (z =0) and so always in the unit circle.

FIR Linear Phase Response

Often have a linear phase response. The phase shift at the output corresponds to a time delay.

FIR Filter Example

See example 12.5

Ideal Digital Filters

Four main types of filter magnitude responses (defined over $0 \leq Ω \leq π$ , mirrored over $π \leq Ω < 2 π$ and repeated every $2 π$ )

Low Pass - pass frequencies less than cut-off frequency $Ω_{c}$ and reject frequencies greater.
High Pass - rejects frequencies less than cut-off frequency $Ω_{c}$ and pass frequencies greater.
Band Pass - Passes frequency within specified range, ie between $Ω_{1}$ and $Ω_{2}$ , and reject frequencies that are either below or above the band within $[0, π]$
Band Stop - Rejects frequency within specified range, ie between $Ω_{1}$ and $Ω_{2}$ , and passes all other frequencies within $[0, π]$

Ideal response appear to be fundamentally different from ideal analogue, however we only care over fundamental band $[- π, π)$ where behaviour is identical

Realising Ideal Digital Filters

Use poles and zeros to create simple filters. Only need to consider response over the fixed $[- π, π)$ band.

Key Concepts:

To be physically realisable, complex poles and zeros need to be in conjugate pairs
Can place zeros anywhere, so will often place directly on unit circle when frequency / range of frequency needs to be attenuated
Poles on the unit circle should generally be avoided (conditionally stable). Can try to keep all poles at origin so can be FIR, otherwise IIR, so feedback. Poles used to amplify response in the neighbourhood of some frequency.
Low Pass - zeros at or near $Ω = π$ , poles near $Ω = 0$ which can amplify maximum gain, or be used at a higher frequency to increase size of pass band.
High Pass - literally inverse of low pass. Zeros at or near $Ω = 0$ , poles near $Ω = π$ which can amplify maximum gain, or be used at a lower frequency to increase size of pass band.
Band Pass - Place zeros at or near both $Ω = 0$ and $Ω = π$ ; so must be atleast second order. Place pole if needed to amplify the signal in the neighbourood of the pass band.
Band Stop - Place zeros at or near the stop band. Zeros must be complex so such a filter must be atelast second order. Place poles at or near both $Ω = 0$ and $Ω = π$ if needed.

Example 12.6 - Simple High Pass Filter Design

See diagram

13 - FIR Digital Filter Design

Discrete Time Radial Frequency

$Ω = \frac{ω}{f _{s}} = \frac{2 π f}{f _{s}}$

As long as $Ω \leq π$ - otherwise there will be an alias at a lower frequency. So $2 f \leq f_{s}$ to avoid aliasing.

Realising Ideal Digital Filter

Aim is to get as close as possible to ideal behaviour. But when using Inverse DTFT, the ideal impulse response is $\frac{Ω _{c}}{π} s in c (n Ω_{c})$ .

This is analogous to the inverse Fourier transform of the ideal analogue low pass frequency response.

Sampled a scaled sinc function, non-zero values for $n < 0$ . So needs to respond to an input before the input is applied, thus unrealisable.

Practical Digital Filters

Good digital low pass filter will try to realise the (unrealisable) ideal response. Will try to do this with FIR filters (always stable, tend to have greater flexibility to implement different frequency responses).

Need to induce a delay to capture most of the ideal signal energy in causal time, ie: use $h_{i} [n - k] u [n]$
Truncate response to delay tolerance $k_{d}$ , such that $h [n] = 0$ for $n > k_{d}$ . Also limits complexity of filter: shorter = smaller order
Window response, scales each sample, attempt to mitigate negative effects of truncation

Windowing

Window Method - design process: start with ideal $h_{i} [n]$ and windowing infinite time-domain response to obtain a realsiable $h [n]$ that can be implemented.

Windowing Criteria

Main Lobe Width - Width in frequency of the main lobe.
Roll-off rate - how sharply main lobe decreases, measured in dB/dec (db per decade).
Peak side lobe level - Peak magnitude of the largest side lobe relative to the main lobe, measured in dB.
Pass Band ripple - The amount the gain over the pass band can vary about unity $1 - δ_{p} /2$ and $1 + δ_{p} /2$
Pass Band Ripple Parameter, dB- $r_{p} = 20 l o g_{10} (\frac{1 + δ _{p} /2}{1 - δ _{p} /2})$
Stop Band ripple - Gain over the stop band, must be less than the stop band ripple $δ_{s}$
Transition Band - $Ω_{Δ} = Ω_{s} - Ω_{p}$

Practical FIR Filter Design Example 13.2

See example...

Specification for FIR Filters Example 13.3

See example...

14 - Discrete Fourier Transform and FFT

Discrete Fourier Transform DFT

$X [k] = n = 0 \sum N - 1 x [n] e^{- jnk \frac{2 π}{N}}$

For $k = {0, 1, 2..., N - 1}$

This is Forward discrete Fourier transform. (Not discrete time transform, but samples of it over interval $[0, 2 π)$

Explanation:

Discrete-time Fourier Transfomr (DTFT), takes discrete time signal, provides continous spectrem that repeats every $2 π$ . Defined for infinite length sequency $x [n]$ , gives continous spectrum with values at all frequencies.

$X (e^{j Ω}) = n = 0 \sum \infty x [n] e^{jn Ω}$

Digital often has finite length sequences. (Also inverse DTFT, uses intergration thus approximated). So assume sequence $x [n]$ is length $N$ .

$X (e^{j Ω}) = n = 0 \sum N - 1 x [n] e^{jn Ω}$

Sample spectrum $X [e^{j Ω}]$ . Repeats every $2 π$ , can sample over $[0, 2 π)$ .

Take same number of samples in frequency domain as length of time domain signal.So $N$ evenly spaced samples of $X (e^{j Ω})$ . (Aka bins)

Occur at fundemental frequency $Ω_{0} = 2 π / N$ $Ω = {0, \frac{2 π}{N}, \frac{4 π}{N}, \dots, \frac{2 π ( N - 1 )}{N}}$

Substitude into the DTFT.

$X [k] = n = 0 \sum N - 1 x [n] e^{- jnk \frac{2 π}{N}}$

For $k = {0, 1, 2..., N - 1}$

$f_{k} = k \frac{f _{s}}{N}, 0 \leq k \leq N - 1$

Inverse DFT

$x [n] = \frac{1}{N} k = 0 \sum N - 1 X [k] e^{jnk \frac{2 π}{N}}, n = {0, 1, 2, \dots, N - 1}$

Example 14.1 DFT of Sinusoid

See example

Zero Padding

Artificially increase the length of the time domain signal $x [n]$ by adding zeros to the end to see more detail in the DTFT as DFT provides sampled view of DTFT, only see DTFT at $N$ frequencies.

Example 14.2 Effect of Zero Padding

See example

Fast Fourier Transform FFT

Family of alogrithms that evaluate DFT with complexity of $O (Nl o g_{2} N)$ compared to $O (N^{2})$ . Achieved with no approximations.

Details are beyond module, but can be used in matlab with fft function.

15 - Computing Digital Signals

This topic isn't examined as it is MATLAB

16 - Digital vs Analogue Recap

Aperiodic (simple periodic) continuous-time signal f(t)

Laplace, fourier transform.

APeriodic (simple periodic) continuous time signal $f (t)$
Convert to Laplace domain (s domain) via Laplace transform
Which $s = j w$ is the (continuous) Fourier transform.
Fourier transform of the signal is its frequency response $F (jω)$ , generally defined for all $ω$
Laplace and fourier transform, have corresponding inverse transforms, convert $F (s)$ or $F (jω)$ back to $f (t)$

More Complex Continuous-time signal f(t)

Fourier series, multiples of fundamental, samples of frequency response.

For a more complex periodic continuous-time signal f (t)
Fourier series representation decomposes the signal into its frequency components $F_{k}$ at multiples of the fundamental frequency $ω_{0}$ .
Can be interpreted as samples of the frequency response $F (jω)$ ,
which then corresponds to periodicity of $f (t)$ over time.
The coefficients $F_{k}$ are found over one period of $f (t)$ .

Discrete-time signal f[n] (infinite length)

Z-Domain, Discrete-time fourier transform

Discrete-time signal $f [n]$ , we can convert to the z-domain via the Z-transform,
Which for $z = e^{j Ω}$ is the discrete-time Fourier transform DTFT.
The discrete-time Fourier transform of the signal is its frequency response $F (e^{j Ω})$
Repeats every $2 π$ (i.e., sampling in time corresponds to periodicity in frequency).
There are corresponding inverse transforms to convert $F (z)$ or $F (e^{j Ω})$ back to $f [n]$

Discrete-time signal f[n] (finite length)

Finite Length N, convert to frequency domain (DFT), N points distributed over 2 pi (periodic)

For discrete-time signal $f [n]$ with finite length (or truncated to) N,
Can convert to the frequency domain using the discrete Fourier transform,
which is also discrete in frequency.
The discrete Fourier transform also has N points distributed over $2 π$ and is otherwise periodic.
Here, we see sampling in both frequency and time, corresponding to periodicity in the other domain (that we usually ignore in analysis and design because we define both the time domain signal $f [n]$ and frequency domain signal $F [k]$ over one period of length N).

Stability

S-domain: negative real component, Z domain: poles within unit circle.

Bi-Linearity

Not core module content.

17 - Probabilities and random signals

Random Variable

A quantity that takes a non-deterministic values (ie we don't know what the value will be in advance).

Probability Distribution

Defines the probability that a random variable will take some value.

Probability Density Function (PDF) - Continuous random variables

$\int_{x = x_{min}}^{x_{ma x}} p (x) d x = 1$

For a random variable $X$ , take values between $x_{min}$ and $x_{ma x}$ (could be $\pm \infty$ ), $p (x)$ is the probability that $X = x$ .

The integration of these probabilities is equal to 1.

Can take integral over subset to calculate the probability of X being within that subset.

Probability mass function (PMF) - Discrete random variables

$x = x_{min} \sum x_{ma x} p (x) = 1$

For a random variable $X$ , take values between $x_{min}$ and $x_{ma x}$ (could be $\pm \infty$ ), $p (x)$ is the probability that $X = x$ .

The sum of these probabilities is equal to 1.

Can take summation over subset to calculate the probability of X being within that subset.

Moments

$E [X^{n}] = x = x_{min} \sum x_{ma x} x^{n} p (x) E [X^{n}] = \int_{x = x_{min}}^{x_{ma x}} x^{n} p (x) d x$

Of PMF and PDF respectively.

$n = 1$ , called the mean $μ_{x}$ - Expected (average) value
$n = 2$ , called the mean-squared value, describes spread of random variable.

Often refer second order moments to as variance $σ_{X}^{2}$ . mean-squared value, with correction for the mean $σ_{X}^{2} = E [(X - μ_{x})^{2}] = E [X^{2}] - (E [X])^{2}$

Standard deviation $σ = σ_{X}^{2}$

Uniform Distribution

Equal probability for a random variable to take any value in its domain, ie over $x_{min} \leq x \leq x_{ma x}$ .

PDF continuous version:

Discrete uniform distributions: result of dice roll, coin toss etc. Averege is average of min and max.

Bernoulli

Discrete probability distribution with only 2 possible values (yes no, 1 0, etc). Values have different probabilities, in general $p (1) = 1 - p (0)$ .

Mean: $μ_{x} = p (1)$ , Variance: $σ_{X}^{2} = p (1) p (0)$

Gaussian (Normal) Distribution

Continuous probability distribution over $(- \infty, \infty)$ , where values closer to mean are more likely.

Arguably most important continuos distribution as appears everywhere.

PDF is

$p (x) = \frac{1}{2 π σ _{X}^{2}} e x p (- \frac{( x - μ _{x} ) ^{2}}{2 σ _{X}^{2}})$

Central Limit Theorem (CLT)

Sum of independent random variables can be approximated with Gaussian distribution. Approximation improves with as more random variables are included in the sum. True for any probability distributions.

Independent Random Variables

No dependency on each other (i.e., if knowing the value of one random variable gives you no information to be able to better guess another random variable, then those random variables are independent of each other).

Empirical Distributions

Scaled histogram by total number of samples.

To observe behaviour that would match a PDF or PMF, require infinite number of samples. In practice can make histogram.

Random Signals

Random variables can appear in signals in different ways, eg:

Thermal noise - in all electronics, from agitation electrons. Often modelled by adding gaussian random variable to signal
Signal processing techniques introduce noise - aliasing, quantisation, non-ideal filters.
Random variables can be used to store information, e.g., data can be encoded into bits and delivered across a communication channel. A receiver does not know the information in advance and can treat each bit as a Bernoulli random variable that it needs to estimate.
Signals can be drastically transformed by the world (wireless signals obstructed by buildings trees etc) - Analogue signals passing through unknown system $h (t)$ , which can vary with time etc

18 - Signal estimation

Signal Estimation

Signal estimation, refers to estimating the values of parameters embedded in a signal. Signals have noise, so can't just calculate parameters.

Linear Model

See equation

$y = Θ ϕ + w$

Polynomial terms,, linearity means y[n] must be linear function of unknown parameters.

EG: $y [n] = A + B n + w [n]$

A,B are unknown parameters
$w [n]$ refers to noise - assume gaussian random variables with mean $μ_{w} = 0$ and varience $σ_{w}^{2}$ - also assume white noise.

Write as column vector for each n.

Create observation matrix $Θ$ .

Since there are 2 parameters, $N \times 2$ matrix.

Can therefore be written as

$y = Θ ϕ + w$

With Optimal estimate $\hat{ϕ}$ : $\hat{ϕ} = (Θ^{T} Θ)^{- 1} Θ^{T} y$

Can write prediction: $\hat{y} = Θ \hat{ϕ}$

Calculate MSE

Generalised Linear From

See equation

$y = Θ ϕ + s + w$

Where $s$ is a vector of known samples. Convenient when our signal is contaminated by some large interference with known characteristics.

To account for this in the estimator but subtracting by s from both sides.

$\hat{ϕ} = (Θ^{T} Θ)^{- 1} Θ^{T} (y - s)$

Optimal estimate

See equation

$\hat{ϕ} = (Θ^{T} Θ)^{- 1} Θ^{T} y$

Predicted estimate

See equation

$\hat{y} = Θ \hat{ϕ}$

Without noise.

Observation Matrix $Θ$

See below

$N \times P$ matrix where $P$ is the number of parameters, and $N$ is the number of time steps.

Each column is the coefficients of the corresponding parameter at the given timestamp (per row).

Mean Square Error (MSE)

See equation

$MSE (\hat{y}) = \frac{1}{N} n = 0 \sum N - 1 (\overset{y}{^} [n] - y [n])^{2}$

Example 18.1

See example

Example 18.2

See example

Linear Regression

$t h e t a = O b s ∖ y$

AKA Ordinary least squares (OLS).

Form of observation matrix had to be assumed but may be unknown. If so can try different ones and find simplest that has best MSE.

Weighted Least Squares Estimate

Weighted least squares, includes a weight matrix W, where each sample associated with positive weight.

Places more emphasis on more reliable samples.

Good choice of weight $W [n]$ : $W [n] = \frac{1}{σ _{n}^{2}}$

Therefore resulting in:

$\hat{ϕ} = (Θ^{T} W Θ)^{- 1} Θ^{T} W y$

Using equation, where W is the column vector of weights.

theta = lscov(Obs, y,W);

Maximum Likelihood Estimation (MLE)

See equation

Found by determining $\hat{ϕ}$ , maximises the PDF of the signal $y [n]$ , which depends on the statistics of the noise $w [n]$ .

Given some type of probability distribution, the MLE can be found.

MATLAB mle function from the Statistics and Machine Learning Toolbox.

19 - Correlation and Power spectral density

Correlation

Correlation gives a measure of time-domain similarity between two signals.

Cross Correlation

$R_{x_{1} x_{2}} [k] \approx \frac{1}{N - k} n = 0 \sum N - k x_{1} [n] x_{2} [k + n]$

$R_{x_{1} x_{2}} [k] \approx \frac{1}{N - k} (x_{1} [0] x_{2} [k] + x_{1} [1] x_{2} [k + 1] + \dots + x_{1} [N] x_{2} [k + N])$

$k$ is the time shift of $x_{2} [n]$ sequence relative to the $x_{1} [n]$ sequence.
Approximation as signal lengths are finite and the signals could be random.

Example 19.1 - Discrete Cross-Correlation

See example

Autocorrelation

Correlation of a signal with itself, ie $x_{2} [n] = x_{1} [n]$ or $x_{2} (t) = x_{1} (t)$

Gives a measure of whether the current value of the signal says anything about a future value. Especially good for random signals.

Key Properties

Autocorrelation for zero delay is the same as the signals mean square value. The auto correlation is never bigger for any non-zero delay.
Auto correlation is an even function of $k$ or $τ$ , ie $R_{x_{1} x_{2}} [k] = R_{x_{1} x_{2}} [- k]$
Autocorrelation of sum of two uncorrelated signals is the same as the sums fo the autocorrelations of the two individual signals.

For $x_{1} [n]$ and $x_{2} [n]$ are uncorrelated,

$y [n] = x_{1} [n] + x_{2} [n] \Rightarrow R_{YY} [k] = R_{x_{1} x_{1}} [k] + R_{x_{2} x_{2}} [k]$

Example 19.2 - Discrete Autocorrelation

See example

Example 19.3 - Correlation in MATLAB

See example

20 - Image Processing

Types of colour encoding

Binary (0, 1), Indexed (colour map), Greyscale (range 0->1), True Colour (RGB)

Binary has value 0 and 1 to represent black and white
Indexed each pixel has one value corresponding to pre-determined list of colours (colour map)
Greyscale - each pixel has value within 0 (black) and 1 (white) - often write as whole numbers and then normalise
True colour - Three associated values, RGB

But focus on binary and greyscale for hand calculations

Notation

See below

$f [i] [j]$ - follows same indexing conventions as MATLAB

ie: $i$ refers to vertical coordinate (row)

$j$ refers to horizontal coordinate (column)

$(i, j) = (1, 1)$ is the top left pixel.

Digital Convolution

$y [n] = k = - \infty \sum \infty x [k] h [n - k] = x [n] * h [n] = h [n] * x [n]$

Example 20.1 - 1D Discrete Convolution

See example

Example 20.2 - Visual 1D Discrete Convolution

See example

Image Filtering

Determine output y[i][j] from input x[i][j] through filter (kernel) h[i][j]

Filter (Kernel) = $h [i] [j]$ , assume square matrix with odd rows and columns so obvious middle

Flip impulse response $h [i] [j]$ to get $h^{*} [i] [j]$
1. Achieved by mirroring all elements around center element.
2. By symmetry, sometimes $h [i] [j] = h^{*} [i] [j]$
Move flipped impulse response $h^{*} [i] [j]$ along input image $x [i] [j]$ .
Each time kernel is moved, multiply all elements of $h^{*} [i] [j]$ by corresponding covered pixels in $x [i] [j]$ .
1. Add together products and store sum in output $y [i] [j]$ - corresponds to middle pixel covered by kernel
2. Only consider overlaps between $h^{*} [i] [j]$ and $x [i] [j]$ where the middle element of the kernel covers a pixel in the input image

Edge Handling

Zero-padding and replicating

Zero Padding Treat all off image pixels as having value 0. $x [i] [j] = 0$ beyond defined image. Simplest but may lead to unusual artefacts at the edges of the image. Only option available for conv2 and default for imfilter.
Replicating the border - Assuming off image have same values as nearest element along edge of image. IE: Assume pixels at the outside corner take the value of the corner pixel $x [0] [1], x [0] [0], x [1] [0] = x [1] [1]$

Kernels

Different types of kernels.

Larger kernels have increased sensitivity but more expensive to compute.

Values add to 0 = Removes signal strength to accentuate certain details
Values add to 1 = maintain signal strength by redistributing
Low Pass Filter - Equivalent to taking weight average around neighbourhood of pixel. Adds to 1
Blurring filter - Similar to low pass, but elements adds uo to more than 1, so washes out the image more
High Pass Filter - Accentuates transitions between colours, can be used as simple edge detection (important task, first step to detecting objects)
Sobel operator - More effective at detecting edges than high pass filter. Do need to apply different kernels for different directions. X-gradient = detecting vertical edges, Y-gradient = detecting horizontal edges

Example 20.3 - Image Filtering

See example

CS325

The notes here are very brief. Most notably they don't contain a lof of the detail of the methods/algorithms (below) needed for the exam.

Algorithms you need to know how to do on paper, by hand, in an exam

Because we just invented computers for fun, apparently.

Lexing
- NFA/DFA stuff
Parsing
- Grammar transformations
  - Eliminating epsilon productions
  - Eliminating left recursion
  - Adding precedence
  - Left factoring
  - Removing ambiguity
- Computing First and Follow sets
- LL(1) parsing
  - Constructing LL(1) parse table
- LR(k) parsing
  - Shift-reduce
  - Constructing set of LR(0) items
  - Constructing LR(0) automaton
  - Constructing LR(0) parse table
- Constructing SLR(1) parse table
Semantic analysis
- Annotating parse trees
- Constructing attribute grammars
- Constructing SDDs
- Constructing SDTs
IRs
- Generating 3-address code for codegen stuff like addressing array elements and control flow
Runtime Environments
- Working out access links/activation records and displays under different calling mechanisms
  - Call-by-value
  - Call-by-reference
  - Call-by-name
  - Copy-restore
- Garbage collection (less sure about this)
  - Mark and sweep
  - Pointer reversal
Optimisation
- Computing basic blocks of a program
- Dataflow analysis algorithms
  - Reaching definitions
  - Live variable analysis
  - Available expressions
- Applying various optimisations to code
  - Algebraic simplification
  - Constant folding
  - Unreachable block elimination
  - Common subexpression elimination
  - Copy/constant propagation
  - Dead code elimination
  - Reduction strength in induction variables
  - Induction variable elimination
- Applying various transformations to loops
  - Loop unrolling
  - Loop coalescing
  - Loop collapsing
  - Loop peeling
  - Loop normalisation
  - Loop invariant code motion
  - Loop unswitching
  - Loop interchange
  - Strip mining
  - Loop tiling
  - Loop distribution
  - Loop fusion
Codegen
- Instruction selection by replacing operations with sequences of assembly
  - Using register and address descriptors
- Peephole optimisation
  - Removing redundant loads/stores
  - Removing jumps over jumps
  - Algebraic optimisations
  - Machine idioms
- Optimal codegen for expressions using Ershov numbers
  - Including spilling to memory
- Instruction selection by tree rewriting
  - Optimal tiling
- Graph colouring for register allocation
  - Chaitin's algorithm - graph colouring heuristic
  - Including spilling to memory

Lexing

We want to transform a stream of characters into a stream of tokens (token name, attribute value)

A lexeme is the sequence of chars from source code
A Regex is formal notation for a recogniser
Recognisers are represented as finite automata
- $S$ - finite set of states in the recogniser along with error state $S_{e}$
- $Σ$ is finite alphabet used by recogniser
- $δ (s, c)$ is transition function
  - Maps states $s$ and characters $c$ to next state
- $s_{0}$ is the start state
- Set $S_{A}$ are accepting states
An alphabet is a finite set of symbols
- String is sequence of symbols from alphabet
- Language is set of strings over the alphabet
  - Defined using grammars
- We want to check if string on alphabet is a member of language
  - Use a recognising automaton
    - Diagrams are large, use regex to express
- A language defined by a regex is the set of all strings that can be described by that regex

Tokenising

Construct a regex matching all lexemes for all tokens
- Union of regexes for the token classes gives a regex $R$ that defines a language
Given an input sequence of characters, want to check if some number of characters belong to the language $R$
- Gor $1 \leq i \leq n$ check if $C_{1}, ..., C_{i}$ is in $L (R)$
- If true then we remove that string as a token and continue
- Always select the longer sequence - maximal munch
- If more than one token matches use the token class specified first
- If no match then error
Can build a scanner from regex
- Require simulation of a DFA
- Thompson’s construction goes from NFA -> RE
- Subset construction builds a DFA that simulates an NFA
- Hopcroft’s algorithm minimises a DFA
- Kleene’s contruction dervices an RE from a DFA
NFAs allow transitions on the empty string
- States may have multiple transitions on the same character.
- Can combine multiple FAs by just joining them with epsilon transisions
DFAs only have a single transition on the each character from each state
- No epsilon transitions
- Can simulate any NFA

Thompson's Construction - RE to NFA

Use a template for building an NFA that corresponds to
- A single letter regex
- Transformation on NFAs that models the effect of regex operators
- Combine fragments using epsilon transitions
- Take into account precedence

Subset Construction - NFA to DFA

Convert NFA to DFA to make it easier to simulate.

Combine states based on epsilon transitions to eliminate them
Create subset of states, then only consider transitions between subsets
Set of states that can be reached from some state $n$ along only epsilon transitions is the epsilon closure of $n$

Where there are several possible choices of next state, take all choices simultaneously and form a set of the possible next states. This set of NFA states becomes a single DFA state.

Hopcroft's algorithm - Minimising a DFA

Some states can be merged - partition states into groups of states that produce the same behaviour on any input string.

Start by partitioning into accepting and non-accepting states
Consider each subgroup
- Partition into new subgroups such that two states $s$ and $t$ are in the same subgroup iff for all input symbols $a$ , $s$ and $t$ have transitons on $a$ into the same group $G$
- Replace group with new partitioning
Keep going until convergence

Syntax Analysis

Take a stream of words and parse it to check it’s correct
- Builds a parse tree
- If invalid then produce a syntax error

Context Free Grammars

CFGs are formal mechanism for specifying syntax of source language
Parsers parse text according to a grammar
- LL(1) top-down recursive descent
- LR(1) bottom up, canonical LR(1), LALR parser
CFGs - stmt -> if (expr) stmt else stmt
- Four components
  - Set of terminal symbols
  - Set of nonterminal symbols
  - One of the terminals is a start symbol
  - Set of productions for nonterminals
- A grammar derives a sentence
- Parsing is the process of figuring out if a sentence is valid
  - Rewrite expressions using grammar

Parse Trees

Parse tree represents derivation as graph
- Terminals at leaves, nonterminals as nodes
- In order gives input, postorder gives evaluation
- Right and leftmost derivations can give different results - grammar is ambiguous
  - Bad property for a program to have
  - Want to be able to rewrite them to be unambiguous
    - Cannot be done automatically
- Also want to give correct mathematical precedence in parse tree
  - Create a non-terminal for each level of precedence
  - Isolate corresponding part of grammar
  - Force parser to recognise high precedence subexpression first

Top-down Parsing

Top-down parsing starts at the root and grows the tree toward leaves.

At each step, select a node for some nonterminal and extend it with a subtree that rewrites the nonterminal
- Always expand leftmost fringe of tree
- If choose wrong nonterminal parser must backtrack
  - Expensive way to discover errors

Grammar scan be transformed to make them top-down parsable:

Eliminate left recursion
- A grammar is left-recursive if it has a nonterminal $A$ such that there is some derivation $A \to A a$
  - The nonterminal at the head is the leftmost symbol of the body
  - Topdown parsers cannot handle this
- Can easily eliminate direct left recursion
  - Eliminate epsilon productions
  - Eliminate cycles
  - Given productions $A \to A a_{1} ∣ A a_{2} ∣ A a_{m} ∣ B_{1} ∣ B_{2}$ , replace $A$ productions by
    - $A \to B_{1} A ’∣ B_{2} A ’$
    - $A ’ \to a_{1} A^{'} ∣ a_{2} A^{'} ∣...∣ a_{m} A^{'} ∣ ε$
- Indirect left recursion still a problem
  - Need a more systematic approach
  - Ommitted for sanity, see slide 54 onwards
- Symbol is nullable if it can be expanded with epsilon productions - can dissapear to an empty string
  - Find nullable non-terminals, if nullable then create a new production by replacing it with epsilon
  - Can increase grammar size

Recursive descent parsers are programs with one parse function per nonterminal (see courserwork). Backtrack-free grammars are grammars that can be parsed by such parsers without having to backtrack.

If top-down picks wrong the production it has to backtrack
- Can use a lookahead in input stream and use context to choose correct production
$FIRST$ set of $a$ is the set of terminals that begin strings derived from $a$
- If $a$ is a terminal then $FIRST (a) = a$
- For a nonterminal $A$ then $FIRST (A)$ is the complete set of terminal symbols that can appear as the leading symbol derived from $A$
- If nonterminal is nullable then $ε$ needs to be in first set
$FOLLOW$ set of terminals that can appear immediately to the right of $A$
- $FOLLOW (A)$ is the symbols that can appear to the right of $A$
- If $A$ is rightmost symbol in some sentinal form then eof is in $FOLLOW (A)$
- For a production $A \to a B b$ everything in $FIRST (b)$ except $ε$ is in $FOLLOW (B)$
- For $A \to a B$ or $A \to a B b$ (where $b$ is nullable), everything in $FOLLOW (A)$ is in $FOLLOW (B)$
These are LL(1) grammars - can always predict the correct expansion at each point in the parse
- Choose production $N \to a$ on a symbol $c$ if
  - $c$ in $FIRST (a)$
  - $a$ is nullable and $c$ in $FOLLOW (N)$
Left factoring - convert grammar to have LL(1) property
- Rewrite nonterminals such that productions with common prefixes are factored into new nonterminals
Table-driven LL(1) parsers are most common
- build first and follow sets
- the production $p$ of the form $N \to a$ is in the table at $(N, c)$ if terminal $c$ or eof is in $FIRST (a)$ OR if $a$ is nullable and $c$ is in $FOLLOW (N)$
- if table has conflicts then grammar is not LL(1)

Bottom-up parsing

Bottom-up parsing begins at the leaves and grows towards the root

Identify a substring of the parse tree’s upper fringe that matches RHS of some production, build node for LHS and connect to tree
- Parser adds layers of nonterminals on top of leaves
- Reduces a string to the start symbol of the grammar
Uses a stack that holds grammar symbols
Shift reduce parsing:
- Parser shifts zero or more input symbols onto stack until ready to reduce a string $b$
- Reduce $b$ into head of appropriate production
- Repeat until error detected or until stack contains start symbol and input is exhausted
LR(k) parsers are most prevalent bottom up parsers
- L - scan Left to right, R - Rightmost derivation
- k can be 0, consider both 0 and 1 cases
- More powerful than LL(1) but harder
  - Proper superset of predictive or LL methods
- For a grammar to be LR(k) must be able to recognise occurence of right side of production in a right-sentinal form, with k input symbols of lookahead
Shift-reduce decisions
- LR parser makes shift-reduce decisions by maintaining state to keep track of where we are in parse
- Each state represents a set of items where item indicates how much of a production we have seen at a given point
- An item of a grammar $G$ is a production of $G$ , with a dot at some position of the body - this is an LR(0) item
Collection of LR(0) items provides the basis for constructing a DFA called the LR(0) automaton that is used to make parsing decisions
- Steps:
  - Create augmented grammar - add a $ for end symbol to indicate when it should stop parsing
  - Compute closure set of items
    - Every possible starting state of the automaton
  - Compute GOTO functions for the set
    - Defines transitions for automaton
- Can codify LR(0) automaton in a table to use for making shift-reduce decisions
  - If a string of symbols takes the automaton from state i to state j then shift on the next symbol if state j has a transision on a
    - Otherwise reduce
  - Get shift/reduce conflicts in the table where do not have enough context on what to do
Can use SLR(1) parsing table to avoid conflits - use next symbol and $FOLLOW$ set
- Uses same LR(0) items but uses an extra symbol of lookahead to make shift-reduce decisions
- Use $FOLLOW$ set of nonterminal to determine if a reduction is correct
All LR parsing is the same - table with input string and stack
There are context-free grammars for which shift-reduce parsing does not work - either get shift/reduce or reduce/reduce conflicts
More powerful parsers exist also
- LR(1) uses full set of lookahead symbols
- LALR parsers are based on LR(0) sets and carefully introduces lookaheads into LR(0) items

Semantic Analysis

A valid parse tree can be built that is gramatically correct, but the program may still be wrong according to the semantics of the language.

Syntax Directed Definitions

Attach rules to a grammar to evaluate some other shit

Each nonterminal has a string-valued attribute that represents the expression generated by that nonterminal
- Symbol $∣∣$ used for string concat
- Notation $X . a$ is attribute $a$ of $X$
- Attributes can be of any kind - numbers, types, table references, strings
It's a context free grammar with attributes and rules
- Can be done in a parse tree - use semantic rules for each node and transform tree in-order
  - Gives annotated parse tree - has attribute values at each node
Synthesised attributes are those where the value at the node is determined from attribute values of children
- Nonterminal $A$ at node $N$ is defined by semantic rule associated with production at $N$
- Production must have $A$ at it’s head
- Has the desirable property that they can be evaluated during a single bottom-up traversal
SDD with only synthesised attributes is called S-attributed - each rule computes attribute for nonterminal at the head from attributes taken from body
Inherited attributes differ from synthesised attributes
- A nonterminal $B$ at a parse tree node $N$ is defined by a semantic rule associated with the production at the parent of $N$
  - Production must have $B$ as a symbol in it’s body
  - Inherited attributes defined in terms of $N$ ’s parents, itself and siblings
SDDs have issues - makes grammar large
- Copy rules copy sets of info around the parse tree
  - Increase space and complexity
  - Can be avoided with a symbol table but that’s outside of this formalism

Dependency Graphs

Determines evaluation order for attribute instances in parse tree

Depict flow of information among attribute instances
Edge from one attribute instance to another means that value of first is needed to compute second
Gives order of evaluation - a topological sort of the graph
If there are any cycles there are no topological sorts and SDD cannot be evaluated
S-attributed grammars are those where every attributes are synthesised
- Can be evaluated in any bottom-up order
- Can evaluate using a post-order traversal
  - Corresponds to order in which LR parse reduces production to head
L-attributed grammars are those with synthesises and inherited attributes, but such that dependency graph edges can only go from left to right

Syntax Directed Translations

SDTs are based on SDDs - context free grammar augmented with program fragments called semantic actions

Semantic actions can appear anywhere within production body
SDTs more implementation oriented than SDDs - indicate order in which actions are evaluated
Implemented during parsing without building parse tree
Use a symbol table
Denoted with braces placed around actions
- $$ refers to result location for current production
- $1, $2, ..., $n refer to locations for symbols on the RHS of production
To build SDT:
- Build parse tree ignoring actions
- For each interior node add additional children for the actions of the productions, from left to right
  - Actions appear to right of productions in tree
  - This gives postfix SDTs
- Do preorder traversal to evaluate
Typically SDTs are done without building a parse tree
- Consider semantic actions as part of production body
- During parsing, actions is executed as soon as grammar symbols to the left have been matched
- Can have productions like $B \to X {a} Y$
  - If parse bottom-up then $a$ is performed as soon as $X$ appears on top of stack
  - If parse top-down, $a$ is performed before we attempt to expand $Y$
Postfix SDTs are always LR-parsable, always S-attributed with semantic action at end of production
SDTs implementing L-attribute definitions are LL-parsable - pop and perform action when it comes to top of parse stack

Intermediate Representations

An IR is a data structure with all of the compiler's knowledge of a program

Can be an AST or some sort of machine code (LLVM IR)
Graphical IRs encode info in a graph
- Nodes, edges, lists, trees, etc
- Memory consuming
Linear IRS are psuedo-code for some abstract machine on varying levels of abstraction
Hybrid IRs combine elemends of both
- Use LLVM IR to represent blocks and a graph of the control flow between blocks
Parse trees are an IR

ASTs

An abstract syntax tree retains the structure of a parse tree but ditches non-terminal nodes

Can have a DAG to identify common sub-expressions
Encodes redundancy - basic optimisation
Must produce pure sub-expression
Can use SDDs to construct a DAG
- Functions leaf and node create a fresh node each time
  - If constructing DAG, then check identical node exists and if so then return that one
- Equivalence between nodes node(op, left, right) established if node with label op already exists with same left and right, in that order
CFG models flow of control between basic blocks in program
- A Directed graph
- Typically used in conjunction with another IR

Linear IRs

Sequences of instructions executed in order. Like asm but with ✨abstraction✨.

One-address code

Models the behaviour of an accumulator machine or stack machine

JVM, CPython do this
Easy to generate and execute

Three-address code

Three-address code is expressions like i = j op k

At most one operator per line
- Unravels multi-op expressions
Compact and can be easily rearranged which is good for optimisation
Most modern processors implement 3-address ops natively
Can also be represented as a linearised syntax tree
An address can be
- A name - pointer to symbol table entry
- A constant
- A compiler-generated temporary
Instructions can be
- Assignment (unary)
- Assignment with a binary op
- Copies
- Jumps (conditional/unconditional)
- Procedure call
- Indexed copy (like index into arrays)
- Address and pointer stuff (think * and &)
Representing linear IRs
- Usually objects/records/structs with fields for operator and operands
- Quadruples have four fields - op, arg1, arg2, result
- Triples haver just op, arg1, arg2
  - Refers to result by location in array of instruction
  - Instructions cannot be easily re-arranged - requires changing references
- Indirect triples are similar but use a list of pointers to triples
  - Can re-order by reordering instruction list without affecting triples themselves
Different IRs exist on different levels of abstraction
- Structural IRs are usually high level
- Linear IRs usually lower level
- Can have a lower-level tree showing address calculations and registers n shit
SSA is an IR that facilitates optimisations
- Names correspond uniquely to definition points in the code
- Each name is defined by a single operation
- Uses phi functions to combine definitions of two variables (ternary operators)

SDTs to generate IR

Actual program storage is runtime allocated, but relative addresses can be computed at compile time for local declarations

From types we can determine storage size
Type and relative address are saved in symbol table entry
Dynamic types are handled by saving a pointer to runtime storage
Can use an SDT to compute types and their widths
- Synthesised attributes for type and width of nonterminals

Can use an SDT to generate 3-address code for expressions too

Array addressing is important when generating addresses
- Most languages number 0 to n-1
  - Fortran numbers from 1 to n (cringe)
- Address of array element is base + (i - low) * width
- Can generalise to multiple dimensions
  - base + i1*w1 + i2*w2 + ... + ik * wk
  - Based on row- major layout - the way you’d expect
    - Fortran uses column-major
- Can use this to generate grammar for array references - semantic actions for generating 3-address code to address arrays

Types are used by compilers to generate code and optimise

Type synthesis builds type of expression from types of sub-exprs
Type inference determines the type of an expression from the way it is used
Type conversion can be explicit casts or implicit coercions
Can use semantic actions for all of these

Control flow to IR is tied to translation of bools

Used for flow of control and for logical values (and, or, not)
Can use SDDs to evaluate boolean expressions and generate jumps and addresses for control flow
May need to use backpatching
- Leave jump targets unspecified, do second pass to fill them in

A symbol table is a a data structure that used to hold info about source-program constructs

May contain:
- Identifiers - data type, addresses, lexeme
- Arrays - dimensions
- Records/structs - fields and types
- Functions - number of params, types,
Localises info - no need to annotate parse trees and makes stuff more efficient
Scopes handled by having a separate symbol table for each scope
Can use an SDT with semantic actions to generate a symbol table

Runtime Environments

We need to understand the compute or abstract machine we are generating code for.

Program Layout

Compilers usually assume each executing program runs in own logical address space
- Mapped to physical addresses by OS
- Compiler is responsible for layout and manipulation of data
Code goes at bottom of address space, followed by static storage, followed by heap (grows up towards stack), followed by stack (grows down towards heap)
Storage layout influenced strongly by addressing constraints
- Alignment - 32 or 64 bit aligned?
- Compiler inserts padding in data types
Static (compile time) and dynamic (runtime) memory are separate
- Stack stores static data local to procedure and sorts out call/return stuff via activation records
- Heap storage is for long lived stuff and may involve GC
Stack allocation assumes execution is sequential and that control flow always return to point of call
- Allocation made possible by activations of procedure nesting in time
- Lifetimes of activations are properly nested
  - Can use a tree to represent them
    - Sequence of procedure calls corresponds to a pre-order traversal of activation tree
    - Sequence of returns is a post-order traversal
    - Live activations are those that correspond to a node and it’s ancestors
- Calls and returns managed by control stack - each live activation is a frame on the stack
  - Top of stack is the currently active function
  - Stack frame for a function contains
    - Temporaries
    - Local data
    - Saved machine info (registers)
    - Links: access, control, return

Procedure Calls

Calls are implemented by calling and return sequences - code inserted by compiler to push/pop from stack

Caller evaluates function parameters and pushes
- Pushes return address
- Pushes caller’s local data and temps
Callee saves register values and other status info
Callee vs caller-svaes registers - designated per-register
- Caller-saves - save only registers that hold live variables
  - Caller saves before function call
  - May end up saving variables that callee does not use
- Callee-saves - save only registers that function actually uses in it’s body
  - Save caller before re-using registers in own function body
  - May end up saving registers that do not have live values
- Cannot avoid unnecessary saves
  - Use a mixed strategy to optimise
  - Designate some as caller and some as callee

Variable Length Stack Data

Memory for data local to a procedure which has dynamic size (like C/C++ variable length arrays) may be stack allocated

Avoids the expense of heap allocation
Activation record does not hold storage for arrays - only a pointer to the beginning of each array
- Pointers are at known offsets from top-of-stack pointer
top - actual top of stack, points to where next activation record will begin
top_sp - used to find local, fixed-length fields of current top activation record
- Points to end of machine status field
Both of the above can be generated at compile time

Scoping & Access Links

(cringe warning, this is confusing and terrible)

Accessing non-local stack data - mechanism for finding data within another procedure

Static/lexical scope - find required data in enclosing scope
- Global vars have static storage - accessed through known addresses
Dynamic scope/runtime binding - leave decision to runtime and look for closest stack frame which has required data
Access links are pointers to activation records
- If procedure p is nested within procedure q, then access link in any activation of p points to most recent activation of q
- Forms a chain from the activation record at top of stack to activations at lower depths
Displays
- Access links inefficient if nesting depth large
- Faster access to nonlocals can be done using an array of pointer to activation records - a display
- d[i] is a pointer to the highest activation record on the stack for any procedure at depth i
- If procedure p is executing and needs to access element x belonging to some procedure q
  - Look in d[i]
  - Follow the pointer to get the activation record
  - Variable is found at known offset
- Compiler knows what i is so can generate code for this
Dynamic scope - new activation inherits existing bindings of nonlocal names

Parameter Passing

Actual parameters are the ones passed into the call
Formal parameters are those used in the function declaration
l-values (memory location) vs r-values (expressions (not l-values))
Call by value
- Treat formal parameters as a local name, storage for formal parameters is activation record, storage within stack frame
- Caller evaluates parameters and puts r-values into storage
- Can pass pointers to affect caller
Call by reference
- Passes a pointer of the storage address of each parameter
- If lvalue, then lvalue is passed
- If rvalue then it’s evaluated and stored in a temporary and that lvalue passed
Copy-restore
- Hybrid of the above two
- Copy-in copy-out
- Rvalues are passed as in call by value
- Lvalues are determined during call
- When control returns, current r-values copied back to lvalues computed earlier
Call by name
- Procedure treated like a macro
- Cody substituted for caller, parameters literally substituted for formal params
- Local names of called procedure are kept distinct from names of calling procedure
Inlining
- Similar to call by name
  - Parameter passing becomes assignments
  - Scoping managed correctly
- (usually) An optimisation to improve execution time
- Increases code size -> different instruction cache performance

Memory Management

Values outliving procedure that creates it cannot be kept in activation record
Heap is used for data that lives indefinitely or for a while
Memory manager is subsystem that allocates/deallocates space within heap
- Deals with free/delete calls
- Java - GC
- Should be efficient
  - Low runtime overhead
  - Facilitate performance of programs
  - Minimise heap space and fragmentation
Fragmentaion caused by holes
- When freeing stuff, combine chunks
- Allocate memory in smallest holes possible - not good for spatial locality
- Next-fit placement - allocate in last split hole if enough space available
  - Improves spatial locality as chunks allocated at same time are places together
Manual allocation/deallocation (C/C++) is an issue - forget to free? fuck you.
GC automatically reclaims free space by deleting unused objects
- Determine reachability of objects by starting from registers and following pointers
- Mark and sweep - mark reachable objects, then collect and free them all
  - Coalesce gaps during sweep phase
- Requires memory to build list of dead objects but needs to be done when memory runs out
  - Use pointer reversal - when a pointer is followed to get a reachable object it is reversed to point at it’s parent
  - Gives an implicit stack to enable depth-first search of all reachable objects

Optimisations

Ideally compilers improve our code for us so it runs faster and uses less memory. Optimisations must preserve meaning however, so this is hard.

Basic Blocks

Basic blocks partition IR program into maximal sequences

Flow of control enters only through first/last instructions in block
- No jumps in middle
- Flow of control leaves block at end
- Last instruction may branch
Find branch instructions, identify targets, get basic blocks
Blocks become nodes of control flow graph
Compilers apply optimisations either locally, globally (entire function), or interprocedurally
- 1, 2 are common, 3 rare and has lower payoff

Local Optimisations

Algebraic simplification - reduction in strength
- Replace complex ops with simple ones
- Replace muls with shifts
- Replace exponents with muls
Constant folding
- Do operations at compile time
- Have to be careful when doing cross compilation due to different mathematical semantics on different architectures
Eliminate unreachable basic blocks
- Makes code smaller and faster
Commmon subexpression elimination
- Using SSA, two assignments with the same RHS compute the same value
Copy propagation
- Using SSA, copies u = v can be changed for just substituting u for v
- No huge performance effect but facilitates constant folding and dead code elimination
Dead code elimintation gets rid of code that does not contribute to a program’s result

Local optimisations do very little on their own but they typically interact. Compilers usually just do them until stuff stops happening.

Aliasing causes problems with optimisations - regions of memory that overlap

Ones here assume no aliasing
C allows to declare memory does not overlap with restrict keyword
- Compiler does not check this

Global Optimisations

Global common subexpression elimination - can be done accross blocks
Knowing when values will be used next is useful for optimising
- Variables are live at a particular point in a program if it’s value is used in future
  - To compute, look into future and work backwards
- Algorithm to compute live vars:
  - For each statement i: x = y op z do:
    - Attach to i the current information in the symbol table regarding next use and liveness of x, y, z
    - In symbol table, set x to not live and no next use (x is assigned new value)
    - In symbol table, set y and z to live and next uses of y and z to i
- Liveness propagated backwards, against flow of control
Data flow anlysis
- Derive info about flow of data along execution paths
- Dataflow values before and after statement are constrained by the semantics of that statement
  - Relationship between before-after values is the transfer function
  - Transfer function may describe dataflow in either direction
    - $OUT [s] = f_{s} (IN [s])$ - forward along execution path
    - $IN [s] = f_{s} (OUT [s])$ - backwards
- Easy for basic blocks - control flow value into a statement is the same as control flow value out of previous statement
  - CFG edges create more complex constraints
  - Transfer function of basic block is the composition of transfer functions of statements in block
- Constraints due to control flow between blocks can be rewritten substituting $IN [B]$ and $OUT [B]$ for $IN [s_{i}]$ and $OUT [s_{n}]$

Reaching Definitions

A definition of a variable is a statement that assigns to it
The definition $d$ reaches a point $p$ if there is a point immediately following $d$ to $p$ such that $d$ is not killed along the path
Statements may generate and kill definitions
- Transfer function of a definition $d$ can be expressed $f_{d} (x) = g e n_{d} \cup (x - ki l l_{d})$
  - $g e n_{d}$ is set of definitions generate by statement $d$
  - $ki l l_{d}$ is other definitions that kill $d$
  - $x$ is set of all definitions reaching $d$ , ie $I N [d]$
Composition of transfer functions like this is gen-kill form
- Extends to basic blocks with any number of statements
Basic blocks also generate and kill sets of definitions
- Gen set is definitions that are downward exposed
- Kill set is union of all definitions killed by individual statements
- A definition may appear in both, gen takes precedence
Iterative algorithm for computing reaching definitions
- $O U T [e n t ry]$ is init to $\emptyset$
- For each basic block $B$ other than entry
  - Init $I N [B]$ to $\emptyset$
  - while there are any changes to $O U T$ - repeat until convergence
    - $I N [B]$ = union of $O U T$ of predecessor blocks
    - $O U T [B] = g e n_{B} U (I N [B] - ki l l_{B})$
Used for optimisations - check if a definition if constant

Live Variable Analysis

We wish to know for variable $x$ and point $p$ if the value of $x$ at $p$ could be used along some path in the control flow graph starting at $p$

A variable is live if
- $x$ is used along some path starting at $p$ and there is no definition of $x$ along the path before the use
A variable is dead if
- There is no use of $x$ on any path from $p$ to exit node or all paths from $p$ redefine $x$ before using it
Need to look at future use of vars and work backwards
Used for register alocation and dead code elimination
Given the $d e f$ and $u se$ set for a block, can relate live vars at beginning to live vars at end by $I N [B] = u se \cup (O U T [B] - d e f)$
Variable is live coming into a block if either:
- Used before redefinition in the block
- Is live coming out of the block and not redefined in the block
Variable is live coming out of a block iff it is live coming into one of it’s successors
Liveness is calculated backward starting from exit node
Algorithm
- Assume all vars are dead at entry to a block
- Iterate starting from final node
  - $O U T [B]$ = union of all successor blocks in sets
  - $I N [B] = u se \cup (O U T [B] - d e f)$
- Repeat until convergence

Available Expressions

An expression $x + y$ is available at a point $p$ if:
- Every path from entry node to $p$ evaluates $x + y$ before reaching $p$
- There are no assignments to $x$ or $y$ after the evaluation but before $p$
Block kills expression $x + y$ if it assignns to $x$ or $y$ and does not recompute them
Block generates expression $x + y$ if it evaluates them and then does not subsequently define them
If an expression is available at use then there is no need to re-evaluate it - global common subexpression initialisation
Expression is available at beginning of block iff available at the end of all predecessors
- Intersection is meet operator

Summary of dataflow analysis algorithms:

	Reaching Definitions	Live Variables	Available Expressions
Domain	sets of definitions	sets of variables	sets of expressions
Direction	forwards	backwards	forwards
Transfer func	$g e n \cup (x - ki ll)$	$u se \cup (x - d e f)$	$e g e n \cup (x - e ki ll)$
Boundary	$O U T [e n t ry] = e m pt y$	$I N [e x i t] = e m pt y$	$O U T [e n t ry] = e m pt y$
Meet	union	union	intersect
Equations	$O U T [B] = f (I n [B])$
Initalise	$O U T [B] = e m pt y$	$I N [B] = e m pt y$	$O U T [B] = e m pt y$

Loop Optimisation

Loop optimisation is important to decrease overhead, exploit locality, increase parallelism, etc.

In a loop a variable whose value is derived from number of iterations is called an induction variable
- Can be optimised by computing it with a single increment per loop iteration
- Where there are two or more induction vars may be possible to reduce to a single one
- Involves strength reduction
When optimising loops, work inside-out
- Start with inner loops and then move to outer loops
Loops are key, esp inner loops where lots of computation is done
- Can optimise loop by decreasing number of instructions in an inner loop
- Code motion - take an expression that yields same result independent of loop iteration and move it outside the loop
Dependence is a relationship between two computations that constrains their execution order
- Control - determines control flow
- Data dependence - one computes something the other needs
  - Flow dependence - one statement must be executed before another
  - Antidependence - statement 1 reads a variable that is read by statement 2
    - Has consequences for parallelisation
  - Output dependence - two statements write to the same variable
- Have to describe dependence between iterations - loop carried dependencies
  - Dependencies between two successive iterations
Different classes of loop optimisations
- Loop restructuring
  - Unrolling, coalescing, collapsing, peeling
- Dataflow-based loop transformations
  - Loop-based strength reduction, induction variable elimination, invariant code motion
- Loop re-ordering
  - Change the relative order of execution of iterations of a loop nest
    - Expose parallelism and improve locality
  - Loop interchange, strip mining, loop tiling, loop fusion
Unrolling
- Replicate the loop body by an unrolling factor u
- Iterate by u steps instead of 1
- Less overhead in loop conditions, longer basic blocks for better optimisations
Coalescing
- Combine loop nest into a single loop
- Compute indices from resulting single induction var
- Improves scheduling on parallel machine
- Reduces overhead of loop nest
Collapsing
- Less general version of coalescing in which dimensions of array is reduced
- Elimintates nested loops and multidimensional array indexing
- Best suited for loops that iterate over contiguous memory
Peeling
- Small number of iterations removed from beginning/end and executed separately
- Removes dependence created by first or last few iterations
Normalisation
- Converts all loops so that induction variable is initially 0 and always incremented by 1
- Exposes opportunities for fusion and simplifies analysis
Invariant code motion
- Move computations outside loop where they do not change between iterations
- Reduce register pressure or avoid alu latency
Unswitching
- Instead of having a conditional within a loop, have a loop within each branch
- Saves the repeated branching overhead
Interchange
- Exchanges position of two loops in a perfect nest
  - Perfect nest means the body over every loop contains only a loop
- Enables vectorisation, reduces stride, improves parallel performance
- Increase number of loop-invariant expressions in inner loop
Strip mining
- Adjust granularity of operation
- Similar to unrolling
- Choose number of independent computations in innermost loop of a nest
- Involves cleanup code in case number of iterations is not perfect multiple of strip
Loop tiling
- Generalisation of strip mining in multiple dimensions
- Improve cache reuse by diving iteration space into tiles
- Critical for high performance in dense matrix multiplication
Loop distribution
- Break a loop into many with same iteration space but subsets of statements of original loop
- Creates perfect loop nests
- Creates subloops with fewer dependencies
- Improves cache usage
- Reduce memory requirements
- Increase register reuse
Loop fusion
- Opposite of the above
- Reduces loop overhead
- Increase instruction parallelism

Codegen

Want to take IR and output assembly that is semantically equivalent.

The main tasks involved are:
- Instruction selection
- Register allocation
- Instruction ordering/scheduling

It is undecidable what the optimal program for any given IR is - we use heuristics.

Instruction selection

Just translate each IR instruction to one or more machine code instructions
- Not very efficient
- Simple to implement but results in repeated loads/stores
Keep track of values in registers to avoid unnecessary loads/stores
Consider each instruction in turn
- Work out what loads are needed
- Generate code for loads
- Generate code for operation
- Generate code for stores
Need to keep track of registers and memory locations for variables
- Register descriptor - (register, variable name) pairs
- Address descriptor - (variable name, location) pairs
  - Location can be a register, memory address, stack location, etc
Need some criteria for selecting registers
- If var currently in a register then no load needed
- If var not in a register then pick an empty one
- If y not in register and no empty ones then need to pick one to reuse
  - Make sure that the value we reuse is either not needed or stored elsewhere
Statement-by-statement codegen can be optimised with peephole optimisations
- Load/store pairs of the same instruction can be eliminated
  - Only works if instructions are in same basic block
- Remove jumps over jumps
  - useful in combination with constant propagation
  - eg, removing debug info
- Flow control optimisations
  - jumps-to-jumps can be eliminated
- Algebraic optimisations
  - Eliminate instructions like x = x + 0
- Use of machine idioms
  - target machine may have auto-increment addressing mode
  - May have instructions that implement complex operations

Optimal codegen from ASTs

Can use the AST of an expression to generate an optimal code sequence
Proven to generate shortest sequence of instructions
Uses Ershov numbers
- Label any leaf 1
- Label of interior node with one child is the label of it’s child
- Label of an interior node with two children is
  - Larger of the labels of it’s children if labels differ
  - One plus the label of its children if labels same
- Label of node is the least number of registers in which expression can be evaluated using no stores of temporary results
Can generate code from labelled expression tree
- Start at root of tree
- Registers used are $R_{b + 1} ... R_{b + k - 1}$
- For a node with label $k$ and two children with equal labels
  - gencode(right child) using base register $b + 1$
    - Result appears in $R_{b + k - 1}$
  - gencode(left child) using base register $b$
    - Result appears in $R_{b + k - 2}$
  - Generate instruction OP $R_{b + k - 1}, R_{b + k - 2}, R_{b + k - 1}$
- To generate code for interior node with unequal labels
  - gencode(big child) using base register $b + 1$
    - result appears in $R_{b + k - 1}$
  - gencode(small child) using base register b
    - result appears in $R_{b + k - 2}$
- Evaluating expressions with insufficient register supply means you need extra memory
  - Spill from registers into memory
  - For interior node with label $k >$ number of registers, work on each side of tree separately and store result in larger subtree
  - Generate stores after code to eval registers for big child

Tree Rewriting

Above algorithm works with RISC instruction sets but CISC instruction sets allow steps to be condensed into one instruction
Treat instruction selection as a tree rewriting problem
Machine instructions implement fragments of IR trees
- Match tree patterns with instructions
- ind operator is dereferencing, $C a$ is offset
- Attempt to tile the subtree
  - Tiles are set of tree patterns that correspond to legal machine instructions
  - Cover the tree with non-overlapping tiles
  - If template matches, matching subtree is replaced with replacement node of rule and machine instruction emitted
- Has it's issues
  - Often multiple possibilities
    - Best tiling corresponds to shortest sequence of instructions
  - If none matches then process blocks
  - Need to guard against possibility of single node being rewritten indefinitely
- Optimal tiling - maximal munch
  - Start at root
  - Find largest tile that covers root node
  - Generate that instruction
  - Goto step 1
  - Generates instructions in reverse order
- Optimum tiling - dynamic programming
  - Bottom up rewrite system
  - Omitted for sanity

Register allocation

Decide what to keep in registers and what in memory

Efficient register use is important
When code has more live values than registers, spill to memory
- this is costly
Register allocation is NP complete
Register assignment can be solved in polynomial time
Can re-order instructions based on dataflow to optimise register assignment and reduce spill

Graph colouring

Allocate based on liveness
Works accross basic blocks
Steps:
- Compute live variables for each point in program
- Generate an interference graph
  - Each variable becomes a node
  - If variables are live at the same time then make an edge connecting them
    - They cannot be in the same register
- Colour the graph
  - Nodes connected by edge cannot be the same colour
  - A k-colourable graph uses no more than k registers
  - NP hard too, use heuristics
  - Algorithm to colour graph $G$ with $k$ colours - Chaitin’s algorithm
  - Step 1:
    - While $G$ has some node $t$ with neighbours less then $k$
    - Pick a node $t$ with less than $k$ neighbours
    - Put $t$ on stack and remove from $G$
    - Repeat until $G$ is empty
    - If all nodes removed then graph k-colourable, else no
    - Step 2 - assign colours to nodes
      - Start at top of stack
      - Add node on stack top to graph including edges
      - Pick a new colour
      - Repeat until stack empty
If colouring not found then have to spill to memory
- Will occur when each node has $k$ or more neighbours
- Pick candidate node for spilling and remove from graph, continue as before
  - Have to insert loads/stores for spilled node
  - Which one to spill? Any is fine but affects performance
    - Spill those with most conflicts
    - Spill those with few uses
    - Avoid spilling in loops

ES3E6 - RF Electronics and Microwave Engineering

RF Semiconductors

Transmission Lines

A transmission line is a two port network that connects a source to a load

Modes

Modes descibe the field pattern of propogating waves
- Can be found by solving Maxwell's equations in a transmission line
In a transmission line, electric and magnetic fields are orthogonal to each other, and both orthogonal to the direction of propogation
- This is TEM (Transverse Electromagnetic) mode
A TEM transmission line is represented by two parallel wires
- To reason about voltages and currents within it, we divide it into differential sections $δz$
- Each section is represented by an equivalent lumped element circuit

$R^{'}$ - the combined resistance of both conductors per unit length, in $Ω/ m$
$L^{'}$ - the combined inductance of both conductors per unit length, in $H / m$
$C^{'}$ - the combined capacitance of both conductors per unit length, in $F / m$
$G^{'}$ - the conductance of the insulation medium between the two conductors per unit length, in $S / m$

The table below gives parameters for some common transmission lines

Conductors have magnetic permeability $μ_{c}$ and conductivity $σ_{c}$
The insulating/spacing material has permittivity $ε$ , permeability $μ$ and conductivity $σ_{c}$
All TEM transmission lines share the relations
- $L^{'} C^{'} = μ ε$
- $G^{'} / C^{'} = σ / ε$
The constant propogation constant of a line $γ = α + j β = (R^{'} + jω L^{'}) (G^{'} + jω C^{'})$
- $α$ is the attenuation constant (Np/m)
- $β$ is the phase constant (rad/m)
The travelling wave solutions of a line are
- $\tilde{V} (z) = V_{0}^{+} e^{- γ z} + V_{0}^{-} e^{γ z}$
- $\tilde{I} (z) = I_{0}^{+} e^{- γ z} + I_{0}^{-} e^{γ z}$
- $z$ represents position along th eline
- $V_{0}^{+}, I_{0}^{+}$ represents the incident wave from source to load
- $V_{0}^{-}, I_{0}^{-}$ represents the reflected wave from load to source

We therefore have the characterisitic impedance of the TEM transmission line:

$Z_{0} = \frac{V _{0}^{+}}{I _{0}^{+}} = - \frac{V _{0}^{-}}{I _{0}^{-}} = \frac{R ^{'} + jω L ^{'}}{G ^{'} + jω C ^{'}}$

Both the voltage and current waves propagate with a phase velocity $u_{p} = f λ$ . The presence of the two waves propagating in opposite directions produces a standing wave.

The Lossless Transmission Line

In most practical situations, we can assume a transmission line to be lossless:

$R^{'} ≪ ω L^{'}$ , and $G ≪ ω C^{'}$
Assume $R^{'} = G^{'} \approx 0$ , so $γ = jω L^{'} C^{'}$
Therefore, as $γ = α + j β$ :
- $α = 0$
- $β = ω L^{'} C^{'} = ω μ ε$
- $Z_{0} = L^{'} / C^{'}$

This then gives velocity and wavelength:

$u_{p} = \frac{ω}{β} = \frac{1}{L ^{'} C ^{'}} = \frac{1}{μ ε} λ = \frac{2 π}{β} = \frac{2 π}{ω L ^{'} C ^{'}} = \frac{1}{f μ ε}$

As the insulating material is usually non-magnetic, we have $μ = μ_{0}$

$c = \frac{1}{μ _{0} ε _{0}} u_{p} = \frac{c}{ε _{r}} λ = \frac{c}{f ε _{r}} = \frac{λ _{0}}{ε _{r}}$

Voltage Reflection Coefficient

Assume a transmission line in which the signals are produced by a generator with impedance $Z_{g}$ and is terminated by a load impedance $Z_{L}$ .

At any position on the line, the total voltage and current is:

$V (z) = V_{0}^{+} e^{- γ z} + V_{0}^{-} e^{γ z} I (z) = \frac{V _{0}^{+}}{Z _{0}} e^{- γ z} + \frac{V _{0}^{-}}{Z _{0}} e^{γ z}$

At the load at position $z = 0$ , the load impedance is:

$Z_{L} = \frac{V _{L}}{I _{L}}$

Using this we can find an expression for the ratio of backwards wave amplitude and forward wave amplitude. We obtain the equation below, the voltage reflection coefficient

$Γ = \frac{V _{0}^{-}}{V _{0}^{+}} = \frac{Z _{L} - Z _{0}}{Z _{L} + Z _{0}}$

$Z_{0}$ for a lossless line is a real number, but $Z_{L}$ may be a complex quantity
In general, the reflection coefficient is also complex, $Γ = ∣ γ ∣ exp (j θ_{Γ})$
- Note that $∣Γ∣ \leq 1$ , always
A load is matched to a line when $Z_{L} = Z_{0}$ , as then $Γ = 0$
- No reflection by the load $V_{0}^{-} = 0$
If $Z_{L} = \infty$ then $Γ = 1$ , then $V_{0}^{-} = V_{0}^{+}$
- Open circuit load
If $Z_{L} = 0$ then $Γ = - 1$ , then $V_{0}^{-} = - V_{0}^{+}$
- Short circuit load

Standing Waves

The standing wave equation gives an expression for the standing wave voltage at position $z = - l$

$∣ \tilde{V} (z = - l) ∣ = ∣ V_{0}^{+} ∣∣ (1 + ∣Γ∣ e^{j (θ_{Γ} - 2 βl)}) ∣$

The ratio of $∣ V ∣_{ma x}$ to $∣ V ∣_{min}$ is called the Voltage Standing Wave Ratio, or VSWR. Max occurs when $e^{j (θ_{Γ} - 2 βl)} = 1$ , and min occurs when $e^{j (θ_{Γ} - 2 βl)} = - 1$

Input Impedance of Lossless Lines

The input impedance $Z_{in}$ of a transmission line is the ratio of the total voltage to the total current at any point $z = - l$ on the line

$Z_{in} = Z (z = - l) = \frac{V ( z = - l )}{I ( z = - l )} = Z_{0} \frac{1 + Γ e ^{- j 2 βl}}{1 - Γ e ^{- j 2 βl}} = Z_{0} \frac{Z _{L} + j Z _{0} tan ( βl )}{Z _{0} + j Z _{L} tan ( βl )}$

For a short circuit line $Z_{L} = 0$ , $Z_{in} = j Z_{0} tan (βl)$
For an open circuit line $Z_{L} = \infty$ , $Z_{in} = - j Z_{0} cot (βl)$

The Smith Chart

The Smith chart is a graphical tool for analysing and designing transmission line circuits. It represents the reflection coefficient's complex plane.

The image below shows the the complex plane

Point A is the reflection coefficient $Γ = 0.3 + j 0.4 = 0.5 e^{j 5 3^{\circ}}$
Point B is the reflection coefficient $Γ = - 0.5 - j 0.2 = 0.54 e^{j 20 2^{\circ}}$

The Smith chart shows circles of constant normalised resistance $r_{L}$ , and constant normalised reactance $x_{L}$ , within the unit circle plane.

Given the normalised value of a load impedance $z_{l} = Z_{L} / Z_{0} = r_{L} + j x_{L}$ , we can find the value of the corresponding reflection coefficient, and vice-versa.

Example

In the example below, point $P$ is plotted on the $r_{L} = 2$ and $x_{L} = - 1$ lines, representing a normalised impedance of $z_{L} = 2 - j$ .

The length of the line between the $P$ and the centre $O$ corresponds to the magnitude of the reflection coefficient
The angle between the x axis and the point $P$ is $- 26. 6^{\circ}$
$Γ = 0.45 e^{- j 26.6}$

Phase Shifting

Based on the input impedance in terms of the reflection coefficient, we obtain

$z (- l) = \frac{Z ( z = - l )}{Z _{0}} = \frac{1 + Γ e ^{- j 2 βl}}{1 - Γ e ^{- j 2 βl}} = \frac{1 + Γ _{l}}{1 - Γ _{l}}$

$Γ_{l}$ is the phase shifted reflection coefficient. $Γ$ at $z = - l$ on a transmission line is equal to the reflection coefficient at the load ( $z = 0$ ), shifted by $- 2 βl$ :

$Γ_{l} = Γ e^{- j 2 βl} = ∣Γ∣ e^{j θ_{Γ}} e^{- j 2 βl} = ∣Γ∣ e^{j (θ_{Γ} - 2 βl)}$

This phase shift can be achieved on the Smith chart by maintaining constant magnitude, and decreasing the phase $θ_{L}$ by the phase, corresponding to a clockwise rotation of an angle $2 βl$ radians.

A complete rotation of $2 π$ radians corresponds to a change in length of $l = λ /2$ . The outermost scale on the chart "wavelengths toward the generator" denotes movement on the transmission line toward the source, in units of wavelength.

Example

Point $A$ is a normalised load of $z_{L} = 2 - j$ at $0.287 λ$ . If the load terminates a transmission line of length $0.1 λ$ , what it's input impedance?

Move clockwise by $0.1 λ$ around a constant $∣Γ∣$ circle
Read the smith chart at point $B$ to get $z_{in} = 0.6 - j 0.66$

Admittance

For some problems, it is more convenient to work with admittances than with impedances

$Y = G + j B = \frac{1}{Z} = \frac{1}{R + j X}$

Normalised admittance $y$ is therefore:

$y = \frac{Y}{Y _{0}} = \frac{G}{Y _{0}} + j B Y_{0} = g + jb$

Rotation by $λ /4$ on the SWR circle transforms $z$ into $y$ , and vice-versa
$r_{L}$ circles become $g_{L}$ circles
$x_{L} c i rc l es$ become $b_{L}$ circles

Example

Point $A$ represents a normalised load impedance of $z_{L} = 0.6 + j 1.4$ . Moving on the SWR circle by $λ /4$ gives point $B$ , the corresponding normalised admittance of $y_{L} = 0.25 - j 0.6$

Narrowband Matching

A transmission line of characteristic impedance $Z_{0}$ is matched to a load $Z_{L}$ when $Z_{L} = Z_{0}$ : not incident waves upon the load are reflected back at the source. A matching network is used to achieve these conditions, placed between the load and the line. Examples of matching networks include

The $λ /4$ transformer
- A transmission line in series of length $λ /4$
A capacitor/inductor in shunt
A short circuit stub in parallel

Note that for lines of length $l = nλ /2$ , since $βl = nπ$ , the input impedance of the line is equal to the load impedance and the line does not modify the impedance of the load to which it is connected.

$Z_{in} = Z (z = - l) = Z_{0} \frac{Z _{L} + j Z _{0} tan ( nπ )}{Z _{0} + j Z _{L} tan ( nπ )} = Z_{L}$

$λ /4$ Transformer

The input impedance of a line of length $λ /4$ is

$Z_{01} = Z_{in} = Z_{02} \frac{Z _{L} + j Z _{02} tan ( π /2 )}{Z _{02} + j Z _{L} tan ( π /2 )} = \frac{Z _{02}^{2}}{Z _{L}}$

$Z_{02} = Z_{01} Z_{L}$

This eliminates reflections at $A A^{'}$ to make $Z_{01} = Z_{in}$ .

$$ \Gamma = \frac{Z_{in} - Z_0}{Z_{in_} + Z_0} $$

At the frequency for which the transformer is a perfect $λ /4$ , there is a perfect match, and $Γ = 0$ . However, as we deviate from the match frequency, the performance degrades:

$∣Γ∣ = \frac{1}{( 1 + ( 4 Z _{0} Z _{L} / ( Z _{L} - Z _{0} ) ^{2} ) sec ^{2} β λ ) ^{1/2}}$

We use $Γ_{m}$ as an acceptable maximum reflection coefficent, for which the bandwith is defined:

$Δ θ = 2 (\frac{π}{2} - θ_{m})$

Solving for $cos θ_{m}$ from the above equations gives:

$cos θ_{m} = \frac{Γ _{m}}{1 - Γ _{m}^{2}} \frac{2 Z _{0} Z _{L}}{∣ Z _{L} - Z _{0} ∣}$

Assuming TEM lines, where $f_{0}$ is the designed frequency, we can then link $θ_{m}$ with $f_{m}$ , the max/min frequency at which our match has an acceptable performance:

$θ = βl = \frac{2 π f}{u _{p}} \frac{u _{p}}{4 f _{0}} = \frac{π f _{m}}{2 f _{0}} θ_{m} = \frac{π f _{m}}{2 f _{0}}$

The fractional bandwith of a matching section (where $cos θ_{m}$ is derived above):

$\frac{Δ f}{f _{0}} = \frac{2 ( f _{0} - f _{m} )}{f _{0}} = 2 - \frac{2 f _{m}}{f _{0}} = 2 - \frac{4 θ _{m}}{π}$

The smaller the load mismatch, the larger the bandwidth.

Lumped Element Matching Networks

An L-secion uses two reactive elements to match a load impedance to a transmission line. If $Z_{L}$ falls within the $1 + j x$ circle on the Smith chart, then the left configuration is used, else the right configuration is used.

Let $Z_{L} = R_{L} + j X_{L}$ where $R_{L} > Z_{0}$ (inside $1 + j x$ circle). For an impedance match:

$Z_{0} = j X + (j B + \frac{1}{R _{L} + j X _{L}})^{- 1}$

Solving for $X$ and $B$ :

$B = \frac{X _{L} \pm R _{L} / Z _{0} R _{L}^{2} + X _{L}^{2} - Z _{0} R _{L}}{R _{L}^{2} + X _{L}^{2}}$

$X = \frac{1}{B} + \frac{X _{L} Z _{0}}{R _{L}} - \frac{Z _{0}}{B R _{L}}$

Two solutions are possible, and both are physically realisable with capacitors/inductors.

Conside the alternative where $R_{L} < Z_{0}$ (outside $1 + j x$ circle):

$\frac{1}{Z _{0}} = j B + \frac{1}{R _{L} + j ( X + X _{L} )}$

$X = \pm R_{L} (Z_{0} - R_{L}) - X_{L} B = \pm \frac{( Z _{0} - R _{L} ) / R _{L}}{Z _{0}}$

Shunt Lumped Element Matching

We use a lumped element in parallel with the load to achieve matching as shown in the figure. As the element $Y_{s}$ is in shunt, we work in the admittance domain.

Assuming $Z_{L} = R_{L} + j X_{L}$ , the aim is at terminal $M M^{'}$ to transform $R_{L}$ to $Z_{0}$ , and $X_{L}$ to $0$ . Assuming $Y_{s} = j B_{s}$ and $Y_{d} = G_{d} + j B_{d}$ , the aim is to choose a length $d$ and value of $Y_{s}$ to match $Y_{0}$ of the feedline to $Y_{in}$ , given by the sum of $Y_{d}$ and $Y_{s}$ .

$Y_{in} = j B_{s} + G_{d} + j B_{s} = G_{d} + j (B_{s} + B_{d})$

$G_{d} = Y_{0} = 1/ Z_{0} B_{s} + B_{d} = 0$

Broadband Matching

Multi-section transformers can be used where a wider bandwith of matching is required than can be achieved by a single $λ /4$ transformer.

Small Reflections

To derive such a transformer, we start with the theory of small reflections, applied to a single-section transformer. The incident wave will partially reflect and partially transmit at the $Z_{1} / Z_{2}$ interface, which will then reflect at the load, and then reflect again at the boundary, and so on.

$Γ_{1} = \frac{Z _{2} - Z _{1}}{Z _{2} + Z _{1}} Γ_{2} = - Γ_{1} Γ_{3} = \frac{Z _{L} - Z _{2}}{Z _{L} + Z _{2}}$

Summing the reflections/transmissions, the total reflection seen by the feedline is:

$Γ = Γ_{1} + T_{12} T_{21} Γ_{3} e^{- 2 j θ} n = 0 \sum \infty Γ_{2}^{n} Γ_{3}^{n} e^{- 2 jn θ}$

This is a geometric series, which sums to:

$Γ = \frac{Γ _{1} + Γ _{3} e ^{- 2 j θ}}{1 + Γ _{1} Γ _{3} e ^{- 2 j θ}}$

Multisection Transformer

Consider now a multisection transformer, which is just lots of small sections of transmission line of equal length

$Γ_{0} = \frac{Z _{1} - Z _{0}}{Z _{1} + Z _{0}} Γ_{n} = \frac{Z _{n + 1} - Z _{n}}{Z _{n + 1} + Z _{n}} Γ_{N} = \frac{Z _{L} - Z _{N}}{Z _{L} + Z _{N}}$

Making a few assumptions:

Assume the differences between adjacent impedances are small
Assume all $Z_{n}$ increase or decrease monotonically
Assume $Z_{L}$ is real
- $Γ_{n}$ will be real and of the same sign

The total reflection coefficient is therefore:

$Γ (θ) = Γ_{0} + Γ_{1} e^{- 2 j θ} + ... + Γ_{N} e - 2 j Nθ$

Any desired value of $Γ$ can be synthesised by suitably choosing $Γ_{n}$ and $N$ .

The Binomial Transformer

We show how to realise such a transformer with a maximally flat total reflection coefficient, a binomial transformer.

For an $N$ -section transformer:

Set the first $N - 1$ derivatives of $∣Γ (θ) ∣$ to 0 at the center frequency $f_{0}$
- Provided by a reflection coefficient of the form $Γ (θ) = A (1 + e^{- 2 j θ})^{N}$
- Magnitude is then $∣Γ (θ) ∣ = 2^{N} ∣ A ∣∣ cos θ ∣^{N}$
To determine $A$ , let $f \to 0$
- $θ = β \to 0$
- Expression reduces to $Γ (0) = 2^{N} A = \frac{Z _{L} - Z _{0}}{Z _{L} + Z _{0}}$
- All sections are of 0 electrical length as $f \to 0$
- $A = 2^{- N} \frac{Z _{L} - Z _{0}}{Z _{L} + Z _{0}}$

$Γ (θ)$ is expressed as a binomial series:

$Γ (θ) = A (1 + e^{- 2 j θ})^{N} = A n = 0 \sum N C_{n}^{N} e^{- 2 jn θ} Γ_{n} = A C_{n}^{N}$

Because we assume $Γ_{n}$ are all small, we can approximate the characteristic impedances as:

$Γ_{n} = \frac{Z _{n + 1} - Z _{n}}{Z _{n + 1} + Z _{n}} \approx \frac{1}{2} ln \frac{Z _{n + 1}}{Z _{n}}$

$Z_{n + 1} \approx exp (ln Z_{n} + 2^{- N} C_{n}^{N} ln \frac{Z _{L}}{Z _{0}})$

To find the bandwith of the binomial transformer, let $Γ_{m}$ be the maximum tolerated reflection coefficient over the passband.

$Γ_{m} = 2^{N} ∣ A ∣ cos^{N} θ_{m}$

$θ_{m} < π /2$ is the lower edge of the passband. Therefore:

$cos θ_{m} = \frac{1}{2} (\frac{Γ _{m}}{∣ A ∣})^{1/ N}$

And the fractional bandwith:

$\frac{Δ f}{f _{0}} = \frac{2 ( f _{0} - f _{m} )}{f _{0}} = 2 - \frac{4 θ _{m}}{π}$

Rectangular Waveguides

Waveguides are just rectangular tubes full of air for transmission of power waves at high frequencies. Waveguides with a single conductor support either TE or TM waves, but not TEM waves.

Modes define the properties of how a wave propagates through a guide
Modes are defined by $m$ and $n$
- Obtained through solving the wave equations for different boundary conditions
Mode with lowest cutoff frequency is the dominant mode
- Dominant TM mode is $T M_{11}$
- Dominant TE mode is $T E_{10}$

TM Modes

Phase Constant

A wave is travelling inside the guide along the z-direction. It's phase factor is $e^{- j β z}$ with:

$β = k^{2} - k_{c}^{2} = ω^{2} μ ε - (\frac{mπ}{a})^{2} - (\frac{nπ}{a})^{2}$

Cutoff Frequency

Corresponding to each mode there is a cutoff frequency $f_{c mn}$ at which $β = 0$ . A mode can only propagate if $f > f_{c mn}$ , as only then is $β$ real.

$f_{c mn} = \frac{u _{p 0}}{2} (\frac{m}{a})^{2} + (\frac{n}{b})^{2}$

$u_{p 0} = 1/ μ ε$ is the phase velocity of a TEM wave in an unbounded medium with parameters $μ$ and $ε$ .

Phase Velocity

$u_{p} = \frac{ω}{β} = \frac{u _{p 0}}{1 - ( f _{c mn} / f ) ^{2}}$

Wave Impedance

$Z_{TM} = \frac{β η}{k} = η 1 - (\frac{f _{c mn}}{f})^{2}$

$η = μ / ε$ is the intrinsic impedance of the lossless medium.

TE Mode

All the parameters are the same as for TM mode, except for wave impedance

$Z_{TE} = \frac{η}{1 - ( \frac{f _{c mn}}{f} ) ^{2}}$

The TE dominant mode, assuming $a > b$ where $a$ and $b$ are the width and height of the waveguide ,is $T E_{10}$ with

$f_{c 10} = \frac{1}{2 a μ ε}$

Zigzag Reflections

For the $T E_{10}$ mode, the field component can be expressed as the sum of two TEM plane waves, both travelling in the $+ z$ direction, but zigzagging between opposite walls of the waveguide. The phase velocity of these waves is $u_{p 0}$ and their direction is at angles. The phase velocity of their combination $u_{p}$ is that of the $T E_{10}$ mode.

$θ^{'} = arctan \frac{π}{β a} θ^{''} = - arctan \frac{π}{β a}$

$θ^{'} = arctan \frac{1}{( f / f _{c 10} ) ^{2} - 1}$

Table

Coaxial & Microstrip Lines

Coaxial

The coaxial line is a waveguide
Unlike the rectangular waveguide, coax supports the TEM mode, as well as higher order modes
- Field profiles for which can be found by solving the wave equations in cylindrical coordinates
Using the cutoff frequency for the $T E_{11}$ mode, the monomode frequency can be obtained
- The highest usable frequency before $T E_{11}$ mode starts to propagate
- Cutoff wave number $k_{c}$ is approximated as $k_{c} = 2/ (a + b)$
  - $a$ , $b$ are the radii of inner and outer sheaths of cable
- Cutoff frequency found as $k_{c} = 2 π f_{c} / v$
  - $v = c / ε_{r}$
Most common coax cables and connectors are 50 Ohm
- Air-filled coax line is 77 Ohm
- Max power capacity is at 30 Ohms
- 50 Ohms is the tradeoff between the two
75 Ohms used in TV systems

Microstrip Lines

Microstrips are a conductor of length $W$ printed on a thin, grounded dielectric substance of thickness $d$ and relative permittivity $ε_{r}$
If there were no dielectric substrate, then we'd have a two wire TEM line with $u_{p} = c$ and $β = k_{0}$ .
- We don't
- The dielectric complicates the analysis
- It's almost-TEM, kind of a hybrid
- Some field lines are in the air region above the substrate, so no pure TEM wave
Can approximate behaviour from quasi-static solutions
- $u_{p} = c / ε_{e ff}$
- $β = (ω / c) ε_{e ff}$
- $ε_{e ff}$ is the effective dielectric constant of the microstrip
  - $1 < ε_{e ff} < ε_{r}$

The effective dielectric constant can be interpreted as the dielectric constant of a homogenous medium that equivalently replaces the air and dielectric regions of the microstrip line

$ε_{e ff} = \frac{ε _{r} + 1}{2} + \frac{ε _{r} - 1}{2} \frac{1}{1 + 12 d / W}$

The characteristic impedance can be calculated as:

$Z_{0} = ⎩ ⎨ ⎧ \frac{60}{ε _{e ff}} ln (\frac{8 d}{W} + \frac{W}{4 d}) \frac{120 π}{ε _{e ff} ( W / d + 1.393 + 0.667 l n ( W / d + 1.444 ))}, for \frac{W}{d} \leq 1 for \frac{W}{d} \geq 1$

For a given $Z_{0}$ and $ε_{e ff}$ , we can also determine the ratio $W / d$

$\frac{W}{d} = ⎩ ⎨ ⎧ \frac{8 e ^{4}}{e ^{2 A} - 2} \frac{2}{π} (B - 1 - ln (2 B - 1) + \frac{ε _{r} - 1}{2 ε _{r}} [ln (B - 1) + 0.39 - \frac{0.61}{ε _{r}}]) for \frac{W}{d} < 2 for \frac{W}{d} > 2$

$A = \frac{Z _{0}}{60} \frac{ε _{r} + 1}{2} + \frac{ε _{r} - 1}{ε _{r} + 1} (0.23 + \frac{0.11}{ε _{r}}) B = \frac{377 π}{2 Z _{0} ε _{r}}$

Again considering the microstrip as a quasi-TEM line, we can determine the attenuation due to dielectric loss $a_{d}$ and conductor loss $a_{c}$

$a_{d} = \frac{k _{0} ε _{r} ( ε _{e ff} - 1 ) tan δ}{2 ε _{e ff} ( ε _{r} - 1 )} a_{c} = \frac{R _{s}}{Z _{0} W}$

Where $R_{s} = ω μ_{0} /2 σ$ is the surface resistivity of the conductor, and $tan δ$ the loss tangent of the dielectric. For most substrates, $a_{c} > a_{d}$ .

Waveguide Discontinuities

Transmission lines often include discontinuities to perform an electrical function. Usually, these can be represented as equivalent circuits for analysis and design. Some common microstrip discontinuities and their equivalent lumped element circuits are shown below.

Striplines

A stripline is a planar transmission line used in microwave integrated circuits.

Thin conducting strip of width $W$ between two wide conducting ground plates of separation $b$ ,
- Between the ground plates is filled with dielectric
Supports usual TEM mode
- Can support higher-order modes, but can usually avoided by restricting spacing and geometry

Network Parameters

Impedance & Admittance Parameters

Consider an N-port microwave network
Forward and backward voltage and current waves can be defined for TEM waves
- Can define matrices of impedances(/admittances) to relate voltage and current port parameters to each other
Ports may be any type of transmission line for a single propagating mode
At a specified point on the $n_{t h}$ port, a terminal place $t_{n}$ is defined
- Terminal planes provide a phase reference for wave phasors
- Equivalent incident and reflected voltage and current also defined

At the $n_{t h}$ terminal, total voltage and current are given by
- $V_{n} = V_{n}^{+} + V_{n}^{-}$
- $I_{n} = I_{n}^{+} + I_{n}^{-}$
- Assumes coordinate along which propagation occurs is zero at terminal

The impedance matrix $V = Z I$ relates these voltages and currents:

$V_{1} V_{2} ⋮ V_{N} = Z_{11} Z_{21} ⋮ Z_{N 1} Z_{12} Z_{22} ⋮ Z_{N 2} \dots \dots ⋱ \dots Z_{1 N} Z_{2 N} ⋮ Z_{NN} I_{1} I_{2} ⋮ I_{N}$

Can similarly define an admittance matrix $I = YV$

$I_{1} I_{2} ⋮ I_{N} = Y_{11} Y_{21} ⋮ Y_{N 1} Y_{12} Y_{22} ⋮ Y_{N 2} \dots \dots ⋱ \dots Y_{1 N} Y_{2 N} ⋮ Y_{NN} V_{1} V_{2} ⋮ V_{N}$

The two matrices are inverses of each other: $Y = Z^{- 1}$ . Both matrices relate total port voltages and currents.

$Z_{ij}$ can be found by driving port $j$ with current $I_{j}$ , open circuiting all other ports, and measuring the open circuit voltage at port $i$

$Z_{ij} = \frac{V _{i}}{I _{j}}_{I_{k} = 0 for k \neq = j}$

$Z_{ii}$ is the input impedance looking into port $i$ , and $Z_{ij}$ is the transfer impedance between ports $i$ and $j$ .

The admittance matrix parameters are found similarly:

$Y_{ij} = \frac{I _{i}}{V _{j}}_{V_{k} = 0 for k \neq = j}$

If a network is reciprocal (contains no active devices), then the matrix is symmetric
- $Y_{ij} = Y_{ji}$
- $Z_{ij} = Z_{ji}$
For a reciprocal lossless network, all the $Z_{ij}$ or $Y_{ij}$ elements are purely imaginary
- $R e (Z_{mn}) = 0$ for any $m$ and $n$

Any two port network can be reduced to an equivalent $T$ or $Π$ network:

Scattering Parameters

Direct measurements of voltage and current become not that useful at high frequency because of waves
The scattering matrix representation is more in line with the direct measurement of waves
Provides a complete description of an $N$ -port network, relating incident and reflected waves on ports.

The S-matrix is defined $V^{-} = S V^{+}$

$V_{1}^{-} V_{2}^{-} ⋮ V_{N}^{-} = S_{11} S_{21} ⋮ S_{N 1} S_{12} S_{22} ⋮ S_{N 2} \dots \dots ⋱ \dots S_{1 N} S_{2 N} ⋮ S_{NN} V_{1}^{+} V_{2}^{+} ⋮ V_{N}^{+}$

$S_{ij}$ is found by sending port $j$ an incident wave $V_{j}^{+}$ and measuring at port $i$ the reflected amplitude $V_{i}^{-}$ . The incident waves on the rest of the ports are set to 0, meaning all ports are terminated in matched loads to avoid reflections.

$S_{ij} = \frac{V _{i}^{-}}{V _{j}^{+}}_{V_{k}^{+} = 0 for k \neq = j}$

$S_{ii}$ is the reflection coefficient looking into port $i$
$S_{ij}$ is the transmission coefficient (gain) from port $i$ to $j$
The scattering matrix for a reciprocal network is symmetric $S = S^{T}$
The scattering matrix fro a lossless network is unitary $S^{T} S^{*} = I$
- Identity

Shifting Reference Planes

In the original network, the terminal planes are assumed to be at $z_{n} = 0$ , where $z_{n}$ is measured along the lossless line feeding the $n_{t h}$ port. The matrix with this set of planes is $S$ . If the new reference planes are defined $z_{n} = l_{n}$ , then we get a new scattering matrix defined $V^{' -} = S^{'} V^{' +}$ . From travelling waves on a lossless line:

$V_{n}^{' +} = V_{n}^{+} e^{j β_{n} l_{n}} V_{n}^{' -} = V_{n}^{-} e^{j β_{n} l_{n}}$

We can use this shift to define $S^{'}$ in terms of $S$

$S^{'} = e^{- j β_{1} l_{1}} 0 ⋮ 0 0 e^{- j β_{2} l_{2}} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ e^{- j β_{N} l_{N}} S e^{- j β_{1} l_{1}} 0 ⋮ 0 0 e^{- j β_{2} l_{2}} ⋮ 0 \dots \dots ⋱ \dots 00 ⋮ e^{- j β_{N} l_{N}}$

Transmission (ABCD) Parameters

Practical microwave networks consist of a cascade connection of two or more 2-port networks. It is useful to define a 2x2 transmission, or ABCD matrix, for each 2-port network such that the transmission matrix of the cascade connection can be obtained as the product of the transmission matrices of the individual networks.

$[V_{1} I_{1}] = [A C B D] [V_{2} I_{2}]$

Note the sign convention, which has $I_{1}$ flowing into port 1, and $I_{2}$ flowing out of port 2.

If two networks are cascaded, ie network 1 outputs into network 2, the transmission matrix of the cascaded network is the product of the two individually

$[V_{1} I_{1}] = [A_{1} C_{1} B_{1} D_{1}] [V_{2} I_{2}] [V_{2} I_{2}] = [A_{2} C_{2} B_{2} D_{2}] [V_{3} I_{3}]$

$[V_{1} I_{1}] = [A_{1} C_{1} B_{1} D_{1}] [A_{2} C_{2} B_{2} D_{2}] [V_{3} I_{3}]$

Some useful ABCD parameters for common networks are shown below

Port Parameter Conversion Table

Filters

Filters are two port networks used to control frequency response.

Insertion Loss Method

We utilise the insertion loss method to design microwave filters.

We define a filter response by it's power loss ratio, the ratio of power available from the source to that delivered to the load.

$P_{L R} = \frac{P _{in c}}{P _{l o a d}} = \frac{1}{1 - ∣Γ ( ω ) ∣ ^{2}}$

The insertion loss (in dB) is then

$I L = 10 lo g_{10} P_{L R}$

As $∣Γ (ω) ∣^{2}$ is an even function of $ω$ , it can be expressed as a polynomial in $ω^{2}$ :

$P_{L R} = 1 + \frac{M ( ω ^{2} )}{N ( ω ^{2} )}$

By choosing coefficients of $M$ and $N$ , we can design filters with a specific frequency response.

Maximally Flat Response

Also known as binomial or Butterworth response. For a given filter order, it provides the flattest response in the passband. For a low pass filter of order $N$ with cutoff frequency $ω_{c}$ :

$P_{L R} = 1 + k^{2} (\frac{ω}{ω _{c}})^{2 N}$

At the cutoff frequency, the power loss ratio is $1 + k^{2}$ .
- If this is chosen as the -3 dB point then $k = 1$ ,
  - Usually the case
The first $2 N - 1$ derivatives are zero at $ω = 0$
For $ω ≫ ω_{c}$ , the insertion loss increases at a rate of $20 N$ dB/decade

Equal Ripple Response

A Chebyshev polynomial $T_{N} (x)$ is used to specify the insertion loss:

$P_{L R} = 1 + k^{2} T_{N}^{2} (ω / ω_{c})$

Results in a sharper cutoff
Passband response will have ripples of amplitude $1 + k^{2}$ , as $T_{N} (x)$ oscillates between $\pm 1$ for $∣ x ∣ \leq 1$
$k^{2}$ determines the passband ripple level
For large $x$ , $T_{N} (x) \approx (2 x)^{2 N} /2$
- For $ω ≫ ω_{c}$ , the power loss ratio is $(k^{2} /4) (2 ω / ω_{c})^{2 N}$
  - Increases at the same rate of $20 N$ dB per decade
At any given $ω$ , the power loss ratio is $(2^{2 N}) /4$ greater than that of the binomial filter for $ω ≫ ω_{c}$

Linear Phase Response

A linear phase response in the passband is important where signal distortion is to be avoided. A sharp-cutoff response is generally incompatible with a good phase response. Linear phase response can be achieved by:

$ϕ (ω) = A ω [1 + p (\frac{ω}{ω _{c}})^{2 N}]$

$ϕ (ω)$ is the phase of the voltage transfer function of the filter
$p$ is a constant

Normalised Design

We can normalise impedance and frequency values to simplify the design of filters.

Maximally Flat Response

Consider an LC circuit as shown below, with a source impedance of 1, a load impedance $R$ , and a cutoff frequency normalised to 1. The desired power loss ratio will be $1 + ω^{4}$ for $N = 2$ .

The power loss ratio of this filter can be derived from it's input impedance and reflection coefficient:

$P_{L R} = \frac{∣ Z _{in} + 1 ∣ ^{2}}{2 ( Z _{in} + Z _{in}^{*} )} = 1 + ω^{4}$

This equation solves to give $L = C = 2$ , for the case $N = 2$ .

The same process can be repeated for different values of $N$ to give the element values for the ladder-type circuits show. The values are numbred from $g_{0}$ source impedance to $g_{N + 1}$ load impedance for a filter with $N$ reactive elements alternating between series and shunt connections.

The graph shows attenuation vs normalised frequency for filter prototypes

Equal Ripple Response

For Chebyshev polynomials, $T_{N} (0) = 0$ when $N$ is odd, and $T_{N} (0) = 0$ when even, so there are two cases for the power loss ratio depending on $N$ . Considering the same LC circuit shown above, for even $N$ it can be shown that $R$ is not unity, so there will be an impedance mismatch if the load has a unity impedance, which can be corrected with a $λ /4$ transformer. For odd $N$ this is not an issue: it can be shown that $R = 1$ .

The tables for equal ripple responses depend on the passband ripple level.

Scaling

In the prototype designs above, the source and load resistances are all unity. A source resistance of $R_{0}$ is obtained by multiplying all the impedances of the prototype design by $R_{0}$

$L^{'} = R_{0} L C^{'} = C / R_{0} R_{s}^{'} = R_{0} R_{L}^{'} = R_{0} R_{L}$

To change the cutoff frequency from unity to $ω_{c}$ , replace $ω$ by $ω / ω_{c}$

Applying both impedance and frequency scaling, the new reactive element values are:

$L_{k}^{'} = \frac{R _{0} L _{k}}{ω _{c}} C_{k}^{'} = \frac{C _{k}}{R _{0} ω _{c}}$

High Pass Transformation

The substitution $ω \leftarrow (- ω_{c} / ω)$ is used to convert a low pass to high pass response. This maps $ω = 0 \to \pm \infty$ and vice-versa.

The impedance and frequency scaling for mapping a normalised prototype to a high pass filter are:

$C_{k}^{'} = \frac{1}{R _{o} ω _{c} L _{k}} L_{k}^{'} = \frac{R _{0}}{ω _{c} C _{k}}$

Filter Implementation

Lumped elements are fine at low frequencies but usually don't work at RF. Richards' transformations can be used to convert lumped elements to transmission line sections:

$j X_{L} = j L tan (βl) j B_{c} = j C tan (βl)$

The stub length of the lines is $λ /8$ at $ω_{c}$ with unity impedance.

The Kuroda identities can convert shunt to series. Each box represents a transmission line of the indicated characteristic impedance at length $λ /8$ at $ω_{c}$ . The inductors and capacitors represent short and open circuit stubs, respectively.

$n^{2} = 1 + Z_{2} / Z_{1}$

Stepped-Impedance Low Pass Filters

Low pass filters can be implement in microstrip using alternating sections of high and low impedance lines. For a low-pass filter prototype, the series indcutors can be replaced by high impedance line sections $Z_{0} = Z_{h}$ , and low impedance $Z_{0} = Z_{l}$ . The ratio $Z_{h} / Z_{l}$ should be as large as can possibly be fabricated. The lengths of the lines can then be determined from:

$βl = \frac{L R _{0}}{Z _{h}} βl = \frac{C Z _{l}}{R _{0}}$

Where $R_{0}$ is the filter impedance, and $L$ and $C$ are the normalised element values from the prototype. To obtain the best response, the lengths should be evaluted at the cutoff frequency.

Power Dividers, Couplers & Resonators

Power dividers divider one input signal into two or more output signals. Power couplers take two or more inputs and combine them into a single output.

Wilkinson Power Divider

The equal split (3dB) Wilkinson power divider will be considered, although it can be designed to give arbitrary power division.

The circuit is formed of two $λ /4$ lines of impedance $Z_{0}$ , with a resisitor in shunt accross the two lines of impedance $2 Z_{0}$ . The scattering parameters:

$S_{11} = 0$
- $Z_{in} = Z_{0}$ at port 1, the input
$S_{22} = S_{33} = 0$
- Ports 2 and 3 are matched for even and odd modes of excitation
$S_{12} = S_{21} = - j / 2$
- Symmetry due to reciprocity
$S_{13} = S_{31} = - j / 2$
$S_{23} = S_{32} = 0$
- Due to short or open circuit at the bisection

Directional Coupler

A directional coupler is shown below

Power supplied to port 1 is coupled to port 3
- The coupled port
The remainder of the input power is delivered to port 2
- The through port
No power is delivered to port 4
- The isolated port

The quantities used to characterize a directional coupler:

Coupling factor $C = 10 lo g_{10} (P_{1} / P_{3})$ - the fraction of the input power that is coupled to the output port
Directivity $D = 10 lo g_{10} (P_{3} / P_{4})$ - a measure of the coupler's ability to isolate forward and backward waves
Isolation $I = 10 lo g_{10} (P_{1} / P_{4})$ - the measure of the power delivered to the uncoupled port
Insertion loss $L = 10 lo g_{10} (P_{1} / P_{2})$ - the power delievered to the output port

The quadrature hybrid directional coupler is a 3dB directional coupler with all ports matched, and input power divided evenly between ports 2 and 3. No power is coupled to port 4. The coupler is symmetrical, and any port can be used as the input/output ports.

$S = \frac{- 1}{2} 0 j 10 j 001 100 j 01 j 0$

Resonators

RLC Resonators

Resonators at microwave frequency are similar to lumped element RLC circuits:

Input impedance:

$Z_{in} = R + jω L - j \frac{1}{ω C}$

Power delivered:

$P_{in} = \frac{1}{2} Z_{in} \frac{∣ V _{in} ∣ ^{2}}{∣ Z _{in} ∣ ^{2}}$

The resistors dissipates power $R$ , while the inductor and capacitor store energy $W_{m}$ and $W_{e}$ :

$P_{l oss} = \frac{1}{2} ∣ I ∣^{2} R W_{m} = \frac{1}{4} ∣ I ∣^{2} L W_{e} = \frac{1}{4} ∣ I ∣^{2} \frac{1}{ω ^{2} C}$

At the resonant frequency of $ω_{0} = 1/ L C$ , $W_{m} = W_{e}$ and $Z_{in} = R$

The quality factor of a resonant circuit is defined as the ratio of energy stored to energy loss:

$Q (ω) = ω \frac{W _{m} + W _{e}}{P _{l oss}}$

$Q$ measures the loss of the circuit: higher Q means higher loss. An external connecting network may introduce additional loss, so the $Q$ of the resonator itself (the unloaded $Q$ ) is:

$Q_{0} = \frac{1}{ω _{0} RC}$

The input impedance of the series resonator at a frequency $ω = ω_{0} + Δ ω$ , where $Δ ω$ is small:

$Z_{in} \approx R = j w L Δ ω = R + j \frac{2 R Q _{0} Δ ω}{ω _{0}}$

A lossy resonator can be modelled as a lossless resonant frequency $ω_{0}$ replaced by a complex effective resonant frequency $ω_{0} (1 + j /2 Q_{0})$ .

When the frequency $ω$ is such that $∣ Z_{in} ∣^{2} = 2 R^{2}$ , the real power delivered to the circuit is half that as of at $ω_{0}$ . We use this to define the fractional bandwith as $1/ Q_{0}$ .

The same analysis can be done for a parallel RLC resonator. The properties of the two are compared in the table below

In general, resonators are coupled to other circuitry which gives a loaded $Q_{L}$ . If we couple a resonant circuit to an external load $R_{L}$ and define $Q_{e}$ , the external load, then:

$\frac{1}{Q _{L}} = \frac{1}{Q _{e}} + \frac{1}{Q _{0}}$

For a series RLC resonator, effective resistance is $R + R_{L}$ , and $Q_{e} = ω_{0} L / R_{L}$ . In parallel, effective resistance is $(1/ R + 1/ R_{L})^{- 1}$ and $Q_{e} = R_{L} / (ω_{0} L)$

Transmission Line Resonators

Open Circuit $λ /2$ line

A practical resonator that is often used in microstrip circuits is an open circuit length of transmission line of length $λ /2$ , which behaves as a parallel resonator circuit. The input impedance is:

$Z_{in} = Z_{0} coth ((α + j β) l)$

In practice, low loss transmission lines are used, so we can approximate $tanh (α l) \approx α l$ . Using $ω = ω_{0} + Δ ω$ again for small $Δ ω$ near to the resonant frequency, we have:

$Z_{in} \approx \frac{Z _{0}}{α l + j ( π Δ ω / ω )}$

At resonance, $βl = π$ , the unloaded $Q$ of this resonator is:

$Q_{0} = ω_{0} RC = \frac{π}{2 α l} = \frac{β}{2 α}$

Gap-Coupled Microstrip Resonator

Consider a $λ /2$ open-circuit microstrip gap-coupled to the end of a microstrip transmission line. The normalised input impedance is:

$z (ω) = \frac{Z}{Z _{0}} = - j \frac{tan ( βl ) + b _{c}}{b _{c} tan ( βl )} b_{c} = Z_{0} ω C$

The resonant frequency occurs when $z = 0$ , ie when $tan (βl) = - b_{c}$

The first resonant frequency $ω_{1}$ is close to the resonant frequency of the unloaded resonator, so we have $b_{c} ≪ 1$ . The coupling of the resonator to the feedline has the effect of lowering it's resonant frequency.

The presence of a coupling capacitor turns the uncoupled $λ /2$ line from a parallel to a series RLC circuit near resonance. At resonance:

$R = \frac{Z _{0} π}{2 Q _{0} b _{c}^{2}}$

For critical coupling, $Q_{0} / Q_{1} = 1$ :

$R = Z_{0} b_{c} = \frac{π}{2 Q _{0}}$

Computer Systems Engineering Notes