# Graphs

A graph is a collection of edges and vertices, a pair $(V,E)$, where

- $V$ is a set of nodes, called
*vertices* - $E$ is a collection of pairs of vertices, called
*edges* - Vertices and edges are positions and store elements

Examples of graphs include routes between locations, users of a social network and their friendships, and the internet.

There are a number of different types of edge in a graph, depending upon what the edge represents:

*Directed*edge- Ordered pair of vertices $(u,v)$
- First vertex $u$ is the origin
- Second vertex $v$ is the destination
- For example, a journey between two points

*Undirected*edge- Unordered pair of vertices $(u,v)$

- In a directed graph, all edges are directed
- In an undirected graph, all edged are undirected

## Graph Terminology

- Adjacent vertices
- Two vertices $U$ and $V$ are adjacent (ie connected by an edge)

- Edges incident on a vertex
- The edges connect to a vertex
- $a$, $d$, and $b$ are incident on $V$

- End vertices or endpoints of an edge
- The vertices connected to an edge
- $U$ and $V$ are endpoints of $a$

- The degree of a vertex
- The number of edges connected to it
- $X$ has degree 5

- Parallel edges
- Edges that make the same connection
- $h$ and $i$ are parallel

- Self-loop
- An edge that has the same vertex at both ends
- $j$ is a self-loop

- Path
- A sequence of alternating vertices and edges
- Begins and ends with a vertex
- Each edge is preceded and followed by its endpoints
- $P_{1}=(V,b,X,h,Z)$ is a simple path

- Cycle
- A circular sequence of alternating vertices and edges
- A circular path

- A simple cycle is one where all edges and vertices are distinct
- A non-simple cycle contains an edge or vertex more than once
- A graph without cycles (acyclic) is a tree

- A circular sequence of alternating vertices and edges
- Length
- The number of edges in a path
- The number of edges in a cycle

### Graph Properties

Notation:

- $n$ is the number of vertices
- $m$ is the number of edges
- $g(v)$ is the degree of vertex $v$

The sum of the degrees of the vertices of a graph is always an even number. Each edge is counted twice, as it connects to two vertices, so $∑_{v}g(v)=2m$. For example, the graph shown has $n=4$ and $m=6$. $g(v)=3⇒∑_{v}g(v)=2m=12$

In an undirected graph with no self loops and no multiple edges, $m≤n2n−1 $. Each vertex has degree at most $(n−1)$ and $∑_{v}g(v)=2m$. For the graph shown, $m=6≤n2n−1 =6$

## The Graph ADT

A graph is a collection of vertices and edges, which are modelled as a combination of 3 data types: `Vertex`

, `Edge`

and `Graph`

.

- A
`Vertex`

is just a box object storing an element provided by the user - An
`Edge`

also stores an associated value which can be retrieved

```
public interface Graph{
int numVertices();
Collection vertices(); //returns all the graph's vertices
int numEdges();
Collection<Edge> edges(); //returns all the graph's edges
Edge getEdge(u,v); //returns the edge between u and v, if on exists
// for an undirected graph getEdge(u,v) == getEdge(v,u)
Pair<Vertex, Vertex> endVertices(e); //returns the endpoint vertices of edge e
Vertex oppsite(v,e); //returns the vertex adjacent to v along edge e
int outDegree(v); //returns the number of edges going out of v
int inDegree(v); //returns the number of edges coming into v
//for an undirected graph, inDegree(v) == outDegree(v)
Collection<Vertex> outgoingEdges(v); //returns all edges that point out of vertex v
Collection<Vertex> incomingEdges(v); //returns all edges that point into vertex v
//for an undirected graph, incomingEdges(v) == outgoingEdges(v)
Vertex insertVertex(x); //creates and returns a new vertex storing element x
Edge insertEdge(u,v,x); //creates and returns a new edge from vertices u to v, storing element x in the edge
void removeVertex(v); //removes vertex v and all incident edges from the graph
void removeEdge(e); //removes edge e from the graph
}
```

### Representations

There are many different ways to represent a graph in memory.

#### Edge List

An edge list is just a list of edges, where each edge knows which two vertices it points to.

- The
`Edge`

object stores- It's element
- It's origin
`Vertex`

- It's destination
`Vertex`

- The edge list stores a sequence of
`Edge`

objects

#### Adjacency List

In an adjacency list, each vertex stores an array of the vertices adjacent to it.

- The
`Vertex`

object stores- It's element
- A collection/array of all it's incident edges

- The adjacency list stores all
`Vertex`

Objects

#### Adjacency Matrix

An adjacency matrix is an $n×n$ matrix, where $n$ is the number of vertices in the graph. It acts as a lookup table, where each cell corresponds to an edge between two vertices.

- If there is an edge between two vertices $u$ and $v$, the matrix cell $(u,v)$ will contain the edge.
- Undirected graphs are symmetrical along the leading diagonal

## Subgraphs

- A subgraph $S$ of a graph $G$ is a graph such that:
- The vertices of $S$ are a subset of the vertices of $G$
- The edges of $S$ are a subset of the edges of $G$

- A spanning subgraph of $G$ is a subgraph that contains
*all*the vertices of $G$ - A graph is
*connected*if there is a path between every pair of vertices - A tree is an undirected graph $T$ such that
- $T$ is connected
- $T$ has no cycles

- A forest is an undirected graph without cycles
- The connected components of a forest are trees

- A spanning tree of a connected graph is a spanning subgraph that has all vertices covered with a minimum possible number of edges
- A spanning tree is not unique unless the graph is a tree
- Multiple spanning trees exist

- Spanning trees have applications in the design of communication networks
- A spanning forest of a graph is a spanning subgraph that is a forest

- A spanning tree is not unique unless the graph is a tree

## Depth First Search

DFS is a general technique for traversing a graph. A DFS traversal of a graph $G$ will:

- Visit all vertices and edges of $G$
- Determine whether $G$ is connected
- Compute the spanning components of $G$
- Compute the spanning forest of $G$

DFS on a graph with $n$ vertices and $m$ edges takes $O(n+m)$ time. The algorithm is:

- For a graph $G$ and a vertex $u$ of $G$
- Mark vertex $u$ as visited
- For each of $u$'s outgoing edges $e=(u,v)$
- If $v$ has not been visited then
- Record $e$ as the discovery edge for vertex $v$
- Recursively call DFS with on $v$

- If $v$ has not been visited then

`DFS(G,V)`

visits all vertices and edges in the connected component of `v`

, and the discovery edges labelled by `DFS(G,V)`

form a spanning tree of the connected component of `v`

.

DFS can also be extended to path finding, to find a path between two given vertices $u$ and $v$. A stack is used to keep track of the path, and the final state of the stack is the path between the two vertices. As soon as the destination vertex $v$ is encountered, the contents of the stack is returned.

DFS can be used for cycle detection too. A stack is used to keep track of the path between the start vertex and the current vertex. As soon as a back edge $(v,w)$ (an edge we have already been down in the opposite direction) is encountered, we return the cycle as the portion of the stack from the top to the vertex $w$.

To perform DFS on every connected component of a graph, we can loop over every vertex, doing a new DFS from each unvisited one. This will detect all vertices in graphs with multiple connected components.

## Breadth First Search

BFS is another algorithm for graph traversal, similar to DFS. It also requires $O(n+m)$ time. The difference between the two is that BFS uses a stack while DFS uses a queue. The algorithm is as follows:

- Mark all vertices and edges as unexplored
- Create a new queue
- Add the starting vertex $s$ to the queue
- Mark $s$ as visited
- While the queue is not empty
- Pop a vertex $v$ from the queue
- For all neighbouts $w$ of $v$
- If $w$ is not visited
- Push $w$ into the queue
- Mark $w$ as visited

- If $w$ is not visited

For a connected component $G_{s}$ of graph $G$ containing $s$:

- BFS visits all vertices and edges of $G_{s}$
- The discovery edges labelled by
`BFS(G,s)`

form a spanning tree of $G_{s}$ - The path of the spanning tree formed by the BFS is the shortest path between the two vertices

BFS can be specialised to solve the following problems in $O(n+m)$ time:

- Compute the connected components of a graph
- Compute a spanning forest of a graph
- Find a simple cycle in G
- Find the shortest path between two vertices
**DFS cannot do this, this property is unique to BFS**

## Directed Graphs

A *digraph* (short for directed graph) is a graph whose edges are all directed.

- Each edge goes in only one direction
- Edge $(a,b)$ goes from a to b but
**not**from b to a - If the graph is simple and has $n$ vertices and $m$ edges, $m≤n2n−1 $
- DFS and BFS can be specialised to traversing directed edges
- A directed DFS starting at a vertex $s$ determines the vertices
*reachable*from $s$ - One vertex is reachable from another if there is a directed path to it

- A directed DFS starting at a vertex $s$ determines the vertices

### Strong Connectivity

A digraph is said to be *strongly connected* if each vertex can reach all other vertices. This property can be identified in $O(n+m)$ time with the following algorithm:

- Pick a vertex $v$ in the graph $G$
- Perform a DFS starting from $v$
- If theres a vertex not visited, return false

- Let $G_{′}$ be $G$ with all the edge directions reversed
- Perform a DFS starting from $v$ in $G_{′}$
- If theres a vertex not visited, return false
- Else, return True

### Transitive Closure

Given a digraph $G$, the transitive closure of $G$ is the digraph $G∗$ such that:

- $G∗$ has the same vertices as $G$
- If $G$ has a directed path from $u$ to $v$, then G* also has a directed *edge* from $u$ to $v$
- In $G∗$, every pair of vertices with a path between them in $G$ is now adjacent
- The transitive closure provides reachability information about a digraph

The transitive closure can be computed by doing a DFS starting at each vertex. However, this takes $O(n(n+m))$ time. Alternatively, there is the Floyd-Warshall algorithm:

- For the graph $G$, number the vertices $1,2,...,n$
- Compute the graphs $G_{0},...,G_{n}$
- $G_{0}=G$
- $G_{k}$ has directed edge $(v_{i},v_{j})$ if $G$ has a directed path from $v_{i}$ to $v_{j}$ with intermediate vertices $v_{1},...,v_{k}$

- Digraph $G_{k}$ is computed from $G_{k−1}$
- $G_{n}=G∗$
- Add $(v_{i},v_{j})$ if edges $(v_{i},v_{k})$ and $(v_{k},v_{j})$ appear in $G_{k−1}$

In pseudocode:

```
for k=1 to n
Gk = Gk_1
for i=1 to n (i != k)
for j=1 to n (j != i, j!=k)
if G_(k-1).areAdjacent(vi,vk) && G_(k-1).areAdjacent(vk,vj)
if !G_(k-1).areAdjacent(vi,vj)
G_k.insertDirectedEdge(vi,vj,k)
return G_n
```

This algorithm takes $O(n_{3})$ time. Basically, at each iteration a new vertex is introduced, and each vertex is checked to see if a path exists through the newly added vertex. If it does, a directed edge is inserted to transitively close the graph.

### Topological Ordering

- A Directed Acyclic Graph (DAG) is digraph that has no directed cycles
- A topological ordering of a digraph is a numbering $v_{1},v_{2},...,v_{n}$ of the vertices such that for every edge $(v_{i},v_{j})$, $i<j$
- The vertex it points to is always greater than it

- A digraph can have a topological ordering
**if and only if it is a DAG**

A topological ordering can be calculated using a DFS:

```
public static void topDFS(Graph G, Vertex v){
v.visited = true
for(Edge e: v.edges){
w = opposite(v,e)
if(w.visited = false)
topDFS(G,w)
else{
v.label = n
n = n-1
}
}
}
```

The first node encountered in the DFS is assigned $n$, the one after that $n−1$, and so on until all nodes are labelled.