Graphs
A graph is a collection of edges and vertices, a pair , where
- is a set of nodes, called vertices
- is a collection of pairs of vertices, called edges
- Vertices and edges are positions and store elements
Examples of graphs include routes between locations, users of a social network and their friendships, and the internet.
There are a number of different types of edge in a graph, depending upon what the edge represents:
- Directed edge
- Ordered pair of vertices
- First vertex is the origin
- Second vertex is the destination
- For example, a journey between two points
- Undirected edge
- Unordered pair of vertices
- In a directed graph, all edges are directed
- In an undirected graph, all edged are undirected
Graph Terminology
- Adjacent vertices
- Two vertices and are adjacent (ie connected by an edge)
- Edges incident on a vertex
- The edges connect to a vertex
- , , and are incident on
- End vertices or endpoints of an edge
- The vertices connected to an edge
- and are endpoints of
- The degree of a vertex
- The number of edges connected to it
- has degree 5
- Parallel edges
- Edges that make the same connection
- and are parallel
- Self-loop
- An edge that has the same vertex at both ends
- is a self-loop
- Path
- A sequence of alternating vertices and edges
- Begins and ends with a vertex
- Each edge is preceded and followed by its endpoints
- is a simple path
- Cycle
- A circular sequence of alternating vertices and edges
- A circular path
- A simple cycle is one where all edges and vertices are distinct
- A non-simple cycle contains an edge or vertex more than once
- A graph without cycles (acyclic) is a tree
- A circular sequence of alternating vertices and edges
- Length
- The number of edges in a path
- The number of edges in a cycle
Graph Properties
Notation:
- is the number of vertices
- is the number of edges
- is the degree of vertex
The sum of the degrees of the vertices of a graph is always an even number. Each edge is counted twice, as it connects to two vertices, so . For example, the graph shown has and .
In an undirected graph with no self loops and no multiple edges, . Each vertex has degree at most and . For the graph shown,
The Graph ADT
A graph is a collection of vertices and edges, which are modelled as a combination of 3 data types: Vertex
, Edge
and Graph
.
- A
Vertex
is just a box object storing an element provided by the user - An
Edge
also stores an associated value which can be retrieved
public interface Graph{
int numVertices();
Collection vertices(); //returns all the graph's vertices
int numEdges();
Collection<Edge> edges(); //returns all the graph's edges
Edge getEdge(u,v); //returns the edge between u and v, if on exists
// for an undirected graph getEdge(u,v) == getEdge(v,u)
Pair<Vertex, Vertex> endVertices(e); //returns the endpoint vertices of edge e
Vertex oppsite(v,e); //returns the vertex adjacent to v along edge e
int outDegree(v); //returns the number of edges going out of v
int inDegree(v); //returns the number of edges coming into v
//for an undirected graph, inDegree(v) == outDegree(v)
Collection<Vertex> outgoingEdges(v); //returns all edges that point out of vertex v
Collection<Vertex> incomingEdges(v); //returns all edges that point into vertex v
//for an undirected graph, incomingEdges(v) == outgoingEdges(v)
Vertex insertVertex(x); //creates and returns a new vertex storing element x
Edge insertEdge(u,v,x); //creates and returns a new edge from vertices u to v, storing element x in the edge
void removeVertex(v); //removes vertex v and all incident edges from the graph
void removeEdge(e); //removes edge e from the graph
}
Representations
There are many different ways to represent a graph in memory.
Edge List
An edge list is just a list of edges, where each edge knows which two vertices it points to.
- The
Edge
object stores- It's element
- It's origin
Vertex
- It's destination
Vertex
- The edge list stores a sequence of
Edge
objects
Adjacency List
In an adjacency list, each vertex stores an array of the vertices adjacent to it.
- The
Vertex
object stores- It's element
- A collection/array of all it's incident edges
- The adjacency list stores all
Vertex
Objects
Adjacency Matrix
An adjacency matrix is an matrix, where is the number of vertices in the graph. It acts as a lookup table, where each cell corresponds to an edge between two vertices.
- If there is an edge between two vertices and , the matrix cell will contain the edge.
- Undirected graphs are symmetrical along the leading diagonal
Subgraphs
- A subgraph of a graph is a graph such that:
- The vertices of are a subset of the vertices of
- The edges of are a subset of the edges of
- A spanning subgraph of is a subgraph that contains all the vertices of
- A graph is connected if there is a path between every pair of vertices
- A tree is an undirected graph such that
- is connected
- has no cycles
- A forest is an undirected graph without cycles
- The connected components of a forest are trees
- A spanning tree of a connected graph is a spanning subgraph that has all vertices covered with a minimum possible number of edges
- A spanning tree is not unique unless the graph is a tree
- Multiple spanning trees exist
- Spanning trees have applications in the design of communication networks
- A spanning forest of a graph is a spanning subgraph that is a forest
- A spanning tree is not unique unless the graph is a tree
Depth First Search
DFS is a general technique for traversing a graph. A DFS traversal of a graph will:
- Visit all vertices and edges of
- Determine whether is connected
- Compute the spanning components of
- Compute the spanning forest of
DFS on a graph with vertices and edges takes time. The algorithm is:
- For a graph and a vertex of
- Mark vertex as visited
- For each of 's outgoing edges
- If has not been visited then
- Record as the discovery edge for vertex
- Recursively call DFS with on
- If has not been visited then
DFS(G,V)
visits all vertices and edges in the connected component of v
, and the discovery edges labelled by DFS(G,V)
form a spanning tree of the connected component of v
.
DFS can also be extended to path finding, to find a path between two given vertices and . A stack is used to keep track of the path, and the final state of the stack is the path between the two vertices. As soon as the destination vertex is encountered, the contents of the stack is returned.
DFS can be used for cycle detection too. A stack is used to keep track of the path between the start vertex and the current vertex. As soon as a back edge (an edge we have already been down in the opposite direction) is encountered, we return the cycle as the portion of the stack from the top to the vertex .
To perform DFS on every connected component of a graph, we can loop over every vertex, doing a new DFS from each unvisited one. This will detect all vertices in graphs with multiple connected components.
Breadth First Search
BFS is another algorithm for graph traversal, similar to DFS. It also requires time. The difference between the two is that BFS uses a stack while DFS uses a queue. The algorithm is as follows:
- Mark all vertices and edges as unexplored
- Create a new queue
- Add the starting vertex to the queue
- Mark as visited
- While the queue is not empty
- Pop a vertex from the queue
- For all neighbouts of
- If is not visited
- Push into the queue
- Mark as visited
- If is not visited
For a connected component of graph containing :
- BFS visits all vertices and edges of
- The discovery edges labelled by
BFS(G,s)
form a spanning tree of - The path of the spanning tree formed by the BFS is the shortest path between the two vertices
BFS can be specialised to solve the following problems in time:
- Compute the connected components of a graph
- Compute a spanning forest of a graph
- Find a simple cycle in G
- Find the shortest path between two vertices
- DFS cannot do this, this property is unique to BFS
Directed Graphs
A digraph (short for directed graph) is a graph whose edges are all directed.
- Each edge goes in only one direction
- Edge goes from a to b but not from b to a
- If the graph is simple and has vertices and edges,
- DFS and BFS can be specialised to traversing directed edges
- A directed DFS starting at a vertex determines the vertices reachable from
- One vertex is reachable from another if there is a directed path to it
Strong Connectivity
A digraph is said to be strongly connected if each vertex can reach all other vertices. This property can be identified in time with the following algorithm:
- Pick a vertex in the graph
- Perform a DFS starting from
- If theres a vertex not visited, return false
- Let be with all the edge directions reversed
- Perform a DFS starting from in
- If theres a vertex not visited, return false
- Else, return True
Transitive Closure
Given a digraph , the transitive closure of is the digraph such that:
- has the same vertices as
- If has a directed path from to , then G* also has a directed *edge* from to
- In , every pair of vertices with a path between them in is now adjacent
- The transitive closure provides reachability information about a digraph
The transitive closure can be computed by doing a DFS starting at each vertex. However, this takes time. Alternatively, there is the Floyd-Warshall algorithm:
- For the graph , number the vertices
- Compute the graphs
- has directed edge if has a directed path from to with intermediate vertices
- Digraph is computed from
- Add if edges and appear in
In pseudocode:
for k=1 to n
Gk = Gk_1
for i=1 to n (i != k)
for j=1 to n (j != i, j!=k)
if G_(k-1).areAdjacent(vi,vk) && G_(k-1).areAdjacent(vk,vj)
if !G_(k-1).areAdjacent(vi,vj)
G_k.insertDirectedEdge(vi,vj,k)
return G_n
This algorithm takes time. Basically, at each iteration a new vertex is introduced, and each vertex is checked to see if a path exists through the newly added vertex. If it does, a directed edge is inserted to transitively close the graph.
Topological Ordering
- A Directed Acyclic Graph (DAG) is digraph that has no directed cycles
- A topological ordering of a digraph is a numbering of the vertices such that for every edge ,
- The vertex it points to is always greater than it
- A digraph can have a topological ordering if and only if it is a DAG
A topological ordering can be calculated using a DFS:
public static void topDFS(Graph G, Vertex v){
v.visited = true
for(Edge e: v.edges){
w = opposite(v,e)
if(w.visited = false)
topDFS(G,w)
else{
v.label = n
n = n-1
}
}
}
The first node encountered in the DFS is assigned , the one after that , and so on until all nodes are labelled.