Transport services provide logical communication between application processes running on different hosts
Break messages into segments, add header, pass to network layer on the send side
Reassemble segments into messages and pass up to application layer on the receive side
UDP provides bare minimum services
No effort to recover lost packets or re-order packets
Connectionless
Each segment treated individually
No congestion control
Sender can send as fast as they want, possibly overloading receiver or infrastructure
Used when fast and low latency is needed
UDP header is smaller
Can send data as fast as wanted
Video games, internet streaming
It is the programmers responsibility to make UDP reliable
TCP is connection-oriented and more reliable
Provides flow and congestion control
Manages packets out of order to make packets appear in order
Enhances unreliable network layer services
Bits often flipped due to noise and packets re-ordered
Checksums in headers detect bit errors
Acknowledgments (ACKs) indicated packets are correctly received
Sender times out if ACK not received within a timeout interval
Automatic Repeat Requests (ARQs) are sent to retransmit lost or corrupt packets
Packets include a sequence number to detect lost or duplicated packets
Stop and wait ARQ is a protocol for ARQs
Sender sends a packet and waits until it receives an ACK
If ACK arrives, send the next packet
If ACK times out, retransmit the same packet
Duplicate detection is possible because sequence numbers are used
Sufficient to use 1 bit sequence number since there can be at most one outstanding packet
Known as the alternating bit protocol
Reliable, but slow as sender has to wait for ACK to send next packet
Suppose a 1Gbps (R=109) link with a packet length of L=8000 bits
RTT is 30ms
Utilisation U is the fraction of time the link is spent transmitting
L/R time spent sending, RTT time spent waiting
U=R×RTT+LL=0.00027
The sender should be allowed to send more packets without waiting for an ACK
There is R×RTT bits of additional data that could be sent during the RTT interval
R×RTT is the delay-bandwith product
Indicates the length of the pipeline
Receiving buffer can also be a bottleneck
Typically has a finite buffer of B bits
May not be reading from buffer all the time
Sender should not send more than B bits at a time to prevent overflow
The maximum number of bits without waiting for an ACK is min(B,L+R×RTT)
Pipelined protocols allow multiple unacknowledged packets in the pipeline
ACKs are sent individually or cumulatively
Range of sequence number must be increased from alternating bits
Go-back-n is a common protocol
Sender maintains a window of N packets that can be sent without waiting for ACK
Depends on delay-bandwith product, receive buffer size, other factors
Receiver maintains expected sequence number variable, keeps track of the next expected packet
If the receiver receives the packet with the expected sequence number, then it sends ACK(n), which acknowledges all packets up to n, making the ACK cumulative
If the sequence number is not the expected one, then the receiver discard the incoming packet and sends ACK(n−1), acknowledging all up to the last correctly received packet
Waits for packet n to be correctly received before acknowledging any further packets
The sender moves the send window forward for every ACK received
Maintains a timer for the oldest unacknowledged packet, if an ACK times out then the packet is resent
Selective repeat does not discard out of order packets as long as they fall inside a receive window
ACKs are individual and not cumulative
Sender selectively retransmits packets whose ACK did not arrive
Maintains a timer for each unacknowledged packet in the send window
Does not have to retransmit out-of-order packets
Packets arriving out of order are buffered, but receive window not moved forward
Window size should be less than or equal to half the max sequence number
Avoids packets being recognised incorrectly
Send window moved forward when ACK received
TCP uses a combination of GBN and SR protocols
Uses cumulative ACKs
Only retransmits the packet causing timeout
Each byte of data is numbered in TCP
Sequence number of a packet is the byte number of the first byte of the segment
TCP ACK number is the number of the next byte expected from the other side
Cumulative ACKs are used
TCP is duplex, so ACKs are piggybacked onto data segments. A segment can carry data and serve as an ACK
The timout period is often relatively long, so on 3 duplicate ACKs, the sender re-transmits that segment without waiting for timeout
Duplicate ACKs are good indicators of high packet loss
TCP headers contain a few fields
Sequence number is the 32 bit number of the segment indicating the number of the first byte in the packet
Acknowledgement number is the number of the next byte expected to be transmitted
Receive window is used for flow control
TCP uses flow control to ensure that the data in the pipeline does not exceed the receive buffer size
Receiver advertises free buffer spaces in the receive windows field - rwnd
Sender limits amount of unacknowledged data to the receiver's rwnd value
(last byte send - last byte ACK'd) ≤rwnd
TCP provides congestion control to control the rate of transmission according to the level of perceived congestion in the network
Congestion occurs when input rate > output rate
Results in lost packets, buffer overflows, long delays due to queuing at routers
As a transmission link approaches maximum capacity queues build up and delay approaches infinity
There is no benefit in increasing transmission rate beyond network capacity
In a circular network where the transmission rate via link i is λ′ and the capacity of the link is C
If λ≤C/2, then λ′′=λ′=λ: the links can all transmit at the same rate and there is no congestion
If λ>C/2, then only a portion of the traffic can be carried by each link
λ′=(Cλ)/(λ+λ′)
λ′′=(Cλ′)/(λ′+λ)
λ′′=C−2λ(1+4C/λ−1)
The throughput increases linearly up to a max of λ=C/2, then decreases exponentially towards 0 from there causing congestion collapse
Throughput control aims to limit send rates such that congestion collapse does not occur, and flows get a fair share of network resources
TCP detects network congestion through delays and losses
Congestion is assumed when timeout occurs or 3 duplicate ACKs are received
TCP is a window-based pipelined protocol, where the rate of transmission is window size W/RTT
Controlling W controls the transmission rate
Maximum size of a TCP segment is the MSS, Maximum Segment Size, which is determined by the maximum frame size specified by the link layer
Number of segments to transmit all data is W/MSS
The sender maintains a congestion window size, denoted cwnd
W = LastByteSent - LastByteAcked <= min(cwnd, rwnd)
When rwnd is large, sender cwnd determines the transmission rate, which ≈RTTcwnd bps
cwnd is a function of perceived network congestion
Varied using additive increases and multiplicative decreases (AIMD)
Increase cwnd by 1MSS every RTT until loss detected
Cut cwnd in half after loss
Achieves a fair allocation rate among competing flows
Additive increase gives a slope of 1 as throughput increases
The main function of the network layer is to move packets from the source to destination node through intermediate nodes (routers)
Main protocol on this layer is IP
At the source, the IP header is added with source and destination IP addresses
Routers check destination IP addresses to decide the next hop
At the destination, the IP header is stripped and the packet is delivered to the transport layer
Routers have two key functions
Forwarding, moving packets from the input to the appropriate output
Routing, constructing routing tables and running routing protocols
Routing tables map destination IP ranges to their output links
Mapping all 4 billion IP addresses would be impractical
If the IP ranges don't divide up so nicely, longest prefix matching is used
When looking for a table entry for a given destinaion address, use the longest address prefix that matches the destination address
IPv4 addresses are 32bits to uniquely identify network interfaces
IP addresses belonging to the same subnet have the same prefix, the subnet mask
Interfaces on the same subnet are connected by a link layer switch and communicate directly
IP addresses have their subnet mask specified as the number of bits as a prefix
CIDR notation is xxx.xxx.xxx.xxx/xx
A sender checks if the destination IP has the same subnet mask
If it does then obtain the MAC address of the destination and forward the packet to the link layer switch
If source and destination belong to different subnets, then the source forwards the packet to it's default gateway
Gateway routers connect subnets
If A want so communicate with B on a different subnet, it forwards the packets to R, the default gateway
R will look up in it's routing table to forward A's packets to the correct outgoing interface
When the packet reaches the interface, it will be forwarded to B through the switch in B's subnet
Nodes have two options for acquiring IP addresses
Network admins can manually configure the IP of each host on the network
DHCP is an application layer protocol that dynamically assigns IP addresses from the server to clients
Both subnet mask and default gateway must be provided for both
Networks are allocated subnets from the ISP's address space
Global authority ICANN is responsible for allocating IP addresses to ISPs
Network Address Translation (NAT) is used so that each IP address on a subnet does not need a globally unique IP, as ICANN have run out of them (4 billion is not enough)
Unique IP addresses are provided to public gateway routers
Private IP addresses that are unique only on the subnet are allocated by the gateway router
Devices in home or private networks need not be visible to the public internet, they can use private IP addresses to communicate with each other and communication with the internet is done via the gateway router
Packets with private IP addresses cannot be carried by the public internet
Private source IP addresses are converted to the public IP address of the router facing the internet
Incoming packets for different hosts are distinguished by different ports on the router
Address shortage is solved by IPv6 with 128-bit addresses, but it is not in wide use yet
At each router, a routing protocol such as RIP or OSPF constructs the routing table
Each routing protocol implements a routing algorithm
Networks are abstracted as graphs G=(N,E)
N=u,v,w,x,y,z is the set of routers
E=(u,v),(u,x),(v,x),(v,w),(x,w),(x,y),(w,y),(w,z),(y,z) is the set of links
Each edge (x,y) has a cost associated with if c(x,y)
c(x,y)=∞ if x and y are not direct neighbours
The cost of a path (x1,x2,x3,...,xp)=c(x1,x2)+c(x2,x3)+...+c(xp−1,xp)
The idea is that given a source x and destination y, what is the least cost path from x to y
Need the shortest past from each node to every other node to populate the routing table
Two type of routing algorithm are used
Global requires the knowledge of the complete topology at each router including costs
Link state algorithms
Local requires only knowledge of the network surrounding the router
Dijkstra’s algorithm is a link-state routing algorithm that computes leas cost path from one node (the source) to all other nodes
Implemented in Open Shortest Path First (OSPF) protocol
Each node requires the entire topology, which is obtained through broadcasting link states
Maintains a set of visited nodes N′, initially only the source
For all nodes v
If v adjacent to u
D(v) = c(u, v) , store current estimates of shortest distance
p(v) = u, store predecessor node of v along with current shortest path from u to v
else, D(v)=∞, p(v)=null, initialise all other nodes to be infinite distance away with no known predecessor yet
While all nodes w not in N′, not yet visited
Add node w to N′
For all v adjacent to w and not in N′
If D(v)>D(w)+c(w,v)
D(v)=D(w)+c(w,v), update distance to the unvisited neighbour v of w if it is smaller
p(v)=w
The Distance Vector (DV) algorithm is used in the Routing Information Protocol (RIP)
Uses local information from neighbouring nodes to compute shortest paths
Based on the Bellman-Ford equation
dx(y) is the length of the shortest path from x to y
BF equation relates dx(y) to dv(y), where v∈N(x) (the set of neighbouts of x)
dx(y)=minc(x,v)+dv(y)
If v∗ minimises the above sum, then it is the next-hop node in the shortest path
Dx(y) is the current estimate of the minimum distance from x to y (different to actual minimum distance dx(y))
DV algorithm tries to converge estimates to their actual values
Each node maintains a distance vector Dx=[Dx(y):y∈N]
Node x performs the update Dx(y)=minvc(x,v)+Dv(y)
Node x needs toe cost of each neighbour, and the distance vector of each neighbour (obtained via message passind)
Whenever any of these is updated, the node recomputes it's distance vector and update all it's neighbours
Each node:
Wait for a change in local link cost or a message from neighbour
Recompute estimates using BF equation
If DV to any destination has changed, notify neighbours