Networking For The Multitude: May 2012

It was just recently that I got my head around understanding what an MTU is and what it meant for my traffic going between say Node A to Node B and what are the different ways we can tweak this to speed up our transfer rates. I feel I can better explain the concept here with a question and answer format which will help readers to take it one at a time and even directly to jump to where they have clarifications about. So starting with the basic and simple question:

What is MTU ?

MTU stands for Maximum Transmission Unit. As the name explains It is the Maximum length on an IP Datagram that can be handled by a specific transmission link. So even if the size for the frame ends up to be larger than the MTU by 1 bit it will end up being fragmented unless DF bit is set ( Explained later ). Some of the MTU values for the different links are as below.

There can also be scenarios where your traffic passes through a GRE tunnel, which adds another header of its own. Your network can be a mix and match of all of these transmission medium and sending a packet from source A to Source B with packet size larger than the least MTU in the path will mean that the PDU ( Packet Data Unit) will get fragmented and passed through.

Well, why is it so bad if it will get fragmented and passed along ? How does fragmentation Impact your network ?

In the above example you can see that if the Node A was to send across a packet with a size of 1400 bytes to Node B. Router 1 will need to break that packet down into two in order to fit it through the transmission medium getting to Router 2 . Router 2 on the other hand has to assemble this IP packet before passing it along.

In the earlier version of Cisco IOS the router had to copy and fragmentation each of the packet was done by the CPU ( Process switching ) which is much slower and processor intensive as the number of packets to be fragmented increase. The newer versions however perform fragmentation at Cisco Express Forwarding level.

IP fragmentation would also increase the Layer 3 overhead and thus ends up reducing throughput. The application-layer information is missing from the non-first IP fragments, as the TCP or UDP header is not copied into all fragments. As a result, some firewalls might be configured to drop IP fragments , while others have to consume additional CPU resources to reassemble the fragments and inspect their actual contents. Intrusion Detection/Prevention Systems (IDS/IPS) have to provide similar functionality to effectively detect intrusion signatures.

There is additional load on the end system (Router 2) to re assemble the packets before passing it on. Reassembled packets are process switched and hence making it slower and process intensive. So to attain the best throughput from a given link the size of the packet ( MTU ) should be approximately equal or lesser than the least MTU in the path.

What is MSS ?

The maximum segment size (MSS) is the largest amount of data, specified in bytes, that a computer or communications device can handle in a single, unfragmented piece. You can see the MSS value in the breakdown of a TCP SYN and SYN ACK Header. SYN and SYN ACK are the first two packets in a TCP communication.

There is a negotiation which happens between the Node A and Node B on what is the agreed MSS for communication ( the least of the two ). I have given an example of a sample communication on Wireshark.

You can see the SYN packet has an MSS 1260 and the return SYN ACK has a MSS of 1430. So once these SYN and SYN ACK are exchanged they TCP communication negotiates to the lowest of the values between them, i.e. 1260.

How is MSS different from MTU ?

MSS is the Maximum Segment Size , which is the maximum size of a Layer 4 Data ( which does not include the Layer 4 header ). Where as the MTU is the Maximum Transmission Unit ( Layer 3 packet including the Layer 3 header ). Perhaps this diagram from the CCNA curriculum will refresh your memories

And here is a simple relation between the Packet Size and MSS

Packet Size = MSS + TCP Header + IP Header

Default TCP header Size = 20 bytes

Default IP Header Size =20 bytes

Therefore ,

Packet size = MSS + 40 bytes

So from the previous explanation about fragmentation we could derive that for best throughput:

Packet Size < or = Least MTU in the path and therefore for best efficiency

MSS = Least MTU in the path - 40 bytes

Now comes the big question. How does Node A know about what is the least MTU in the path for reaching B ? In comes something called Path MTU Discovery.

How does PMTU Discovery work ?

Path MTU Discovery uses functionalities of IP and ICMP to find the least MTU in a communication path. Whenever the first PMTUD-aware session with a new destination host is started, the MTU of the outgoing interface is assumed to be the MTU of the overall path.

All outgoing IP datagrams are sent with the DF bit set.Whenever the layer-4 session happens to send an over sized datagram, a router in the path will drop the packet, report that the local egress MTU was exceeded and suggest the new MTU size in the ICMP reply. An extra field in the ICMP response indicates the maximum MTU the sending router could support on the outgoing link.The MTU size reported by an intermediate router is cached as the new MTU for the destination host and all future outgoing datagrams will not exceed that MTU.The TCP stack in the originating host or a PMTUD-aware UDP application has to retransmit the data in smaller datagrams.

Below is a diagram which explains it better.

Packet sent with MTU of 1500
ICMP reply sent back from the router to the host indicating the MTU of the forwarding interface.
MTU value stored for the specific host in the Source device.
Packet sent with the new MTU Size.

However there can at time be issues which happen with PMTU Discovery because many reasons like the return ICMP being blocked, return ICMP not set back etc. As a network administrator you do have a few options available to manipulate what the MTU size should be. I am going to explain the easiest one of them here.

What does "ip tcp adjust-mss < > " command do ?

If you read back on the explanation on the MSS, you can see that during any TCP communication the two host negotiates the MSS between them to the lowest value of both.

The command ip tcp adjust-mss <500-1460> when applied on an interface level, looks into all the SYN packets going in and out of that interface and replaces the the MSS value field with what is configured on the interface level, Hence limiting the MTU of the packets going through that interface to a value of +40 bytes. How cool is that ?

So in the above example the MSS value of any TCP SYN packet passing that interface in either direction will be changed if it is larger than the configured MSS and hence indirectly limiting the MTU size to 1350 ( remember that MTU = MSS + TCP header + IP Header ).

Hope this was helpful!

Networking For The Multitude

Saturday, May 26, 2012

MTU and MSS. Why should we care ?

What is MTU ?

What is MSS ?

How is MSS different from MTU ?

How does PMTU Discovery work ?

What does "ip tcp adjust-mss < > " command do ?