Some actual packets - DF=0 and DF=1

Robin Whittle rw@firstpr.com.au 2008-08-13

Back to the point in the parent page from which this page is a sidebar.

Contents

>> Most hosts send packets with a non-zero Identifier, including those with DF=0.

>> Solved:  This was illusory - the TCP/IP stack was sending larger than MTU packets to the Ethernet driver, because the Ethernet chip (Broadcom BCM5751) can break the oversize TCP packet into smaller separate TCP packets, using "Large Send Offload"
AKA TSO See my message to the RRG list:  http://psg.com/lists/rrg/2008/msg02175.html. My server regularly sends out TCP HTTP packets to web clients way longer than 1500 bytes - up to 8.8k bytes or so, but I can't figure out how it can do this, since the MSS values at the start of the TCP connection were both less than 1500 bytes.

>> Google servers have an MSS of 1430 and (assuming the client's MSS is this or higher) sends only DF=0 packets, launching straight into maximum sized packets of 1470 bytes, with no attempt to do DF=1 RFC 1191 style PMTUD.



Most hosts send packets with a non-zero Identifier, including those with DF=0.


Let's have a look at how the Identifier is typically set:

tcpdump -x -n -i ppp0 | grep 0x0000: > tcpdump.txt

results in a hex dump of the first 16 bytes of packets coming into my home server's DSL line.

     45 = Protocol and header length - every packet is the same.     
     ||
     ||** DiffServ & ECN
     ||||
     |||| LLLL = Total Length
     |||| |  |
     |||| |  | IIII = Identification
     |||| |  | |  |
     4500 007e 1c07 0000 6e11 c13d 4533 f116
         Evil bit__/|\\\
               DF _/| \\\
                   MF  Fragment
                       Offset  

                           Next            
                           Protocol
                       TTL || Checksum
                         \\|| ////
     4500 007e 1c07 0000 6e11 c13d 4533 f116
                                   |||| ||||
                                   Part of
                                   Source Address

The first hex nybble of the Flags (Evil bit, DF, MF and MSB of Fragment Offset is either:

0 means Don't Fragment = 0, More Fragments = 0
or                    
4 means Don't Fragment = 1, More Fragments = 0

In several thousand packets I captured, none were fragments.

There were plenty of long packets with DF=0, which surprised me.

More than half the packets were like this:

4500 05ac 38c0 0000 3606 5838 d071 e501

which I think were mainly from Google.  These are 1452 bytes long (hex 05ac) and have DF=0.  This shows more about the packets:

tcpdump -x -i ppp0

Here is an edited example:

22:42:04.727180 IP cf-in-f99.google.com.http > my-host.4879:
    0x0000:  4500 05ac 4e74 0000 3906 9717 4a7d 1363
    0x0010:  9665 a27b 0050 130f f509 2f31 2b14 f681
    0x0020:  5010 46e0 d37f 0000 f070 7460 0d8e f60f
    0x0030:  0770 3c18 8fd4 f110 b85e be9e 9834 3557
    0x0040:  cea3 b048 7db4 9d11 b5fd cef2 82e4 46e0

This is quite interesting in itself.  It seems Google figures it is OK to send out packets of 1452 bytes, expecting the network to fragment them if there are any MTU problems.  

The 1452 bytes probably arises from my ifcg-ppp0 file setting CLAMPMSS=1412.  The 1452 byte packet length is the MSS 1412 plus 20 bytes TCP header plus 20 bytes IP header.

See below where a Google server sends out 1470 byte packets with DF=0.  Presumably they wouldn't do this if there were, in general, any MTU problems.  I guess it saves their servers having to keep a record of recently sent bytes so they can be resent if there is a PTB message.  




Solved:  This was illusory - the TCP/IP stack was sending larger than MTU packets to the Ethernet driver, because the Ethernet chip (Broadcom BCM5751) can break the oversize TCP packet into smaller separate TCP packets, using "Large Send Offload" AKA TSO See my message to the RRG list:  http://psg.com/lists/rrg/2008/msg02175.html   My server regularly sends out TCP HTTP packets to web clients way longer than 1500 bytes - up to 8.8 bytes or so, but I can't figure out how it can do this, since the MSS values at the start of the TCP connection were both less than 1500 bytes.

 
See the false-alarm.html page in this directory for the stuff I wrote about this.



Google servers have an MSS of 1430 and (assuming the client's MSS is this or higher) sends only DF=0 packets, launching straight into maximum sized packets of 1470 bytes, with no attempt to do DF=1 RFC 1191 style PMTUD.

What does Google do with a "browser" which can handle jumboframes?  I found it hard to get Google to send anything to my Texas server, from an image search URL - maybe because wget doesn't look like a browser to it.  However, I was able to get large images from Google's press center which was a simple HTTP image download.  I used a www.google.com IP address so I could access the same machine from Australia without any funny DNS stuff pointing me to an Australian server:

 
wget http://72.14.205.147/press/images/gallery/solarpanels1_lg.jpg

Here the maximum size packets were DF=0 with a size of 1470 bytes (05be):

4500 05be 0d1c 0000 3506 894e 480e cd93

I would be surprised if there was a 1500 byte MTU 100Mbps Ethernet link between my server and Google, so my guess is that Google simply sends out DF=0 bytes with a length of 1470, or less according to the TCP MSS reported by the destination host.

Here the the lines from the first packets sent by Google's server when it served the file:

4500 0034 0be1 0000 3506 9013 480e cd93
4500 05be 0be2 0000 3506 8a88 480e cd93
4500 05be 0be3 0000 3506 8a87 480e cd93
4500 05be 0be4 0000 3506 8a86 480e cd93

So Google's server goes straight to 1470 bytes, all DF=0 - no trying DF=1 to test the Path MTU as a well-behaved server should.

What TCP MSS did my server provide when it opened the connection?  1460. That would allow for a 1500 byte packet size.  The Google server responded with an MSS of  1430, which was used for the session - hence the 1470 byte packets.

If there is a router between the Google server and the client with an MTU of, for instance, 1400 bytes then the client should be configured to supply an MSS of 1360 bytes, which sets a maximum packet size of 1400 bytes for TCP.

In the future, when more and more clients can be reached with jumboframes (such as packets or 8k bytes) it will become increasingly tempting for Google etc. not to limit their servers to 1470 byte packets.  Then, they will presumably need to do proper PMTUD, since there will be situations where a host gives an 8k or so MSS, but that there a possiblity that some router or data link en-route would have a 1500 MTU.  

But whose problem would this be?  Google's, or the person whose host this is?  It should be Google's problem, for not using RFC 1191 PMTUD.  However if Google and enough other companies do this - just send packets as big as the client's MSS allows - AND if there are problems for some clients, with the bottlenecks being closer to the clients, then it will be the clients' ISPs who probably cop the flak, since Google etc. in their server farms can say it works fine for most people, apart from those at certain ISPs . . .








Here are some lines of packets on my home server  The second column is the size, fourth column the flags.

45 = Protocol and header length - every packet is the same.     
||
||** DiffServ & ECN
||||
|||| LLLL = Total Length
|||| |  |
|||| |  | IIII = Identification
|||| |  | |  |
|||| |  | |  | 4 => DF=1
|||| |  | |  | |   
4500 007e 1c07 0000 6e11 c13d 4533 f116
45c0 009a f4dd 0000 4001 159b 9665 a27b
4500 0083 870c 0000 6c11 af75 c9d9 152e
45c0 009f 4607 0000 4001 1baf 9665 a27b
4500 003c 6abf 4000 3811 d450 482e 8292
4500 00a0 0000 4000 4011 36ac 9665 a27b
4500 004a 0000 4000 4011 3b92 9665 a27b
4500 0084 0000 4000 3611 4558 86b2 3f7e
4500 003c 60cd 4000 3f06 a15e 9665 a27b
4500 0028 60cd 0000 3606 ea72 cb3f 3570
4500 0028 60ce 4000 3f06 a171 9665 a27b
4500 0028 3ad3 0000 3706 0f6d cb3f 3570
4500 0240 60cf 4000 3f06 9f58 9665 a27b
4500 0096 60d0 4000 3f06 a101 9665 a27b
4500 0028 db8a 0000 3606 6fb5 cb3f 3570
4500 0514 db8b 0000 3606 6ac8 cb3f 3570
4500 0028 60d1 4000 3f06 a16e 9665 a27b
4500 0514 db8c 0000 3606 6ac7 cb3f 3570
4500 0028 60d2 4000 3f06 a16d 9665 a27b
4500 0514 db8d 0000 3606 6ac6 cb3f 3570
4500 0028 60d3 4000 3f06 a16c 9665 a27b
4500 0514 db8e 0000 3606 6ac5 cb3f 3570
4500 0028 60d4 4000 3f06 a16b 9665 a27b
4500 0322 db8f 0000 3606 6cb6 cb3f 3570
4500 0028 60d5 4000 3f06 a16a 9665 a27b
4500 0514 db90 0000 3606 6ac3 cb3f 3570
4500 0028 60d6 4000 3f06 a169 9665 a27b
4500 0514 db91 0000 3606 6ac2 cb3f 3570
4500 0028 60d7 4000 3f06 a168 9665 a27b
4500 0514 db92 0000 3606 6ac1 cb3f 3570
4500 0028 60d8 4000 3f06 a167 9665 a27b
4500 033a db93 0000 3606 6c9a cb3f 3570
4500 0028 60d9 4000 3f06 a166 9665 a27b
4500 0514 db94 0000 3606 6abf cb3f 3570
4500 0028 60da 4000 3f06 a165 9665 a27b
4500 0514 db95 0000 3606 6abe cb3f 3570
4500 0028 60db 4000 3f06 a164 9665 a27b
4500 0514 db96 0000 3606 6abd cb3f 3570
4500 0028 60dc 4000 3f06 a163 9665 a27b
4500 0514 db97 0000 3606 6abc cb3f 3570
4500 0028 60dd 4000 3f06 a162 9665 a27b
4500 0514 db98 0000 3606 6abb cb3f 3570
4500 0028 60de 4000 3f06 a161 9665 a27b
4500 0514 db99 0000 3606 6aba cb3f 3570
4500 0028 60df 4000 3f06 a160 9665 a27b
4500 0514 db9a 0000 3606 6ab9 cb3f 3570
4500 0028 60e0 4000 3f06 a15f 9665 a27b
4500 0514 db9b 0000 3606 6ab8 cb3f 3570
4500 0028 60e1 4000 3f06 a15e 9665 a27b
4500 0514 db9c 0000 3606 6ab7 cb3f 3570
4500 0028 60e2 4000 3f06 a15d 9665 a27b
4500 0514 db9d 0000 3606 6ab6 cb3f 3570
4500 0028 60e3 4000 3f06 a15c 9665 a27b
4500 0514 db9e 0000 3606 6ab5 cb3f 3570
4500 0028 60e4 4000 3f06 a15b 9665 a27b
4500 0410 db9f 0000 3606 6bb8 cb3f 3570
4500 0028 60e5 4000 3f06 a15a 9665 a27b
4500 0514 dba0 0000 3606 6ab3 cb3f 3570
4500 0028 60e6 4000 3f06 a159 9665 a27b
4500 0514 dba1 0000 3606 6ab2 cb3f 3570
4500 0028 60e7 4000 3f06 a158 9665 a27b
4500 0514 dba2 0000 3606 6ab1 cb3f 3570
4500 0514 dba3 0000 3606 6ab0 cb3f 3570
4500 0028 60e8 4000 3f06 a157 9665 a27b
4500 0514 dba4 0000 3606 6aaf cb3f 3570
4500 0514 dba5 0000 3606 6aae cb3f 3570