Why clients need to retransmit Confirms
The following shows problems caused
by a client not retransmitting Confirm feature-negotiation options while
in state PARTOPEN.
1. Connection aborted due to Ack lost in the network
The early implementation of feature negotiation did not retransmit Confirm feature-negotiation
options while in state PARTOPEN. This lead to aborting the connection
when the single Ack confirming the DCCP-Response got lost, triggering a
"feature negotiation failed" system log message at the server, such as
seen in the bug
To illustrate the problem, consider the following message exchange that
happened with the early implementation.
The sequence of events is
The corresponding capture files to illustrate the problem can be found here and here.
- Client sends Request #451 with Change options,
- Server replies with Response #171, containing Confirm and Change options,
- Client is happy with the Confirms
sent by the server and continues sending,
- client Ack #452 is lost in the network,
- only DataAck #453 arrives back at the sender (client is still in
- server misses the Confirm
options and aborts connection setup.
2. Requirements and a more robust solution
Aborting the connection setup when feature negotiation fails is essential to ensure that
both endpoints are in a sane state
before entering the data transmission phase.
Clients stay in PARTOPEN until they can be sure that the server has
received the acknowledgment of their Response packet. This happens when
the client "receives a valid packet
other than DCCP-Response, DCCP-Reset, or DCCP-Sync from the server"
(RFC 4340, 8.1).
Thus, although the RFC does not mandate to retransmit Confirm options, for the client
it is necessary to retransmit them as long as it stays in PARTOPEN,
since the initial Ack, any retransmitted Acks or newly sent DataAcks
can get lost in the network.
3. Proof of concept
After this issue had been reported,
the implementation was changed to let the client retransmit Confirm options while in state
PARTOPEN. This is achieved by flushing the client feature-negotiation
queue only at the moment when transitioning to OPEN.
To test the implementation, a test-client was modified to drop the
acknowledgment following the Response, forcing it to retransmit the
Confirm options on the subsequent DataAcks. This is shown in the
screenshot below, the corresponding capture file is here.
The client starts with Request #069 and drops Ack #70. At DataAck #73
it is still in PARTOPEN, so it retransmit the Confirm options shown in the
screenshot. The server then acknowledged #073, after entering itself
OPEN state as a result of receiving the requisite Confirm options. The
Ack #058 by the server is actually a cumulative
acknowledgment, since it contains a CCID-2 Ack Vector (check the
The important point is that the communication continues successfully in
spite of losing the initial Ack, which was not possible with the
4. And yet another problem to solve
Unfortunately there is another problem. Feature negotiation options
take more space than usual options, in part due to the server-priority
list dynamic-length format. DCCP allows clients to send DataAck packets
in state PARTOPEN. So we can either
The amount of feature negotiation options varies usually between 20-70
bytes. To reduce the MPS by always 72 bytes (multiples of 4 byte) is
taking too much away from the normal payload.
- reduce the MPS by the larger-than-usual amount that
negotiation options take or
- devise a special case
for clients with large payloads.
Hence a special case was
devised: clients in PARTOPEN retransmit the
Confirm options by sending an extra Ack before sending DataAcks with
large payloads. This means that the client has replied with a total of
two Acks to the DCCP-Response. If both get lost in the network then the
user needs to try again to connect.
The special-case is demonstrated in the following screenshot, the capture file
shows more details.
The first Ack carrying Confirm
options is #412, answering Response #428. Until the server-Ack #429
arrives (packet 7), the
client remains in PARTOPEN. The DataAck #414
has too little room for the 20 bytes of feature-negotiation options
(due to a payload size of 1420
bytes), hence the client sends the
second Ack, #413, to carry the Confirm
The Ack #429 by the server is a cumulative
acknowledgment with an Ack Vector of run length 2, i.e. down to
#413. Since Ack Vectors are relative only to OPEN state, this shows
that the server entered OPEN state as expected when receiving Ack #412.
So this second Ack saved the day for the connection. Had it not been
used, packet 5 would have been dropped due to over-length. In CCID-3
this causes ugly throughput reduction due to packet loss.
On the other hand, if Ack #412 had been lost, the second Ack #413 would
have allowed the server to enter OPEN and carry on as if nothing had
happened. If the second Ack were lost too, it may be time to get a cup
of coffee and try to dial in again...