Partial Packets in NIO ?

Endemoniada · March 24, 2005, 2:18pm

Hi guys,

How do you normally handle partial packets ?

Here is some pseudo-code:

if(key.isReadable()){
…
client.read(buffer);
…
// check if it’s a full packet here ?
}

Should I be using something like XML tags, for example:

the stuff he sent

Then I can check it that way. And is it possible that the server can receive these kinds of packets:

// packet 01
this stuff
// packet 02
wasn’t complete
// packet 03
smallstart of next
// packet 04
packet

See what I mean ?

I’m pretty new to network programming but I’m a pro at C(++) so I’ll get the hang of this stuff eventually.

Thanks.

vrm · March 25, 2005, 6:53am

I use another trick, the first 4 byte of the packet are an Integer representing the total size of the packet, so I know how many bytes I need to wait.

endolf · March 26, 2005, 5:37pm

In my case I’m using simple messageing (enough for what I need) and I use the first 2 bytes as a message type code, and the next two as the data length, then I can read bytes off the stream till I’ve got that many and process them.

Jeff · April 16, 2005, 1:44am

This isn’t really an NIO issue, its a TCP/IP issue. TCP is a stream protocol, not a packet protocol, and as such doesnt necessarily return all the bits pushed into the stream at the same rate otu the other end. UDP is just unreliable and therefor can split up or break packets.

While there are some knobs on TCP/Ip you cna twiddle and some assumptiosn you cna make for small UDP packets, the general rule is that you need to implement your own apcket protocol ontop of either.

This is what the guys were talking about with size headers-- they built simple protocols that let them udnerstand wha they were receiving on the far end as descrete packets.

As I do packet communciation in my project the bottom-most layer of my app communciatio nstack is similar, an integer size proceeds each packet of data. NIO Buffers provide the scatter send methods that alloww yo uto build up protocol layers without having to copy data.

divzero · April 17, 2005, 8:11am

Jeff, I thought UDP did provide a pretty good (albiet unreliable) packet transmittion protocol already, the only reason I can see for creating your own packet protocol over the top would be if your datagrams were dangerously large (i.e. larger chance at getting lost) or if you needed some sort of reliability. In which case you start to lose the benifits of UDP to begin with.

Will.

Jeff · April 18, 2005, 4:06am

[quote]Jeff, I thought UDP did provide a pretty good (albiet unreliable) packet transmittion protocol already,
[/quote]
I believe as part of that unreliability UDP gives you no gaurantee that a packet will arrive in one peice (or even that all of it will arrive at all) but I’ll double check as its been a bit…

Jeff · April 18, 2005, 4:13am

Okay ncie sumamry of UDP from here…

http://www-net.cs.umass.edu/kurose/transport/UDP.html

UDP, defined in [RFC 768], does just about as little as a transport protocol can. Aside from the multiplexing/demultiplexing function and some light error checking, it adds nothing to IP. In fact, if the application developer chooses UDP instead of TCP, then the application is talking almost directly with IP.

UDP takes messages from application process, attaches source and destination port number fields for the multiplexing/demultiplexing service, adds two other fields of minor importance, and passes the resulting “segment” to the network layer. The network layer encapsulates the segment into an IP datagram and then makes a best-effort attempt to deliver the segment to the receiving host.

If the segment arrives at the receiving host, UDP uses the port numbers and the IP source and destination addresses to deliver the data in the segment to the correct application process.

So the real question is what if any gaurnatees does IP give you.

Succint definiton of IP from here:

http://www.freesoft.org/CIE/Topics/79.htm

IP is the Internet’s most basic protocol. In order to function in a TCP/IP network, a network segment’s only requirement is to forward IP packets. In fact, a TCP/IP network can be defined as a communication medium that can transport IP packets. Almost all other TCP/IP functions are constructed by layering atop IP.

IP is documented in RFC 791, and IP broadcasting procedures are discussed in RFC 919. The Encyclopedia’s Programmed Instruction Course includes an IP Section.

IP is a datagram-oriented protocol, treating each packet independently. This means each packet must contain complete addressing information. Also, IP makes no attempt to determine if packets reach their destination or to take corrective action if they do not. Nor does IP checksum the contents of a packet, only the IP header.

IP provides several services:
* Addressing. IP headers contain 32-bit addresses which identify the sending and receiving hosts. These addresses are used by intermediate routers to select a path through the network for the packet.

* Fragmentation. IP packets may be split, or fragmented, into smaller packets. This permits a large packet to travel across a network which can only handle smaller packets. IP fragments and reassembles packets transparently.

* Packet timeouts. Each IP packet contains a Time To Live (TTL) field, which is decremented every time a router handles the packet. If TTL reaches zero, the packet is discarded, preventing packets from running in circles forever and flooding a network.

* Type of Service. IP supports traffic prioritization by allowing packets to be labeled with an abstract type of service.

* Options. IP provides several optional features, allowing a packet's sender to set requirements on the path it takes through the network (source routing), trace the route a packet takes (record route), and label packets with security features.

So by my reading of these sources, which seem well researched with proper references back to the RFCs, I was correct and UDP indeed CAN both fragment and garbage data.

endolf · April 18, 2005, 4:41am

Hi

I think the important bits to note are that fragmentation and reassembly can take place without you knowing, and there is no checksum on the data. The other parts of UDP (no reliability of packet arriving, no resending at protocol level etc) are known. This means that your data has a reasonable chance of not only not appearing at a remote side, but being a little on the screwy side. Just because your packet arrives, doesn’t mean it’s the same size or contents as the one you sent. Time to stick a checksum byte at the front of your packets peeps

Endolf

kevglass · April 18, 2005, 5:30am

While IP does fragment that doesn’t effect you since according to that quote above:

As to checksums, assuming ethernet (which of course we can’t) you’ve got a CRC. I believe ATM the only other (protocol I know that used widely on the internet) also uses a CRC (sometimes referred to as a CFF).

The UDP header(http://www.networksorcery.com/enp/protocol/udp.htm) like the TCP header includes a checksum for the IP header, UDP header and data. Having written code to generate both the UDP and TCP checksums by hand in my previous job I’m scarred by this memory.

So, if you send a UDP packet at the other end you’ll only recieve it if your IP packet has been reassembled and the data was correct (well as much as a 16bit checksum can give you).

Kev

Jeff · April 18, 2005, 4:19pm

[quote]While IP does fragment that doesn’t effect you since according to that quote above:
[/quote]
You are correct and I stand corrected.
It was late and I missed that. I do want to do more research though because I don’t entirely believe it. Im pretty sure, for instance, if you get over MTU, that you can see split packets. If not, then you would never see splits in TCP because TCP is built ontop of UDP, but you do.

Yup we can’t. Thst the very definition of “internet”, that you don’t know the exact transports between you and the end.

[quote] I believe ATM the only other (protocol I know that used widely on the internet) also uses a CRC (sometimes referred to as a CFF).
[/quote]
Well when I was in college there was BitNet too. I have no idea whats out there. If you want to make extra-protocol assumptiuosn you can… but they may suddenly and inexplicably fail so I don’t like to.

Jeff · April 18, 2005, 4:28pm

Whelp Kev, I give, you’re right… here’s a good high level explaination of reassembly:

Nice IP fragementation FAQ in general here:

http://www.geocities.com/SiliconValley/Vista/8672/network/ipfrag.html#A1

This raises the issue thopugh of why you should ever see partial packets out of a TCP stream. The only reason I can think of is if bandwidth v. buffering of the stream on the sending end caused it to divide the data in transmission…

divzero · April 19, 2005, 3:39am

Indeed, that was also my understanding, you either read out your datagram in full at the other end (noting that if your datagram is larger than the ByteBuffer you read it into, the remaining bytes will be silently destroyed) or it never arrives.

I suspected this was the case when I started investigating UDP in NIO (really, if it wasn’t the case, UDP would be a lot less useful). However, I found is surprisingly hard to confirm. I read several books, most which skim over UDP or just provide code samples and don’t mention the fundimental concepts like this (Java Network Programming, 3rd Edition was pretty useless in this regard).

I did however find an excellent reference which confirmed my assumptions about UDP handling in NIO, the relevant section which I have quoted for you.

The book is the Java NIO book.

Quoting from section 3.5.4:

[quote]Invoking send( ) sends the content of the given ByteBuffer object, from its current position to its limit, to the destination address and port described by the given SocketAddress object. If the DatagramChannel object is in blocking mode, the invoking thread may sleep until the datagram can be queued for transmission. If the channel is nonblocking, the return value will be either the number of bytes in the byte buffer or 0. Sending datagrams is an all-or-nothing proposition. If the transmit queue does not have sufficient room to hold the entire datagram, then nothing at all is sent.
[/quote]

If you have any NIO questions then I highly recommend that book! I would have saved many hours of searching if that was the first book I found.

Cheers,

Will.

Jeff · April 19, 2005, 3:59am

Cool, I’ll have to add it to my reference library.

Thanks Will!

JK

blahblahblahh · April 19, 2005, 4:47am

Sorry for not replying earlier; very busy at the moment, and have a 1.5 hour conference lecture to give tomorrow (!).

Did the TCP bible have nothing to say on fragmentation?

Yeah, they’re not worth the time of day. To a certain extent, it’s legitimate that they don’t bother, precisely because everyone has a copy of the bible :(.

But, mainly, it’s that the authors don’t tend to know much about the details of their subject, AFAICS :(.

Note that the author apparently hasn’t tried a lot of what he tells you to do (going by the fact that the book tells you to do things that for the first 2 years of publication were wrong or impossible). The book gives up at most points just as it’s about to get into telling you something truly useful and difficult to deduce for yourself :(. So … YMMV. Personally, whilst i find it’s not a bad book, I can’t say it’s particular good either - I was disappointed. Shrug.

dsellars · April 19, 2005, 6:13am

I know this is a bit out of context, but I’m pretty sure that this isn’t correct:

[quote]TCP is built ontop of UDP
[/quote]
The only thing they have in common it that they are both Transport protocols.

Dan.

kevglass · April 19, 2005, 6:19am

I’m sure he meant IP.

The segmentation you see at TCP isn’t the same as the segmentation at IP. TCP segments are segments of the buffer that is being transmitted. When TCP sends out its segements via IP they could be resegmented but IP would be responsible for reassembling the segments before passing them up to TCP for it to perform it resegmentation

Cooooool

Kev

endolf · April 19, 2005, 5:41pm

Did I miss the point of Kev’s post when it comes to checksumming. I know that the header is checksummed. But what about the data?. If the packet gets fragmented, then it’s either rebuild (fragment count wise) totally or not at all, but what about corruption during transmission?

There is the posibility of this, either part of a fragments data, or when not fragmented, the data of the packet might get corrupted, so how does this show that we don’t need checksumming?.

Cheers

Endolf

kevglass · April 19, 2005, 5:43pm

UDP performs a checksum on the DATA.

Kev

endolf · April 19, 2005, 5:45pm

Re read RFC 768, must have misread it as I thought it was just the header, but …

[quote]Checksum is the 16-bit one’s complement of the one’s complement sum of a
pseudo header of information from the IP header, the UDP header, and the
data, padded with zero octets at the end (if necessary) to make a
multiple of two octets.
[/quote]
Endolf

divzero · April 19, 2005, 11:05pm

[quote]Sorry for not replying earlier; very busy at the moment, and have a 1.5 hour conference lecture to give tomorrow (!).

Did the TCP bible have nothing to say on fragmentation?
[/quote]
I don’t have that book, do you highly recommend it?

I think that’s exactly it, they didn’t know much about it but figured they needed to have a chapter on it, so they just cooked up an example from the API without actually understanding it (stupid really, since understanding how it works is exactly why the reader is digging deeper and reading the book to begin with)

You may be right. I have found what I have read so far very useful, especially the paragraph’s I quoted. That’s not to say it is the ultimate NIO reference, just that it answered beautifully a question I had searched quite hard to answer/confirm. My comments on both books mentioned were purely related to their handling of DatagramChannel.

The section on Selectors in Java NIO also looks very good, I will soon be tackling that.

Cheers,

Will.