NIO problem sending data when multiple requests in the queue

zidsal · March 15, 2014, 12:24pm

I have my networking code set up so if we wish to send data we simply add it to the queue and mark the selectionkey for OP_WRITE

	public void send( byte[] message) throws IOException{
		if(this.socketChannel == null){
			throw new SocketException("Socket connection has closed!");
		}
		
		messages.add(message);
		selectionKey = selectionKey.interestOps(SelectionKey.OP_WRITE);
	}

in my handleWrite() method I simply loop over all the data we find in the queue and send it

	public void handleWrite(SelectionKey key) throws IOException
	{
		this.socketChannel = (SocketChannel) key.channel();
		
                //error checking removed to conserve space
		
		while (!messages.isEmpty()) {
			writeBuffer = ByteBuffer.wrap(messages.peek());
			
			if(socketChannel.write(writeBuffer) == 0){
				break;
			}
		     
			if (writeBuffer.remaining() > 0) {
				break;
			}
			writeBuffer.compact();
			messages.remove(); //everything went well so remove head
		}
		
		selectionKey.interestOps(SelectionKey.OP_READ);		
	}

in my read method I simply take the bytes I get and transform it into a packet object

	public Packet handleRead(SelectionKey key, PacketFactory factory) throws IOException{
		
		this.socketChannel = (SocketChannel) key.channel();
		
           //error checking removed to conserve space
		
		int bytesRead = 0;
		readBuffer.clear();
		
		try {
			bytesRead = this.socketChannel.read(readBuffer);
			readBuffer.flip();
		} catch (IOException e) {
			log.message("connection closed at: " + this.socketChannel.getRemoteAddress(), Log.INFO);
			key.cancel();
			close();
		}
		
      //error checking removed to conserve space
		byte[] data = readBuffer.array();
	
		return factory.buildPacket(transformer.getType(data), transformer.getData(data));
	}

my problem arises from the fact of when there are mutliple messages in the queue. So lets say I have requested to send a packet 3 times [3]:FOO:[3]:BAR, [3]:FOO:[3]:BAR, [3]:FOO:[3]:BAR. I would expect my read to be called 3 times and return 3 FOO packets with each one having the arguments BAR. Instead it returns 1 big packet FOO with the arguments BAR,FOO,BAR,FOO,BAR.

whats the best way to split this up? Have I done something wrong with my reading and writing or is this behavour to be expected? should I instead change my protocol so packets have a termination character so we know when a new packet begins?

TeamworkGuy2 · March 15, 2014, 2:32pm

Looks like you’ve got a good understanding of how socket channels work. Took me weeks to understand them
Yes, if you use TCP (Socket, SocketChannel) the underlying OS/socket protocol stack will arbitrarily decide when to send a packet, you can somewhat affect this by calling flush if you are using a Socket, but I’d have to do some research.
The JavaDoc for OutputStream’s flush() method mentions:

However, this doesn’t really apply to SocketChannel’s read() or write() method. So I’m not sure that there are any guarantees about the behavior of either method.

If you use a DatagramSocket or DatagramChannel and keep the data you write under the maximum UDP transmission size, then the write methods will almost always send one packet per each write() method call, although it may vary from OS to OS.

So yes, the behavior you are describing is expected. One way around it is to use a special marker byte/character between packets.
However, if the data in your packets might contain that special marker, then you will need to use an escape marker for your special marker to differentiate between the end of a packet and a marker that happens to be data in the packet.

Another solution is to include the length of the packet at the beginning of each packet, this way you don’t need any special markers or escape markers. Just read the length whenever you receive data and keep reading until number-of-bytes-read == length, then read the next packet’s length and so on.

Riven · March 15, 2014, 5:13pm

Additionally, no matter how packets leave a computer, the routers and switches in between are at liberty to split/merge packets as they see fit. Never make any assumptions on packet boundaries.

zidsal · March 15, 2014, 5:59pm

ok thanks for the details guys. I’ve decided to add a termination character to the end of each packet so I can detect if I have multiple packets being sent to the read.

jonjava · March 15, 2014, 7:26pm

You could also additionally or instead put the length of the packet at the start so you know when the packet has been read fully and prepare for the next packet. This has the benefit of you knowing when to stop reading and e.g. check if the packet length is something ridiculously large and skip it.

Riven · March 15, 2014, 9:06pm

you can’t skip packets without reading them…

jonjava · March 15, 2014, 9:33pm

Yes, you’re right. But you can avoid allocating memory for erroneous packets. I guess it makes more sense if you have some sort of packet identifier.

Riven · March 15, 2014, 10:23pm

Upon my recommendation (claim to fame!), Nate used chucked encoding in Kryo. This allows streaming of data without knowing the packet length ahead of time, and equally reading back the stream without holding it in memory.

With chunked encoding you basically say that the current message has N more bytes, and after N bytes you send the size of the next chunck. A chuck size of 0 signals the end of the message.