Riven, I was thinking a lot about the stuff you said about poor cache hits, etc. Also what you said about how the server jvm eliminates the array-buffer bottleneck got me thinking. For a while, the only way I could see direct ByteBuffer access working is if every connection had its own ByteBuffer. Of course, in a highly concurrent situation, that would cause even worse cache performance than my array-copy system.
Here’s an example of why I thought of having one ByteBuffer per client… Suppose a client sends a few integers and the server is expecting integers. What if a packet’s payload isn’t a multiple of 4? That means there will be 3 bytes remaining that the server program can’t really use. So you’d think that to keep track of this, on the server each connection would need its own buffer.
I was thinking for ages about it, trying to discover a solution that involved storing the left-over data somewhere, then retrieving it and appending the new ByteBuffer data onto that. Then I realised that instead of complicating SocketManager even more, I could pass this responsibility off to the implementor of HostListener. I did this by removing hostDataAvailable() and adding hostDataFound() and hostDataBuffered(). hostDataFound() is called after an OP_READ event is triggered and inputBuffer (a ByteBuffer) is cleared. It can add any left over data from the previous invocation of hostDataBuffered() (such as those pesky 3 bytes when ints are wanted). Once this method returns, the selector loop will read into inputBuffer from the channel then call the listener’s hostDataBuffered() method if zero or more bytes are read (otherwise disconnect the host).
I think a very simple way of parsing in hostDataBuffered would just be to put everything inside a big try block, catching a BufferUnderflowException. The catch clause would then get the remaining bytes out of the buffer and shove them back into the buffer on the next hostDataFound().
Hopefully this is the optimal solution.
I haven’t yet modified the output subsystem to take ByteBuffers instead of arrays. I don’t yet know how feasible this would be. When I come to a decision and make the necessary modifications (if any), I’ll post the updated package and documentation.