Spinoff project: Advanced lowlevel Network API

Yes, Yet Another Network Library, but thanks for taking the time to read this post :wink:

I often found myself in the need of a network API that was based on NIO, had decent request-response support, and allowed non-blocking packets over the same channel. Basicly multi-threaded support, where multiple threads can request-and-wait-for-response on the same socket-channel.

After a bit of head-scratching I decided to give it a go. The idea was to support the above features, but first I needed a basic framework to handle NIO easily - being able to guarantee that a sent packet of N bytes would be received as a packet of N bytes at the other side, with minimal overhead (currently 1-4 bytes per packet, depending on the number of bytes to be sent). Trival stuff so far, this was the base framework and can be found in the craterstudio.network package.

In the Sync-layer (the layer above the basic framework), the advanced features are implemented. You can do a request->response from any thread. Other network-frameworks tend to abstract the whole I/O, but for it to be useful to other people, it had to be close to the metal. So we are still sending ByteBuffers and receiving ByteBuffers, only the actual client<->server communication and synchronization are hidden. it can be found in the craterstudio.network.sync package. I could write quite a lot about it, but a tiny code-example will explain it probably better. :slight_smile:

Network API: base example code
Network API: sync example code

The javadoc should be self-explanatory after reading the example codes

Network API: javadoc

If you are interested, please give it a try!

Network API: libs

(compiled with Java 1.5)

I think it needs some clarification.

The API was designed for a situation where we need the client to get synchronized with the state of the server. Also the client sends updates to the server, which get ‘multicasted’ (just tcp) to all connected clients (upon validation ofcourse). This applies to most game-servers.

So we have a relatively dumb server, that processes all incoming packets in 1 thread, in order. It sends updates to all clients when one client made a change in the state of the server. So serverside we have a trival situation.

Client-side is a whole different story. You want to send an update, wait for the response of the server, but meanwhile the server could have sent update-packets, that really need to be processed before the response is processed (for which we block) otherwise we can get horribly out of sync with the server (processing updates in the wrong order, likely resulting in corrupt data). So another thread should handle all packets that are not the ‘response’ to the ‘request’ we were looking for. This thread must be synched with the blocking-thread, as we do not want to process incoming packets concurrently (also likely resulting in corrupt data). Last but not least this multi-threaded setup must process the packets in exactly the right order.

When more than 1 thread is blocking for a response, and the server also sends ‘randomly’ timed updates, complexity spirals out of control. We can’t make quick hacks anymore to get things working correctly.

Let’s look at the following case:


Line 0: SyncPacket responseA = requestResponse(ByteBuffer requestA);
Line 1:     ByteBuffer dataA = responseA.data();
Line 2:     ...;
Line 3: responseA.done();
Line 4: SyncPacket responseB= requestResponse(ByteBuffer requestB);
Line 5:     ByteBuffer dataB = responseB.data();
Line 6:     ...;
Line 7: responseB.done();

The server:
Event 1. received the first request, sends a response
Event 2. received another packet (from another client) and multicasts an update
Event 3. received the second request, sends a response

The client must be able to handle the update (Event 2) between line 3 and 4, as the update on the server happened before request #2, so the client has to do that in the same order. Things start to get difficult, imagine N threads blocking for a response, while the server also sends updates around. I hope the problem is clear, proving the usefulness of my framework: it will all be handled under the covers, in-order and synchronized (never more than 1 thread processing a packet). The protocol manages to handle up to 255 concurrent requests per socketchannel, which should be more than enough.

If you don’t need the advanced features, the base-framework is also very handy, abstracting NIO nicely, while keeping control.

Hi,
Thanks for giving us this code, the low-level ByteBuffer send/recieve looks like what I’ve been looking for.

I’d give this library a go if I hadn’t already committed myself to endolf & kev’s New Dawn NIO library. Something that you might consider adding though: as an alternative to listeners, add a simple recieve() method that returns ByteBuffer if a whole byte message is available or null otherwise. And throws an exception if connection dead.

Cheers,
Keith

It was like that in the original API. Then I decided to make it event-driven, resulting in:

ClientListener:
public void clientReadable(Client client)
{
ByteBuffer packet = client.receive(); // may be null
}

I thought the null-check was just annoying, so I only fired an event when a full packet was received:

ClientListener:
public void clientReceivedPacket(Client client, ByteBuffer packet)
{
// packet is never null
}

the Client.receive() method is still there, but is called automaticly when the socketchannel becomes readable (OP_READ) and has package-scope access. Making this method public will break the SyncLevel code, as it relies on the ClientListener and will miss packets when the user-code will intercept packets.

Maybe I could add another layer (on top of on the base layer, not the sync layer) that queues the received packets.

ByteBuffer packet = DirectLayer.receive(Client client); // returns first received packet, queued from ClientListener.clientReceivedPacket
ByteBuffer packet = DirectLayer.sent(Client client); // returns first sent packet, queued from ClientListener.clientSentPacket

But that would be a layer enabling a feature on top of a layer that already supports it, but hides it. Very easy to code, but it smells a bit :wink:

I’ll take a look at the New Dawn API.

Regarding the New Dawn API, it looks very much like my base-layer, yet closer to the metal. It has a fixed overhead of 4 bytes (2x short) per packet, and supports packets of ‘only’ 32k.

My API has these overheads:

[tr][td]Packets up to 63 bytes[/td][td]1 byte overhead[/td][/tr]
[tr][td]Packets up to 16,383 bytes[/td][td]2 bytes overhead[/td][/tr]
[tr][td]Packets up to 4,194,303 bytes[/td][td]3 bytes overhead[/td][/tr]
[tr][td]Packets up to 1,073,741,823 bytes[/td][td]4 bytes overhead[/td][/tr]

Further there aren’t really any advantages to my base layer, besides not having the ‘bug’ you found in the New Dawn API :slight_smile:

For me is was all about the sync layer, the base layer is pretty basic.

Thanks for looking into that. Why is the limit 32k? That’s a bit of a problem for me since my packets are 20k, and could reach 30k+.

I’m just looking for a base layer since I’m putting my SuperSerializable streams on top which sound like they do something similar to your sync layer. I’ve actually modifed the NewDawn code into just two classes, ByteArrayConnection and ByteArrayConnectionServer which, as their names suggest, just send and recieve byte arrays without all the other stuff.

Are you using ByteBuffers for their direct-memory mapping? I’m curious since with my massive 20k+ byte arrays that are created/disposed garbage collector ticks can be high, but I think I read somewhere that NIO ByteBuffers that are memory-mapped avoid using the gc at all, is that right?

In the New Dawn API the limit is 32k, because the packet-length is encoded as a signed short. Easy as that.

For networking the packet-counts (per second) are normally low enough not to cause any GC issues. ByteBuffers are objects too, and incase they are direct buffers, they are even heavier (minimal ~4k (pagesize) per newly alloc-ed ByteBuffer). It would be better to create a giant ByteBuffer every once in a while and slice() it in chuncks, so that you don’t have that excessive memory-usage and GC.

Currently all new ByteBuffers created by the framework are heap-bytebuffers (wrapping around a byte[]) but I will soon make that controlable, through a ByteBufferFactory design.

An efficient direct-bytebuffer factory would be like:

static ByteBuffer massive = null;
public final static ByteBuffer create(int size)
{
if(masive == null || massive.remaining() < size)
{
massive = ByteBuffer.allocateDirect(4 * 1024 * 1024);
massive.position(0);
massive.limit(0);
}

massive.position(massive.limit());
massive.limit(massive.position() + size);

return massive.slice();

}

You need to do a data copy if the number of elements requested is larger than what is remaning because if your just initialising it to a new buffer, the old contents will be lost. Also, you might want to do ByteBuffer.allocateDIrect(massive.capacity() + size); instead of initialising that blindly to 1024*1024

DP

Interesting… I remember karmagfa raised this problem in the network forum > http://www.java-gaming.org/forums/index.php?topic=12457.0

Ring buffers were said to be the solution but nobody came up with how to make one. Sounds like they’d be ideal since then you wouldn’t be periodically feeding the GC massive ByteBuffers that have would have been promoted to non-eden GC space.

Ofcourse not. We are only creating new ByteBuffers, because of the nature of Java the old contents will never ever be lost, until the GC decides to do so. We’re not trying to store data in the massive buffer for later use. We only create tiny buffers from it, which will be used to read from a channel, passed to the events and processed by the user. It doesn’t matter whether we lose the reference to the massive buffer, as we’re never going to use it anyway, it’s private to the Factory.

HTH

I’ve built a RPC (Remote Prodecure Call, RMI-like) “layer” on it.

“layer” was a bit of a poor name for what they did, so they were renamed to Contexts.

NetworkContext
|-> SyncContext
|-> RpcContext

By building on top of the SyncLayer, we still have (up to 255) concurrent blockings on the same channel, so blocking on multiple method-invocation-results is supported. The SyncLayer added 1 byte overhead per packet, and the RpcContext adds 3 more, regardless of the length of the packagename+classname+methodname, enabling access to up to 32k remote methods. Check out the new javadocs for more information!

Async invocation

rpc.invokeAndForget(String methodPath, Client client, ByteBuffer packet)

Request/response invocation

SyncPacket result = rpc.invokeAndReturn(String methodPath, Client client, ByteBuffer packet)

Example:


// remotely
rpc.addAccessibleClass(RpcMath.class);
rpc.addAccessibleClass(YourOwnStuff.class);



// locally
SyncPacket sp = rpc.invokeAndReturn("RpcMath.add()", client, bb);
{
   ByteBuffer data = sp.data();
   System.out.println("sum=" + data.getInt());
}
sp.done();

rpc.invokeAndForget("RpcMath.print()", client, cc);

Invokes on other side of the connection:


package whatever.package;

public class RpcMath
{
   public static final ByteBuffer add(SyncPacket p)
   {
      ByteBuffer data = p.data();
      int a = data.getInt();
      int b = data.getInt();

      ByteBuffer bb = ByteBufferFactory.create(4);
      bb.putInt(a + b);
      bb.flip();

      return bb;
   }

   public static final void print(SyncPacket p)
   {
      ByteBuffer data = p.data();
      int a = data.getInt();
      int b = data.getInt();

      System.out.println("print: a=" + a + ", b=" + b);
   }
}

According to the massive and enthusiastic responses ahem ;), everybody already rolled-their-own network framework or can’t see how extremely handy this framework is.

Either way, I think I’m not going to post updates of the released JAR until somebody actually shows some interest.

Yeah, sorry, these things need a LOT of testing to be reliable and worth using, and I’ve already got two that have had years of testing, so … not really interested in a 3rd party one unless its really spectacular and/or being used by a lot of people already :slight_smile:

I thought my framework was pretty spectacular :slight_smile:

How many libraries can say you can do concurrent blockings from any thread, over the same socketchannel :slight_smile: and where the blocking threads are all sync-ed so that they ‘fire’ in the exact order the server sent the packets. With just 1 byte overhead.

I just can’t stop rambling about it :slight_smile:


Thread A
{
   ByteBuffer result1 = sync.requestResponse(client, ByteBuffer request1);
   ByteBuffer result2 = sync.requestResponse(client, ByteBuffer request2);
}

Thread B
{
   ByteBuffer result1 = sync.requestResponse(client, ByteBuffer request1);
   sync.sendAsyncPacket(client, ByteBuffer something);
   ByteBuffer result2 = sync.requestResponse(client, ByteBuffer request2);
}

Thread C
{
   while(true)
   {
        ByteBuffer async = sync.takeAsyncPacket();
   }
}

The only things missing are exactly those you mentioned: nobody is using it, and it lacks extensive testing, but hey, there are only 4 classes that actually do something :slight_smile:

Well, I think it sounds pretty cool from what I can understand of it. Unfortunately I don’t have a use for a network API right now, nor the time to facilitate the completion of such a project.

If it makes you feel better, I bookmarked this page and I promise (if I remember at the time) to go back and use your API if I ever see a potential need for it. Maybe I’ll try working with it next week, we have spring break then.

I can tell you put some good effort into this and hate it see it go unrecognized. :slight_smile: I sincerely hope someone else sees this who can find a good use for it.

Perhaps you could write a short demo application showing what it does? People like demos. :slight_smile: (and their easy to learn off of, especially for complete newbies in a field who may overlook something seemingly obvious)

Yeah, like visualizing … bytes ;D

So I tried… here is it: Network API: webstartable Demo

The server is fairly stupid and just echoes what it received from the client. The blocking-behaviour should be pretty clear.

I think I get what your ‘sync’ layer API does now - it lets you do network communication without having to worry about threading.

This really is great, fool-proof threading flexibility is valuable. Especially if you’re sending client mouse & key events to the server (in the AWT Event thread) and also sending stuff from the game thread. I overlooked this threading problem just last week in my own game (like all threading problems its obvious now but it wasn’t then).

If I do understand your API properly then developers looking into networking will appreciate what you’ve done. Threading problems in network code are the worst kind because you can never be sure if it was indeed a threading problem, a NIO network library bug, or an asymmetry in the order you wrote and then read from your DataIn/OutputStreams.

It was originally meant as a framework that allowed the server to send more than the client asked for. One thread could be blocking for ages, waiting for a response, while other threads would be sending and receiving packets. When I saw the potential of it (or at least though I did), I decided to make it a more general tool. The sync-layer was meant to be the final destination, but the RPC layer (RMI, but then static methods) was so trival to put on top of it, and would make networking so easy (no more non-OO, error-prone switch-statements at the processing side) that that became the final destination (for now ;)). I don’t even use the sync-layer directly anymore, only RPC, at the cost of extra 3 bytes per packet (soon reduced to 2).

The above demo would look exactly the same for RPC, only then each block would be a method-invocation.

I’m just curious, but what does the RPC layer do and how?

Check out Reply#10 in this thread for the “what” part of your question.

As for the “how”, the Local client is given a “method-path”, and checks whether or not it has a key for it. If not, then it requests a key from the Remote client, which it will return (or it will return an descriptive-error if something is setup incorrectly). Then it uses that key (1 short / 2 byte) to invoke a method on the server. If you invoked a method with a return-type (ByteBuffer) that will be sent back over the line. If the return-type is void, it will be non-blocking. This is not configured during invoking, but during the retrieval of the key.

ByteBuffer bb = rpc.invokeAndReturn(“RpcMath.add()”, client, ByteBuffer); // blocking (request->response-packet in sync layer)

rpc.invokeAndForget(“RpcMath.print()”, client, ByteBuffer); // non-blocking (async-packet in sync layer)