Java.nio UDP

Ive successfully written a class to handle an arbitrary amount of TCP connections using java.nio, but I havent found any good information regarding UDP sockets, anyone that can help with some links perhaps? Or even better, some code?

I recently had a look and saw DatagramChannel, is that the way to go? I would guess so, please correct me so that I dont start coding away on that :slight_smile:

Yep… DatagramChannel is UDP.

I just had a need for this info as well, and while I found DatagramChannel, what I can’t find is a Datagram (UDP) version of ServerSocketChannel. I.e. no NIO based UDP stuff with an accept() method.

How do you use NIO and UDP on the server side?

I think you’re meant to use

DatagramChannel.open()

Then use the socket() method to obtain the DatagramSocket associated to bind to the server address. Then go on to use the channel operations as you would do normally.

You can connect() sockets if you’re only going to use the server socket for one connection ever, otherwise you’ve got security check overhead each time a packet is sent.

Kev

I saw that… I didn’t understand it… how would I block waiting for a connection to the socket?

I started with something like this:


isa = new InetSocketAddress(InetAddress.getLocalHost(), PORT);
            DatagramChannel udpChannel = DatagramChannel.open();
            udpChannel.socket().bind(isa);

But then I was stuck with how to wait for a connection. For my TCP connection I use the accept() method of ServerSocketChannel.

Hmm… it looks like maybe I’m just totally confused about how UDP works. I’m looking at some examples and they just try to read data… which seems to wait for a connection. Is there no concept of a single connection used for multiple packets with UDP on the server side? The client side seems to work that way…

I’m totally new to UDP - Help!

With udp there is no client and server as with tcp. You don’t connect with the other machine. The connect() function in Datagram channel do not create a real connection with an other machine. It just locks the address that you can send to. Because the client don’t connect to the server, the server don’t accept either. It just reads the packets it recieves and checks the address to know where it come from. You have to keep track of which address represents which sender.

Since there is no client/server, both endpoints of the comunication works in the same way. They both try to send packets to addresses, and the recieve packets.

http://www.district86.k12.il.us/central/activities/computerclub/Tutorials/Winsock/Index.htm

Thanks. That confirms what I inferred from the small bits of code I could find. I’ll check out the link.

UDP is not relyable like TCP in transporting data as no connection is established and no action is taken if packets get lost (at this layer). Since establishing the connection in TCP is an expensive operation, UDP is very useful for answering short requests where it isn’t as important if the answer gets lost.

For example, UDP is used when you do a lookup on a DNS server. If either packet gets lost then the person looking up the DNS server simply tries again (most software would do this automatically).

Another use would be setting up a way of “pinging” (not ICMP) a server to see if it’s alive.

If you need relyable data transmittion like for playing a networked game then TCP is the way to go - but if you want a way for the game client to query a game server to see if it’s full (for example) then UDP may be more appropriate.

Hope that helps,

Will.

I know that UDP is not reliable and that means the packets may be lost or arrive out of order, I just wasn’t aware that there was no concept of a ‘connection’ with UDP. I think I have it all straightened out now.

If I need a connection concept I have to make it myself based on the sender address that I can get from the packet. I assume that it is relative safe to assume (even with firewalls involved) that packets from the same address and port are likely part of the same transmission and that if my program was running on two different machines behind a firewall that the sender’s port numbers would be different and if I reply to the same address and port that the reply will likely get back to the sender.

This isn’t for realtime game data. I’m building a general file transport (think downloading levels/models from the server) I want to avoid latency issues involved with acknowledging TCP packets. I have another method for dealing with retransmission and acknowledgement. If it doesn’t work I may end up using FEC.

In my experience with mobile phone operators’ firewalls / network address translators (NATs), a UDP address/port mapping in the NAT may be discarded if no packets are seen for about a minute. If new packets come after that, they get a new address/port mapping. Therefore if your application is going to have such long delays, make sure that the client behind the NAT sends occasional ‘keep-alive’ UDP messages (that the server receives and ignores). In case that doesn’t work (e.g. the keep-alive message gets lost), you can put a ‘session-ID’ at the start of each of your UDP messages, then the server can recognize the ‘UDP connection’ even if the address/port mapping has been changed. (The server should keep track of the most recent sender address/port for each UDP connection, so it can send packets back to the client).

[quote] I just wasn’t aware that there was no concept of a ‘connection’ with UDP.
[/quote]
I think we need to make this thread:

http://www.java-gaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=Networking;action=display;num=1046872611;start=15

sticky. And / or get a prominent summary of UDP and TCP. For anyone who’s not read it, it does contain some good summaries of UDP and TCP.

Then why are you interested in UDP? I’m assuming you’re trying to build some fantabulous new transport protocol that is more efficient than the standard TCP; if so, can you not just use TCP Vegas?

The general approach should always be to use TCP unless you have a really good reason not to (FYI for anyone without the time to read that other thread…). Really good reasons typically include:

[]I’m only sending 30 bytes of data at a time
[
]I need very very fast round-trip-times for messages - e.g. bullets and player positions in an FPS
[]This has to work in less than 0.2k/s bandwidth per player, and the overhead of TCP headers is making that difficult to achieve
[
]If data is more than a few tens of milliseconds late, it’s useless and I’ll delete it anyway (e.g. in IP-telephony)

Of course there is NO EXTRA LATENCY ( this is isn’t quite true, but when talking about file-transfer it’s effectively true ) with TCP packets if you need all the data, hence all file-transfer protocols use TCP (in general; there are of course exceptions :). e.g. Use of UDP is one of the reasons that NFS is useless in the wild. ).The only situation (other than a need to try and beat Vegas) I can think of off the top of my head where you would not want to use TCP is if your file contents were changing from second to second; that would be pretty interesting.

Somewhere like the wiki perhaps…?

Searches for TCP Vegas yielded only a Linux implementation. Based on the info found in some of the other links TCP Vegas would not likely yeild the same return as a non-TCP based transfer.

Everything I have read thus far indicates that latency issues slow TCP transfers.


http://www.onionnetworks.com/

[quote] with TCP packets if you need all the data, hence all file-transfer protocols use TCP (in general; there are of course exceptions :). e.g. Use of UDP is one of the reasons that NFS is useless in the wild. ).The only situation (other than a need to try and beat Vegas) I can think of off the top of my head where you would not want to use TCP is if your file contents were changing from second to second; that would be pretty interesting.
[/quote]
Note the exceptions provided in the two links above. There is clear evidence that TCP file transfers are much slower than can be achieved with UDP, unless you have VERY good network conditions. There is also the possibility to use a multicast file distribution mechanism if I used FEC.

According to http://www.internettrafficreport.com/main.htm the packet loss will often be sufficient enough to make a significant difference.

It was only a guess at something that might help :slight_smile: ; I appreciate that Vegas implementations are thin on the ground :(.

Looking into the links you cite, it sounds like Vegas would suit you down to the ground, except where you want to do caching. In general you can do almost all the caching tricks with UDP or TCP.

OK, I’m eagerly trying to see this, but… the second link contains absolutely nothing on this subject (perhaps you could let me know where on that site there is info? I’ve been routing around, and all I can find is info on their multi-stage cache system, rather ccNUMA-esque…)

The first link seems to be a parallel to Vegas, although again it offers the vast majority of it’s benefits through multi-stage caching instead. The description of the “meta content technology” is quite scary - it shows either a devastating lack of understanding of how TCP works, or that the company knowingly lies to its clients; not exactly giving a great impression :(.

The only new concepts I can see in the links above are the use of caching. If they cache at the packet level, they could obviate the need for re-transfers - but only at the cost of wasting large amounts of bandwidth by transmitting everything twice in the first place, rather than transmitting once and then re-transmitting what was lost. Otherwise, everything they’re doing is do-able with TCP (and, note, has been done with TCP for many years by people such as GoZilla etc - although GZ never took it all that far IIRC).

Other than that they offer nothing new (apparently - if I’ve missed something else here, please let me know). The first link only claims to improve on TCP’s windowing - and that is PRECISELY what Vegas is all about!

Fundamentally you are transferring data. If you need that data, you have to wait for it to arrive. If it doesn’t arrive, you need to re-transmit it. You can’t magically obviate the need for it.

All the claims of “increase speed by 3 to 300 times” on the links you point to are to do with direct improvements because of caching, not because of the transport layer. I’m having to guess here (since I can’t find details of the companies’ algorithms) but a 100x improvement in speed is quite typical from a system of geographically distributed caches (i.e. independent of transport layer). There are techniques for increasing raw speed compared to TCP, but only with major tradeoffs (like doubling the bandwidth usage as mentioned above). The only techniques that increase speed without tradeoff are things like Vegas, which is just a much better implementation of TCP than the standard implementations.

Again, I can’t seem to find anything on that page that talks about UDP vs TCP for file transfer. Am I being stupid?

[quote]the second link contains absolutely nothing on this subject (perhaps you could let me know where on that site there is info? I’ve been routing around, and all I can find is info on their multi-stage cache system, rather ccNUMA-esque…)
[/quote]
Yeah - it stinks… I included it because it uses similar technology as the first link, the caching aspect is not something I’m interested in… just the idea of a completely ack-less transmission because the FECs can make the probability of recovering the data even when packets are lost quite high.

[quote]The first link seems to be a parallel to Vegas, although again it offers the vast majority of it’s benefits through multi-stage caching instead.
[/quote]
Not at all. It uses UDP. It does not play with any of the parameters of the transport layer. And, it does not use multi-stage caching. (Assuming I understand what you mean by that.)

[quote]The description of the “meta content technology” is quite scary - it shows either a devastating lack of understanding of how TCP works, or that the company knowingly lies to its clients; not exactly giving a great impression :(.
[/quote]
I don’t understand why you say this? They are a well respected company that is delivering a real product that produces real results. Those guys know there stuff, they have written various papers and IETF RFCs.

[quote]The only new concepts I can see in the links above are the use of caching. If they cache at the packet level, they could obviate the need for re-transfers - but only at the cost of wasting large amounts of bandwidth by transmitting everything twice in the first place, rather than transmitting once and then re-transmitting what was lost.
[/quote]
Poke around on their site a bit more. You seem to have totally missed what they do. Caching is not a significant part of it. They do not ever need to resend lost data. By adding a slight overhead they completely eliminate the need for retransmission of any sort. The receiver simply waits till it has received enough encoded packets and it can rebuild the original data. Typically this means it only need to receive about as much data as there was in the original file. If it missed a few blocks in the middle it doesn’t matter, it just keep receiving new data until it has enough. The sender can generate an infinite stream of data such that if you receive ANY X packets from that stream then you can basically reconstruct the data. The receiver only needs to send a single message to say “ok I have enough”.

Here’s my basic understanding of what is happening:
Because there is no ack or retransmission the general packet throughput can be significantly higher. TCP packets need to be saved until an ack is received or there is a timeout and they are resent. This happens at many stops along the route, and since TCP can’t deliver data out of order lost packets can hold up the pipe for a moment such that there is a significant impact on the transmission rate. The UDP method has higher throughput because it simply doesn’t need to care if a packet made it through or not.

[quote]Am I being stupid?
[/quote]
Sort of :slight_smile: Actually I sent you straight to the FAQ with that first link, and you should probably poke around on the main site for a bit to see what they are really about. I have some technical papers about their technology that make things clearer, maybe they are available from their web site.

do you qualify under blah’s really good reasons to use UDP?

So your process is:
TCP is too slow, so you choose UDP.
UDP doesn’t have retransmittion of lost packets so you add that functionality yourself
UDP doesn’t know about the order of packets so you add that functionality yourself
UDP doesn’t do x so you have to add that functionality yourself

By the end of that process it it sounds like you will have a re-implementation of the TCP protocol using UDP… Will it still be as fast as UDP? I doubt it. Will it be faster than TCP? Possibly if you do a really good job of it but why reinvent the wheel?

Will.

OK, re-reading their spiel, I start to see something similar to CD-ROM encoding, whereby all the data is merged into all the other data (using some funky group theory). This results in every packet containing some of the data of N other packets (where N can be chosen by the protocol designer - larger numbers basically just mean you have to encode with larger groups). Although I may still be missing the point :slight_smile: (I’m currently at home with flu, so not quite top notch at the moment)…

You’re right that this info is there, but I got lulled by the dumbed down marketing BS :(.

I think from my current understanding that their “Meta-content technology” is a group-theory, with bells and whistles. I’ve found a big PDF of theirs, and will try and read through it today. In the meantime, I’ll assume it is group theory, because that’s interesting even if it’s wrong :slight_smile:

OK, assuming what I’ve just written above is correct, then again this actually doesn’t do much good without packet-level synchronization between a compute-intensive server and the client. With this synchronization, the server can use a technique similar to Quake3’s net protocol, which keeps track of the packets that the client is known to have received, and alters what is sent based on that knowledge.

This way, the server is actually sending a different encoded file every time a packet is received. The CPU usage is going to be quite high, because it’s like re-compressing the entire remaining file every packet; by today’s CPU standards, that’s very little, but for a large number of simultaneous streams it could (I have no idea; I’ve never benchmarked group-encoding schemes! I mention it merely as an interesting side-effect) get large.

Without that synchronization, the sender is sending LOTS of useless data. In fact, without the synch, the file takes slightly longer to send than with a good TCP implementation, but slightly less time than a standard (poor) TCP imp, because good TCP sends very very little more than the entire file, whereas this technique could send quite a lot more, depending on where the dropped packets are; towards the end, if e.g. you drop the last packet, you will on average have to wait many packets (proportional to the size of the file, and to the number of packets merged into each packet) to reconstitute that data, because each packet contains data for the whole file.

Bear in mind that TCP only has to receive “about as much data as there was in the original file”! It all depends on how good the algorithm is which reacts to dropped packets. Note that network cards have large (typically tens of kb) buffers so that no packet is ever sent in vain - every packet received is NOT resent, and is kept until needed. On extremely high bandwidth connections, it’s possible to saturate the buffer, but usually you just buy a “server” network card for this - e.g. the Compaq NIC’s which come with four ports on each card, and can aggregate bandwidth from all four ports into a single stream.

If it’s the group-theory encoding described above, then WITHOUT synchronization, X here is signficantly greater than the number of packets that a perfect TCP link would use. I’m afraid I can’t remember the figures for this, and I can’t find a relevant paper right now :(. My vague memories of information-theory and the like would suggest you’d be looking at 10%-500% overhead, depending upon how close you wanted to get to “ANY X packets”, how many bytes are in your file, and how many bytes you can send in each packet.

If you are using synchronization, then obviously you have lots of ACK-style packets flying from client to server.

However, ACK packets are NOT usually a source of slowdown in TCP, nor should they be in this protocol, except in the situation where the upstream traffic is congested, but the downstream is not (unusual but theoretically possible).

I don’t think that’s related. As it happens, TCP packets should never be queued along the route, in theory. Each packet is free to take an entirely different route through the network, and hence in general it’s not even possible to queue them.

By “parallel to Vegas” I did not mean “it is TCP”, I meant it served a similar role - improving on the “how to react to dropped packets” algorithm.

Note that whatever it does it’s not going to achieve the claimed hundreds of times increases in speed without using caching - that’s impossible except in pathologically bad networks; probably not even then. TCP’s inefficiency - even at it’s greatest - doesn’t leave all that much room for improvement. I suspect this is a large reason why Vegas hasn’t caught on - it’s considerably better, but not so much so that everyone HAS to have it.

“Since Fountain Pool enables clients to receive data from multiple Transporter Fountains, it is, in effect, a type of load sharing.”

i.e. multi-stage caching. They have servers sitting at various places in your network, and you are getting the same data from multiple servers at once, each of which is getting that data from presumably a single source file on a single server elsewhere.

You have a cascade of caches, and you can fetch from the local cache at full local link speeds. Fetching across the entire network would force you to receive at the speed of the slowest link in the chain.

This, of course, is the main feature of the P2P file-sharing systems, although they take the multi-stage caching to an extreme, in return for extremely high bandwidth utilization.

Because parts of their site reads just like many crap sites from companies that have managed to register some patents (that will later prove indefensible, or just plain impossible) and really don’t have a clue what they’re doing; but do have good marketing!

e.g.:

“As an example, consider a transfer of a 1 GB file with the average Internet packet loss rate (2%) and global average RTT (just over 200 msec). In this specific case, TCP would take more than 7 hours to transfer the file. With Digital Fountain, on a fully utilized T1 link, the transfer of the 1 GB file will take about 1.5 hours, and on a fully utilized T3 link the transfer time will be a little over 3 minutes. That translates to a 3X to over 100X improvement.”

WTF? Only a marketing droid or an idiot would use an example with 200msec delay - I get less than that to almost everywhere in the world except Asia and Australia from my 56k modems. Everyone with the most basic grasp of mathematics knows there is absolutely no point in taking a global average of latency for measuring protocol performance. The modal average would at least make sense and have some value here, although if they wanted to do it properly (convincingly) they’d cite the 1 SD range, and do 3 figures for min/mid/max.

In addition, TCP is alleged to take “more than 7 hours” to transfer a file at an unknown bandwidth. They quote figures for their tech on T1 and T3, and compare them to this 7 hours to give a “3X to over 100X improvement”. Huh? Perhaps they were measuring from a 56kbps modem? If not, why not quote figures for TCP on T1 and T3 - it’s deeply unconvincing that they only quote one figure, suggesting that they probably did test at a different bandwidth altogether. This is the kind of statement you frequently see from companies claiming to have invented faster-than-light communications and other such dross.

Also, speaking as someone who actually has transferred 1GB files via high speed connections over the internet before - it takes a heck of a lot less than 7 hours. Obviously, if you take it from somewhere local - e.g. SuperJANET or the now defunct IBM GN - then you can do international transfers at phenomenal rates. But I’ve done it inter-continental, over public networks, and got much better than they quote.

I’m afraid I can’t find the original quote about TCP that was simply wrong and incited me to the original statement - but I can’t find the page it was on, either. Perhaps they’ve changed the site in the last 24 hours, I dunno ???.