Ive successfully written a class to handle an arbitrary amount of TCP connections using java.nio, but I havent found any good information regarding UDP sockets, anyone that can help with some links perhaps? Or even better, some code?
I recently had a look and saw DatagramChannel, is that the way to go? I would guess so, please correct me so that I dont start coding away on that
I just had a need for this info as well, and while I found DatagramChannel, what I canât find is a Datagram (UDP) version of ServerSocketChannel. I.e. no NIO based UDP stuff with an accept() method.
Then use the socket() method to obtain the DatagramSocket associated to bind to the server address. Then go on to use the channel operations as you would do normally.
You can connect() sockets if youâre only going to use the server socket for one connection ever, otherwise youâve got security check overhead each time a packet is sent.
Hmm⌠it looks like maybe Iâm just totally confused about how UDP works. Iâm looking at some examples and they just try to read data⌠which seems to wait for a connection. Is there no concept of a single connection used for multiple packets with UDP on the server side? The client side seems to work that wayâŚ
With udp there is no client and server as with tcp. You donât connect with the other machine. The connect() function in Datagram channel do not create a real connection with an other machine. It just locks the address that you can send to. Because the client donât connect to the server, the server donât accept either. It just reads the packets it recieves and checks the address to know where it come from. You have to keep track of which address represents which sender.
Since there is no client/server, both endpoints of the comunication works in the same way. They both try to send packets to addresses, and the recieve packets.
UDP is not relyable like TCP in transporting data as no connection is established and no action is taken if packets get lost (at this layer). Since establishing the connection in TCP is an expensive operation, UDP is very useful for answering short requests where it isnât as important if the answer gets lost.
For example, UDP is used when you do a lookup on a DNS server. If either packet gets lost then the person looking up the DNS server simply tries again (most software would do this automatically).
Another use would be setting up a way of âpingingâ (not ICMP) a server to see if itâs alive.
If you need relyable data transmittion like for playing a networked game then TCP is the way to go - but if you want a way for the game client to query a game server to see if itâs full (for example) then UDP may be more appropriate.
I know that UDP is not reliable and that means the packets may be lost or arrive out of order, I just wasnât aware that there was no concept of a âconnectionâ with UDP. I think I have it all straightened out now.
If I need a connection concept I have to make it myself based on the sender address that I can get from the packet. I assume that it is relative safe to assume (even with firewalls involved) that packets from the same address and port are likely part of the same transmission and that if my program was running on two different machines behind a firewall that the senderâs port numbers would be different and if I reply to the same address and port that the reply will likely get back to the sender.
This isnât for realtime game data. Iâm building a general file transport (think downloading levels/models from the server) I want to avoid latency issues involved with acknowledging TCP packets. I have another method for dealing with retransmission and acknowledgement. If it doesnât work I may end up using FEC.
In my experience with mobile phone operatorsâ firewalls / network address translators (NATs), a UDP address/port mapping in the NAT may be discarded if no packets are seen for about a minute. If new packets come after that, they get a new address/port mapping. Therefore if your application is going to have such long delays, make sure that the client behind the NAT sends occasional âkeep-aliveâ UDP messages (that the server receives and ignores). In case that doesnât work (e.g. the keep-alive message gets lost), you can put a âsession-IDâ at the start of each of your UDP messages, then the server can recognize the âUDP connectionâ even if the address/port mapping has been changed. (The server should keep track of the most recent sender address/port for each UDP connection, so it can send packets back to the client).
sticky. And / or get a prominent summary of UDP and TCP. For anyone whoâs not read it, it does contain some good summaries of UDP and TCP.
Then why are you interested in UDP? Iâm assuming youâre trying to build some fantabulous new transport protocol that is more efficient than the standard TCP; if so, can you not just use TCP Vegas?
The general approach should always be to use TCP unless you have a really good reason not to (FYI for anyone without the time to read that other threadâŚ). Really good reasons typically include:
[]Iâm only sending 30 bytes of data at a time
[]I need very very fast round-trip-times for messages - e.g. bullets and player positions in an FPS
[]This has to work in less than 0.2k/s bandwidth per player, and the overhead of TCP headers is making that difficult to achieve
[]If data is more than a few tens of milliseconds late, itâs useless and Iâll delete it anyway (e.g. in IP-telephony)
Of course there is NO EXTRA LATENCY ( this is isnât quite true, but when talking about file-transfer itâs effectively true ) with TCP packets if you need all the data, hence all file-transfer protocols use TCP (in general; there are of course exceptions :). e.g. Use of UDP is one of the reasons that NFS is useless in the wild. ).The only situation (other than a need to try and beat Vegas) I can think of off the top of my head where you would not want to use TCP is if your file contents were changing from second to second; that would be pretty interesting.
Searches for TCP Vegas yielded only a Linux implementation. Based on the info found in some of the other links TCP Vegas would not likely yeild the same return as a non-TCP based transfer.
Everything I have read thus far indicates that latency issues slow TCP transfers.
[quote] with TCP packets if you need all the data, hence all file-transfer protocols use TCP (in general; there are of course exceptions :). e.g. Use of UDP is one of the reasons that NFS is useless in the wild. ).The only situation (other than a need to try and beat Vegas) I can think of off the top of my head where you would not want to use TCP is if your file contents were changing from second to second; that would be pretty interesting.
[/quote]
Note the exceptions provided in the two links above. There is clear evidence that TCP file transfers are much slower than can be achieved with UDP, unless you have VERY good network conditions. There is also the possibility to use a multicast file distribution mechanism if I used FEC.
It was only a guess at something that might help ; I appreciate that Vegas implementations are thin on the ground :(.
Looking into the links you cite, it sounds like Vegas would suit you down to the ground, except where you want to do caching. In general you can do almost all the caching tricks with UDP or TCP.
OK, Iâm eagerly trying to see this, but⌠the second link contains absolutely nothing on this subject (perhaps you could let me know where on that site there is info? Iâve been routing around, and all I can find is info on their multi-stage cache system, rather ccNUMA-esqueâŚ)
The first link seems to be a parallel to Vegas, although again it offers the vast majority of itâs benefits through multi-stage caching instead. The description of the âmeta content technologyâ is quite scary - it shows either a devastating lack of understanding of how TCP works, or that the company knowingly lies to its clients; not exactly giving a great impression :(.
The only new concepts I can see in the links above are the use of caching. If they cache at the packet level, they could obviate the need for re-transfers - but only at the cost of wasting large amounts of bandwidth by transmitting everything twice in the first place, rather than transmitting once and then re-transmitting what was lost. Otherwise, everything theyâre doing is do-able with TCP (and, note, has been done with TCP for many years by people such as GoZilla etc - although GZ never took it all that far IIRC).
Other than that they offer nothing new (apparently - if Iâve missed something else here, please let me know). The first link only claims to improve on TCPâs windowing - and that is PRECISELY what Vegas is all about!
Fundamentally you are transferring data. If you need that data, you have to wait for it to arrive. If it doesnât arrive, you need to re-transmit it. You canât magically obviate the need for it.
All the claims of âincrease speed by 3 to 300 timesâ on the links you point to are to do with direct improvements because of caching, not because of the transport layer. Iâm having to guess here (since I canât find details of the companiesâ algorithms) but a 100x improvement in speed is quite typical from a system of geographically distributed caches (i.e. independent of transport layer). There are techniques for increasing raw speed compared to TCP, but only with major tradeoffs (like doubling the bandwidth usage as mentioned above). The only techniques that increase speed without tradeoff are things like Vegas, which is just a much better implementation of TCP than the standard implementations.
Again, I canât seem to find anything on that page that talks about UDP vs TCP for file transfer. Am I being stupid?
[quote]the second link contains absolutely nothing on this subject (perhaps you could let me know where on that site there is info? Iâve been routing around, and all I can find is info on their multi-stage cache system, rather ccNUMA-esqueâŚ)
[/quote]
Yeah - it stinks⌠I included it because it uses similar technology as the first link, the caching aspect is not something Iâm interested in⌠just the idea of a completely ack-less transmission because the FECs can make the probability of recovering the data even when packets are lost quite high.
[quote]The first link seems to be a parallel to Vegas, although again it offers the vast majority of itâs benefits through multi-stage caching instead.
[/quote]
Not at all. It uses UDP. It does not play with any of the parameters of the transport layer. And, it does not use multi-stage caching. (Assuming I understand what you mean by that.)
[quote]The description of the âmeta content technologyâ is quite scary - it shows either a devastating lack of understanding of how TCP works, or that the company knowingly lies to its clients; not exactly giving a great impression :(.
[/quote]
I donât understand why you say this? They are a well respected company that is delivering a real product that produces real results. Those guys know there stuff, they have written various papers and IETF RFCs.
[quote]The only new concepts I can see in the links above are the use of caching. If they cache at the packet level, they could obviate the need for re-transfers - but only at the cost of wasting large amounts of bandwidth by transmitting everything twice in the first place, rather than transmitting once and then re-transmitting what was lost.
[/quote]
Poke around on their site a bit more. You seem to have totally missed what they do. Caching is not a significant part of it. They do not ever need to resend lost data. By adding a slight overhead they completely eliminate the need for retransmission of any sort. The receiver simply waits till it has received enough encoded packets and it can rebuild the original data. Typically this means it only need to receive about as much data as there was in the original file. If it missed a few blocks in the middle it doesnât matter, it just keep receiving new data until it has enough. The sender can generate an infinite stream of data such that if you receive ANY X packets from that stream then you can basically reconstruct the data. The receiver only needs to send a single message to say âok I have enoughâ.
Hereâs my basic understanding of what is happening:
Because there is no ack or retransmission the general packet throughput can be significantly higher. TCP packets need to be saved until an ack is received or there is a timeout and they are resent. This happens at many stops along the route, and since TCP canât deliver data out of order lost packets can hold up the pipe for a moment such that there is a significant impact on the transmission rate. The UDP method has higher throughput because it simply doesnât need to care if a packet made it through or not.
[quote]Am I being stupid?
[/quote]
Sort of Actually I sent you straight to the FAQ with that first link, and you should probably poke around on the main site for a bit to see what they are really about. I have some technical papers about their technology that make things clearer, maybe they are available from their web site.
do you qualify under blahâs really good reasons to use UDP?
So your process is:
TCP is too slow, so you choose UDP.
UDP doesnât have retransmittion of lost packets so you add that functionality yourself
UDP doesnât know about the order of packets so you add that functionality yourself
UDP doesnât do x so you have to add that functionality yourself
By the end of that process it it sounds like you will have a re-implementation of the TCP protocol using UDP⌠Will it still be as fast as UDP? I doubt it. Will it be faster than TCP? Possibly if you do a really good job of it but why reinvent the wheel?
OK, re-reading their spiel, I start to see something similar to CD-ROM encoding, whereby all the data is merged into all the other data (using some funky group theory). This results in every packet containing some of the data of N other packets (where N can be chosen by the protocol designer - larger numbers basically just mean you have to encode with larger groups). Although I may still be missing the point (Iâm currently at home with flu, so not quite top notch at the moment)âŚ
Youâre right that this info is there, but I got lulled by the dumbed down marketing BS :(.
I think from my current understanding that their âMeta-content technologyâ is a group-theory, with bells and whistles. Iâve found a big PDF of theirs, and will try and read through it today. In the meantime, Iâll assume it is group theory, because thatâs interesting even if itâs wrong
OK, assuming what Iâve just written above is correct, then again this actually doesnât do much good without packet-level synchronization between a compute-intensive server and the client. With this synchronization, the server can use a technique similar to Quake3âs net protocol, which keeps track of the packets that the client is known to have received, and alters what is sent based on that knowledge.
This way, the server is actually sending a different encoded file every time a packet is received. The CPU usage is going to be quite high, because itâs like re-compressing the entire remaining file every packet; by todayâs CPU standards, thatâs very little, but for a large number of simultaneous streams it could (I have no idea; Iâve never benchmarked group-encoding schemes! I mention it merely as an interesting side-effect) get large.
Without that synchronization, the sender is sending LOTS of useless data. In fact, without the synch, the file takes slightly longer to send than with a good TCP implementation, but slightly less time than a standard (poor) TCP imp, because good TCP sends very very little more than the entire file, whereas this technique could send quite a lot more, depending on where the dropped packets are; towards the end, if e.g. you drop the last packet, you will on average have to wait many packets (proportional to the size of the file, and to the number of packets merged into each packet) to reconstitute that data, because each packet contains data for the whole file.
Bear in mind that TCP only has to receive âabout as much data as there was in the original fileâ! It all depends on how good the algorithm is which reacts to dropped packets. Note that network cards have large (typically tens of kb) buffers so that no packet is ever sent in vain - every packet received is NOT resent, and is kept until needed. On extremely high bandwidth connections, itâs possible to saturate the buffer, but usually you just buy a âserverâ network card for this - e.g. the Compaq NICâs which come with four ports on each card, and can aggregate bandwidth from all four ports into a single stream.
If itâs the group-theory encoding described above, then WITHOUT synchronization, X here is signficantly greater than the number of packets that a perfect TCP link would use. Iâm afraid I canât remember the figures for this, and I canât find a relevant paper right now :(. My vague memories of information-theory and the like would suggest youâd be looking at 10%-500% overhead, depending upon how close you wanted to get to âANY X packetsâ, how many bytes are in your file, and how many bytes you can send in each packet.
If you are using synchronization, then obviously you have lots of ACK-style packets flying from client to server.
However, ACK packets are NOT usually a source of slowdown in TCP, nor should they be in this protocol, except in the situation where the upstream traffic is congested, but the downstream is not (unusual but theoretically possible).
I donât think thatâs related. As it happens, TCP packets should never be queued along the route, in theory. Each packet is free to take an entirely different route through the network, and hence in general itâs not even possible to queue them.
By âparallel to Vegasâ I did not mean âit is TCPâ, I meant it served a similar role - improving on the âhow to react to dropped packetsâ algorithm.
Note that whatever it does itâs not going to achieve the claimed hundreds of times increases in speed without using caching - thatâs impossible except in pathologically bad networks; probably not even then. TCPâs inefficiency - even at itâs greatest - doesnât leave all that much room for improvement. I suspect this is a large reason why Vegas hasnât caught on - itâs considerably better, but not so much so that everyone HAS to have it.
âSince Fountain Pool enables clients to receive data from multiple Transporter Fountains, it is, in effect, a type of load sharing.â
i.e. multi-stage caching. They have servers sitting at various places in your network, and you are getting the same data from multiple servers at once, each of which is getting that data from presumably a single source file on a single server elsewhere.
You have a cascade of caches, and you can fetch from the local cache at full local link speeds. Fetching across the entire network would force you to receive at the speed of the slowest link in the chain.
This, of course, is the main feature of the P2P file-sharing systems, although they take the multi-stage caching to an extreme, in return for extremely high bandwidth utilization.
Because parts of their site reads just like many crap sites from companies that have managed to register some patents (that will later prove indefensible, or just plain impossible) and really donât have a clue what theyâre doing; but do have good marketing!
e.g.:
âAs an example, consider a transfer of a 1 GB file with the average Internet packet loss rate (2%) and global average RTT (just over 200 msec). In this specific case, TCP would take more than 7 hours to transfer the file. With Digital Fountain, on a fully utilized T1 link, the transfer of the 1 GB file will take about 1.5 hours, and on a fully utilized T3 link the transfer time will be a little over 3 minutes. That translates to a 3X to over 100X improvement.â
WTF? Only a marketing droid or an idiot would use an example with 200msec delay - I get less than that to almost everywhere in the world except Asia and Australia from my 56k modems. Everyone with the most basic grasp of mathematics knows there is absolutely no point in taking a global average of latency for measuring protocol performance. The modal average would at least make sense and have some value here, although if they wanted to do it properly (convincingly) theyâd cite the 1 SD range, and do 3 figures for min/mid/max.
In addition, TCP is alleged to take âmore than 7 hoursâ to transfer a file at an unknown bandwidth. They quote figures for their tech on T1 and T3, and compare them to this 7 hours to give a â3X to over 100X improvementâ. Huh? Perhaps they were measuring from a 56kbps modem? If not, why not quote figures for TCP on T1 and T3 - itâs deeply unconvincing that they only quote one figure, suggesting that they probably did test at a different bandwidth altogether. This is the kind of statement you frequently see from companies claiming to have invented faster-than-light communications and other such dross.
Also, speaking as someone who actually has transferred 1GB files via high speed connections over the internet before - it takes a heck of a lot less than 7 hours. Obviously, if you take it from somewhere local - e.g. SuperJANET or the now defunct IBM GN - then you can do international transfers at phenomenal rates. But Iâve done it inter-continental, over public networks, and got much better than they quote.
Iâm afraid I canât find the original quote about TCP that was simply wrong and incited me to the original statement - but I canât find the page it was on, either. Perhaps theyâve changed the site in the last 24 hours, I dunno ???.