non NIO question

Jeff · January 23, 2004, 6:53pm

I’ll ask but I believe the answer is that that is NOT a valid server design. That even on the native level you aren’t assured of an error condition return immediately from the send(). Which is to say return from sending as packet to the OS does not gaurantee that the connection will complete.

I believe that is a misunderstanding of the term “reliable”.

JK

blahblahblahh · January 23, 2004, 7:33pm

I’m not contesting you’re right here, but IMHO java should either say “we don’t offer blocking IO” or should actually make it block. I find it really hard to accept that you could do a TCP transfer, have it fail, and java not throw any exception. What’s the frigging point of using TCP if your API is going to partially (and non-deterministically) disable the “guaranteed” aspect? How many seconds do you have to loop for, flushing, until you can feel safe you’ve received any IOExceptions that may have occurred? And why the heck isn’t this documented?

As it stands, it seems that the apparent option in the java API’s to do “non-blocking” or “blocking” I/O doesn’t really exist. Instead you can choose “non-blocking” or “non-blocking, with some processes simplified, and some features missing (e.g. Buffers)”.

PS Could you please kick whoever wrote the API docs for io.* for me - this is not mentioned anywhere AFAICS in the API, let alone in the obvious places where it should be - e.g. OutputStream (and if someone can find it somewhere else in the API, please shout). Implementing counter-intuitive API’s is unfortunate but hey it’s free - not bothering to document their behaviour is completely unacceptable.

blahblahblahh · January 23, 2004, 7:40pm

[quote]Which is to say return from sending as packet to the OS does not gaurantee that the connection will complete.

I believe that is a misunderstanding of the term “reliable”.

JK
[/quote]
It has been a very long time since I did any network programming in C, and I never did much. I could be completely wrong here, but I would have thought I’d remember if TCP’s guaranteed delivery were not actually accessible with standard networking API’s…I don’t see that java has an excuse to be any different. Or am I misunderstanding what you’re saying here?

If you are offering reliability as a feature (which, if you claim to be supplying TCP, you are), the application needs to know if that reliability has not been achieved. How can anyone write a serious network application where they never know if the data has been sent or not? That would be crazy! It would mean implementing your own ACK scheme on top of TCP (which is one of the things TCP is supposed to avoid!).

…Unless TCP doesn’t actually offer reliability at all, and I’ve been misunderstanding it all these years :(.

Jeff · January 23, 2004, 7:43pm

[quote]I’m not contesting you’re right here, but IMHO java should either say “we don’t offer blocking IO” or should actually make it block…
[/quote]
Well I think you are mis-interpreting the meaning of blocking in this case. The APi docs don’t actually say “This whole API is blocking.” They say that IF the underlying channel for the input stream is non-blocking then this API will throw an exception.
It makes a similar statement about the output stream.

We cannot make promises about how your OS and TCP/IP stack handles things, only what we do with what they tell us.

blahblahblahh · January 23, 2004, 8:45pm

This whole thread has been about the non-NIO API’s throughout, although it sounds like you might now be talking about blocking-mode NIO? Although, of course, the same questions need to be answered for NIO’s blocking modes, they may already be answered in API docs etc - I don’t know, I don’t use that mode.

If you’re still talking about IO, then please point out to me where in the API docs it says what you’re referring to? As I said, I couldn’t see it, having looked in the “obvious” places (e.g. java.io.OutputStream, 1.4.x docs; I may not have been looking at the absolute latest version, though…).

blahblahblahh · January 23, 2004, 8:58pm

Taking all your statements, and all the info you’ve provdided, together, AFAICS there is no functionality that lets a java programmer write a real TCP-based network application using io.*, unless by using some incredibly ugly hacky workarounds that aren’t even guaranteed to work (c.f. previous posts).

This is basic stuff - “were my packets transmitted or not?” - without which you cannot do TCP programming (it’s part of the protocol, part of the spec, IIRC?).

Waving hands in the air and saying “well, it’s all up to your OS; maybe they were, maybe they weren’t” is not in any way an excuse for not providing this information / guarantee / option. At the very least the API should say “although this uses TCP, and looks just like TCP, it is in fact something slightly different that doesn’t allow you, the common java programmer, the option of using TCP”.

Your last statement just sounds exactly like the old fun in 1.0.x where we were only allowed access to one mouse button, allegedly because the OS would only guarantee one. Go that way for long, and java becomes a “toy” programming language useless for real work.

Jeff · January 24, 2004, 1:02am

Totally incorrect. from the Man pages:

" Send(), sendto(), and sendmsg() are used to transmit a message to another
socket. Send() may be used only when the socket is in a connected state,
while sendto() and sendmsg() may be used at any time.

 The address of the target is given by to with tolen specifying its size.
 The length of the message is given by len.  If the message is too long to
 pass atomically through the underlying protocol, the error EMSGSIZE is
 returned, and the message is not transmitted.

 No indication of failure to deliver is implicit in a send().  Locally
 detected errors are indicated by a return value of -1. "

java.net on Win32 or any flavor of Unix operates EXACTLY the way berekely sockets are defined, because thats what it is a wrapper on.

Next?

Jeff · January 24, 2004, 4:02am

[quote]If you’re still talking about IO, then please point out to me where in the API docs it says what you’re referring to? As I said, I couldn’t see it, having looked in the “obvious” places (e.g. java.io.OutputStream, 1.4.x docs; I may not have been looking at the absolute latest version, though…).
[/quote]
I am talking about the fundementals of Sockets. How they work. I gave you the Unix man page. That should settle it.

You seem to have come up with your own idea of what reliable “shoudl mean.” Thats all well and good, but its not TCP/IP.

The reliability gaurantee of TCP/IP means that a packet will be delivered, in order, or you will eventually get an error. Saying that error has to be immeidate on send though would turn the entire inetrnet into one giant synchronous app and reduce the running of all the computers on it down to the slowest reponse time of any communciation they do.

The guys who invented TCP/IP were a whole lot smarter then that.

The difference between NIO and java.net, which is what seems to have thoroughly confused you, is equally simple. When you read from a java.net socket, it blocks until data arrives and that is the ONLY way to find out if there is data avilable. This is why it is called “blocking.”

NIO gives you a select() call with which to find out what sockets have data BEFORE you try to read from them. Thats the fundemental difference and why NIO is called “non-blocking”.

In EITHER case, when you write data out a socket, it first goes into a buffer in the sending computer. The only time this blocks is if the buffer is full. Doing anythign else again would make the entire computer potentially stop and wait on the ether card which you do not want.

This is all really really basic operating system Io stuff.

blahblahblahh · January 24, 2004, 11:10am

If that was meant to clarify the C situation, thanks. I happily admitted that I might have been using TCP for years without realising a fundamental problem with the API’s I was using. However…

I’ve just re-read the TCP RFC 793, where it is “suggested” that OS developers writing a TCP API should offer precisely the information I’ve described (via an API function called “STATUS”), and which you say is not available on win32 nor unix. Because of the design of the RFC, nothing is mandated about the OS API.

(This would explain my “confusion”; I always use RFC’s as my primary source for networking stuff, until/unless the implementation I’m using proves different from the RFC.)

So, as far as Java is concerned, developers have a reasonable expectation that java has access to a STATUS function - unless they happen to know platform-specific info AND they happen to know EXACTLY how the JVM implements networking.

[quote]Totally incorrect. from the Man pages:
[/quote]
I’m sorry, but this is the kind of rubbish that makes java networking so unnecessarily hard: you can’t say “it’s not documented, but you should know how we implemented this by using your psychic powers”. WHEN the API docs for java start to reference the man pages for unix, THEN I will care. Up until then, it’s accurate but irrelevant.

Java presents an API. Java != Unix. If the API doc doesn’t explicitly state “this uses Berkeley sockets, c.f. XXX for documentation” then the docs are wrong, period. Just because it happens to wrap that is irrelevant - java programmers don’t KNOW this unless the API states it.

I sometimes wish Sun would make up it’s mind about API docs - either Sun intends to document, or it doesn’t.

blahblahblahh · January 24, 2004, 11:16am

Well, we disagree on the definition of TCP/IP - I work on the basis that RFC’s are authoritative, I’m afraid, not the man pages of a particular OS’s implementation.

As pointed out, the RFC does not mandate the availability of STATUS, and even states “and could be excluded without adverse affect” (although, given the scenarios described in this thread, this is only true AFAICS if this functionality is offered elsewhere in the particular implementation).

Unfortunately, 793 seems to predate the use of well-defined MUST and SHOULD in RFC’s; almost every RFC contains many “non-mandatory” parts, but usually every implementation is encouraged to implement the entire spec, unless there is a good reason not to do so.

There’s enough room here to drive a bus through, and I still don’t claim anything you’ve said is wrong, per se, but I don’t agree that my assumptions are as unreasonable as you believe :(.

blahblahblahh · January 24, 2004, 11:21am

Huh ???

Of course you want “the computer to stop and wait on the ethernet card” - that’s the main advantage of the concept of synchronous I/O! It’s synchronous!

Yes, if you are using Windows 3.1, or some non-multi-threaded OS, this is going to be unacceptable, but this is one of the things threads are so useful for!

This thread contains several examples of situations where this is not only the desired behaviour, but actually the required behaviour in order to achieve particular applications…

Jeff · January 24, 2004, 11:12pm

Please give the quote. I woudl like to see if it really says that information on the last sent packet is available immediately after the send call returns, since that is your claim.

AFAIK the JDK APIs represent the Berkeley Unix Socket API, which is the standard on which most other OS’s TCP/IP APIs are based.

Until NIO we were missing the select() functionality that Berkeley had, thats why the additions through NIO.

Jeff · January 24, 2004, 11:18pm

To my knowledge, berkeley sockets never were “synchronous” in that sense. If you have a man page reference that shwos otherwise I would definitely like to see it.

And the Java java.net. API’s don’t even claim “sychnronicity” to me knowledge, only that they do certain thinsg in response to the unserlying channels being confugred in certain ways. I gave those quortes above, if you can find other more apropros quotes from the JDK APi docs please by all means post them.

In terms of “certain applciations require…” since Berkeley sockets don’t support such behavior, such applicatiosn are theoretical at best as there was never any way to implement them.

Oh nand speakign of RFCs, I REALLY want to see your sources since I went digging last weekand coudl find nothing on the net that describes the behavior you claim. In fact, whatI did find was all so complex in terms of how TCP/Ip handles failure, retries and time outs that there was no single concrete description I could clip and quote or I would have.

But ANAICT if it worked as you say then all of that wouldn’t be necessary.

Jeff · January 24, 2004, 11:39pm

Part of the problem here is a difference in the interpretation of “socket operation completing.”

We can agree that, in blocking mode, a Socket wil lnot return until “the requested socket operation is complete.”

The fundemental disagremenet here sems to be whether ona write that means just the app side is complete-- that the data has been transferred to the system for output-- or whether that means the data has been received on the other end.

In practice I am pretty dman sure that all the BSD socket programming I ever did in TCP/Ip did not require sucha hand-shake per packet. We did ALL our game networking at TEN in TCP/IP and I KNOW game loops did not stop when we had a latency spike.

But I am looking for an authortative definiton. If you can find one that says otherwise I’d love to see it.

Jeff · January 24, 2004, 11:48pm

Since you wanted an RFC reference, this seems pretty clear and authoritative:

==================================

RFC1180, Page 24

TCP is a sliding window protocol with time-out and retransmits.
Outgoing data must be acknowledged by the far-end TCP.
Acknowledgements can be piggybacked on data. Both receiving ends can
flow control the far end, thus preventing a buffer overrun.

As with all sliding window protocols, the protocol has a window size.
The window size determines the amount of data that can be transmitted
before an acknowledgement is required. For TCP, this amount is not a
number of TCP segments but a number of bytes.

=====================================

SO it seems very clear to me that acknowledgement of individual writes is NOT requried by the protocol. The guarantee is that after a certain number of bytes, those bytes must be acknowledged, with that number fo bytes set by a transmission window.

Are we done yet?

blahblahblahh · January 25, 2004, 8:32pm

Sorry, I’ve confused the issue. I was only saying that I had previously believed that to be the case for JAVA in particular, but…

…I then made the point that the TCP spec contains a suggested set of information that should be exposed by an API, but which Java does NOT expose; without this info, it is not possible to do things which IMHO you really ought to be able to do in TCP [c.f. below], because it was designed with such use in mind, as far as I had been aware (but I’ll need to re-read RFC’s etc to check the extent to which these application designs were intended…).

I was then suggesting that, having worked with the TCP RFC before, I’d probably assumed java behaved the way I had because this was the ONLY way that java could fully support all TCP-based application designs, since it doesn’t have a “STATUS” command. If you search the 793 for “STATUS” you should find the paras outlining the suggested info. About 3 pages earlier it is at pains to explain that it is NOT mandating this.

The TCP spec suggested info includes “bytes acknowledged”. With this information, the API-user can accurately and definitely discover whether or not all their SENT packets have been RECEIVED (…which is the core functionality demanded by the use-cases cited previously; I had naively assumed old-IO in java provided this functionality automatically, since it’s API’s are dumbed-down, compared to what the RFC suggests. Whilst I’ve been accepting this is wrong, I’ve been struggling to accept that it’s impossible to achieve in java. If everything you’ve said is true, it is indeed impossible, which IMHO is limiting enough that it MUST be documented in the API docs).

[quote]Since you wanted an RFC reference, this seems pretty clear and authoritative:

==================================

RFC1180, Page 24

TCP is a sliding window protocol with time-out and retransmits.
Outgoing data must be acknowledged by the far-end TCP.
Acknowledgements can be piggybacked on data. Both receiving ends can
flow control the far end, thus preventing a buffer overrun.

As with all sliding window protocols, the protocol has a window size.
The window size determines the amount of data that can be transmitted
before an acknowledgement is required. For TCP, this amount is not a
number of TCP segments but a number of bytes.

=====================================

SO it seems very clear to me that acknowledgement of individual writes is NOT requried by the protocol. The guarantee is that after a certain number of bytes, those bytes must be acknowledged, with that number fo bytes set by a transmission window.
[/quote]
You are correct that “acknowledgement of individual writes is NOT required”, but that does not mean that the protocol is NOT AWARE of how many bytes have been acknowledged (which is the functionality I’ve been referring to throughout, albeit perhaps badly explained on my part).

The text you’ve quoted refers to the “transmission-in-progress” pseudo-state (i.e. my ease-of-explanation definition meaning “what happens in the middle of a transfer” :)). My core concern is that once I’ve finished transmitting a set of bytes, I want to know when/whether they’ve been successfully transmitted.

The question is, should I be allowed to know this only when I close the TCP connection? The evidence you’ve offered for java (and for Berkeley, since I’m taking your word for it there) explicitly says I have to close the connection to know that my data was sent.

That sucks.

TCP is a very old protocol, so it’s easy to imagine that it was designed for something as low performance as opening and closing a separate stream for each application-level chunk of data (i.e. no pipelining); pipelining made a relatively late move into various RFC’s (e.g. HTTP). I’d always worked on the assumption that TCP supports pipelining; the RFC I quoted explicitly “suggests” that implementors “could” provide a set of info that happens to include everything necessary for API-users to implement correct pipelinining with minimal effort, by reusing data that TCP is already keeping track of.

As if! ;D No, seriously, I really want to explain this to you sufficiently that you understand my perspective here. If you think this is better done by email or phone…? Basically, I want you to understand why what I’m suggesting is reasonable to assume, and reasonable to expect. It’s sufficiently unlikely that java’s old IO sucks THIS much that I basically want an informed second opinion. I’m actually secretly hoping that there’s some other way of achieving this “reasonable” functionality in java, and that I just haven’t spotted it yet.

Unfortunately, there’s only one person I know who knows everything I do about networking in java, and a lot more on top, and he’s hard to get hold of :(.

I have some niggling memories in the back of my head of rants by academics against the IO models of Java pre-NIO, and I’m wondering now if we’ve come across what they were referring to (I didn’t pay much attention at the time; I wasn’t that interested in networking back then). My vague memory is of hearing them as part of various papers/articles/lectures/pages (can’t remember which :() outlining “serious fundamental problems” with java; I think it’s the same place I first discovered some of the major (but extremely subtle) JVM bugs where e.g. Sun’s 1.1.x reference JVM didn’t adhere to the JLS in “10 different ways”. But it was sooooo long ago…

Jeff · January 26, 2004, 12:27am

I am missing your point, maybe there is something Im stil lnot grapsing.

The quote makes it clear that, for instance, if I send a single byte, an acknowledgement is NOT requried by the protocol

Therefor, by the protocol, no, you CANNOT know that that byte has been received until the handshake for connection closing OR until you send enough additional data to fil lthe window and require an ACK.

Now there are ALSo time-outs in the protocol. My guess is that after a set amoutn of time, if no new data arrives, then it probably ACKs anyway. Thsoe time-outs and the window itself may even be negotiable and settable under Unix by ioctls.

But by default acknowledgement of that single byte write certainly is not immediate on doing the write , nor is that acknowledgement requried in order to write again.