File transfer through a socket stream

Endymius · September 13, 2003, 3:00pm

I’m trying to find the best possible way to transfer a file from the server to the client. I am hoping to use this method:

Client sends request for a file ( in the Server dir )
Server locates the file
File is buffered and sent through a stream one section ( ~3000 bytes ) at a time
Client reads the stream and writes the file

My question is, what is the best existing way to do the process of sending. Reader -> Stream (server) Stream -> Writer (client) ? ObjectInputStream and then a socket stream? I’m very confused about this issue and if someone would be kind enough to shed some light on it, I would be very thankful.

blahblahblahh · September 13, 2003, 4:38pm

[quote]I’m trying to find the best possible way to transfer a file from the server to the client. I am hoping to use this method:

Client sends request for a file ( in the Server dir )
Server locates the file
File is buffered and sent through a stream one section ( ~3000 bytes ) at a time
Client reads the stream and writes the file

My question is, what is the best existing way to do the process of sending. Reader -> Stream (server) Stream -> Writer (client) ? ObjectInputStream and then a socket stream? I’m very confused about this issue and if someone would be kind enough to shed some light on it, I would be very thankful.
[/quote]
First questions you need to ask yourself (and answer for us :)) are:

[]What are your typical filesizes?
[]…smallest?
[]…biggest?
[]What’s the context - why are you sending these files, what other processes are going on at the same time (is the server heavily loaded doing lots of CPU work, for instance?) - or is the server ONLY sending files?
[*]When you say “what is the BEST way…” do you want the FASTEST, or the EASIEST to program, or the MOST EFFICIENT (i.e. takes smallest amount of CPU time etc)…?

blahblahblahh · September 13, 2003, 4:42pm

3k is a strange number to choose (even roughly). 1k, 15K or 30K would make a lot of sense (each for different reasons - depends on what you are trying to do), so I’m wondering if there’s any particular reason why you said roughly 3k ?

And several good ways of sending files in java you don’t actually get to choose a number to send at once anyway…

Endymius · September 13, 2003, 4:44pm

Allright

The typical file sizes can be of variable degree. Anything from 1kb to 5mb ( that’s about a good range ).

There won’t be any processes other than the file transfer happening at the same time, I don’t see a need to thread the Server/ Client at this time.

By “best” I mean the best non error prone, logical, efficient way to send the files.

Also, i chose 3000 bytes out of my hat. I am not aware in what partitions transfer is most efficient that’s why I’m asking

Endymius · September 13, 2003, 4:46pm

Elaborate on this please

blahblahblahh · September 13, 2003, 7:56pm

[quote]Allright

The typical file sizes can be of variable degree. Anything from 1kb to 5mb ( that’s about a good range ).

There won’t be any processes other than the file transfer happening at the same time, I don’t see a need to thread the Server/ Client at this time.

By “best” I mean the best non error prone, logical, efficient way to send the files.
[/quote]
Without any special needs going on, I’d probably just do something like:


// Setup:
File f = new File( "/whereever/thingy" );
FileInputStream fis = new FileInputStream( f );
FileChannel rc = fis.getChannel();

...
// Loop:

int position = 0;
int fileLength = // length of file in bytes
if( position < fileLength )
{
rc.transferTo( position, fileLength-position, wbc );
}
else
{
out( "file transferred to nework Channel = "+wbc );
}

…where wbc is a SocketChannel returned from a ServerSocketChannel via:


SocketChannel wbc = ((ServerSocketChannel) key.channel()).accept();

The loop section that writes the file to the wbc would be in the middle of the network-non-blocking select, thus:


while( true )
{
try
{
selector.select();
Set keys = selector.selectedKeys();
                        
                        Iterator i = keys.iterator();
                        while( i.hasNext() )
                        {
                              SelectionKey key = (SelectionKey) i.next();
                              i.remove();
                              
                              if( key.isWritable() )
                              {
                                    try
                                    {
                                          processWritableKey( key );
                                    }
                                    catch( IOException e )
                                    {
                                          key.cancel();
                                          closeChannel( key.channel() );
                                    }
                              }
                              else
                                    out( "Unexpected: key \""+key+"\" is not writable; this Selector should only contain Writable keys...." );
                        }
}

…and the “processWritableKey” method initiates the loop that writes to the file. Obviously, you have to keep track of state etc, but you only asked “what’s the best way of doing this”, so I’m assuming you’re OK to fill in the blanks yourself.

That’s a good combination of “logical” and “efficient”. If you wanted ultra efficient, you could add bells and whistles, but with rapidly diminishing returns…If you don’t lke NIO, or don’t want to learn it, then you’ll be back to methods that are inherently inefficient, and so you have to work harder to try and squeeze out performance.

Generally, I trust methods like “transferTo” to work - unless I have special needs in a given situation (like “this has GOT to be REALLY REALLY fast!!!” ;)). Quoting from the API docs:

 "This method is potentially much more efficient than a simple loop that reads from this channel and writes to the target channel. Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them."

blahblahblahh · September 13, 2003, 8:04pm

Well, when you’re using non-blocking I/O, you should just say “send all of it” - and it’ll come back and say “I actually sent 31,245 bytes”. So you say “go back and finish it off, you lazy £%£ !”. You don’t get to choose. Note, however, that for REALLY high performance, when you want to start doing fancy tricks like variable QoS (google for it) - where you want to give some clients faster service than others - you can tell NIO “transfer NO MORE THAN X bytes”. But it’s optional, and effectively pointless in normal usage.

When you’re using Buffered streams in 1.3.x and below, you often adopt a similar approach and just use “readLine()” which doesn’t read any specific amount of data - it handles the buffering for you. Sometimes this is fine, sometimes it’s stupid - all depends on circumstance. In theory, bearing in mind that you are already forsaking the higher performance of NIO, using the built-in buffering could be your best bet - because a JVM might have some clever tricks going on with doing native buffering. I wouldn’t bet on it though.

OTOH, I’ve done basic “read from fileinputstream into byte[], write from byte[] to Socket.getOutputStream()” before with variable sizes of byte array and noticed little or no difference for changing sizes. That was probably an artifact of the buffering being done beneath the covers.

blahblahblahh · September 13, 2003, 8:12pm

1k == significantly less than the MTU of a standard TCP/IP link. So, you have a good chance that the entire 1k of data gets transmitted in a single packet. 1501 bytes could be really stupid, because it means that you are very likely to fragment your data into “one big packet, one tiny packet, one big packet, one tiny packet”. Assuming the presence of a good OS and network-card, you’d hope that together they stuffed your data into a large buffer on the NIC, and only sent out consistently sized packets. However, there are an awful lot of utterly crap network cards (both real and virtual) lurking around…

15k == enough less than 16k that you have a resonably good chance of fitting inside the hardware send buffer of a typical network card (although YMMV - I still use some cards with as low as 4k buffers).

30k == again, aiming to fit inside network-card buffers. This is for slightly better network cards - and also happens to be about right to fit inside the SEND BUFFER SIZE reported by the 1.4 JVM on most machines I use…but until I find a detailed description of precisely WHAT that buffer includes (is it something reported by the OS? by the NIC hardware driver? what?), I’d “Assume nothing, and just suck it and see…”.

Those are just off the top of my head - I make no recommendations for any of them. Wait until you profile and find that you’re not achieving as high a throughput as you’d expect… (unless you’re using UDP, in which case how much data you send per packet is much more important - because the OS and the network card will NOT make any changes for you - so you should be planning it right from the start).

Endymius · September 13, 2003, 8:16pm

wow… all I can say is I really appreciate the time and effort you have put into explaining all this

Jeff · October 28, 2003, 7:59am

Although I havent looked at it in depth, NIO provides memory mapped files for just such types of applications. Used properly they will allow you to route the data from file to socket within a modern OS and save major copying over-head.

blahblahblahh · October 28, 2003, 5:10pm

[quote]Although I havent looked at it in depth, NIO provides memory mapped files for just such types of applications. Used properly they will allow you to route the data from file to socket within a modern OS and save major copying over-head.
[/quote]
IIRC this is considerably slower than not mem-mapping the file, even on server-VM (where it’s only a little a bit slower).

And if you’re supposed to mem-map for this, then WTF is the

FileChannel.transferTo(…)

method for? I don’t have access to javadocs at the mo (on a VERY slow dialup away from home), but IIRC that method is expressly for this purpose?

Jeff · October 28, 2003, 5:58pm

Your deepre into the API then I’ve gone.

What I CAN tell you is that this is preciosely what memory mapped files are for. On an OS that supports proper memopry mapping it means that you never have to copy the data out of the system’s buffer but can transfer it directly in the OS.

Likely the call you are askling about uses memory mapping, on Solaris at least.

JK