Selector not blocking on select

Jannick · February 25, 2004, 7:22pm

Hi, I finally got around to play with some nio code again, but of course things had to stop working before I could get anywhere :-/ To make it easier to overlook i’ve put together some code containing of only the basic stuff, but where the problem also exists.

If I connect to my code using fx telnet everything works fine. If I then try to make a connection from macromedia flash and continue to start up the flash script (which means closing the connection and opening a new one fast) my selector suddenly starts to return immidiatly on select, returning 0 selectionkeys. Sometime this condition is even created the first time I connect from flash. My orriginal code connected to another server (and went bananas when the connections were dropped quick), but the behavior has been similar with this test code.

Once the selector has started returning immidiatly it will continue until a new connection is made, after when it will enter into normal blocking operation. Another thing worth noting is that if I dont write out anything to the socket then its not possibel to create this behavior.

Hope someone here can help me figure it out, cause Im pretty empty for ideas atm =) Ill post the code in seperate posts to make it easier to overlook.

Jannick · February 25, 2004, 7:23pm

ConnectionTenant: A “controller” for each connection, just a seperate outputbuffer in this example:


import java.nio.*;
import java.nio.channels.*;

public class ConnectionTenant{

    private SelectionKey sk;
    private ByteBuffer outBuffer;
    private ConnectionWorker cw;

    public ConnectionTenant(ConnectionWorker cw){
      this.cw = cw;

      String welcome = "Hej med dig";
      outBuffer = ByteBuffer.allocate(500);

      byte[] outArr = welcome.getBytes();
      outBuffer.put(outArr);
    }


    public void setSelectionKey(SelectionKey sk){
      this.sk = sk;
      cw.addWriteInterest(sk);
    }
    
    
    public boolean hasData(){
      return (outBuffer.position() > 0);
    }

    public ByteBuffer getOutputBuffer(){
      outBuffer.flip();
      return outBuffer;
    }

    public void onWritePerformed(){
      outBuffer.compact();
    }

    public boolean isRequestingShutdown(){
      return false;
    }

    public void onShutdown(){
      System.out.println("[ConnectionTenant:onShutdown()]");
    }
    

}

Jannick · February 25, 2004, 7:24pm

ConnectionWorker: The selector object


import java.nio.*;
import java.nio.channels.*;
import java.io.*;
import java.util.*;

public class ConnectionWorker implements Runnable{

    private FIFOQueue newConnections = new FIFOQueue(10);
    private FIFOQueue writeRequests = new FIFOQueue(10);
    private Selector selector;
    private ByteBuffer readBuffer;
    private Thread internalThread;
    
    public ConnectionWorker() throws IOException{
      selector = Selector.open();
      readBuffer = ByteBuffer.allocateDirect(1000);

      internalThread = new Thread(this);
      internalThread.start();
    }


    public synchronized void addSocketChannel(SocketChannel sc){
      newConnections.addObject(sc);
      selector.wakeup();
    }

    public synchronized void addWriteInterest(SelectionKey sk){
      writeRequests.addObject(sk);
      selector.wakeup();
    }

    private void closeChannel(SocketChannel sc){
      if (sc == null) return;

      SelectionKey sk = sc.keyFor(selector);
      if (sk != null){
          if (sk.attachment() != null){
            ((ConnectionTenant)sk.attachment()).onShutdown();
          }
          sk.cancel();
      }

      try{
          sc.socket().close();
      }catch (IOException e){
          e.printStackTrace();
      }

      try{
          sc.close();
      }catch (Throwable t){
          t.printStackTrace();
      }

    }

    private void processNewConnections(){
      for (Object o = newConnections.getObjectNow(); o != null; o = newConnections.getObjectNow()){
          SocketChannel sc = (SocketChannel)o;

          if(!sc.isConnected()){
            return;
          }
          

          try{
            System.out.println("[ConnectionWorker] Registering new connection");

            sc.configureBlocking(false);
            ConnectionTenant ct = new ConnectionTenant(this);
            SelectionKey sk = sc.register(selector, SelectionKey.OP_READ, ct);
            ct.setSelectionKey(sk);

          }catch (Exception e){
            System.out.println("Error caught in processNewConnections(). Printing stack trace: ");
            e.printStackTrace();
            
            closeChannel(sc);
          }
      }

    }//end processNewConnections()


    private void processWriteInterests(){
      for (Object o = writeRequests.getObjectNow(); o != null; o = writeRequests.getObjectNow()){
          SelectionKey sk = (SelectionKey)o;
          sk.interestOps(sk.interestOps() | SelectionKey.OP_WRITE);
      }

    }//end processWriteInterests()
    


    

    public void run(){
      int nSelKeys = 0;
      System.out.println("[ConnectionWorker] Starting service");

      while (true){
          processNewConnections();
          processWriteInterests();
          
          try{
            nSelKeys = selector.select();

          }catch (Exception e){
            e.printStackTrace();
            return;
          }

          System.out.println("[ConnectionWorker] Selected " + nSelKeys + " keys");
          if (nSelKeys == 0) continue;
          
          Set keys = selector.selectedKeys();
          Iterator i = keys.iterator();

          while (i.hasNext()){
            SelectionKey sk = (SelectionKey)i.next();
            i.remove();

            processKey(sk);
          }
      }

    }//end run()


    private void processKey(SelectionKey sk){
      if (!sk.isValid()){
          System.out.println("[DEBUG] Invalid key");
          return;
      }
      
      if (sk.isReadable()){
          if (!processRead(sk)) return;
      }

      if (sk.isWritable()){
          processWrite(sk);
      }
      
    }//end processKey()


    private boolean processRead(SelectionKey sk){
      readBuffer.clear();
      ReadableByteChannel rbc = (ReadableByteChannel)sk.channel();
      int numBytesRead = 0;
      
      try{
          numBytesRead = rbc.read(readBuffer);

      }catch (Exception e){
          e.printStackTrace();
          closeChannel((SocketChannel)sk.channel());
          return false;
      }
      

      if (numBytesRead < 0){
          closeChannel((SocketChannel)sk.channel());
          return false;
      }

      readBuffer.flip();
      //Discard data in demo


      return true;
    }//end processRead()



    private void processWrite(SelectionKey sk){
      ConnectionTenant ct = (ConnectionTenant)sk.attachment();
      
      if(ct.hasData()){
          WritableByteChannel wbc = (WritableByteChannel)sk.channel();
          
          try{
            ByteBuffer writeBuffer = ct.getOutputBuffer();
            wbc.write(writeBuffer);
            
            ct.onWritePerformed();

          }catch (Exception e){
            e.printStackTrace();
            
            closeChannel((SocketChannel)sk.channel());
          }


      }else{
          sk.interestOps(sk.interestOps() & (~SelectionKey.OP_WRITE));

          if (ct.isRequestingShutdown()){
            closeChannel((SocketChannel)sk.channel());
          }
      }

    }//end processWrite()

}

Jannick · February 25, 2004, 7:25pm

Should be irrelevant, but heres the ConnectionListner:

import java.util.*;
import java.net.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;


public class ConnectionListener implements Runnable{


    private Selector selector;
    private Thread internalThread;
    private ConnectionWorker cw;
    private String host;
    private int port;

    public ConnectionListener(String host, int port) throws IOException{
      this.host = host;
      this.port = port;

      selector = Selector.open();
      
      ServerSocketChannel ssc = ServerSocketChannel.open();
      ssc.configureBlocking(false);
      ssc.socket().bind(new InetSocketAddress(host, port));
      ssc.register(selector, SelectionKey.OP_ACCEPT);
      
      cw = new ConnectionWorker();

      internalThread = new Thread(this);
      internalThread.start();

    }



    public void run(){
      int nSelKeys = 0;
      System.out.println("[ConnectionListener] Starting service");

      while (true){
          
          try{
            nSelKeys = selector.select();

          }catch (Exception e){
            e.printStackTrace();
            return;
          }
          
          if (nSelKeys == 0) continue;
          
          Set keys = selector.selectedKeys();
          Iterator i = keys.iterator();

          while(i.hasNext()){
            SelectionKey sk = (SelectionKey)i.next();
            i.remove();

            processKey(sk);
          }

      }

    }//end run()



    private void processKey(SelectionKey sk){
      if(!sk.isValid()){
          System.out.println("[ConnectionListner] Invalid key");
          return;
      }

      try{
          ServerSocketChannel ssc = (ServerSocketChannel)sk.channel();
          SocketChannel sc = ssc.accept();

          cw.addSocketChannel(sc);

      }catch (Exception e){
          e.printStackTrace();
          return;
      }

    }//end processKey()



    public static void main(String[] args) throws Exception{
      if(args.length < 2){
          System.out.println("Syntax: ConnectionListner <host> <port>");
          System.exit(0);
      }

      int port = 5000;
      String host = args[0];
      
      try{
          port = Integer.parseInt(args[1]);
      }catch (NumberFormatException nfe){}

      System.out.println("host: " + host);
      System.out.println("port: " + port);

      ConnectionListener cl = new ConnectionListener(host, port);
    }





}

blahblahblahh · February 26, 2004, 8:43am

Firstly, are you using 1.4.2? If not, no-one is likely to care. 1.4.0 and 1.4.1 do not work with NIO: they have too many major bugs, many with no workaround. They are also very platfrom-dependent, so only someone with your exact OS can help you.

Assuming you’re using 1.4.2 or above, I’m afraid that’s far too much code to wade through without any comments. The execution path is definitely non-obvious, and so to work out what you’re doing when is a difficult task for anyone other than you to do quickly. This is why I spent 30 seconds looking at your code and moved on (having learnt nothing at all in 30 seconds ), assuming someone else with more time would have a look for you.

Since no-one else has replied… If you can put together all the lines - in sequence - that handle or interact with your Selector, I’ll have another look. This should be about 20-40 lines of code. No methods, please, and comments every few lines to say what you’re about to do in the next X lines (e.g. every 2-5 lines is usually about right for selector interaction) would help immensely. Often it’s possible to spot the problem just by reading the comments.

Jannick · February 26, 2004, 10:43am

Im using 1.4.2 and have tried it on both windows 2000 and windows xp. Ill just get the important stuff typed out for the selector loop.

Jannick · February 26, 2004, 11:03am

The basic princip is that only the selector thread does selector related operations. New connections are added to a queue and the selector thread then registers them as part of the select loop:

  
//Executed for all new connections 
//ConnectionTenant is a buffer holding object
sc.configureBlocking(false); 
ConnectionTenant ct = new ConnectionTenant(this); 
SelectionKey sk = sc.register(selector, SelectionKey.OP_READ, ct); 
ct.setSelectionKey(sk);

Similar, if the connection tenant wants to write, it adds a request to a queue, as part of the select loop the thread then modifies interestOps with this:

 //Executed for each connection that wants to write
sk.interestOps(sk.interestOps() | SelectionKey.OP_WRITE);

The SelectionKeys returned by selector.select(); is checked for valid operations by this code:

 if (sk.isReadable()){ 
     if (!processRead(sk)) return; 
 } 
 
 if (sk.isWritable()){ 
     processWrite(sk); 
 }

If its readable the reading is done with this:

//Using a shared direct bytebuffer for reading in the available data. In this example the data is never passed on/used
readBuffer.clear(); 
 ReadableByteChannel rbc = (ReadableByteChannel)sk.channel(); 
 int numBytesRead = 0;

And if its writeable:

//ConnectionTenant holds a writebuffer for the connection. Get the buffer and write from it. ConnectionTenant takes care of preparing buffer to be written before returning it.
 ConnectionTenant ct = (ConnectionTenant)sk.attachment(); 
  
 if(ct.hasData()){ 
     WritableByteChannel wbc = (WritableByteChannel)sk.channel(); 
     

  ByteBuffer writeBuffer = ct.getOutputBuffer(); 
  wbc.write(writeBuffer); 
   
  ct.onWritePerformed();

The initial handling of the Set returned by the select operation is handeled by this code:

//If the selector returned any keys iterate through the key set and toss each of them off to processing
     if (nSelKeys == 0) continue; 
      
     Set keys = selector.selectedKeys(); 
     Iterator i = keys.iterator(); 
 
     while (i.hasNext()){ 
  SelectionKey sk = (SelectionKey)i.next(); 
  i.remove(); 
 
  processKey(sk); 
     }

Hope it made it a bit more clear

blahblahblahh · February 26, 2004, 11:33am

[quote]my selector suddenly starts to return immidiatly on select, returning 0 selectionkeys. Sometime this condition is even created the first time I connect from flash. My orriginal code connected to another server (and went bananas when the connections were dropped quick), but the behavior has been similar with this test code.
[/quote]
The only way it will return 0 selectionkeys is if you called wakeup() (you’re not using interrupt() AFAICS).

Try checking your code that calls wakeup() and see how it could end up being invoked more often than you expect…?

EDIT: Here’s a guess: you’re calling wakeup() once too many times for each time you intend to call it. If you check the API docs you’ll see that they “stack up” - so that if you call it twice when a select is blocking, the first one will wake up the selector, and the second one will be queued so that the next “select” returns as soon as it’s called.

Jannick · February 26, 2004, 11:36am

Ive tried a System.out.println("…") before both wakeups, and its never called. The selector just returns immidiatly on .select(); until a new connections is added.

It keep returning on select (produce 4mb logfile in 10-15 seconds) so is not because wakeups is queued.

cknoll · February 26, 2004, 1:25pm

Question: if you put non-blocking channels in a selector, will they automatically appear ‘ready’ when you do a select? I thought the point of selectors was you register blocking channels and when one of them ‘unblocks’ it will notify the selector? Or something like that…

I’m referring to this block of code:


 //Executed for all new connections  
//ConnectionTenant is a buffer holding object
sc.configureBlocking(false);  
ConnectionTenant ct = new ConnectionTenant(this);  
SelectionKey sk = sc.register(selector, SelectionKey.OP_READ, ct);  
ct.setSelectionKey(sk);

-Chris

blahblahblahh · February 26, 2004, 2:47pm

[quote]Ive tried a System.out.println("…") before both wakeups, and its never called. The selector just returns immidiatly on .select(); until a new connections is added.
[/quote]
Assuming there are no keys, and if you can comment out the wakeups, and there are no other wakeups or interrupts in another part of your code (do a search/replace for them), AND if you aren’t silently handling an exception, then you probably have a bug. C.f. the API docs for info on what select does, and assuming the above it looks to me (I just re-read it to be sure, but maybe I’ve missed something so check yourself) like the contract is being broken.

Cut out as much code as possible whilst preserving the problem, and get ready to file a bug report…but paste here if you can get it down real small (inline most methods, and aim for under 60 LOC if you can) and maybe some subtle mistake will become obvious :).

blahblahblahh · February 26, 2004, 2:49pm

[quote]Question: if you put non-blocking channels in a selector, will they automatically appear ‘ready’ when you do a select? I thought the point of selectors was you register blocking channels and when one of them ‘unblocks’ it will notify the selector? Or something like that…
[/quote]
They automatically appear as “ready” for whichever of the operations you said you wanted to be notified of IFF that operation has data (or something; nb there is the undocumented notify-of-disconnect that counts as “data” here) ready.

But they don’t disappear unless you manually remove them.

But he says nothing is appearing as ready - there are no selection keys in the selector’s set.

Jannick · February 26, 2004, 4:16pm

I’ve tried inlining as much as possibel, and made a lil discovery. If I register the connection for OP_WRITE right from the beginning then I cant create the problemo, so seems it could be related to the sk.interestOps(sk.interestOps() | SelectionKey.OP_WRITE) call. Ill get some more tests done on this before posting more code, cant really get it down to 60loc so far.

Another little note is that it seems much easier to create this condition when run the flash client on another computer, so might be a time factor involved.

cknoll · February 26, 2004, 4:19pm

Ok, well, even if that is the case, I’m not sure if you want to set these sockets up as non-blocking because that’s what the selector is for: to block only when all channels in the selector would block. So change the line of code to this:


 //ConnectionTenant is a buffer holding object
sc.configureBlocking(true);

and see if that gives you the desired result. If it fixes the problem but what blahablhbalhahalahaha says is correct, then that’s probably an issue that needs to be reported.

-Chris

Jannick · February 26, 2004, 4:22pm

If I leave them in blocking mode (default for new connections) then Ill end up with blocking read and write calls and that pretty much kills the idea.

Jannick · February 26, 2004, 5:07pm

Okay I’ve figured out what causes the problem, dunno if you would classify it as a bug.

In my original test code I made a ConnectionTenant object for each connection, which would hold an internal write buffer. The tenant would ready some welcome/ackknowledge data in a bytebuffer, and on setSelectionKey(SelectionKey sk) it would call back to add itself to a “wantToWrite” queue. Effective result of this was that the selectors own thread called back into a method that called selector.wakeup.

In the current version of my code I dont even write (and dont have to read either, just got the code for it to handle disconnects):

          //Register the new connections in queue
          for (Object o = newConnections.getObjectNow(); o != null; o = newConnections.getObjectNow()){
            SocketChannel sc = (SocketChannel)o;
            if(!sc.isConnected()) return;
            
            try{
                System.out.println("[ConnectionWorker] Registering new connection");
                sc.configureBlocking(false);
                SelectionKey sk = sc.register(selector, SelectionKey.OP_READ);
                System.out.println("INNER WAKEUP");
                //selector.wakeup();

            }catch (Throwable tr){
                tr.printStackTrace();
                closeChannel(sc);
            }
          }

The code is placed before selector.select() in the select loop. If I comment out selector.wakeup() everything runs as normal, but if its not the selector will start to return immidiatly on select. The wakeup placed in the code snippet will only be called for every new connection, so its not like it queue a new “wakeup request” up each loop. Ive made test where wakeup() in the select loop is called one time, but the selector keep non-blocking select until a new connection is made.

An imporant note is that if I connect to the server with telnet from my workstation (where I also run the server) I cant create the bug. The first time (most of the time, sometimes take 2-3 connections) I connect with telnet from my “test machine” (stands right next to workstation, both connected to same switch) it will start its non-blocking select. I dont know enough about low lvl io to say anything based on this, but it obviously has an effect that its a “real” network connection.

Jannick · February 26, 2004, 5:11pm

Heres my current test code, got it somewhat down in length.

addSocketChannel(SocketChannel sc)… is called by the ConnectionListener and is the only place any other thread calls any method in this object.


public class ConnectionWorker implements Runnable{

    private FIFOQueue newConnections = new FIFOQueue(10);
    private Selector selector;
    private Thread internalThread;
    private ByteBuffer readBuffer;

    public ConnectionWorker() throws IOException{
      selector = Selector.open();
      readBuffer = ByteBuffer.allocateDirect(1000);

      internalThread = new Thread(this);
      internalThread.start();
    }


    public void addSocketChannel(SocketChannel sc){
      newConnections.addObject(sc);
      System.out.println("Calling wakeup()!! new socket added to queue");
      selector.wakeup();
    }


    private void closeChannel(SocketChannel sc){
      if (sc == null) return;

      SelectionKey sk = sc.keyFor(selector);
      if (sk != null) sk.cancel();

      try{
          sc.close();
      }catch (Throwable t){
          t.printStackTrace();
      }

    }


    public void run(){
      int nSelKeys = 0;
      System.out.println("[ConnectionWorker] Starting service");

      while (true){
          //Register the new connections in queue
          for (Object o = newConnections.getObjectNow(); o != null; o = newConnections.getObjectNow()){
            SocketChannel sc = (SocketChannel)o;
            if(!sc.isConnected()) continue;
            
            try{
                System.out.println("[ConnectionWorker] Registering new connection");
                sc.configureBlocking(false);
                SelectionKey sk = sc.register(selector, SelectionKey.OP_READ);
                System.out.println("INNER WAKEUP");
                //selector.wakeup();

            }catch (Throwable tr){
                tr.printStackTrace();
                closeChannel(sc);
            }
          }

          try{
            nSelKeys = selector.select();

          }catch (Exception e){
            e.printStackTrace();
            return;
          }

          System.out.println("[ConnectionWorker] Selected " + nSelKeys + " keys");
          if (nSelKeys == 0) continue;
          
          Set keys = selector.selectedKeys();
          Iterator i = keys.iterator();

          while (i.hasNext()){
            try{
                SelectionKey sk = (SelectionKey)i.next();
                i.remove();


                if (sk.isReadable()){
                  //Read so that it wont keep returning key
                  //In case test klient sends data
                  readBuffer.clear();
                  int readBytes = ((SocketChannel)sk.channel()).read(readBuffer);
                  if (readBytes < 0) closeChannel((SocketChannel)sk.channel());
                }

            }catch (Throwable t){
                t.printStackTrace();
                continue; //next key
            }   

          }//end while (i.hasNext())

      }

    }//end run()

}

blahblahblahh · February 26, 2004, 5:11pm

[quote]Ok, well, even if that is the case, I’m not sure if you want to set these sockets up as non-blocking because that’s what the selector is for: to block only when all channels in the selector would block. So change the line of code to this:
[/quote]
!!! what do you see as the point of the method call configueBlocking() then, if it’s not to enable non-blocking mode?

blahblahblahh · February 26, 2004, 5:28pm

[quote]Okay I’ve figured out what causes the problem, dunno if you would classify it as a bug.

…Effective result of this was that the selectors own thread called back into a method that called selector.wakeup.
[/quote]
So my guess was pretty close?

If you look at the way the API is designed this problem makes sense somewhat… The design (although Sun doesn’t really explain this in the docs, you can find treatises on the different ways OS’s implement asynch elsewhere) is that once something happens that the Selector notices, it assumes that thing is always happening, and doesn’t listen out for it to stop.

Hence the warnings to remebmer to remove keys from the set, else your channels will always appear to be readable/writable/etc from as soon as they first become so.

If the select exited with a key state-change, you can reset it by removing the keys. If it exited with a wakeup with no keys, you can’t do anything to change it’s internal state, so it carries on in the same state forever returning an empty key set.

I suspect there is a naive FSM inside which saves it’s previous state and monitors if the state has been changed. If this is even close to true, it’s very sad because it means the authors didn’t have a good set of unit tests (this happens very occasionally with particular parts of the standard libs, where a group of bugs appear that show the author of a particular class was not doing much unit testing).

I believe this is definitely a bug, because I’m pretty sure that this wasn’t the intended behaviour. I suggest you log a bug-report, and put in the suggested action/workaround fields something like:

You could change it to automatically reset its status if it leaves a select with no keys, so that it won’t immediately do the same thing on the next select call (i.e. fix the bug) - this is the preferred option
You could add a method .reset() to Selector which does the same thing manually, so that if someone calls wakeup and gets back an empty set they can at least force it to go back to blocking - this is in case there is some reason why number 1 is undesirable. It also has the benefit of being backwards compatible.

If there’s a reason why this isn’t a bug, they’ll probably tell you!

(don’t forget to include your handy shortened code; with the code snippet they’re much more likely to accept the bug, assuming you give them enough info to reproduce it!)

cknoll · February 26, 2004, 11:48pm

I’m saying that I don’t see the point in enabling non-blocking mode on a socket when using channel selectors. Why would you?

-Chris