puppygames.net:25000 is rock solid too.
I’m pretty certain the trouble is with your ISP, BT (it used to be crappy, back in the day) or your home network, fwiw.
The last unknown is your mystery socketFactory that may be configured… oddly.
puppygames.net:25000 is rock solid too.
I’m pretty certain the trouble is with your ISP, BT (it used to be crappy, back in the day) or your home network, fwiw.
The last unknown is your mystery socketFactory that may be configured… oddly.
hmm I’m on PlusNet (been on PlusNet for the entire time I’ve lived here). I wonder what they could be doing wrong.
I’ll just check I get similar results from java-gaming…
Would you mind altering the server code so that it doesn’t close the socket?
Cas
Well, I can’t let the server run out of file handles. that can screw up the OS quite badly, as every single service/process will start to fail in spectacular ways. So… what can I do for you instead? A ping/pong-like service?
Don’t worry, I’m closing the sockets immediately from the client end and I’ll only run the test for a few seconds… should survive?
(Or stick in a max count and then abort the process after say 1000)
Cas
Deployed:
[icode]Server.java[/icode]
import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
import java.util.ArrayList;
import java.util.List;
public class Server {
public static void main(String[] args) {
ServerSocket ss = null;
List<Socket> open = new ArrayList<>();
while (true) {
if (ss == null) {
try {
ss = new ServerSocket(25000);
} catch (IOException e) {
e.printStackTrace();
sleep(1000);
ss = null;
continue;
}
}
try {
Socket s = ss.accept();
open.add(s);
System.out.println("accepted[" + s + "] / " + open.size());
} catch (IOException e) {
e.printStackTrace();
sleep(1000);
ss = null;
}
while (open.size() > 50) {
try {
open.remove(0).close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
private static void sleep(int ms) {
try {
Thread.sleep(ms);
} catch (InterruptedException exc) {
// meh
}
}
}
Hmm, well that worked perfectly too on JGO. I’ll try that on puppygames.
Cas
Linode… given they dropped the price of their low-end VPS from $20 to $10 / month, there’s no reason not to make the switch. Dirt cheap, can’t break it (easily). It just rocks. :point:
Think I may migrate to Linode at some point in the near future.
Running that exact code on puppygames.net:25000 now - and it fails almost instantly here with connect timed out etc. How about when you try connecting puppygames?
Cas
As said before, puppygames.net:25000 was (and to this very moment is!) equally stable for me, just some hefty (atlantic ocean induced) latency.
Why are you using a SocketFactory, and how is it configured?
Right then… so what does this tell us.
Firstly, that it’s not puppygames.net: it works fine for you
Secondly, that it’s not my server code: your code has the same problems for me and also works fine for you
Thirdly, that it’s not my machine (also verified problem exists with laptop too btw): I can run against JGO and the client is fine
Fourthly, that it’s not the client code: as I can run against JGO without problems
Fifthly, that it’s not my ISP: as I can run against JGO without problems
Sixthly, that the port number makes no difference: happens to port 80 as well
Seventhly, that the rate makes no difference: happens at full pelt or at 1 every 10 seconds
It seems to only occur between my computer and puppygames.net.
Sorta running out of options here.
Cas
Just the default socket factory.
Also, your “server” code runs fine on JGO, and exhibits exactly the same behaviour when run on puppygames.net (ie. failure) and it does no socket configuration.
Cas
[quote=“princec,post:20,topic:51918”]
Your ISP routes (and mangles?) your traffic.
[quote=“Riven”]
This is possible and one of the few tentative options left… but that would mean that it’s only doing this between me and puppygames.net but not me and JGO.
Cas
Indeed. It’s not uncommon for ISPs to route badly. I get mails from people that can ping to the IP address of java-gaming.org, but cannot connect to port 80. A few weeks later they can, and all is well. Then it starts all over again… sometimes they use proxies to get in, to workaround their crappy ISPs.
Here’s the tracert:
Tracing route to puppygames.net [184.106.147.224]
over a maximum of 30 hops:
1 5 ms 2 ms 2 ms srp527w [192.168.15.1]
2 * * * Request timed out.
3 30 ms 30 ms 28 ms lo0-central10.pcl-ag03.plus.net [195.166.128.184]
4 26 ms 29 ms 27 ms link-a-central10.pcl-gw01.plus.net [212.159.2.168]
5 26 ms 29 ms 29 ms xe-10-2-0.pcl-cr01.plus.net [212.159.0.200]
6 29 ms 28 ms 30 ms xe-11-2-0.edge3.London2.Level3.net [212.187.201.213]
7 124 ms 127 ms 136 ms ae-210-3610.edge1.Chicago2.Level3.net [4.69.158.229]
8 125 ms 123 ms 124 ms ae-210-3610.edge1.Chicago2.Level3.net [4.69.158.229]
9 125 ms 123 ms 123 ms 4.71.248.54
10 * * * Request timed out.
11 124 ms 124 ms 123 ms czi1-tunnel4.ord1.rackspace.net [50.56.6.163]
12 127 ms 127 ms 126 ms core1-CoreB.ord1.rackspace.net [184.106.126.129]
13 124 ms 124 ms 124 ms aggr301a-3-core1.ord1.rackspace.net [173.203.0.177]
14 126 ms 123 ms 123 ms 184-106-147-224.static.cloud-ips.com [184.106.147.224]
Not sure why I’m getting those timeouts.
(For comparison, JGO:)
1 4 ms 3 ms 5 ms srp527w [192.168.15.1]
2 * * * Request timed out.
3 70 ms 36 ms 34 ms lo0-central10.pcl-ag03.plus.net [195.166.128.184]
4 29 ms 34 ms 28 ms link-b-central10.pcl-gw02.plus.net [212.159.2.170]
5 26 ms 30 ms 28 ms xe-10-2-0.pcl-cr02.plus.net [212.159.0.202]
6 26 ms 31 ms 32 ms ae1.ptw-cr02.plus.net [195.166.129.2]
7 * * * Request timed out.
8 30 ms 29 ms 29 ms 217.20.44.193
9 31 ms 29 ms 29 ms 212.111.33.234
10 27 ms 29 ms 29 ms li732-171.members.linode.com [85.159.215.171]
Cas
1 <1 ms <1 ms <1 ms 192.168.1.1
2 20 ms 20 ms 28 ms ............ ORLY!
3 25 ms 25 ms 25 ms ............ ORLY!
4 25 ms 25 ms 25 ms ae3.cr1-asd8.nl.euro.net [194.134.161.215]
5 34 ms 26 ms 26 ms ae0.br1-asd8.nl.euro.net [194.134.161.171]
6 26 ms 26 ms 26 ms er1.ams1.nl.above.net [80.249.208.122]
7 26 ms 27 ms 26 ms ae8.cr1.ams5.nl.above.net [64.125.30.205]
8 112 ms 112 ms 129 ms xe-0-2-0.cr2.lga5.us.above.net [64.125.27.185]
9 129 ms 139 ms 139 ms ae6.cr2.ord2.us.above.net [64.125.24.30]
10 123 ms 124 ms 124 ms ae10.mpr1.ord11.us.above.net [64.125.24.110]
11 123 ms 124 ms 124 ms ae4.mpr1.ord5.us.above.net [64.125.24.94]
12 125 ms 125 ms 124 ms 208.185.125.6.IPYX-076520-ZYO.above.net [208.185.125.6]
13 124 ms 134 ms 124 ms 10.25.0.65
14 127 ms 127 ms 127 ms czi1-tunnel4.ord1.rackspace.net [50.56.6.163]
15 125 ms 124 ms 125 ms core1-CoreB.ord1.rackspace.net [184.106.126.129]
16 124 ms 124 ms 124 ms aggr301a-3-core1.ord1.rackspace.net [173.203.0.177]
17 127 ms 128 ms 127 ms 184-106-147-224.static.cloud-ips.com [184.106.147.224]
1 <1 ms <1 ms <1 ms 192.168.1.1
2 22 ms 20 ms 19 ms ............ ORLY!
3 26 ms 25 ms 31 ms ............ ORLY!
4 26 ms 25 ms 25 ms ae3.cr1-asd8.nl.euro.net [194.134.161.215]
5 27 ms 31 ms 25 ms ae0.br1-asd8.nl.euro.net [194.134.161.171]
6 26 ms 25 ms 26 ms er1.ams1.nl.above.net [80.249.208.122]
7 26 ms 26 ms 26 ms ae14.cr1.ams10.nl.above.net [64.125.21.77]
8 31 ms 42 ms 31 ms ae9.mpr3.lhr3.uk.above.net [64.125.28.242]
9 31 ms 30 ms 31 ms ae6.mpr2.lhr3.uk.above.net [64.125.21.22]
10 31 ms 31 ms 31 ms 94.31.35.186.t01461-01.above.net [94.31.35.186]
11 34 ms 32 ms 31 ms 212.111.33.234
12 39 ms 31 ms 32 ms li732-171.members.linode.com [85.159.215.171]
Right, so… the only difference I can see here is that I have to go via Level3.
Cas
So… once you established a TCP connection… is it stable? If so, just make N connections on N threads, and close N-1 sockets.
I’ve not got as far as to test the stability of the connections yet but if you remember from the protocol we devised, it only transmits a few bytes, reads a small response, and then shuts down, in order to handle thousands of “simultaneous” clients, so stability isn’t really an issue.
I can of course work around it by simply retrying until I get a connection - which is actually what I will really do - but what is bugging me is that it fails at all at this stage, most unexpectedly. It doesn’t bode well for stability. But if it’s genuinely just a crazy quirk of my route from home to the server, there’s nothing I’ll be able to do about it anyway and continually retrying will “patch” over the deficiency. It just sucks to not know why it’s failing and this sort of random crap is exactly why network programming is so pointlessly difficult :emo:
Cas
A (few?) months ago you said you rewrote everything to SSL, and as short-lived connections are truly not a good idea with SSL, given the incredible overhead of the handshake, I presumed you rewrote the protocol to persistent connections.
Anyway, network I/O is hard, and I can know, I make the ‘big’ bucks in this general area. If your low level code looks clean, you’re doing it wrong. Put those (self-adjusting) retry-loops behind abstraction layers and you’d be relatively fine.