One of our (server-side) apps runs perfectly for anything up to a couple of weeks, and then will suddenly start running at 100% of the CPU. The problem is completely intermittent. When it started happening a couple of months ago, it was during peak usage, then happened a week or so later at the lowest point of usage during the day, then happened 2 days later, then ran fine for another 3 weeks.
After the latest occurrence, top shows 3 threads that are causing the problem, but as far as I know, there’s no way to map a PID to a Java thread. The stack dump produced by kill -3 shows me nothing obvious, although perhaps I’m don’t know what I should be looking for.
The worst part about this problem is that I’ve been completely unable to reproduce it during testing. Frustrated hair pulling, ahoy!
Anyone have any thoughts/advice…?
Thx
J