Possibly, except for a (&%£ license incompatibility.
I’m not sure I could achieve quite what I want without the boilerplate that Janino provides anyway.
Ah, I see you like playing with fire!
The Janino ClassLoader already has the ability to provide a ProtectionDomain for classes compiled through it. That would appear to be the way to do this, but I’ve never tried.
Exactly. The timer issue is that you then need a second monitoring thread, which potentially brings in thread switching issues - in particular one thing I’m looking at doing at the moment is live coding audio DSP. Extra threads could be an issue there.
The bigger thing is what to do with a thread stuck in a loop. Call stop() on it? That would have to be very carefully thought out if anything’s shared. Would have thought that would be even worse for you - could have more serious effects on a server!
Hmm, interesting problem about the thread stopping. I guess Thread.interrupt won’t work in an infinite loop.
I’ve never used it but how about byte code injection? That seems to be a way of adding arbitrary code over the top of someone else’s code. Maybe you could inject some code that checks a flag at the start of every loop to see if the thread should die??
Another thing, why is it an issue for you if your user breaks his own program by making an infinite loop?
On preventing run-away scripts. An other option is to do code weaving and have the script itself bail if it taking too long. If you’ve never goofed with compilers this might be too much of a time commitment. Luckily with janino (or javac framework) you can manipulate the AST instead of asm. The upside is that you could do other code inspection and weaving. Say disallowing most calls of ‘new’ or whatever else.
Humm…I’m not familiar with the ProtectionDomain class and glancing at the source it looks like it would take longer for me to figure out than just writing a classloader. If this notion isn’t clear…shout out.
Simply that it isn’t very user friendly! Praxis LIVE is a graphical patcher / dataflow environment where you can add fragments of code (Java, GLSL, etc.) to the processing graph at runtime. The idea is that you can incrementally change and back out code as it runs. Entering an infinite loop would completely stall the media pipeline, forcing a restart and losing some changes. If someone writes a while(true) loop without thinking, they get what they deserve - as you implied, some infinite loops are more subtle.
I may consider a protected environment, probably as Roquen suggests using AST manipulation rather than byte-code manipulation, if I can get it to run without losing too much performance. It’s not a high-priority though.
Slowly is unfortunately the word though.
In the current sources I was playing with yesterday this is marked deprecated, and is also a no-op. JavaDoc comment says “Auxiliary classes never really worked… don’t use them.”
I suppose a small plug for Starsector (formerly Starfarer) would not be entirely inappropriate here, given its use of Janino in its scripting & extensive modding support.
I don’t know if Alex (the game’s creator) frequents these forums, but he could undoubtedly cast some valuable light upon the usability & security of using Janino in this scenario.
With serverside 3rd party code, you have to not only worry about cpu-cycle consuming runaway scripts. Memory consuming scripts are much more dangerous, because you basically can’t gracefully recover from code like this:
String s = "";
while(true) {
s += " ";
}
as random, unrelated, critical code will start throwing OutOfMemoryErrors - it will very likely pull your service down when the GC panics, and maybe even the entire server, due to excessive swapping.
With ASM, you can rewrite bytecode to intercept every new and anew instruction, and manage the allocation count your script is allowed to reach. It’s not easy to make this water tight, because almost all the JRE classes expose methods that do allocations behind the scenes (like StringBuilder.append, as per the above example)
I’d propose the 3rd party code to run in a separate, bolted down JVM, which can be nuked from orbit when it misbehaves. Not very practical, but what ya gonna do. With a bit of trickery, you could use MappedByteBuffers to share a read-only view of your business objects. (if they are backed by buffers - where are structs when you need 'em)
@Riven’s comment: This is related to disallowing almost all ‘new’ invocations in user scripts (the almost part is to allow things like boxing). Have all instances come from SDK calls. Disallow StringBuilder and StringBuffer in the script classloader. In fact disallow pretty much every class that isn’t from the SDK (the SDK’s classloader can handle arbitrary classes) and required for basic operation.
On code weaving: I was initially thinking of weaving in a method call (which checks for timeout and tosses an exception if needed) at the entry of all user defined methods (at least to start with…you can get clever later) and at the top of loops. But if your inspecting for user-defined loops…you could just disallow them and only allow iterating on SDK provided collections.
That still doesn’t quite cover: [icode]while(true) list.add(null);[/icode]
You could create a callback prior to every new, newarray, anewarray, multianewarray, invoke*, goto, goto_w, if*, if_*, jsr, ret and athrow though. That way you can be reasonably sure you can intercept a runaway script, assuming that it can only create a limited amount of instances as you limit the number of instructions that can be executed, without being at the mercy of the OS thread scheduler.
The goal of scripting is to allow building of behaviors and modification of game state data and not general computing. I think what I’m suggesting has a fair amount of merit and is more flexible that what you’ll see in most commercial games that provide an end-user scripting language. You’re allowing the janino (or javac) folks do the heavy lifting of conversion of source to AST and lowering to bytecodes. The JVM folks are taking care of converting bytecodes to native. You only have to write a custom classloader which only allows permitted classes to the scripter and an AST visitor to handles any weaving and subseting of java. This should be fairly easy to keep up-to-date with any changes the janino/javac folks do. On loops it seems a reasonable thing to disallow esp. if the scripts are running server side and the SDK provides some iterators such as [icode]entitiesWithinRadius(…)[/icode].
If you have these basics locked-down and working rock solid you could get fancy and add in work-arounds for some of these limitations by providing SDK calls.
I downloaded eclipse and the AST plugin,
followed the instructions to get AST working in my program independently of eclipse: http://www.programcreek.com/2011/01/a-complete-standalone-example-of-astparser/
Managed to analyse some code using the org.eclipse.jdt.core.dom.ASTVisitor class. There are heaps of ‘visit…’ methods which can be used to detect things, the javadocs are here:
If I analyse this simple file:
package eclipseast;
public class TestFile {
public TestFile(){
// a constructor
}
public boolean someMethod(){
TestFile testFile = new TestFile();
while (true){
Object obj = new Object();
break;
}
String str = "hi";
return str.startsWith("hi there");
}
}
It produces this output:
Line 1: ASTNode of type CompilationUnit
Line 1: ASTNode of type PackageDeclaration
Line 1: ASTNode of type SimpleName named 'eclipseast'
Line 2: ASTNode of type TypeDeclaration
Line 2: ASTNode of type Modifier
Line 2: ASTNode of type SimpleName named 'TestFile'
Line 3: ASTNode of type MethodDeclaration
Line 3: ASTNode of type Modifier
Line 3: ASTNode of type SimpleName named 'TestFile'
Line 3: ASTNode of type Block
Line 6: ASTNode of type MethodDeclaration
Line 6: ASTNode of type Modifier
Line 6: ASTNode of type PrimitiveType named 'boolean'
Line 6: ASTNode of type SimpleName named 'someMethod'
Line 6: ASTNode of type Block
Line 7: ASTNode of type VariableDeclarationStatement
Line 7: ASTNode of type SimpleType named 'TestFile'
Line 7: ASTNode of type SimpleName named 'TestFile'
Line 7: ASTNode of type VariableDeclarationFragment named 'testFile'
Line 7: ASTNode of type SimpleName named 'testFile'
Line 7: ASTNode of type ClassInstanceCreation named 'TestFile'
Line 7: ASTNode of type SimpleType named 'TestFile'
Line 7: ASTNode of type SimpleName named 'TestFile'
Line 8: ASTNode of type WhileStatement
Line 8: ASTNode of type BooleanLiteral
Line 8: ASTNode of type Block
Line 9: ASTNode of type VariableDeclarationStatement
Line 9: ASTNode of type SimpleType named 'Object'
Line 9: ASTNode of type SimpleName named 'Object'
Line 9: ASTNode of type VariableDeclarationFragment named 'obj'
Line 9: ASTNode of type SimpleName named 'obj'
Line 9: ASTNode of type ClassInstanceCreation named 'Object'
Line 9: ASTNode of type SimpleType named 'Object'
Line 9: ASTNode of type SimpleName named 'Object'
Line 10: ASTNode of type BreakStatement
Line 12: ASTNode of type VariableDeclarationStatement
Line 12: ASTNode of type SimpleType named 'String'
Line 12: ASTNode of type SimpleName named 'String'
Line 12: ASTNode of type VariableDeclarationFragment named 'str'
Line 12: ASTNode of type SimpleName named 'str'
Line 12: ASTNode of type StringLiteral
Line 13: ASTNode of type ReturnStatement
Line 13: ASTNode of type MethodInvocation named 'startsWith'
Line 13: ASTNode of type SimpleName named 'str'
Line 13: ASTNode of type SimpleName named 'startsWith'
Line 13: ASTNode of type StringLiteral
I think it’s feasible for me to weave time-checking methods into the start of all loop iterations (WhileStatement, ForStatement, …), method calls (MethodInvocation) and others which allows me to terminate the 3rd party script in case it’s carrying on for too long.
I could restrict access to all methods and classes except some that will be allowed, for example anything to do with game state or the basic classes like String, ArrayList and others. This can be done by querying the MethodInvocation or StringLiteral or other relevant ASTNode about what class of object is being created/having its method called.
But about controlling the problem of memory over-allocation, I don’t think that I can easily restrict it by looking at the source code using this AST method. I mean how can I know that String[] stringArray = new String[Math.pow(100000,10000)]; is going to make a humongous String array just by looking at the source? I’m wondering if you guys know another method of monitoring memory allocation, perhaps using ASM and bytecode?
With ASM you can alter the bytecode to first DUP the value at the top of the stack, pass it to another method to verify the arraylength is ‘sane’ prior to executing the NEWARRAY/ANEWARRAY opcode. That way your sanity checks happen at runtime - as that’s the only time mallicious input can be detected.
As for Roquen’s Iterable strategy, that provides a loophole where you can nest enhanced-for loops into oblivion as a means to create a near-infinte loop, which can be used to stage the aformentioned attacks, while all by themselves creating tonnes of Iterator instances that are strongly reachable.
Bytecode manipulation through ASM is just as easy as messing about with AST, whilst being way more powerful and effective, and less restrictive to the scripter. Let them have their custom loops
Seems to be a minor bit of mis-communication. By AST I meant whatever AST/Visitor functionality the choosen compiler based provides and not any specific library…so janino has this in it, so does javac and (yes) eclipse.
I’m suggesting that the user not be allocate memory for themselves…ever. They can only get an instance of something from code you provide. That’s the only way you can insure a bound on allocations. Related to that is why I’m suggestion you not give them access to any standard java classes, beyond those pretty much required: like primitive wrappers and String. Every class you allow them direct access to needs to be inspected for any potential holes and you need to redo that work any time you upgrade the JVM. Recall what I said above…you don’t need to provide general computation framework…just things they need to be able to define behaviors and modify accessible game states.
I doesn’t matter what the script does…if it takes too long it’s forced to stop. No code analysis is needed.
Good point. Back to weaving in a checks at top of loops. The reason I thought it would be good to avoid this is so the server could run fewer serializing instructions (reading the counter).
In the cases we’re currently talking about the choice doesn’t really matter too much.