Casting Objects

Are Object casts (to a subclass) processor expensive?

No. On my machine (P4 3.06GHz) a cast from Object to Integer appears to take about one clock cycle. It is difficult to measure this reliably, which is really more evidence that you should not worry about it.

[quote]a cast from Object to Integer appears to take about one clock cycle.
[/quote]
One clock cycle of what clock is that? Certainly not the CPU’s clock I suppose…

Yes the CPU clock cycle. In other words the casts appear to be costing of the order of 0.3ns.

How can that be and how did you measure that?
The last time I checked, no 1 CPU instruction takes as little as 1 clock cycle so how can a java type cast cost so little?. Or maybe something has changed in this (I haven’t done any asm in years)?

(BTW. I know it’s a bit off topic nitpicking as you’re right this casting is not costly at all :))

I think on modern processors with fancy pipeline architectures the THROUGHPUT can be one instruction per clock or more. The actual instruction takes several cycles… but because of the pipeline there are several instruction in progress at any one time and (at least) one instruction FINISHES on each clock cycle. This of course is the optimal case, many things can happen to stall the pipeline.

That said, I find it hard to believe that a cast can happen so quickly. Surely it takes several instruction to do the cast, given that a check needs to be made on the validity of the cast re:throw an exception or not. Plus possibly some pointer/table adjustment to gain access to new methods that are available after the cast. One instruction just seems far too cheap. Although I recall reading that the cost of a cast has gone down from what it used to be with older VMs.

Cast can be actually free if you already know that object extends given class. If it is not a case, what is left is just a check if superclass of object at given depth is equal to compile time constant - something like

cmp [eax+depth_offset], constant
jne exception

With branch being default to non-taken, eax pointing to object class. Simple cmp and non-taken-jne can be resolved in single clock as far as I remember.
You may need a one instruction more to get pointer to object class, but it will be free if you use some method of object afterwards, as you will have to resolve it anyway.

1 clock seems reasonable estimation, with maybe one or two more in more complicated cases. Not much to worry about.

Of course, if we are talking about interface casts, situation looks quite different…

Ah, of course, the pipe line.
Never had to deal with that on a 68k ;D
I read a few articles the other day about how to prevent stalling the pipeline, but the code that that resulted in was so utterly unreadable and it all complicated things so much that I didn’t bother to get into asm on recent CPU’s again :stuck_out_tongue:

It doesn’t even matter whether the branch is taken or not these days provided the processor correctly predicted which route would be followed. The branch prediction on casting should be good if the objects being cast are all the same type.

[quote]…what is left is just a check if superclass of object at given depth is equal to compile time constant - something like

cmp [eax+depth_offset], constant
jne exception
[/quote]
Cool. Now that I think about it, I guess it really can be that simple! You can’t get much cheaper than that… except for the ‘free’ case where the cast is known to be safe of course.
Hmm… Is depth_offset a constant though? Would it not depend on the actual type of the object? I guess it will always be fixed distance from the root Object… but the size of such a lookup table would depend on the actual object type wouldn’t it?

class A extends Object {}
class B extends A {}
class C extends B {}

Checking if A is a valid B or C would look into the table, but if A is really a B and we try to cast it to a C do we go one step too far into the table?

So, it’s essentially super-cheap.

I was asking because I’ll be doing quite of lot of it at times in some portions of my main game loop, and I was wondering whether or not a not-cast solution would yeild performace benefits worth chasing. Probably not, in this case.

Of course, my game code is still in the early stages, so I haven’t got the the optimization step yet, but I do appreciate the comments for future reference.

I don’t think there is much you can do about casting in your main loop if you want to use any sort of Container, well apart from reverting to arrays everywhere.

This should be sorted as of Generic in Java 1.5 tho…

Kev

This is true (about 1.5). However, I’m writing a code library at the moment for use in my game (the part that will be called many times in the main loops of my projects), and I’d like this code to be around for quite a while.

So, I don’t think I’ll be worrying about the casts. But then again, do I really care that about making the generics change to my library when 1.5 comes out? I don’t think so. Doing so would make it incompatible with any previous JDK.

Then again, what’s the point? Why should I take the time to make tedious changes to the library if the performance difference is, in this case, negligible?

You just need to make such lookup table constant in size, even for classes which are not deep in inheritance tree. This means umpteen wasted bytes per class, but IMHO it is acceptable tradeoff for speedup. Few years ago I have done some profiling checks
http://nwn-j3d.sourceforge.net/misc/stattable.html
Depth 5 seems to be very good choice - as depth is known at jit compile time, for depth 6+ jvm can just use normal, iterate-through-superclasses check.

Agreed, it will be a lot of hassle…

But, if Generics are implemented like Templates are in C++ it will only mean that you can’t compile your source on older JDKs, won’t cause a problem running it, since the Generic will have just generated an appropriate class… although, I don’t know the details of how generics will be implemented.

As to why to move to generics, well unless you’re writing type safe wrappers for all you containers (which tbh, I try to) you’ll be exposing an Object as the contained element in quite a few places. This is quite bad from a code robustness point of few. Moving to Generics in this case will most likely make the code far more resiliant to accidental coding problems…

Kev

I also wondered how expensive typecast is. Following code displays:
~ 200 ms WITHOUT typecast
~ 1000 ms WITH typecast

My interpretation is: typecast takes four times longer to execute, THAN method invocation + integer divide (at least in this case). I’m not sure if it’s correct interpretation though



public class Test {

	private static final int COUNT			= 50000000;
	private static Test object1 = new Test();
	private static Object object2 = new Test();
	
	public static void main(String[] args) {
		
		long startTime = System.currentTimeMillis();
		test();		
		long diff = System.currentTimeMillis() - startTime;
		System.out.println("Without typecast: " + diff);

		startTime = System.currentTimeMillis();
		test2();		
		diff = System.currentTimeMillis() - startTime;
		System.out.println("With typecast: " + diff);
	}

	private static void test()
	{
		for (int i=0; i<COUNT; i++)
		{
			object1.testMethod();
			object1.testMethod();
			object1.testMethod();
			object1.testMethod();
			object1.testMethod();
		}
	}

	private static void test2()
	{
		for (int i=0; i<COUNT; i++)
		{
			((Test) object2).testMethod();
			((Test) object2).testMethod();
			((Test) object2).testMethod();
			((Test) object2).testMethod();
			((Test) object2).testMethod();
		}
	}

	public void testMethod()
	{		
		int i = 10 / 3;
	}
}

The point being that should only use type casts where you have to? Not that surpising.

Kev

Doesn’t the server VM do some clever stuff to optimise casts away completely?

Cas :slight_smile:

Well, you never have to. You can always make some crazy not-very-OO construction (or at least replace collections with arrays). The point was to find out whether it’s worth it.

And my point was that whether you have to or not isn’t just dependent on performance.

Kev