Question about genericity in the class ManagedReference

karmaGfa · April 1, 2007, 12:04pm

Hello,

I still have 2 questions about the design of the SGS App API:

I would like to know why the genericity in the class ManagedReference was not designed like a Collection(i.e. ManagedReference<E extends ManagedObject>),
and why there is a need to specify a class in the functions get() and getForUpdate(). Was it just to be able provide the wanted type as the return value or does it help to request the object from the database?

Regards,
Vincent

Jeff · April 1, 2007, 9:29pm

On 1, its a really fine and subtle point of semantics. While it would seem the obvious thing to do (and something we did in the EA stack) it turns out to be subtly, arguably semantically wrong. If you want the whole argument in depth, you’ll need to get it from Seth or Tim.

On 2, its so the return type is cast to the correct type, yes. At that stage in the game you have more information as the object has been retrieved and the casting becomes more semantically correct.

Hoyle · April 1, 2007, 10:16pm

I was just about to ask a similar question . . .

I ended up writing a wrapper around ManagedReference that could track the type it was referencing. I didn’t like the fact that because of cut and paste errors I might try to get the wrong type out of a managed reference. The reference itself really didn’t seem to have a clue as to what it was refering to. Using this new class, I can specify it’s type and even bind it to the datamanager within the constructor.


import java.io.Serializable;
import java.math.BigInteger;

import com.sun.sgs.app.AppContext;
import com.sun.sgs.app.ManagedObject;
import com.sun.sgs.app.ManagedReference;


public class GenericManagedReference<X extends ManagedObject> implements ManagedReference, Serializable {

	private final static long serialVersionUID = 1L;

	private ManagedReference ref;
	
	private Class<X> clazz;
	
	public GenericManagedReference(final X obj) {
		ref = AppContext.getDataManager().createReference(obj);
		clazz = (Class<X>)obj.getClass();
	}

	public <T> T get(Class<T> arg0) {
		return ref.get(arg0);
	}

	public <T> T getForUpdate(Class<T> arg0) {
		return ref.getForUpdate(arg0);
	}
	
	public X get() {
		return ref.get(clazz);
	}
	
	public X getForUpdate() {
		return ref.getForUpdate(clazz);
	}	

	public BigInteger getId() {
		return ref.getId();
	}
}

Jeff · April 2, 2007, 12:11am

Well, it doesn’t really. All managed references at the end of th day are simply an index into the database.

Its sort of the ManagedObject equivalent of the base Object reference.

Hoyle · April 2, 2007, 3:53am

Yeah, I understand what your saying. Although in my code I’d prefer to have references that are specific to a specific class instead of making everything an Object. I’ve used this GenericManagedReference everywhere in my code and it has helped catch a lot of would be runtime errors.

I think



GenericManagedReference<Foo> fooRef = new GenericManagedReference<Foo>(new Foo());

. 
.  Do some stuff here
. 

Foo foo = fooRef.get();

// Note that Bar bar = fooRef.get() will not even compile.

is a lot safer and easier to read than



ManagedObject unknownReference = AppContext.getDataManager().createReference(new Foo());

.
.  Do some stuff here
.

// Providing the class twice seems odd?  
// Seems that 
//   <X> X ManagedReference.get(Class<X> value)
// Could just be
//   <X> X ManagedReference.get(); 
Foo foo = unknownReference.get(Foo.class);

especially considering that I’ve been known to do this a bit . . .



ManagedObject unknownReference = AppContext.getDataManager().createReference(new Foo());

.
.  Do some stuff here
.

// Since I don't know what this is a reference to, a quick typo will give me a runtime error
Bar bar = unknownReference.get(Bar.class);

karmaGfa · April 2, 2007, 1:47pm

From my point of view, to do

Foo foo = ref.get(Foo.class);

instead of

Foo foo = (Foo) ref.get();

is not particularly helpfull. The result is the same so it is as if there was no genericity at all in the class.

If there is no really usefull genericity in the class, I prefer that there is at least the one I proposed since it leaves the choice for the user to use it or not.

I am really curious and would like to understand where it is semantically wrong to use the design of the collections here.

Deep respects,
Vincent

Jeff · April 2, 2007, 2:11pm

For that deep answer we need to get Tim or Seth to weigh in.

Edit: Just a note. Personally, I find this


ref.get(Foo.class).bar()

A lot neater and easier to read then:


((Foo)ref.get()).bar

Its an idiom that appears a lot in my own SGS app code.

YMMV

Jeff · April 2, 2007, 3:00pm

Okay,

I had a talk with Tim. It was a fairly late decision actually in the API re-design and I think I understand it now, but its a pretty subtle point, so I hope I can explain it clearly.

The issue is this. ManagedReferences are truly typeless, as I mentioned above, until the ManagedObject they reference gets instantiated by the system. In other words, at run-time.

Generics are a 100% compile time construct. They check type at compile time. For this reason, in a sense, a Genericly typed ManagedReference would by “lying to you.” Although it might pass the compile time check there is no assurance it would not actually fail at run-time, when the ManagedObject returned to you from the get() proved to be of some other type.

So it gives no assurances, but causes extra confusion because it looks like it does.

Thats basically the reason.

karmaGfa · April 2, 2007, 4:58pm

… but … that’s the same with the ArrayList class, and people are still happy to use generics on them.

Isn’t there a way to guaranty (at least for the user of SGS App) that a ManagedReference really reference the type of data specified in its Generic type? For example, by adding the genericity to the following function in the interface DataManager:

<T extends ManagedObject> ManagedReference<T> createReference(T object);

… and it seems that this function is the only one to create the Managed reference.

Now, how can a ManagedReference could reference an object whose type is incompatible with its generic type?

I also want to point the fact that with or without a genericity in the class, the cast exceptions happen in the same places and the result for the user is the same.

floersh · April 2, 2007, 5:20pm

Thats simple. In the multi-stack objects can be created on various servers potentially running various versions of your app (or at least at upgrade time)… And thus one set of code could put one type of object in the store and then another node of the cluster could retrieve that object using say your new code and the type cast would fail or code fail… There is no garuntee in a distributed system and I think they wanted to imply that in their API…

karmaGfa · April 2, 2007, 5:32pm

What you said is true.

But, as I said, with or without generics on the class, the class cast exception will come at about the same place (inside the get function for the current version, and at the usage, pretty shortly after the get function, for the version that have the generic on the class).

Jeff · April 2, 2007, 5:46pm

No it really isnt,

The array list has to be created with a generic type. That then gates what can go into that
list. The result is you are assured (baring certain VERY grody and deliberate reflection tricks) of
what is in that list at run time.

By contrast the Object Store always contains a mix of types and all a ManagedReference contains is an index number. Furthermore the ObjectStore itself has no knowledge even at run-time of the types of objects within in it. until they are instantiated. (I realize thats an implementation detail but thats how its implemented today.)

As I said, its a very subtle point

Isn’t there a way to guaranty (at least for the user of SGS App) that a ManagedReference really reference the type of data specified in its Generic type? For example, by adding the genericity to the following function in the interface DataManager:
<T extends ManagedObject> ManagedReference<T> createReference(T object);
… and it seems that this function is the only one to create the Managed reference.

Well thats a detail of the current API and not necessarily true going forward. (In fact, there is already a break in this
assumption in the 9.1 release that adds a way to get the ID and recreate a ManagedReference from that.)

In the case of your ArrayList, the proper operation is assured by the syntax of the language. Here you are depending on an undocumented feature of the API (that that is the only way to get a ManagedReference.)

Yes, but the assumptions of the user are not. When its not typed the exception for a mis-matched type is expeted, when it is, it leaves you scratching your head asking “how did that happen?”

sethp · April 2, 2007, 7:32pm

Right, and that’s part of the problem here. For folks following along at home: if createReference() takes an Object, and calling get() on the reference requires a type as a parameter, then we can explicitly cast the object and throw an exception if the type is wrong. If, on the other hand, createReference() takes a T, and calling get() simply returns something of type T, then there is no cast until you actually invoke the referenced object. This means that you may not see the exception where you expect to.

There is an alternative. We could make getReference() take both a T and a Class instance, where the class is the same as the type of T. In my opinion, that’s pretty terrible. It also means that the class information needs to be included with each serialized copy of the object, and needs to be meaningful in all classloaders. Given this, then yes, ManagedReference could safely be templated.

So, there are several options, none of which (in my opinion) is all that attractive. Use no templating on ManagedReference and provide a Class on get(), template ManagedReference with no Class parameter such that calling get() may not actually return the right type, or template ManagedReference and also include a Class parameter when the reference is created. For that matter, ManagedReference could be templated and we could require the Class on get(), but that’s just all kinds of unattractive. After banging our heads against this for a long time, we decided to go with the approach that provides (what we think is) the best tradeoff between safety, performance, debuggability, and ease of coding. No doubt that last point is endlessly debatable

floersh raises exactly the right issues for why we assume that the datatypes might not match. In this kind of distributed system, if you ever want to upgrade your code, core code, etc., then this case can arise. We want to give developers the tools to handle this as clearly as possible, which includes making it easy to track down exactly where type mis-matches happen.

seth

karmaGfa · April 3, 2007, 12:02pm

Thank you for the explaination.