Berkeley DB Java vs. JNI

rizenine · March 27, 2007, 10:53pm

I see SGS uses Berkeley DB Java JNI, why that over the pure Java version? License or performance?

Jeff · March 28, 2007, 2:57am

If you mean Berkley DB over our previous use of Derby, it was the choice of our DB expert. I think it was a combination of performance and familiarity.

We did run into problems with the EA stack not handling violent shut down nicely. The 0.9 seems much more stable and reliable in this regard. Now whether thats a problem of Derby or us not using it right, I’d hesitate to state an opinion as we never really debugged it terribly deeply.

aldacron · March 28, 2007, 3:01am

I think he’s referring to the Java version of Berkeley DB. I was curious about this, too. Why use JNI with the C version when there is a pure Java version? Sleepycat was taken over by Oracle, but the license hasn’t changed.

Jeff · March 28, 2007, 4:15am

I didnt’ know they had done a pure Java version.

It has to either be a different product or a total rewrite.

Anyway I’ll ask Tim, DB expert extrodinaire and let you know what I find out.

rizenine · March 28, 2007, 5:42am

Ya, I was referring to the Berkeley DB pure Java version. Seems like just using the pure Java version would make things easier, not that they’re difficult now.

endolf · March 28, 2007, 6:36am

Tell that to all the Linux users

Now it’s all set up it’s fine, but it was a pain to find a new enough bdb version for verious distros. Especially the enterprise distros as they update much slower.

Endolf

rizenine · March 28, 2007, 2:13pm

I am a Linux user, but I’m using Ubuntu 7. It just seems either there should be the option to use other db’s or keeping it pure Java would ease usage. I’ve never used bdb so I can’t speak to it’s performance, but the whole thing is in Java so I don’t completely understand what performance you would get out of using JNI.

Jeff · March 28, 2007, 2:23pm

Well “using JNI” isnt the issue. If there is a performance issue it would be between the two versions.

One thing that you can’t really do from Java is block-oriented disk IO. When I took DB theory (way back when) this was how database typically did secondary storage acess for speed reasons. Course that was when dinosaurs roamed the computer rooms (can anyone say “Univac”? )

JK

Edit: Been thinking about the last bit. As I said that was when dinosaurs roamed the earth. In these days of on-disc memory caches its likely unimportant.

sethp · March 28, 2007, 3:49pm

The decision was mostly a licensing one. Sun has a wide license for the native version of BDB, so that’s where we started.

Many of us would like to see a pure-java version of the system, if only for making it easier to do setup and basic development, but no one here has had the time yet to make it happen. When we publish the code, maybe someone will write a Java DataManager for us?

seth

tigeba · March 28, 2007, 4:36pm

One thing to keep in mind (this may now be outdated info) is that when using BDB there can be issues if you are accessing your store over a networked volume (NFS, SMB, etc). At least this was true for previous versions of BDB. I only remember this because the original Subversion used BDB for their repository and they had this warning in their documentation.

rizenine · March 28, 2007, 5:01pm

That’s true, it’s not out there yet. I was looking at Mina before finding Darkstar so maybe I can do some good, if I decide to use Darkstar for my project.

sethp · March 28, 2007, 6:31pm

My understanding is that this used to be the case, but that recent versions of BDB have added some support for remote filesystems. The main issue has to do with how shared memory is mapped. I am not a database person, so my knowledge here is somewhat limited, but there’s more detail at:

http://www.oracle.com/technology/documentation/berkeley-db/db/ref/env/remote.html

seth