I’ve read on the java performance tuning section at java.sun.com that there’s an increase in performance when char array’s are used. Does anyone feel that you should substitute all your strings for char array’s whenever possible? If so, how would you use them because they won’t expand. I’m thinking maybe arrayCopy would be helpful there. Just wanted to see what everyone’s opinion was. thanks
Without seeing any numbers, I would guess that if there is a performance gain, it’ll be both negligible and platform/runtime dependent. The extra hoops you’d need to jump through to work with char arrays instead of Strings would certainly immediately nullify any gain… in most situations.
(That said, Strings don’t expand either! Strings are immutable, so any time you see one change, you’re actually seeing a new String being allocated. If you’re frequently concatenating Strings for instance, it might be good for performance to avoid them, but I’d switch to StringBuffers instead - I certainly wouldn’t start using char arrays.)
And now the boiler-plate comments on optimization:
[]First get it working, then get it working fast.
[]Premature optimization is the root of all evil.
[*]Don’t guess at bottlenecks, use a profiler instead.
thanks for the help. Good thoughts
I’m guessing that String may even be faster when you are going to compare strings (see java.lang.String.intern() in the source to see why).
And the String is internally represented as a character array anyway.
Just use String and listen to charlie
Erik
Since String’s pretty much are char arrays, what would cause the performance gain? Arrays are still objects, and good luck writing a Java program where the String class never gets loaded
Actually, in many/most cases, the compiler will swap a StringBuffer in for you. So you can still use the convenient notation of Strings without worry.
[quote]Actually, in many/most cases, the compiler will swap a StringBuffer in for you. So you can still use the convenient notation of Strings without worry.
[/quote]
You have to be very careful about that and really know what you are doing. In most cases where a StringBuffer should be used you will get very poor performance without using it. E.g. you will construct many useless intermediate String objects from short lived behind-the-scenes StringBufffers.
In general whenever you are appending to a string you want a StringBuffer. If you just need a one shot String made from a concatenation of existing strings then the StringBuffer that is created behind the scenes by the compiler could be good enough.
Yeah, the compiler will nicely optimize single-statement use (String s = "Are you " + playerName + "?" ;
) but I don’t think it does anything when the concatenations are much further separated. It certainly can’t help if the concatenations happen inside different methods.
EDIT: Which is basically what Mr Palmer said… :
String and StringBuffer (and new friend CharSequence which you should really look at too) are a great big poke in the eye to one of the very tenets of object-oriented programming. In order to understand how to best use them and why… you have to look at the source code, thus violating encapsulation. Tsk.
Cas
I seem to remember it also throwing in a StringBuffer for loops that build Strings, but I could be wrong. I just generally use Strings as the syntax is cleaner, and if after I’m done it could use an optimization I’ll switch to StringBuffer if the compiler didn’t already.
[quote]String and StringBuffer (and new friend CharSequence which you should really look at too) are a great big poke in the eye to one of the very tenets of object-oriented programming. In order to understand how to best use them and why… you have to look at the source code, thus violating encapsulation. Tsk.
[/quote]
Actually (starting a war here ) I think the poke should go in the eye of OO programming. OO tends to gloss over the fact that to write efficient code you sometimes, perhaps often, NEED to know the basics of the implementation.
E.g. you have the java.utils.Collection interface… but you have to choose an implementation to actually create a collection. E.g. Maybe you want a Set, if you require it to be Sorted you would be better off to use the SortedSet interface from the get go… then you’ve narrowed it enough that there is only one implementationin the JRK to choose - TreeSet.
BUT, lets say you don’t care about order or anything. Do you use a HashSet, an ArrayList, or a TreeSet or … You need to know how you will be using it and then you have to pick an implementation that suits how you will use it… If you will be calling the ‘contains(obj)’ method frequently a List will suck… a TreeSet will be better, and a HashSet better yet. Without knowning the implementation it is difficult or impossible to optimize without guessing.
What if you need an ordered list… you pick the List interface of course, but should it be ArrayList or LinkedList… The classes in the JDK are named specifically to advertise the implementation and the JavaDocs go further to confirm the implementation… it’s done that way because you NEED to know.
Note that I’m not bashing OO techniques, I’m just pointing out that there are some obvious times when knowledge of the implementation is required to write efficient code. Understanding the implementation of String and StringBuffer is one of those things you need to know.
I would recommend not blindly using char[]'s everywhere in place of Strings. But a char[] can very much outperform a String… in some situations. This usually is related to when you might need to perform multiple mutations on a String or when you might need to recursively go through a String, where the char[] loop can handle many things simultaneously. You can hope that Hotspot will catch and clean this for you, but the more complex the code the less likely it will happen.
For example the following snippet would be a nightmare to do any other way and is insanely fast.
Doing this same thing in Strings is really yuckky and doing this with java.util.regex is really slow. Microbenchmarking this shows a String version to be about 20x slower and the regex version to be about 100x slower.
While I would not advocate always using char[]'s, it does make good sense to use char[]'s where you understand the difference and the performance is important.
[quote]Since String’s pretty much are char arrays, what would cause the performance gain?
[/quote]
I myself was surprised when I profiles a project (using LWJGL) several months ago. I was getting steadily slower framerates. Turned out that String.toCharArray() was taking huge amounts of time relative to any other methods.
I can only guess at two things: either its the need to allocate and fill the new char array, or that theres a translation from String to array that requires some non-trivial translation.
In reality it was a simple fix - I like the clean String syntax, so i left it with strings but cached the .toCharArray result and avoided the overhead.
So, erm, I guess that the only concrete advice you can get from that is to profile first and not trust your initial instincts over whats fast and whats not…
from StringBuffer Api:
String buffers are used by the compiler to implement the binary string concatenation operator +. For example, the code:
x = "a" + 4 + "c"
is compiled to the equivalent of:
x = new StringBuffer().append("a").append(4).append("c")
.toString()
should be woth a try to use stringbuffers for everything internally and only use toString() if you have to output stuff, but im no stringing expert
edit: you didn’t really tell us what you want to do with your strings; above should be true is you do a lot of appends.
Wow, that really is in the StringBuffer docs. I thought/hoped in a case like that the compiler would just turn that into “a4c” and throw it in the constant pool. I thought the StringBuffer was reserved for when the concatenation isn;t known until run time.
I’d be surprised if it wasn’t optimized away. I expect the docs are either out of date or just a utopian view of things.
[quote]String and StringBuffer (and new friend CharSequence
[/quote]
StringBuffer implements CharSequence interface. But to you all, have you looked at the sourcecode of StringBuffer.java class?
Most of the methods are synchronized but most time we dont need it to be so. Create FastStringBuffer clone by removing all synchronized keywords from the methods.
Or wait for JDK1.5 where we have StringBuilder to represent unsynchronized version of StringBuffer.
This is a similar case to Vector-ArrayList pair, no point using Vector class in Java2 environment anymore.
We wrote a FastStringBuffer a while ago and use it extensively in all of our performance intensive code. To get around the unsynchronized nature of the class we only use it as a method local constuct, and it seems to get the job done. It’s surprising the difference it makes over StringBuffer.
We have all promised to go back one day and implement CharSequence, but so far no one has needed it.