Monday, March 18, 2013

Java Question

The project that I am currently working on has an interesting issue.  We just started using a tool called YourKit Java Profiler, and it shows that we have 26 MB of duplicate copies of the empty string "".  My first reaction was shock: I thought all compilers were smart enough to recognize that immutable strings (such as string constants) that are identical should reference a single version of that string.  But if YourKit Java Profiler is to be believed, that is NOT happening.  I asked the question here, and the answers that I received indicated that Java does create only a single version of a string constant -- but that the problem might be:

String str = i + ""

which is a very common Java construct for converting an integer to a string, is turning into something like:

String str = new StringBuilder("").append(i).toString();  

The implication is that the new StringBuilder("") is producing a distinct object each time, because that is a mutable string.  The solution is to use the somewhat less easy 

String.valueOf(i)

to produce the string version of i instead.  Does this seem like a plausible explanation of how we end up with 26 MB of "" copies?

5 comments:

  1. I'm not a Java guy, but... shouldn't those StringBuilders fall out of scope and get GC'd, and thus not matter in the long run?

    (I'm a VB.NET guy, primarily, and always work with Option Strict turned on, so all my integer -> string conversions are "int.ToString" or the like.

    The idea of adding a "" to it and making the compiler infer that I want a type conversion makes me twitchy...)

    ReplyDelete
  2. i + "" is an affront to my notion of how to do things, but it is quick and easy, and that's why it is widely done in this steaming pile that I am trying to clean up.

    ReplyDelete
  3. I'm a csharp guy (and csharp is just a ripoff of java), and int.ToString is the way to go. Never say i + "". Can that explain the copies? Well, that wouldn't happen in csharp.

    ReplyDelete
  4. Try the StringBuilder version, but instead of a literal "" in the first call, replace it with a static final String set to "".

    Or drop the "" in the constructor completely. It shouldn't be necessary.

    ReplyDelete
  5. That's what Java does with string concatenation: fire up the StringBuilder class.

    You probably have so many of them because you have plenty of memory and the garbage collector hasn't gotten around to removing them.

    The Integer.toString(i) approach is preferred but it's sort of annoying that the compiler/JIT doesn't optimize the concatenation idiom.

    ReplyDelete