Java is an awesome language because you get to ignore hard stuff like memory allocation.  Write once, run anywhere.  Sweet, where do I sign up?

The privilege of not having to manage memory comes at a cost: you aren't allowed to question how the JVM works.  Move along, coder.  Keep making those objects.  Don't ask how much memory things take up. In fact, to keep you from getting curious, we're not even going to have a sizeof function.

How My Complacency Made Me Fail

When you're coding in Java, it's easy to buy into this mentality.  You never really have to worry about how much space anything takes up, and if you get an OutOfMemoryError, just give the JVM more memory.  Problem solved.

But there are times when you need to be conscious of how the JVM actually works.  For example, when you're trying to squeeze every bit of performance out of the crappiest of machines, such the small Amazon EC2 instances (where Persai is hosted).

Here's a real world example of how I got burned:

Persai does a lot of work with high-dimensionality, sparse vectors.  To save space, we compact the vectors.  Since most of the values in the vectors are zero, we simply do not store them.  What we store amounts to a list of the nonzero element indices and the corresponding values.  This is our basic data structure:

class sparseNode {
    public int index;
    public double value;
So a vector is an array of sparseNode objects. Sounds easy enough, and for a while, it was.  That is until I was tasked with storing as many of these in memory as I could.

Where's the fail here?  How big is an int primitive in Java? 32 bits, right?  Sort of.  The Java specification says that an implementation must provide 32 bits of workable space for the programmer using an int, but makes no mention of how how the virtual machine must store this variable.

In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary.  On top of this, every object has a 2-word header in memory.  The JVM's word size is usually the platform's native pointer size.  Alright, two words for the object header, one word for the int, two words for the double.  That's 5 words: 160 bits.  Because of the alignment, this object will occupy 192 bits of memory.  Effectively, the int value is taking 64 bits!  In an array of these things, I've wasted N times 32 bits.  Figure, a typical vector is about 200 elements long, so that's 800 bytes out the window for each one.

This fun fact would have been good to know when I was doing my initial back of the envelope calculation of how many vectors I can fit in a gigabyte of memory.

Yes, I know I should be complaining about the same thing when using C structs.  But you know what?  When you learn C, you are introduced to many harsh realities.  When you learn Java, you are introduced to XML.  They protect you from the hard things.  Live and learn, I guess.

The Fix

After knowing this, it was easy to rescue myself.  Java primitive arrays fall to the same alignment issue, but in our case, they can help solve the problem.  Instead of representing a vector as an array of objects, we'll represent a vector as an object of arrays.

class sparseVector {
  public int[] indicies;
  public double[] values;
This way, we're going to lose at most 4 bytes per vector with the alignment of the int[] array.  This sure beats the ~800 byte loss with the other solution.