Tuesday, 1 October 2013

Time Series Map

Recently, I needed a time-series data where a ticker information was needed to be stored every millisecond. My initial thought was I could create a map which would have millisecond as the key and data as the value.

Map<Long, Tick> tickMap;

I soon realized also needed to remove old entries from the map as the goes on to keep memory consumption in control. Again I thought I could evict expiring map entries from map after a put operation or use guava cache builder. Although that would work, I would cause creating/removing a lot of objects and hence risking garbage collector runs.
I remembered Disruptor usage from my Betfair times, the magic Ring Buffer solution which eliminates GC runs by reusing early initialized objects. Ring buffer is basically an array list with a predefined capacity. Instead of adding/removing entries from the ring, entries are updated to reflect the last added object. Locating an entry’s position can be determined by a bitwise operation of index mask and entry’s sequence value.
Disruptor/Ring Buffer

The sequence number is basically an always increasing pointer, yielding the next long number and index mask is simply size of buffer minus one. With that information in my hand, I can build a ring buffer to store 24 hour volatile data without driving GC crazy. All i need is defining a ring buffer with a power of 2 size, above 60 x 60 x 24x1000 milliseconds.  The reason why power of 2 is required is for performance reason. Computers like binary systems and it seems an index locating function using a bitwise operation is cheaper than a modulus operation. Java Hashmap internally uses same logic for determining the size of the internal buckets.

No comments:

Post a Comment