Tuesday, 10 January 2012

Clojure and Software Transactional Memory

I have added Clojure to my interest list recently and started to spend some time to understand the language and how it works. I started off with The Joy Clojure book which is quite good to understand intrinsic of the language at the very beginning. I find Clojure very suitable for algorithmic operations with built-in support of structural sharing which eliminates the need for full copy on write. Structural sharing is very well presented in slides through 32 and 34. In Lisp dialects, it is quite easy to play with with map, list, vector, tree and set data structures. for example, adding an element to a list is constant time operation.

Clojure also takes a different route which is called “software transactional memory” to support concurrency. Although STM/TM concurrency controlling is not a new mechanism, Clojure is one of the languages supporting it at the language level. Clojure supports STM mechanism with built-in Refs. It also provides a mechanism to manage retry times with :min-history and :max-history keywords.

When I first saw Clojure’s STM mechanism, I worried about performance characteristics since Clojure’s STM mechanism retries a conflicted transaction a number of time to commit it. In fact I wasn’t alone to worry about this issue. My Google search took me to a few years old paper claiming STM as a toy. In contrast to the old paper, there is also a paper defending STM can outperform sequential code and another one by Oracle stating STM can be beneficial for some applications. In fact there are a number of papers defending STM and making an analogy between STM and Java GC mechanism. Even STM approach to concurrency is also welcomed by Scala community.

My Clojure STM interest even took me further to Chapel parallel programming language which seems to be a unique approach to parallel programming and software transactional memory for distributed systems (Distributed STM) implementations using Chapel. I was also impressed by Avout which is a Clojure library for managing state in distributed environments.

In conclusion, S/TM and Clojure’s STM implementation seems mature enough to run under heavy concurrent conditions.