Saturday, 23 June 2012

Concurrent Ternary Search Tree

I was recently asked for implementing string processing and searching feature. I decided to use the ternary search data structure, as it consumes less memory than hash-based structures. Moreover hash-based data structures lookup performance heavily depends on how hash function is implemented. On the other side a ternary searh tree lookup performance would be satisfying assuming string insertions will be random.
Below is a simple lock-free TST implementation with concurrency support. In this implementation a node's character is assigned if and only if the node is null. This is achieved thanks to CAS instruction.

import java.util.concurrent.atomic.AtomicReference;

/**
 * Ternary Search Tree with concurrency support
 *
 * @see <a href ="http://www.drdobbs.com/database/184410528">drdobbs</a>
 * @see <a href ="http://en.wikipedia.org/wiki/Ternary_search_tree" >Ternary search tree</a>
 */
public class TstNode {
    private final char ch;
    private volatile boolean terminated = false;
    private AtomicReference<TstNode> left = new AtomicReference();
    private AtomicReference<TstNode> right = new AtomicReference();
    private AtomicReference<TstNode> center = new AtomicReference();

    public TstNode(char ch) {
        this.ch = ch;
    }

    public boolean isTerminated() {
        return terminated;
    }


    public TstNode search(String word) {
        TstNode node;
        node = this;
        int i = 0;
        while (null != node) {
            char ch = word.charAt(i);
            if (ch < node.ch)
                node = node.left.get();      
           else if (ch == node.ch) {
                i++;
                if (word.length() == i) {
                    return (node.terminated) ? node : null;
                }
                node = node.center.get();
            } else
                node = node.right.get();
        }
        //not found
        return null;
    }


    public TstNode insert(String word) {
        int i = 0;        char ch;
        TstNode node = this;
        while (i < word.length()) {
            ch = word.charAt(i);
            if (ch < node.ch) {
                node = insertLeftNode(ch, node);
            } else if (ch > node.ch) {
                node = insertRightNode(ch, node);
            } else {
                ++i;
                if (i == word.length()) {
                    node.terminated = true;
                    break;
                }
                node = insertCenterNode(word, i, node);
            }
        }
        return node;
    }

    private TstNode insertCenterNode(String word, int i, TstNode node) {
        node.center.compareAndSet(null, 

new TstNode(word.charAt(i)));
        node = node.center.get();
        return node;
    }

    private TstNode insertRightNode(char ch, TstNode node) {
        node.right.compareAndSet(null, new TstNode(ch));
        node = node.right.get();
        return node;
    }

    private TstNode insertLeftNode(char ch, TstNode node) {
        node.left.compareAndSet(null, new TstNode(ch));
        node = node.left.get();
        return node;
    }
}

Monday, 30 April 2012

Decorator Pattern & Checked Exception Wrapping

Have you ever come across a situation that checked exceptions made your code looking ugly and hard writing unit tests?
I’ve recently spent some time on a REST service acting like a man in the middle between two distributed systems. The service was interacting between two systems. Due to nature of distributed unreliable systems there has to be exception checking for every communication function amongst systems. Obviously checked exceptions over time has made rest service class look nasty and writing unit tests started to take more time. I’ve realized that in the rest service, I was just doing some logging, returning meaningful error codes in exception checking blocks, so I decided to wrap checked exceptions with a decorator so that service class would look better.
To give a high level brief, I have an interface defining some functions with a checked exception declaration.

1. Interface definition
public interface ExecutionManager {
  public Boolean checkJobExists(String job) throws ExecutionException;
  public void createNewJob(String job) throws ExecutionException;
  public void updateNext(String jobName, Integer nexId) throws ExecutionException;
}

2. Rest Service

An implementation of ExecutionManager is used throughout REST service class where all checked exceptions are wrapped with custom runtime WebException

@Path("execute")
@Singleton
public class ExecutionImpl  {
@Override
    public JAXBElement<BuildStatus> buildStatus(UriInfo ui,
                                                String packName,
                                                String packVersion,
                                                String packType,
                                                String buildId) {
// code blocks at here
        ExecutionManager mng;
        try {
            mng = this.buildManagement();
            exists = mng.checkIobExists(job);
        } catch (ExecutionException e) {
            throw new WebException(Response.Status.INTERNAL_SERVER_ERROR, 
WebExceptionBean.ErrorCode._EXECUTION_ERROR, 
"Unable to check job is on execution server", e);
        }

// code blocks at here
       
        try {
            ExecutionStatus jbs = mng.getBuildStatus(job, requestedBuild);
            return objectFactory.createBuildStatusResponse(jbs);
        } catch (ExecutionException e) {
            throw new WebException(Response.Status.INTERNAL_SERVER_ERROR, 
WebExceptionBean.ErrorCode._EXECUTION_ERROR, 
"Unable to retrieve build details from execution server", e);
        }
    }
}


3. Decorating checked exceptions

I’ve created a new decorator class implementing the Executionmanager and accepting it as the constructor parameter. With decorator pattern, I am able to remove throws declaration and wrap checked ExecutionExpception, removal of throws declaration is allowed, as the decorator is not introducing a new exception  to be checked.

public class ExceptionWrappingDecorator implements ExecutionManager {
    ExecutionManager manager;
    public ExceptionWrappingDecorator(ExecutionManager manager) {
        this.manager = manager;
    }

@Override
   public Boolean checkJobExists(String jobName) { // removed throws declation
       try {
           return this.manager.checkJobExists(jobName);
       } catch (ExecutionException e) {
           throw new WebException(Response.Status.INTERNAL_SERVER_ERROR,
               WebExceptionBean.ErrorCode.TEST_PACK_EXECUTION_ERROR,
               "Unable to check job is on execution server", e);
       }
   }

@Override
public void createNewJob(String jobName) {  //removed throws declation
    try {
        this.manager.createNewJob(jobName);
    } catch (TestPackExecutionException e) {
        throw new TesWebException(Response.Status.INTERNAL_SERVER_ERROR,
            TesWebExceptionBean.ErrorCode.TEST_PACK_EXECUTION_ERROR,
            "Unable to create new test pack job on execution server", e);
    }
}

@Override
public void updateNext(String jobName, Integer nextBuildId) {
        try {
            this.manager.updateNextBuildId(jobName, nextBuildId);
        } catch (ExecutionException e) {
            throw new WebException(Response.Status.INTERNAL_SERVER_ERROR,
                WebExceptionBean.ErrorCode.EXECUTION_ERROR, e,
                "Unable to update set nextbuildId -> %s of job -> %s on execution server", nextBuildId, jobName);
        }
}


4. Improved new REST service class

Rest service class looks cleaner with decorator usage, obviously main purpose of the service class is dealing with requests instead of being drown in try-catch blocks. By creating exception decorator, side effect codes are moved away from service class while not losing any information carried in the exceptions.

@Path("execute")
@Singleton
public class ExecutionImpl  {  
@Override
public JAXBElement<BuildStatus> buildStatus(UriInfo ui,
                                            String packName,
                                            String packVersion,
                                            String packType,
                                            String buildId) {

// code blocks at here 

   ExceptionWrappingDecorator mng=ExecutionHelper.getExceptionWrappingDecorator();
    boolean exists = mng.checkJobExists(jobName);

// code blocks at here

  ExecutionStatus jbs = mng.getBuildStatus(jobName, requestedBuild);
}


5. Simplified Unit Tests for Exception Decorator

Decorator pattern has also helped refactoring unit tests and increasing the test coverage.

public class ExceptionWrappingManagerTest {
    @Mock
    ExecutionManager tpmng;
    @Mock
    ExecutionException ex;
    ExceptionWrappingDecorator mng;
    String job = "error-job";

    public ExceptionWrappingManagerTest() {
        MockitoAnnotations.initMocks(this);
        mng = new ExceptionWrappingDecorator(tpmng);
    }

    @Test(expected = WebException.class)
    public void testCheckJobExists() throws Exception {
        Mockito.when(tpmng.checkIfJobExists(job)).thenThrow(ex);
        mng.checkIfJobExists(job);
    }

    @Test(expected = WebException.class)
    public void testCreateNewJob() throws Exception {
        Mockito.doThrow(ex).when(tpmng).createNewJob(job);
        mng.createNewJob(job);
    }

    @Test(expected = TesWebException.class)
    public void testUpdateNextBuildId() throws Exception {
        Integer id = 1;
        Mockito.doThrow(ex).when(tpmng).updateNextBuildId(job, id);
        mng.updateNextBuildId(job, id);
    }

}

Saturday, 28 April 2012

Java Visual VM & Visual GC

I’ve always found hard to memorize java GC parameters. Java GC tuning has become a major issue after systems with huge memory and multiple cpu’s has emerged. I believe that Java GC management sounds like an unknown grey area for most developers, This conception has grown due to increasing configuration options. Although there is java performance portal and complete dedicated page for GC tuning, it’s still not easy to visualize dozens of GC terms instantly.

For those finding GC terms difficult and want see what actually these terms (perm/young/old generations, survivor spaces, etc ) means, there is a visual gc plugin which can be installed to visual vm shipping with jdk 1.6. Basically Visual VM looks like an improved version of jconsole tool with an add-in architecture.

Below I uploaded a visual gc picture of a java app running locally. Thanks to visual gc plugin, it’s quite easy to monitor and visualize memory consumption of java applications.

java visual gc
Java Visual GC



After seeing the picture, it’s time to revisit some GC parameters as;

--XX: NewRatio        –> Old / (Eden + S0 +S1)
--XX: SurvivorRatio  –> Eden / S0
--XX: MaxPermSize  –> Perm

where S0 equals S1 and they are called From and To survivor spaces, while Eden + S0 + S1 altogether define young generation size.

Recalling that a java app's min and max memory sizes are defined by -Xms and -Xmx, Jvm will split an app's memory into Perm, Old, Eden, S0,S1 regions according to ratios above.

Sunday, 19 February 2012

Integration tests, how much are they needed?

I few days ago I was required to verify a custom Jenkins plugin was developed correctly and would be running as expected. The plugin in question was covered with proper unit tests and part of a big solution project including  some other modules using the plugin to achieve a job build task.  As it was a plugin, unit tests were not guaranteeing that it was ready to be used for functional requirements test.The question is how you can be confident with your module that it is ready for functional requirements tests?

It looks like today’s agile software development practitioners are generally focusing only on unit tests and omitting integration-tests, Maybe this routine is due to the fact that it’s generally difficult to test a module without isolating/mocking some external dependencies, that’s why a code base passing unit tests is generally treated ready for functional testing stage. In reality there is a certain need to verify that a module as a whole is tested before going further in the development pipeline. I think instead of omitting the integration-test phase all together, how much integration tests coverage  is needed can be asked. In most cases number of integration test cases would not surpass unit tests. Depending on project factors, usability and risk assessment, a percentage of integration-test coverage can be set for projects.

I would rather prefer  deferring heavy cost/slow tests to the end of development pipeline as much as possible, this would give more time to achieve a certain level of unit and integration tests coverage, which will ensure to check against a number of common pitfalls.
Ideally, if a TDD style development is followed  with concise user stories and clearly highlighted  feature requests, the unit test level coverage of the project can easily hit above 90% and integration test coverage could be set to a lesser target while functional tests coverage would be the least, only verifying high level user stories.


CD Flow

Tuesday, 10 January 2012

Clojure and Software Transactional Memory


I have added Clojure to my interest list recently and started to spend some time to understand the language and how it works. I started off with The Joy Clojure book which is quite good to understand intrinsic of the language at the very beginning. I find Clojure very suitable for algorithmic operations with built-in support of structural sharing which eliminates the need for full copy on write. Structural sharing is very well presented in slides through 32 and 34. In Lisp dialects, it is quite easy to play with with map, list, vector, tree and set data structures. for example, adding an element to a list is constant time operation.

Clojure also takes a different route which is called “software transactional memory” to support concurrency. Although STM/TM concurrency controlling is not a new mechanism, Clojure is one of the languages supporting it at the language level. Clojure supports STM mechanism with built-in Refs. It also provides a mechanism to manage retry times with :min-history and :max-history keywords.

When I first saw Clojure’s STM mechanism, I worried about performance characteristics since Clojure’s STM mechanism retries a conflicted transaction a number of time to commit it. In fact I wasn’t alone to worry about this issue. My Google search took me to a few years old paper claiming STM as a toy. In contrast to the old paper, there is also a paper defending STM can outperform sequential code and another one by Oracle stating STM can be beneficial for some applications. In fact there are a number of papers defending STM and making an analogy between STM and Java GC mechanism. Even STM approach to concurrency is also welcomed by Scala community.

My Clojure STM interest even took me further to Chapel parallel programming language which seems to be a unique approach to parallel programming and software transactional memory for distributed systems (Distributed STM) implementations using Chapel. I was also impressed by Avout which is a Clojure library for managing state in distributed environments.

In conclusion, S/TM and Clojure’s STM implementation seems mature enough to run under heavy concurrent conditions.