Home > API and Patterns, Threading and Concurrency > Java actor framework “Kilim”: A solution without a problem?

Java actor framework “Kilim”: A solution without a problem?

Some time ago I read a posting from a workmate, that was linking to an IBM developerworks article about the actor-framework “Kilim” and he seemed to be quite fond of it. Since I know this guy’s a very skilled technician and since actors are one of the bigger hypes in software technology nowadays, with languages like Scala openly supporting that concurrency model, and since the IBM developerworks articles are usually quite good, I instantly dived into it, only to be left perplexed.

Let me start with some of the commonly repeated claims regarding actors, which in this particular article reads like this:

The actor model is a different way of modeling concurrent processes. Rather than threads interacting via shared memory with locks, the actor model leverages “actors” that pass asynchronous messages using mailboxes. A mailbox, in this case, is just like one in real life — messages can be stored and retrieved for processing by other actors. Rather than sharing variables in memory, the mailbox effectively separates distinct processes from each other.

First I notice something fundamentally wrong here: Actors do interact via shared memory. It’s the mailbox, which is both shared with the outside world and synchronized. Well, to be fair a mailbox does not need to be synchronized, it may well be implemented using a lock free algorithm. If you look at Kilim’s implementation, however, you will find a lot of synchronization over a shared array there.

And there was another thing that bugged me:

There are no locks and synchronized blocks in the actor model, so the issues that arise from them — like deadlocks and the nefarious lost-update problem — aren’t a problem.

As a user of actors I don’t need the synchronized keyword, true, and in fact you don’t find any synchronization in the user code. But seriously: Do we need a framework for that?

Translating the example to plain Java SE

What else is an actor like a (lightweight) thread and what else is a mailbox than a thread safe container? Let me translate the example code from the article to use only classes from the Java Standard Edition.
The “calculation type message” (I’ll try to use the same terms as in the article) is exactly the same as the original:

public class Calculation{
 private BigDecimal dividend;
 private BigDecimal divisor;
 private BigDecimal answer;
 public Calculation(BigDecimal dividend,BigDecimal divisor){
 public BigDecimal getDividend(){
  return dividend;
 public BigDecimal getDivisor(){
  return divisor;
 public void setAnswer(BigDecimal ans){
 public BigDecimal getAnswer(){
  return answer;
 public String printAnswer(){
  return "The answer of " + dividend + " divided by " + divisor + " is " + answer;

The DeferredDivision class, which is our first actor, looks almost the same as in the article. Instead of a mailbox class it uses a blocking queue, it implements Runnable, instead of extending the framework class “Task” and it handles InterruptedExceptions:

public class DeferredDivision implements Runnable{
 private BlockingQueue<Calculation> mailbox;
 public DeferredDivision(BlockingQueue<Calculation> mailbox){
 public void run(){
   Random numberGenerator=new Random(new Date().getTime());
   MathContext context=new MathContext(8);
    System.out.println("I need to know the answer of something");
     new Calculation(new BigDecimal(numberGenerator.nextDouble(),context),
      new BigDecimal(numberGenerator.nextDouble(),context)));
    Calculation answer=mailbox.poll(); // no block
    if(answer != null && answer.getAnswer() != null){
     System.out.println("Answer is: " + answer.printAnswer());
  }catch(InterruptedException e){

The Calculator class, which is the second actor, is not very different either. The modifications to the original code are exactly the same as in the DeferredDivision:

public class Calculator implements Runnable{
 private BlockingQueue<Calculation> mailbox;
 public Calculator(BlockingQueue<Calculation> mailbox){
 public void run(){
    Calculation calc=mailbox.take(); // blocks
    if(calc.getAnswer() == null){
     System.out.println("Calculator determined answer");
     mailbox.offer(calc); // no block
  }catch(InterruptedException e){

We don’t need the Ant script for Kilim’s Weaver here, because we don’t need to weave anything. So we don’t need to understand and maintain any XML definition either, which is great. Instead we can continue with the runner, which looks as simple as the original from the article, but I can give it an own maximum amount of messages. I chose 300, the same value that is hard coded in the Kilim framework:

public class CalculationCooperation{
 public static void main(String[] args){
  BlockingQueue<Calculation> sharedMailbox
   =new ArrayBlockingQueue<Calculation>(300);
  Runnable deferred=new DeferredDivision(sharedMailbox);
  Runnable calculator=new Calculator(sharedMailbox);
  new Thread(deferred).start();
  new Thread(calculator).start();

Voila! This code does not only look very similar to the original, it also acts very similar. There are just two differences: It doesn’t use lightweight threads and there’s a difference in the way it fails. It fails? Wait! I was told that…

When an example backfires on a bold claim

There are no locks and synchronized blocks in the actor model, so the issues that arise from them — like deadlocks and the nefarious lost-update problem — aren’t a problem.

The funny thing is: That particular article itself is probably the best proof that it’s wrong. The code that the author presents is free of deadlocks, sure, but ironically it contains a huge lost-update problem, as does my no-framework translation.

The mailbox from the original will silently reject new messages (returning “false”), when it is full, while the BlockingQueue in my version will throw an IllegalStateException in the same situation. So this is the first obvious flaw in the code from the article and its translation. If the mailbox is full, we have a problem: We lose updates or we crash. Since I am a great fan of “fail fast” programming, I’d prefer the exception, because it enables me to quickly analyze what went wrong and take countermeasures, I don’t simply lose something without notice. Fortunately the fix for both of these problems is easy: Use a blocking “put” that waits for space in the mailbox.

There is a second bug, which is not that obvious. It’s also shared by the original code and the translation. Just follow the program: The DeferredDivision posts a Calculation, waits a second and takes a calculation from the mailbox. If the calculation has been solved it prints the result, else it just starts over. It reads good, it looks good and it’s wrong. Imagine this situation:
The DeferredDivision posts a Calculation, but before the calculator can take it from the mailbox (because it’s still busy), the DeferredDivision wakes up again, takes the unsolved calculation from the mailbox, and omits it. Now the calculator wakes up, can’t find that calculation and blocks until the DeferredDivision posts another. As a result the calculation has been lost.
Fortunately the second problems can also be fixed: One could use a blocking “get”, so the program doesn’t progress until there is a result in the mailbox/queue. Or one could repost the unsolved calculation and progress. Still the second solution may re-evoke the first problem: If you use a putnb (non blocking) with Kilim you could lose the new and reposted calculations as soon as the mailbox is filled (so you’d need to evaluate the return param of the putnb call) or with my alternative implementation a non blocking put (=offer) would throw an IllegalStatetException that indicates the queue is full. So you’d better use a blocking “put” here aswell. As a result the worst case scenario would be a constantly filled queue, and threads waiting to do something. Which is annoying but normal under heavy load.

So what is the take-home-message of this blog? Without the need for any actor abstraction framework I have recreated this example with nearly the same code. And, as I’ve shown, the actor model has not saved the author of the article from introducing a lost-update problem on two occasions in just about 20 lines of code. In fact it is equally easy to create the spaghetti eating philosopher’s problem with actors (the forks being messages in the shared mailboxes). So does that actor framework really make our life simpler as programmers?
And what’s left besides lightweight threads? What’s it really worth having lightweight threads anyway, facing multicore chips like AMD’s and Intel’s current consumer line or even SUN/Oracle’s Ultrasparc T2 chips, that eat entire thread pools for breakfast? Could it even hurt performance based on the fact that these CPU base their internal scheduling partly on the information that threads are independent instruction streams?
To be honest, I am not convinced this framework solves me any problem, I’m terribly sorry. Or is it just that I’m missing something important here?

  1. Tronje Krop
    August 28, 2011 at 16:01

    Actually you really missed an important point: Kilim just uses a single thread for for execution. This might not seem to be very important for two actors, but if you have thousands of them, your alternative solution will not work.

    • August 28, 2011 at 19:20

      Tronje, I don’t think I’m missing the point here, but probably you you were missing mine. As I have written Kilim is using “lightweight-threads”, which is just another flavor of green threads, continuations, etc, or generally speaking having a single hardware thread managing multiple software threads. As I said elsewhere: Kilim tries to re-implement something that operating systems were tuned for in years: Thread scheduling (or “fibers” in Kilim’s terms). The Java runtime may already use so-called “green threads”, which is the same really and which is there for compatibility with platforms that do not support hardware-threads. Still modern JRE-implementations chose to use system threads on platforms where available and they had pretty good reasons for that. So I’m not really missing the point, it’s rather that I’m questioning (see the last paragraph), whether it is really useful to do that.
      As an example:
      A SPARC T3-4 server (4 sockets with one 16-core UltraSparc T3 each) can process up to 512 threads simultaneously. When you think about the fact that most of the threads are fighting over the same cache, RAM, CPU, storage and network-devices, you can imagine that a lot of them will actually sit there waiting for data. Generally speaking, it depends on your application’s profile though, it is pretty efficient to run anything from 150% to 200% the number of threads that a CPU can manage simultaneously for a heavily loaded application. Looking at the T3-4 server, that would be up to 1024 hardware threads or even more if threads are more likely to sit idle.
      Even “traditional” and cheaper x86-servers, like the 4 socket 10-core Xeon equipped Dell Power-Edge R910, can process up to 80 threads simultaneously, which means that you can easily throw up to 160 heavily busy threads at it at any time and even more if most threads are idle. That is still pretty impressive.
      To get a glimpse of how many threads the Windows operating system can handle, there is an interesting read on Mark Russinovich’s blog [1]. These numbers are pretty much larger than anything than common hardware can manage simultaneously.

      Now what the hardware knows better than any software framework is when to switch busy threads. Machines like the T3 or the Itanium processor manage to switch contexts very quickly.[2,3] For example they switch threads to hide cache misses, which the Java runtime knows nothing about, or eliminate the cost by doing a round robin of the threads in flight, like the T3. So since context switches on the hardware have become almost free, the slowest thing with hardware-threads is to create them. To avoid that thread-creation-overhead, application servers push their work units into threadpools. Now the actor model can be looked at as a large network of long running threads that wait, process and repeat, so thread creation is hardly any problem. And what else is Kilim than a giant threadpool for actors? So knowing that operating systems can easily handle thousands of threads and that modern servers can handle hundreds of those simultaneously and switch between them very quickly, what’s the point in “pooling” them, when you, thanks to the actor model, have hardly any thread creation overhead? I’ve shown that the actor model from the programmer’s perspective does not require this framework either. So which problem is it trying to solve?

      Apart from that I find it pretty arguable if you really need “thousands” of actors in common software. Even huge application servers are usually better off scaling across multiple machines for stability and security reasons alone. So you’d probably be better off sacrificing some performance and go for some full-blown clustering framework, be it JBoss’ built-in solution, a framework like Gigaspaces or what else you might prefer and run that on appropriate hardware.

      [1] http://blogs.technet.com/b/markrussinovich/archive/2009/07/08/3261309.aspx
      [2] http://staff.science.uva.nl/~jesshope/Downloads/Niagara2.pdf
      [3] http://www.dig64.org/about/Itanium2_white_paper_public.pdf

  1. August 12, 2011 at 18:03

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: