Posts Tagged ‘c++’

Porting Soar to Java or: How I Learned to Stop Worrying and Love Spaghetti (Part 1)

November 25th, 2008 No comments

I have recently just finished up a port of the Soar kernel from C/C++ to Java. The existing implementation is about 40,000 lines of C and C++ code accumulated over more than 15 years of development, mostly by grad students at the University of Michigan.   For such a long span of time, that’s not really that much code, but unlike a lot of “modern” object-oriented systems, every line does something important. There’s very little getter/setter boilerplate.

I won’t get into why I wanted to port it to Java here. That’s another story. Instead focus on the software process.

From the start, I had some basic principles to follow. First, this was to be a port, not a reimplementation from scratch. Although there is a manual, wiki, and forthcoming book on the Soar architecture, there is nothing that I would call a specification for the language or run-time. That is to say, as is so often the case, the code is the spec. And there’s enough code of a sufficiently intricate nature that there’s minimal hope of generating a spec for Soar in any reasonable amount of time. It’s intractable. Besides, I wanted to use jsoar in the near future and with two children under two, I’m not spending my spare time writing a spec.

Second, because there is already a community of researchers working on the kernel, I wanted the port to be as lossless as possible. If the entire structure of the kernel changes out from under them, the chances of adoption are low.

Third, I wanted to actually test the Soar kernel. Historically, testing on the kernel has been minimal aside from manually running a few test cases to make sure nothing crashes. Since no one wants to write unit tests for a big legacy system, I might as well write some while I’m porting.

Finally, I wanted to be fairly conservative with refactoring during the port. In taking a bottom up approach to the port, I never knew when a  refactoring choice would come back to haunt me.  Furthermore, one argument for Java is that it’s easy (or easier) to refactor, so I might as well refactor at the end when I have a full view of the system. I’ll post more on how this has turned out later.

So, from these principles, I started out from Rivendell for Mordor. The upshot of the first tenet is kind of the funniest. My approach to the port came down to the following procedure:

  1. Open a cpp file, say rete.cpp
  2. Gasp at the majesty of a 9000 line cpp file
  3. Create class Rete in Java
  4. One by one, paste functions from C++ into Java, changing pointer arrows to dots, ec, etc

Union Busting

That last step was a doozy. The first few days were essentially building up data structures in Java and required the most actual though. The Soar kernel makes heavy use of the old tagged union pattern. This generally maps pretty nicely to a type hierarchy with polymorphism, but there were a few stumbling blocks… In particular, there are a few Soar data structures that are “transmogrified” at run-time, i.e. their type is changed. Nice. In these cases, I settled for an “expanded union” where all of the unused data fields were null. This is the closest I could get in Java without some major refactoring of the code… and since I was taking a mostly bottom up approach to the port, I was never sure when a refactoring decision would bite me at some later point.

Here’s an example of such a beast. You can see my assertions desperately trying to ensure that the “union” was indeed behaving like a union.

Function Pointers

The rete algorithm implementation in Soar makes heavy use of tables of function pointers. Since Java lacks function pointers, I took the fairly tedious route of defining an interface and implementing it once for each function pointer.  Basically, I ended up with a bunch of inner classes, one for each function pointer. This worked fine.

The interesting part came when the port was finished and I started performance testing. It turns out that method calls from a nested class to a containing class are actually pretty slow when called in an inner loop due to an additional level of method calls generated by the compiler. So, in the end, for many function pointer tables, I ended up reverting to simple switch statements which gave me something like a 20% speedup on some tests.


Enumerations are one area where I really did some refactoring. I kept the names the same, but I almost always converted enum or #define constant lists to Java enumerations. The ordinal() method provides access to the original integer value, if needed, but I found that with EnumSet and EnumMap, this was rarely necessary. The improved typesafety and overall debuggability didn’t hurt either.


Another obvious area of difficulty is what to do with macros, especially “clever” macros. For example, here’s the insert_at_head_of_dll() macro from the C Soar kernel:

/* This macro cannot be easily converted to an inline function.
   Some additional changes are required.
#define insert_at_head_of_dll(header,item,next_field_name,prev_field_name) { \
  ((item)->next_field_name) = (header) ; \
  ((item)->prev_field_name) = NIL ; \
  if (header) ((header)->prev_field_name) = (item) ; \
  (header) = (item) ; }

That’s fun, isn’t it?  I kept the comment (ca. 2003) for extra effect. This is a “generic” procedure for inserting an item at the head of a linked list.  You’ve got the list header, the item and then two special parameters. What do they do? They name the next and previous fields in the item structure used for the linked list. Why not fix the names so you can just define a C++ template? Or pass in the address of the next and previous links? There are probably a few reasons. One would be performance. When this code was written (early 90s) they were squeezing every bit of performance they could out of the code. More interesting though is why the names aren’t fixed. That’s easily answered with a look at the working memory element (WME) structure from the kernel:

typedef struct wme_struct {
  /* WARNING:  The next three fields (id,attr,value) MUST be consecutive--
     the rete code relies on this! */
  Symbol *id;
  Symbol *attr;
  Symbol *value;
  Bool acceptable;
  unsigned long timetag;
  unsigned long reference_count;
  struct wme_struct *rete_next, *rete_prev; /* used for dll of wmes in rete */
  struct right_mem_struct *right_mems;      /* used for dll of rm's it's in */
  struct token_struct *tokens;              /* dll of tokens in rete */
  struct wme_struct *next, *prev;           /* (see above) */
  struct preference_struct *preference;     /* pref. supporting it, or NIL */
  struct output_link_struct *output_link;   /* for top-state output commands */
  tc_number grounds_tc;                     /* for chunker use only */
  tc_number potentials_tc, locals_tc;
  struct preference_struct *chunker_bt_pref;

  /* REW: begin 09.15.96 */
  struct gds_struct *gds;
  struct wme_struct *gds_next, *gds_prev; /* used for dll of wmes in gds */
  /* REW: end   09.15.96 */

} wme;

Just this one datastructure is potentially part of 1, 2, 3 linked lists (see the X_next and X_prev fields?) and holds the head for 1, 2, 3, 4 linked lists. So, how can this be improved? If I can’t improve it in C, I doubt I can implement it in Java. Well, the next and previous pointers can at least be pushed down into a structure:

template <typename T> struct list_member
   T next;
   T prev;

which will allow us to do something like this:

typedef struct wme_struct {
  // ... snip ...
  list_member<struct wme_struct*> rete_next_prev; /* used for dll of wmes in rete */
  struct right_mem_struct *right_mems;      /* used for dll of rm's it's in */
  struct token_struct *tokens;              /* dll of tokens in rete */
  list_member<struct wme_struct> next_prev;           /* (see above) */

  // ... snip ...

  struct wme_struct *gds_next, *gds_prev; /* used for dll of wmes in gds */
} wme;

Now we have actual object we can work with and write a single insert_at_head_of_dll() function, not macro. And it will be typesafe… but, we’re porting to Java here, and my number one goal is to port this thing, and my number two goal is to avoid manually expanding that linked list management code over and over again in Java.

My solution is encapsulated in two generic classes, ListHead and AsListItem. The former represents the head of a list, the latter an entry in the linked list equivalent to list_member<T> above. AsListItem came about when I was starting to play around with Ruby and it’s approach to mixin classes (like “acts_as_taggable”). It seemed clever at the time, but now it just seems stupid.

Anyway, these classes allowed objects to be a part of multiple lists at a time while eliminating the need for duplicate code, but there was one more issue…

As it turns out, the objects that use these classes the most also happen to be the most frequently created, short-lived objects in Soar, in particular rete WMEs and tokens. ListHead and AsListMember object were being created like crazy, reducing performance. So over the course of a few days, I slowly reverted the most performance-critical lists to raw linked list manipulation code, just duplicating the list maintenance. D’oh. It improved performance by around 40% though, so I can’t complain too much.

This seems like enough for now. In a later installment, I’ll talk about testing, getting Java to run as fast as C, and why refactoring the Soar kernel still seems hard even after its been moved to Java.


Here are some random links about porting C/C++ to Java:

  • Jazillian – this guy’s put some serious thought into the issue. Plus, he occasionally stirs the pot on the ANTLR mailing list which is always fun.
  • Point Legacy C++ code Java: Tales from a Trading Desk – the answer is here. The point about functional tests is important. Lucky for me, some functional tests already existed for Soar and I had help from a Soar “guru”
Categories: c++, java, soar, software engineering Tags: , , ,