Posts Tagged ‘xml’

Remedial Scala: XML

January 21st, 2009 2 comments

Tonight I finally got to the XML chapter of Programming in Scala. I was excited because I had seen a few examples on the web and wanted to try something out that wasn’t jdom or jax-b or whatever. So, I read the chapter and it was … short. The authors even admit in the conclusion that it barely scratches the surface and I have to agree :)

Anyway, my exercise for Scala’s XML support was adding an XML encoding of rules to rooscaloo. My plan is to eventually write a parser with parser combinators, but that’s not until chapter 31.

Overall, Scala’s XML support is a dream compared to anything I’ve done in Java before. I began by creating some case classes for my AST:

  case class RuleAST(name : String,
                     fireCode : String, unfireCode : String,
                     conditions : ConditionAST*)

  case class ConditionAST

  case class ObjectConditionAST(typeName : String,
                                binding : String,
                                testCode : String) extends ConditionAST

  case class NotAST(children : ConditionAST*) extends ConditionAST

really simple stuff.  Here’s what a rule looks like in XML:

    <rule name="bob-is-not-a-cat-lover">
        <object type="Person" binding="p"> == "bob" </object>
            <object type="Cat" binding="c"> c.owner == p </object>
      <then>println("bob does not fancy cats")</then>
      <unfire>Tprintln("bob now owns a cat!")</unfire>

(I couldn’t think of a better name for rule retraction than unfire…)

and here’s the satisfyingly compact code to convert an XML node to a RuleAST object:

  def parseRule(node: scala.xml.Node) : RuleAST =
    RuleAST(node \ "@name" text,
            node \ "then" text,
            node \ "unfire" text,
            parseConditionList(node \ "if" \ "_") : _*)

  def parseConditionList(xml : scala.xml.NodeSeq) : Seq[ConditionAST] =
    for(e <- xml)
      yield e match {
        case <object>{ _* }</object> => parseObjectCondition(e)
        case <not>{ _* }</not> => parseNot(e)

  def parseObjectCondition(xml : scala.xml.Node) : ObjectConditionAST =
    ObjectConditionAST(xml \ "@type" text, xml \ "@binding" text, xml.text)

  def parseNot(xml : scala.xml.Node) : NotAST =
    NotAST(parseConditionList(xml \ "_") : _*)

This could actually be much more compact but I chose to break it up into methods so I could test each construct individually. Still, I’m fairly certain there’s an even better way to do this, but this represents 45 minutes of work including unit tests, so I’m still pretty happy. One thing I discovered that wasn’t covered in the book was the use of the “_” wildcard to get all sub-elements of a node. For example we can see above that I use:

   node \ "if" \ "_"

to get the list of all child elements of the <if> node.

The match statement in parseConditionList is a little irritating to me and I feel like there’s probably a better way to do that.

Just like the book, I’ve obviously only started to scratch the surface of  Scala’s XML support. Since I also just read the chapter on extractors, it seems like it might be interesting to define extractors to automatically convert XML to an AST. Of course, there’s also XPath, transforms, etc.

Maybe tomorrow night. It’s time to sleep.

p.s. here are some more Scala XML resources I’ve found useful:

Categories: scala Tags: ,