Magnolia: JCX Unmarshaller

- Dec 21, 2017 3:15:31 AM

Magnolia JCX is a module which attempts to support JAXB annotations in order to access Jackrabbit repositories. JAXB as an XML representation is a natural equivalent to the hierarchical data stored in Jackrabbit repositories. Therefore it seemed logical to reuse these annotations in order to extract the data from the Jackrabbit repositories.

Here's a little video which may provide some insight (~ 18min):


You can add this module using the following maven coordinates:


The old way

The following code snippet shows a typical scenario while retrieving data from a JCR tree:

public class Teaser extends RenderingModelImpl<RenderableDefinition> {

    private String     headline;
    private int        x;
    private int        y;
    private String     ctaLink;
    private boolean    showError;
    private LinkModel  link;

    // just imagine all the getters&setters here

   public String execute() {
       headline  = PropertyUtil.getString  ( getNode(), "headline"  );
       x         = PropertyUtil.getLong    ( getNode(), "x"         );
       y         = PropertyUtil.getLong    ( getNode(), "y"         );
       ctaLink   = PropertyUtil.getString  ( getNode(), "ctaLink"   );
       showError = PropertyUtil.getBoolean ( getNode(), "showError" );
       if( getNode().hasNode( "link" ) ) {
           Node linkNode = getNode().getNode( "link" );
           link = new LinkModel();
           link.setTitle ( PropertyUtil.getString( linkNode, "title" ) );
           link.setLink  ( PropertyUtil.getString( linkNode, "ref"   ) );
       return super.execute();

} /* ENDCLASS */

This is a pretty straightforward example as it only attempts to load the JCR properties from the content node into the class instance. You might think that it would be rational to access the content directly within the template. Although this is true I provide this snippet just as an example. Most of the times you are doing some stuff with the loaded data, so you must go through a model class.

Apart from that it's common to use sub-structures (or types) such as the provided link. Dealing with it solely within the template pretty much obfuscates it and destroys it's readability. Nevertheless extracing the information as shown above still works, so what's the rationale behind JCX?

If you have a close look at the provided example you will notice that it's quite redundant:

  • The class itself provides the information of the fields that should be extracted.
  • This information is duplicated through the assignments within the execute function.


Let's say you're renaming a property. You could rename the literal itself and leave everything else as it was before. But these kind of discrepancies are confusing and a good error source. So usually you try to keep everything consistent which means to make multiple edits just for the renaming of a single property. And when it comes to sub-structures or sub-types you are forced to incorporate the structural knowledge into your loading process (in reality I expect that you're experienced enough to use some sort of helper/service in order to load these sub-structures; this opens another error-source as it requires that the corresponding dialog definitions are consistent). Telling from my experience I've seen a lot issues caused by such trivial changes.


The new way

Along comes my JCX module. Using JAXB annotations you are essentially only writing the data fields themselfes. The JcxUnmarshaller provides the functionality necessary to process any Java class and initialize it's JAXB annotated fields from a standard JCR Node. To keep things as simple as possible I'm always using a base class which provides the desired functionality, so here it is:

public abstract class AbstractComponentModel<D extends ConfiguredTemplateDefinition> extends RenderingModelImpl<D> {
  private JcxUnmarshaller                                                          unmarshaller;
  private BiFunction<Node, AbstractComponentModel<D>, AbstractComponentModel<D>>   loader;

  public AbstractComponentModel( Node content, D definition, RenderingModel<?> parent ) {
    super( content, definition, parent );
    unmarshaller = Components.getComponent( JcxUnmarshaller.class );
    loader       = (BiFunction<Node, AbstractComponentModel<D>, AbstractComponentModel<D>>) unmarshaller.createLoader( getClass() );
  public String execute() {
    loader.apply( node, this );
    return super.execute();
} /* ENDCLASS */

As you can see this class is pretty simple. It's only fields unmarshaller and loader are marked as @XmlTransient which means that they will be ignored by the JcxUnmarshaller.

The JcxUnmarshaller is a singleton which manages the unmarshalling functions on a class basis. This is what happens in the constructor of AbstractComponentModel which initializes a loader for the currently used class. Therefore the code analysis only happens once.

The JcxUnmarshaller provides the following kind of functions:

  • Loader - Loads a current node into a provided instance.
  • Creator - Creates an initialized instance from a provided node.
  • SubLoader - Loads a subnode into a provided instance.
  • SubCreator - Creates an initialized instance from a subnode.


In our example we're creating a simple Loader which is called within the execute block using the associated content node as the data source. Each model inheriting this class will automatically make use of this functionality. So look at our example class:

public class Teaser extends AbstractComponentModel<RenderableDefinition> {

  private String      headline;
  private int         x;
  private int         y;
  private String      ctaLink;
  private boolean     showError;
  private LinkModel   link;  
} /* ENDCLASS */

So what happened here? Using JCX you essentially end up only defining the necessary annotations which means to add @XmlAttribute or @XmlElement (you can discover more possibilities in the project documentation).

The model is easier to read and easier to change. You don't need to think about the sub-structure of LinkModel as the class itself will provide the necessary JAXB annotations to load it's content.


The current state

This module essentially began as an ad-hoc implementation so there was definitely no planning involved. Apart from that I'm using it's functionality quite intensive and I suspect that most bigger issues are gone. Furthermore you can checkout the test code which covers a big variety of use cases for structured data. However I still think that you should be aware that this module has a prototypical state and thus some flaws that still need to be addressed:

  • There is no testsuite against the JAXB standard itself in order to properly document the stuff that is working and especially the annotations that aren't supported yet.
  • If there's an error during the unmarshalling process it's not always easy to identify the location which causes trouble since there are lot of @FunctionalInterface involved in the process. I will add some contextual state later which allows to deliver some helpful information but for now you need to dig into the code if there's an issue.


The future

The progress of this module mainly depends on my available time but I've got several working packages in mind in order to improve it and I'm pretty sure that future changes will be very much appreciated:

  • Where there's an Unmarshaller, there should be a Marshaller ! I think this is an obvious statement. In the long run this feature would allow to import/export xml files which correspond to the data structures so they are easily readable in contrast to the generic format used by the current XML/YAML import/export.
    • A working XML import/export would imply corresponding support for YAML and JSON as there are several libraries out there which can do that out of the box.
    • Readable XML files can be used to prepare testcases without much hassle.
  • One of my goals is to get rid of the RenderingModel so instead of these models I want to use Pojos everywhere which would greatly improve the reusability and testability.
  • In the end it's time to rethink the dialog definitions as they always need to be in sync with the data extraction code. If we would annotate our model classes consequently with JAXB annotations we would essentially have datatype declarations. There's no reason not to use this information and generated the corresponding dialog definition on the fly. It might be necessary to add some dialog specific annotations but having a single source for everything would be worth the effort (at least that's what I believe).



Comments are moderated. Therefore they're only being published when the moderator decides to enable them.