Maven code generation – ftw!

Have you ever wanted to generate some of the code you use in a project from a data file?

Recently, I’ve been working on a class that provides configuration options to the application. The options come in from a file or are hard-coded, or simply come from defaults, but it’s nice to be able to access configuration options from a single class with type-safe ‘getters’ for each option. The trouble starts when the set becomes larger than, say, a dozen or so options. At that point the class starts to become a maintenance nightmare. Such classes usually access the property name and type a dozen times or more within the class definition. This is the perfect chance to try out template-based code generation.

Maven

My first thought was to look for a maven plugin that provided such template-based, data-driven code generation. Hmmm – there is the replacer plugin, but the data source is the maven pom file. Well, ok, but I had in mind a more formal definition of the dictionary. I didn’t really want to bury my properties in some obscure pom file in the middle of a multi-module project. In fact, it would be nice if there existed some template-based file generator that would accept data from multiple sources, and even from multiple types of sources – xml, json, csv, whatever.

I’ve had some limited experience using Apache Velocity for this purpose in a past life. The trouble is my experience with it really was limited. I wasn’t the author of the system – just the consumer – all the work of integrating the Velocity engine into the maven build life cycle had already been done by someone else and I never bothered to see how they did it. A quick discussion with a friend who still worked there enlightened me to the fact that it took a LOT of custom code to get it working right.

Now it seems to me that this is not an uncommon thing to want to do – someone must have worked out all the kinks by now in a nicely integrated solution, right?

Freemarker

I looked at several code generation packages and finally landed on Apache Freemarker (the Apache foundation seems to be the final resting place for a lot of projects – many of which completely overlap in functionality – once their originators tire of them). Freemarker sports a powerful data-driven template language so it was a perfect fit for my needs, but – honestly – the main reason I went with it is because of the fmpp (freemarker pre-processor) project, which is a command-line front-end for the Freemarker template engine. While I could have used the Freemarker engine the same way – it’s also a jar – the engine requires data to be fed to it though its API, and the maven-exec-plugin is just not that sophisticated. No, I needed a command-line tool that would allow me to specify a data file as a command-line argument to the jar.

fmpp

The fmpp command-line is also very powerful and functionally complete. Additionally, it’s written in java and, thus, comes packaged in a jar file which can be executed by the maven-exec-plugin‘s “java” goal:

<plugins>
  ...
  <plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <version>1.6.0</version>
    <executions>
      <execution>
        <phase>generate-sources</phase>
        <goals>
          <goal>java</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <includePluginDependencies>true</includePluginDependencies>
      <mainClass>fmpp.tools.CommandLine</mainClass>
      <sourceRoot>${basedir}/target/generated-sources/</sourceRoot>
      <arguments>
        <argument>-C</argument>
        <argument>${basedir}/src/main/templates/config.fmpp</argument>
        <argument>-S</argument>
        <argument>${basedir}/src/main/templates/com/</argument>
        <argument>-O</argument>
        <argument>${basedir}/target/generated-sources/com/</argument>
      </arguments>
    </configuration>
    <dependencies>
      <dependency>
        <groupId>net.sourceforge.fmpp</groupId>
        <artifactId>fmpp</artifactId>
        <version>0.9.15</version>
      </dependency>
      <dependency>
        <groupId>org.freemarker</groupId>
        <artifactId>freemarker</artifactId>
        <version>2.3.23</version>
      </dependency>
    </dependencies>
  </plugin>
  ...
</plugins>

Important aspects of this snippet are highlighted in red:

  1. I’m sure it’s obvious to every other maven user out there, but it’s never been obvious to me how you get the exec plugin to run as part of your build. There appear to be several ways of doing this, but the most succinct, and simplest, in my opinion, is to specify a life cycle phase in an execution. In my case, I wanted to generate source code to be used in the build process, so the “generate-sources” phase seemed appropriate.
  2. Being able to add a “dependencies” section to a plugin made sense to me. What did not make sense was having to explicitly tell the plugin to add those dependencies to the plugin’s class path. I mean – what would be the point of adding dependencies to the exec plugin unless you wanted those dependencies to be accessible by the plugin when you told it to run something?! Well, this is apparently not obvious to the authors of the plugin, because unless you set the “includePluginDependencies” tag to true, your attempts to execute any java code in the fmpp package will be in vain.
  3. Finally, I needed to upgrade the Freemarker engine to a newer version than the one that last shipped with fmpp. Adding that as a separate dependency allowed my specified version to override the one defined in the fmpp pom.

The fmpp command-line is driven by a combination of command-line arguments and a configuration file. You may use all command-line arguments, or all configuration file options, or any mix of the two. I chose to use a mix. My configuration file is minimal, just specifying the data source and where to find it, as well as what to do with the file names during translation:

removeExtensions: ftl
dataRoot: .
data: {
  properties: json(data.json, UTF-8)
}

The first line tells fmpp to remove the ftl extension from the templates as it generates source files. A template file might be named Configuration.java.ftl, while the corresponding generated source file would be Configuration.java.

The second line says where to look for the data file, relative to the configuration file (they’re in the same directory).

The third option – “data:” – tells fmpp where to look for data sources. I have one data source, a json file named data.json, and it uses UTF-8 text encoding. The term “properties” indicates the name of the root-level hash that will store the data in the json file. Here’s an example of some json data:

[
  {
    "type": "Integer",
    "section": "someSection",
    "name": "someProperty",
    "defaultValue": "1",
  },
  ...
]

With this data set, you can generate a java class from a template like this:

package com.example;

import ...

/**
 * NOTE: Generated code !! DO NOT EDIT !!
 * Reference: http://freemarker.org
 * See template: ${pp.sourceFile}
 */
public class Configuration implements Cloneable {
<#list properties as property>
    private final ${property.type} ${property.name};
</#list>
...

The possibilities are endless, really. The Freemarker template language is very powerful, allowing you to transform the input data in any way you can think of.

IntelliJ

I use an IDE – I love my IDE – it does half my work for me. I remember a math teacher from middle school that was upset at the thought of tools that might cause you to forget some of the basics. But, let’s face it, software developers have so much to remember these days, it’s nice when a tool can take some of the load off of us.

One thing we can agree on about maven is that directory structure matters. Maven is all about using reasonable defaults for the 10,000 configuration options in a build, allowing you to override the few that need to change for your special needs. Thus, directory structure matters. It is, therefore, important to choose wisely where template files go and where generated sources are placed.

Referring back up to the pom file plugin definition, note that the fmpp “-O” command line option specifies where to place generated files. And maven has a place for them – in the target/generated-sources directory. IntelliJ will recognize these files as sources and index them properly. Additionally, since IntelliJ recognizes the generated-sources directory as the proper maven location for generated sources, it will also warn you when you try to edit one of these files.

The fmpp “-S” option specifies where the source templates can be found. These I placed in the src/main/templates directory. Now, the templates sub directory of  src/main is not a standard maven location but, not being able to find a better location, I figured it made sense to place them in a package directory structure relative to the src/main/java directory. If you know of a better location for such templates, please feel free to comment.