Wednesday, May 30, 2012

Scala and Avro

Its possible to use Avro in a Protobuf-esque manner. This means you can input the Avro message schema file (.avsc) into the Avro compiler and generate Java wrappers for you messages. The easiest way to do this is via the sbt plugin for Avro. Here's how.
  1. Include the Avro plugin as part of your sbt configuration by adding the following lines to your project/plugins.sbt file (manually create the folder and file if they don't exist):
  2. resolvers += "cavorite" at "http://files.cavorite.com/maven/"
    
    addSbtPlugin("com.cavorite" % "sbt-avro" % "0.1")
    
  3. Restart sbt, or type 'reload'. The plugin should be ready to use. Now you should read more about its default settings etc.
  4. Next, create a directory 'src/main/avro' to hold your first Avro Schema file (.avsc).
  5. As a test, in the above folder create a file DummyAvroClass.avsc with experimental contents. E.g:
    {
        "namespace": "dummy.avro",
        "type": "record",
        "name": "DummyAvroClass",
        "fields": [
            {"name": "header", "type":
                {
                  "namespace": "dummy.avro",
                  "type": "record",
                  "name": "HeaderAvro",
                  "fields": [
                      {"name": "something", "type": "string"},
                      {"name": "somethingElse", "type": "string"}
                  ]
                }
            },
            {"name": "anotherField", "type": "string"}
        ]
    }
    NOTE: as demonstrated above, avsc files require nested definitions. In this post, Doug Cutting gives a nice summary of avsc limitations and more powerful alternatives.
  6. Add the following line to your build.sbt. This library is needed to compile the Avro generated Java classes:
    libraryDependencies += "org.apache.avro" % "avro" % "1.6.3"
    
    Of course, now run sbt commands 'reload' and 'update'.
  7. (Optional) If you're using IntelliJ, and you have the sbt IntelliJ plugin installed then now's the time to run 'gen-idea'.
  8. Now you can compile, run or test your project. Of course, you haven't yet made use of Avro - we'll save that for the next post.
  9. (Optional) As an example of how you can change the default Avro plugin settings, let's alter the folder into which the Java classes are generated by adding the following lines to your build.sbt (I sometimes do this so my IDE picks them up which is great for debugging and learning purposes):
    seq( sbtavro.SbtAvro.avroSettings : _*)
    
    //E.g. put the source where IntelliJ can see it 'src/main/java' instead of 'targe/scr_managed'.
    javaSource in sbtavro.SbtAvro.avroConfig <<= (sourceDirectory in Compile)(_ / "java")
    

No comments: