December 29, 2016

Adding microbenchmarking to your build process

Introduction

As an industry, we are adopting higher transparent and more predictable build processes in order to reduce the risks in building software.  One of the core principles of Continuous Delivery is to gather feedback via Feedback Loops.  At Dev9, we have adopted a "first to know" principle that aligns with the CD principle which means that we (the dev team) wants to be the first to know when there is a failure, degradation of performance or any result not consistent with the business objectives.

Maven and other build tools have provided developers a standardized tool and ecosystem in which to establish and communicate feedback.  While unit tests, functional, build acceptance, database migration, performance testing and code analysis tools have become a mainstay in a development pipeline, benchmarking has largely remained outside of the process.  This could be due to the lack of open sourced, low cost tooling or lightweight libraries that add minimal complexity.

The existing tools often compound complexity by requiring an outside tool to be integrated with the runtime artifact and the tests are not saved in the same source repository or even stored in a source repository.  Local developers are unable to run the benchmarks without effort and therefore the tests lose their value quickly.  Adding to the mainstream solution problems, benchmarking is not typically taught in classes and is often implemented without the necessary isolation required to gather credible results.  This makes all blogs or posts about benchmark results a ripe target for trolls.

With all that said, it is still very important to put some sort of benchmark coverage around critical areas of your codebase.  Building up historical knowledge about critical sections of code can help influence optimization efforts, inform the team about technical debt, alert when a performance threshold change has been committed and compare previous or new versions of algorithms.  The question should now be, how do find and easily add benchmarking to my new or existing project.  In this blog, we will focus on Java projects (1.7+).  The sample code will utilize Maven, though Gradle works very similarly.  I make a few recommendations throughout the blog and they are based on experience from past projects. 

Introducing JHM

There are many strong choices when looking to benchmark Java based code, but most of them have drawbacks that include license fees, additional tooling, byte code manipulation and/or java agents, tests outlined using non-Java based code and highly complex configuration settings.  I like to have tests as close to the code under test as possible to reduce brittleness, lower cohesion and reduce coupling.  I consider most of the benchmarking solutions I have previously used to be too cumbersome to work with or the code to run the tests are either not isolated enough (literally integrated in the code) or contained in a secondary solution far from the source.

The purpose of this blog is to demonstrate how to add a lightweight benchmarking tool to your build pipeline so I will not go into detail about how to use JMH, the following blogs are excellent sources to learn:


Benchmarking Modes

There are a small number of items I want to point out with respect to the modes and scoring as they play an important role in how the base configuration is setup.  At a basic level, JMH has two main types of measure:  throughput and time-based.

Throughput Measuring

Throughput is the amount of operations that can be completed per the unit of time.  JMH maintains a collection of successful and failed operations as the framework increases the amount of load on the test.  Note:  ensure the method or test is well isolated and dependencies like test object creation is done outside of the method or pre-test in a setup method.  With Throughput, the higher the value, the better as it indicates that more operations can be run per unit-time.

Time-Based Measuring

Time-based measuring is the counter-partner to throughput.  The goal of time-based measuring is to identify how long a particular operation takes to run per unit-time.  

AverageTime
The most common time-based measurement is the "AverageTime" which calculates the average time of the operation.  JMH will also produce a "Score Error" to help determine confidence in the produced score.  The "Score Error" is typically 1/2 of the confidence interval and indicates how close the results deviated from the average time.  The lower the result, the better as it indicates a lower average time to run per operation.

SampleTime
SampleTime is similar to AverageTime, but JMH attempts to push more load and look for failures which produces a matrix of failed percentages.  With AverageTime, lower numbers are better and the percentages are useful to determine where you are comfortable with failures due to throughput and length of time.

SingleShotTime
The last and least commonly used mode is SingleShotTime.  This mode is literally a single run and can be useful for cold testing a method or testing your tests.  SingleShotTime could be useful if passed in as a parameter when running benchmarking tests, but reducing the time required to run tests (though, this diminishes the value of the tests and may make them deadweight).  As with the rest of the time-based measurements, the lower the value the better.

Adding JMH to a Java Project

Goal:  This section will show how to create a repeatable harness that allows new tests to be added with minimal overhead or duplication of code.  Note, the dependencies are in the "test" scope to avoid JMH being added to the final artifact.  I have created a github repository that uses JMH while working on Protobuf alternative to REST for Microservices.  The code can be found here: https://github.com/mike-ensor/protobuf-serialization

1) Start by adding the dependencies to the project:


2) JMH recommends that benchmark tests and the artifact be packaged in the same uber jar.  There are several ways to implement an uber jar, explicitly using the "shade" plugin for maven or implicitly using Spring Boot, Dropwizard or some framework with similar results.  For the purposes of this blog post, I have used a Spring Boot application.

3) Add a test harness with a main entry class and global configuration.  In this step, create an entry point in the test area of your project (indicated with #1).  The intention is to avoid having benchmarking code being packaged with the main artifact.



3.1) Add the BenchmarkBase file (indicated above #2).  This file will serve as the entry point for the benchmark tests and contain all of the global configuration for the tests.  The class I have written looks for a "benchmark.properties" file containing configuration properties (indicated above in #3).  JMH has an option to output file results and this configuration is setup for JSON.  The results are used in conjunction with your continuous integration tool and can (should) be stored for historical usage.

This code segment is the base harness and entry point into the Benchmark process run by Maven (setup in step #5 below) At this point, the project should be able to run a benchmark test, so let's add a test case.

4)  Create a Class to benchmark an operation.  Keep in mind, benchmark tests will run against the entirety of the method body, this includes logging, file reading, external resources, etc.  Be aware of what you want to benchmark and reduce or remove dependencies in order to isolate your subject code to ensure higher confidence in results.  In this example, the configuration setup during

Caption:  This gist is a sample benchmark test case extracted from Protobuf Serialization

All of your *Benchmark*.java test classes will now run when you execute the test jar, but this is often not ideal as the process is not segregated and having some control over when and how the benchmarks are run is important to keeping build times down.  Let's build a Maven profile to control when the benchmarks are run and potentially start the application.  Note, for the purposes of showing that maven integration tests start/stop the server, I have included this in the blog post.  I would caution the need to start or stop the application server as you might be incurring the costs of resource fetching (REST calls) which would not be very isolated.

5)  The concept is to create a maven profile to run all of the benchmark tests in isolation (ie. no unit or functional tests).  This will allow the benchmark tests to be run in parallel with the rest of the build pipeline.  Note that the code uses the "exec" plugin and runs the uber jar looking for the full classpath path to the main class.  Additionally, the executable scope is only limited to the "test" sources to avoid putting benchmark code into final artifacts.

This code segment shows an example maven profile to run just the Benchmark tests

6)  Last, optional item is to create a runnable build step in your Continuous Integration build pipeline.  In order to run your benchmark tests in isolation, you or your CI can run:


Conclusion

If you are using a Java based project, JMH is relativly easy to add to your project and pipeline.  The benefits of a historical ledger relating to critical areas of your project can be very useful in keeping the quality bar high.  Adding JMH to your pipeline also adheres to the Continuous Delivery principles including feedback loops, automation, repeatable, and improving continuously.  Consider adding a JMH harness and a few tests to the critical areas of your solution.

December 26, 2016

Protobuf alternative to REST for Microservices

Introduction

A few months ago a colleague and long-time friend of mine published an intriguing blog on a few of the less discussed costs associated with implementing microservices.  The blog post made several important points on performance when designing and consuming microservices.  There is an overhead to using a remote service beyond the obvious network latency due to routing and distance.  The blog describes how there is a cost attributed to serialization of JSON and therefore a microservice should do meaningful work to overcome the costs of serialization.  While this is a generally accepted guideline for microservices, it is often overlooked and thus a concrete reminder helps to illustrate the point.  The second point of interest is the costs associated to the bandwidth size of JSON based RESTful API responses.  One potential pitfall of having a more substantive endpoint is that the payload of a response can degrade performance and quickly consume thread pools and overload the network.

These two main points made me think about alternatives and I decided to create an experiment to see if there were benefits from using Google Protocol Buffers (aka, "Protobuf" for short) over JSON in RESTful API calls.  I set out to show this by first highlighting performance differences between converting JSON using Jackson into POJOs versus Protobuf messages into and out of the a data model.  I decided to create a sufficiently complex data model that utilized nested objects, lists and primitives while trying to keep the model simple to understand;  Therefore I ended up with a Recipe domain model that I would probably not use in a serious cooking application, but serves the purpose for this experiment.

Test #1:  Measure Costs of Serialization and Deserialization

The first challenge I encountered was how to work effectively with Protobuf messages.  After spending some time reading through sparse documentation that focused on an elementary demonstration of Protobuf messages, I finally decided on a method for converting Messages in and out of my domain model.  The preceding statements about using Protobufs is opinionated and someone who uses them often may disagree, but my experience was not smooth and I found messages to be rigid and more difficult than I expected.

The second challenge I encountered came when I wanted to measure the "performance" of both marshaling JSON and Serializing Protobufs.  I spent some time learning JMH and designed my plan on how to test both methods.  Using JMH, I designed a series of tests that allowed me to populate my POJO model, then construct a method that converted into and out of each of the technologies.  I isolated the conversion of the objects in order to capture just the costs associated with conversion.

Test #1: Results

My results were not surprising as I expected Protobuf to be more efficient.  I measured the average time to marshal an object into JSON at 876.754 ns/operation (±43.222ns) versus 148.160 ns/operation (±6.922ns) showing that equivalent objects converted into Protobuf was nearly 6 times faster than into JSON.

Reversing a JSON and Protobuf message into a POJO yielded slower results and were closer together, but Protobuf still out performed JSON un-marshaling.  Converting a JSON string into the domain object took on average 2037.075 ns/operation (±121.997) and Protobuf message to object took on average 844.382 ns/operation (±41.852), nearly 2.4 times faster than JSON.

JSON vs Protobuf Serialization Graph
Serialize/Deserialize times in μSeconds

JSON Protobuf Data


Run the samples yourself using the github project created for this project: https://github.com/mike-ensor/protobuf-serialization


Test #2: Bandwidth differences

I did not find a straight forward way to capture bandwidth using traditional Java-based tools, so I decided to setup a service on AWS and communicate to the API using JSON and Protobuf requests.  I then captured the traffic using Wireshark and calculated the total amount of bytes sent for these requests.  I included the headers and payload in the calculation since both JSON and Protobufs require Accepts and Content-Type mime-type headers.

Test #2: Results

The total size of the request for the JSON request was 789 bytes versus the Protobuf at 518 bytes.  While the JSON request was 45% greater in size than the Protobuf, there was no optimization applied to either request.  The JSON was minified but not compressed.  Using compression can be detrimental to the overall performance of the solution based on the payload size.  If the payload is too small, the cost of compressing and decompressing will overcome the benefits of a smaller payload.  This is a very similar problem to the costs associated with marshaling JSON with small payloads as found by Jeremy's blog.


Conclusion

After completing a project to help determine the overall benefits of using Protobuf over JSON I have come to a conclusion that unless performance is absolutely critical and the developing team's maturity level is high enough to understand the high costs of using Protobufs, then it is a legitimate option to increase the performance associated with message passing.  That being said, the costs of working with Protobufs is very high.  Developers lose access to human readable messages often useful during debugging.  Additionally, Protobufs are messages, not objects and therefore come with more structure and rigger which I found to be complicated due to the inflexibility using only primitives and enums, and updating messages requires the developer to mark new fields as "optional" for backwards compatibility.  Lastly, there is limited documentation on Protocol Buffers beyond the basic "hello world" applications.

February 2, 2014

AES-256 Encryption with Java and JCEKS

Overview

Security has become a great topic of discussion in the last few years due to the recent releasing of documents from Edward Snowden and the explosion of hacking against online commerce stores like JC Penny, Sony and Target. While this post will not give you all of the tools to help prevent the use of illegally sourced data, this post will provide a starting point for building a set of tools and tactics that will help prevent the use of data by other parties.

This post will show how to adopt AES encryption for strings in a Java environment. It will talk about creating AES keys and storing AES keys in a JCEKS keystore format. A working example of the code in this blog is located at https://github.com/mike-ensor/aes-256-encryption-utility

It is recommended to read each section in order because each section builds off of the previous section, however, this you might want to just jump quickly jump to a particular section.

  • Setup - Setup and create keys with keytool
  • Encrypt - Encrypt messages using byte[] keys
  • Decrypt - Decrypt messages using same IV and key from encryption
  • Obtain Keys from Keystore - Obtain keys from keystore via an alias

What is JCEKS?

JCEKS stands for Java Cryptography Extension KeyStore and it is an alternative keystore format for the Java platform. Storing keys in a KeyStore can be a measure to prevent your encryption keys from being exposed. Java KeyStores securely contain individual certificates and keys that can be referenced by an alias for use in a Java program. Java KeyStores are often created using the "keytool" provided with the Java JDK.

NOTE: It is strongly recommended to create a complex passcode for KeyStores to keep the contents secure. The KeyStore is a file that is considered to be public, but it is advisable to not give easy access to the file.

Setup

All encryption is governed by laws of each country and often have restrictions on the strength of the encryption. One example is that in the United States, all encryption over 128-bit is restricted if the data is traveling outside of the boarder. By default, the Java JCE implements a strength policy to comply with these rules. If a stronger encryption is preferred, and adheres to the laws of the country, then the JCE needs to have access to the stronger encryption policy. Very plainly put, if you are planning on using AES 256-bit encryption, you must install the Unlimited Strength Jurisdiction Policy Files. Without the policies in place, 256-bit encryption is not possible.

Installation of JCE Unlimited Strength Policy

This post is focusing on the keys rather than the installation and setup of the JCE. The installation is rather simple with explicit instructions found here (NOTE: this is for JDK7, if using a different JDK, search for the appropriate JCE policy files).

Keystore Setup

When using the KeyTool manipulating a keystore is simple. Keystores must be created with a link to a new key or during an import of an existing keystore. In order to create a new key and keystore simply type:

keytool -genseckey -keystore aes-keystore.jck -storetype jceks -storepass mystorepass -keyalg AES -keysize 256 -alias jceksaes -keypass mykeypass 

Important Flags

In the example above here are the explanations for the keytool's parameters:

Keystore Parameters

genseckey
Generate SecretKey. This is the flag indicating the creation of a synchronous key which will become our AES key
keystore
Location of the keystore. If the keystore does not exist, the tool will create a new store. Paths can be relative or absolute but must be local
storetype
this is the type of store (JCE, PK12, JCEKS, etc). JCEKS is used to store symmetric keys (AES) not contained within a certificate.
storepass
password related to the keystore. Highly recommended to create a strong passphrase for the keystore

Key Parameters

keyalg
algorithm used to create the key (AES/DES/etc)
keysize
size of the key (128, 192, 256, etc)
alias
alias given to the newly created key in which to reference when using the key
keypass
password protecting the use of the key

Encrypt

As it pertains to data in Java and at the most basic level, encryption is an algorithmic process used to programmatically obfuscate data through a reversible process where both parties have information pertaining to the data and how the algorithm is used. In Java encryption, this involves the use of a Cipher. A Cipher object in the JCE is a generic entry point into the encryption provider typically selected by the algorithm. This example uses the default Java provider but would also work with Bouncy Castle.

Generating a Cipher object

Obtaining an instance of Cipher is rather easy and the same process is required for both encryption and decryption. (NOTE: Encryption and Decryption require the same algorithm but do not require the same object instance)

Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
Once we have an instance of the Cipher, we can encrypt and decrypt data according to the algorithm. Often the algorithm will require additional pieces of information in order to encrypt/decrypt data. In this example, we will need to pass the algorithm the bytes containing the key and an initial vector (explained below).

Initialization

In order to use the Cipher, we must first initialize the cipher. This step is necessary so we can provide additional information to the algorithm like the AES key and the Initial Vector (aka IV).

cipher.init(Cipher.ENCRYPT_MODE, secretKeySpecification, initialVector);

Parameters

The SecretKeySpecification is an object containing a reference to the bytes forming the AES key. The AES key is nothing more than a specific sized byte array (256-bit for AES 256 or 32 bytes) that is generated by the keytool (see above).

Alternative Parameteters

There are multiple methods to create keys such as a hash including a salt, username and password (or similar). This method would utilize a SHA1 hash of the concatenated strings, convert to bytes and then truncate result to the desired size. This post will not show the generation of a key using this method or the use of a PBE key method using a password and salt. The password and/or salt usage for the keys is handled by the keytool using the inputs during the creation of new keys.

Initialization Vector

The AES algorithm also requires a second parameter called the Initialiation Vector. The IV is used in the process to randomize the encrypted message and prevent the key from easy guessing. The IV is considered a publicly shared piece of information, but again, it is not recommended to openly share the information (for example, it wouldn't be wise to post it on your company's website). When encrypting a message, it is not uncommon to prepend the message with the IV since the IV will be a set/known size based on the algorithm. NOTE: the AES algorithm will output the same result if using the same IV, key and message. It is recommended that the IV be randomly created each time an encryption takes place.

With the newly initialized Cipher, encrypting a message is simple. Simply call:

byte[] encryptedMessageInBytes = Cipher.doFinal((message.getBytes("UTF-8"));
String base64EncodedEncryptedMsg = BaseEncoding.base64().encode(encryptedMessageInBytes);
String base32EncodedEncryptedMsg = BaseEncoding.base32().encode(encryptedMessageInBytes);

Encoding Results

Byte arrays are difficult to visualize since they often do not form characters in any charset. The best recommendation to solve this is to represent the bytes in HEX (base-16), Double HEX (base-32) or Base64 format. If the message will be passed via a URL or POST parameter, be sure to use a web-safe Base64 encoding. Google Guava library provides a excellent BaseEncoding utility. NOTE: Remember to decode the encoded message before decrypting.

Decrypt

Decrypting a message is almost a reverse of the encryption process with a few exceptions. Decryption requires a known initialization vector as a parameter unlike the encryption process generating a random IV.

Decryption

When decrypting, obtain a cipher object with the same process as the encryption method. The Cipher object will need to utilize the exact same algorithm including the method and padding selections. Once the code has obtained a reference to a Cipher object, the next step is to initialize the cipher for decryption and pass in a reference to a key and the initialization vector.

// key is the same byte[] key used in encryption
SecretKeySpec secretKeySpecification = new SecretKeySpec(key, "AES");
cipher.init(Cipher.DECRYPT_MODE, secretKeySpecification, initialVector);
NOTE: The key is stored in the keystore and obtained by the use of an alias. See below for details on obtaining keys from a keystore

Once the cipher has been provided the key, IV and initialized for decryption, the cipher is ready to perform the decryption.

byte[] encryptedTextBytes = BaseEncoding.base64().decode(message);
byte[] decryptedTextBytes = cipher.doFinal(encryptedTextBytes);
String origMessage = new String(decryptedTextBytes);

Strategies to keep IV

The IV used to encrypt the message is important to decrypting the message therefore the question is raised, how do they stay together. One solution is to Base Encode (see above) the IV and prepend it to the encrypted and encoded message: Base64UrlSafe(myIv) + delimiter + Base64UrlSafe(encryptedMessage). Other possible solutions might be contextual such as including an attribute in an XML file with the IV and one for the alias to the key used.

Obtain Key from Keystore

The beginning of this post has shown how easy it is to create new AES-256 keys that reference an alias inside of a keystore database. The post then continues on how to encrypt and decrypt a message given a key, but has yet shown how to obtain a reference to the key in a keystore.

Solution

// for clarity, ignoring exceptions and failures
InputStream keystoreStream = new FileInputStream(keystoreLocation);

KeyStore keystore = KeyStore.getInstance("JCEKS");
keystore.load(keystoreStream, keystorePass.toCharArray());

if (!keystore.containsAlias(alias)) {
    throw new RuntimeException("Alias for key not found");
}

Key key = keystore.getKey(alias, keyPass.toCharArray());

Parameters

keystoreLocation
String - Location to local keystore file location
keypass
String - Password used when creating or modifying the keystore file with keytool (see above)
alias
String - Alias used when creating new key with keytool (see above)

Conclusion

This post has shown how to encrypt and decrypt string based messages using the AES-256 encryption algorithm. The keys to encrypt and decrypt these messages are held inside of a JCEKS formatted KeyStore database created using the JDK provided "keytool" utility. The examples in this post should be considered a solid start to encrypting/decrypting symmetric keys such as AES. This should not be considered the only line of defense when encrypting messages, for example key rotation. Key rotation is a method to mitigate risks in the event of a data breach. If an intruder obtains data and manages to hack a single key, the data contained in multiple files should have used several keys to encrypt the data thus bringing down risk of a total exposure loss.

All of the examples in this blog post have been condensed into a simple tool allowing for the viewing of keys inside of a keystore, an operation that is not supported out of the box by the JDK keytool. Each aspect of the steps and topics outlined in this post are available at: https://github.com/mike-ensor/aes-256-encryption-utility. NOTE: The examples, sample code and any reference is to be used at the sole implementers risk and there is no implied warranty or liability, you assume all risks.

January 22, 2013

How to Publish Maven Site Docs to BitBucket or GitHub Pages

Introduction

In this post we will Utilize GitHub and/or BitBucket's static web page hosting capabilities to publish our project's Maven 3 Site Documentation. Each of the two SCM providers offer a slightly different solution to host static pages. The approach spelled out in this post would also be a viable solution to "backup" your site documentation in a supported SCM like Git or SVN. This solution does not directly cover site documentation deployment covered by the maven-site-plugin and the Wagon library (scp, WebDAV or FTP).

There is one main project hosted on GitHub that I have posted with the full solution. The project URL is https://github.com/mike-ensor/clickconcepts-master-pom/. The POM has been pushed to Maven Central and will continue to be updated and maintained.

<parent>
    <groupId>com.clickconcepts.project</groupId>
    <artifactId>master-site-pom</artifactId>
    <version>0.16</version>
</parent>

GitHub Pages

GitHub hosts static pages by using a special branch "gh-pages" available to each GitHub project. This special branch can host any HTML and local resources like JavaScript, images and CSS. There is no server side development.

To navigate to your static pages, the URL structure is as follows:

http://<some-username>.github.com/<project-name>
An example of the project I am using in this blog post: http://mike-ensor.github.com/clickconcepts-master-pom/ where the first bold URL segment is a username and the second bold URL segment is the project.

GitHub does allow you to create a base static hosted static site for your username by creating a repository with your username.github.com. The contents would be all of your HTML and associated static resources. This is not required to post documentation for your project, unlike the BitBucket solution.

There is a GitHub Site plugin that publishes site documentation via GitHub's object API but this is outside the scope of this blog post because it does not provide a single solution for GitHub and BitBucket projects using Maven 3.

BitBucket

BitBucket provides a similar service to GitHub in that it hosts static HTML pages and their associated static resources. However, there is one large difference in how those pages are stored. Unlike GitHub, BitBucket requires you to create a new repository with a name fitting the convention. The files will be located on the master branch and each project will need to be a directory off of the root.

mikeensor.bitbucket.org/
     /some-project
      +index.html
      +...
          /css
          /img
     /some-other-project
      +index.html
      +...
          /css
          /img
index.html
.git
.gitignore

The naming convention is as follows:

<username>.bitbucket.org
An example of a BitBucket static pages repository for me would be: http://mikeensor.bitbucket.org/. The structure does not require that you create an index.html page at the root of the project, but it would be advisable to avoid 404s.

Generating Site Documentation

Maven provides the ability to post documentation for your project by using the maven-site-plugin. This plugin is difficult to use due to the many configuration options that oftentimes are not well documented. There are many blog posts that can help you write your documentation including my post on maven site documentation. I did not mention how to use "xdoc", "apt" or other templating technologies to create documentation pages, but not to fear, I have provided this in my GitHub project.

Putting it all Together

The Maven SCM Publish plugin (http://maven.apache.org/plugins/maven-scm-publish-plugin/ publishes site documentation to a supported SCM. In our case, we are going to use Git through BitBucket or GitHub. Maven SCM Plugin does allow you to publish multi-module site documentation through the various properties, but the scope of this blog post is to cover single/mono module projects and the process is a bit painful.

Take a moment to look at the POM file located in the clickconcepts-master-pom project. This master POM is rather comprehensive and the site documentation is only one portion of the project, but we will focus on the site documentation. There are a few things to point out here, first, the scm-publish plugin and the idiosyncronies when implementing the plugin.

In order to create the site documentation, the "site" plugin must first be run. This is accomplished by running site:site. The plugin will generate the documentation into the "target/site" folder by default.

The SCM Publish Plugin, by default, looks for the site documents to be in "target/staging" and is controlled by the content parameter. As you can see, there is a mismatch between folders. NOTE: My first approach was to run the site:stage command which is supposed to put the site documents into the "target/staging" folder. This is not entirely correct, the site plugin combines with the distributionManagement.site.url property to stage the documents, but there is very strange behavior and it is not documented well.

In order to get the site plugin's site documents and the SCM Publish's location to match up, use the content property and set that to the location of the Site Plugin output (<siteOutputDirectory>).

If you are using GitHub, there is no modification to the siteOutputDirectory needed, however, if you are using BitBucket, you will need to modify the property to add in a directory layer into the site documentation generation (see above for differences between GitHub and BitBucket pages). The second property will tell the SCM Publish Plugin to look at the root "site" folder so that when the files are copied into the repository, the project folder will be the containing folder. The property will look like:

<siteOutputDirectory>${project.build.directory}/site/${project.artifactId}</siteOutputDirectory>
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

Next we will take a look at the custom properties defined in the master POM and used by the SCM Publish Plugin above. Each project will need to define several properties to use the Master POM that are used within the plugins during the site publishing. Fill in the variables with your own settings.

BitBucket

<!-- Override Site Documentation SCM publishing parameters -->
<properties>
...
...
<scm-publish.scmBranch>master</scm-publish.scmBranch>
<scm-publish.pubScmUrl>scm:git:git@bitbucket.org:mikeensor/mikeensor.bitbucket.org.git</scm-publish.pubScmUrl>

<!-- Location of where "site" documentation is output; This is for BitBucket only!!! -->
<siteOutputDirectory>${project.build.directory}/site/${project.artifactId}</siteOutputDirectory>
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

<!-- Overwrite from Parent Pom  -->
<changelog.fileUri>${changelog.bitbucket.fileUri}</changelog.fileUri>
<changelog.revision.fileUri>${changelog.revision.bitbucket.fileUri}</changelog.revision.fileUri>
...
...
</properties>

GitHub

<!-- Override Site Documentation SCM publishing parameters -->
<properties>
...
...
<scm-publish.scmBranch>gh-pages</scm-publish.scmBranch>
<scm-publish.pubScmUrl>scm:git:git@github.com:mikeensor/clickconcepts-master-pom.git</scm-publish.pubScmUrl>

<!-- Location of where "site" documentation is output; This is for 
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

<!-- Overwrite from Parent Pom  -->
<changelog.fileUri>${changelog.github.fileUri}</changelog.fileUri>
<changelog.revision.fileUri>${changelog.revision.github.fileUri}</changelog.revision.fileUri>
...
...
</properties>
NOTE: changelog parameters are required to use the Master POM and are not directly related to publishing site docs to GitHub or BitBucket

How to Generate

If you are using the Master POM (or have abstracted out the Site Plugin and the SCM Plugin) then to generate and publish the documentation is simple.

mvn clean site:site scm-publish:publish-scm
mvn clean site:site scm-publish:publish-scm -Dscmpublish.dryRun=true

Gotchas

In the SCM Publish Plugin documentation's "tips" they recommend creating a location to place the repository so that the repo is not cloned each time. There is a risk here in that if there is a git repository already in the folder, the plugin will overwrite the repository with the new site documentation. This was discovered by publishing two different projects and having my root repository wiped out by documentation from the second project. There are ways to mitigate this by adding in another folder layer, but make sure you test often!

Another gotcha is to use the -Dscmpublish.dryRun=true to test out the site documentation process without making the SCM commit and push

Project and Documentation URLs

Here is a list of the fully working projects used to create this blog post:

November 19, 2012

How to test a Custom Exception using custom FEST assertions

Introduction

This is part three of my posts on assertions testing using Fest, JUnit and custom Exceptions. The first post was covering the basics of assertions, then followed up with testing custom Exceptions using JUnit and JUnit's ExpectedException class. At this point you should know that you have a custom Runtime Exception class and you would like to test it using the Fluent API provided by FEST.

Custom Assertions with Fest

This blog post will not go into the details on creating a custom assertion, but the solution posted in the Github project does contain a custom assertion. In addition, please refer to the official site docs.

Building off of the last post, you will see that ExpectedException is a great improvement over the @Test(expected) and Try/Catch techniques, however the ExpectedException object can still be improved by adding in a fluent-style API backed by Fest Assertion project. So, how do you do it? Let's get right to the solution!

Expected Exceptions with FEST and JUnit @Rule

Now that we have an understanding of FEST assertions, JUnit's @Rule functionality and ClickConcept's @ExpectedFailure we can combine the first two to provide fluent-style expected exception behavior while testing the assertion class using @ExpectedFailure annotations.

Testing your custom exception with FEST

Let's begin by creating a new @Rule object "ExpectedException" which extends TestRule. When creating the class, we will expose the construction through a simple factory method to return a new ExpectedException. The default factory will return a base implementation where the functionality is muted in all other cases where exceptions are not desired.

We can start out with the code first, but I will explain that in order to build your own custom Fluent API for FEST, you must re-create the API for the base exception assertion. The fluent API you create will be in addition to the FEST exception assertion class. Fluent API help was derived off of several blogs, but the most informative has been http://www.unquietcode.com/blog/2011/programming/using-generics-to-build-fluent-apis-in-java/.

NOTE: AbstractExpectedException encapsulates the base API for FEST's ExceptionAssertion. The code for this is found at the Github site: https://github.com/mike-ensor/fest-backed-expected-exception


public class ExpectedCustomException extends AbstractExpectedException<ExpectedCustomException> {

    private Integer code;

    public static ExpectedCustomException none() {
        return new ExpectedCustomException();
    }

    /**
     * Checks to see if the CustomException has the specified code
     *
     * @param code int
     * @return AbstractExpectedException
     */
    public AbstractExpectedException hasCode(int code) {
        // method telling class that a custom exception is being asked for
        markExpectedException();
        this.code = code;
        return this;
    }

    @Override
    protected void checkAssertions(Exception e) {
        // check parent's exceptions
        super.checkAssertions(e);

        if (getCode() != null) {
            // FEST Custom Assert object
            CustomExceptionAssert.assertThat(e).hasCode(code);
        }
    }

    private Integer getCode() {
        return code;
    }

}

Analysis

In this example, my CustomException has exposed a "code" to store when the exception was created. In order to test this my custom ExpectedException object must look for the proper code on the CustomException object, in a fluent manor.

Here is an example test case to explain how to use your new fluent API Custom Exception test. Take note of the third test case to see the Fluent API in use! (NOTE: Full test cases are available on my github account.

public class CustomExceptionTest {

    @Rule
    public ExpectedCustomException exception =
            ExpectedCustomException.none();

    @Rule
    public ExpectedTestFailureWatcher expectedTestFailureWatcher =
            ExpectedTestFailureWatcher.instance();

    @Test
    public void hasCode_worksAsExpected() {
        exception.hasCode(123);
        throw new CustomException("Message", 123);
    }

    @Test
    @ExpectedFailure
    public void getCode_fails() {
        exception.hasCode(456);
        throw new CustomException("Message", 123);
    }

    @Test
    @ExpectedFailure
    public void getMessageAndCode_codeFailsFirst() {
        exception.hasCode(456).hasMessage("Message");
        throw new CustomException("Message", 123);
    }

}

Summary

Thank you to those of you who have read through this little series covering assertions, how to test exceptions (both the exception flow and custom exceptions) and then on to testing your custom exceptions using FEST assertions. Please come back to my blog in the near future where I will have a REST API checklist to look over when architecting your next REST API.

Those who are reading this blog are most likely a small subset of the software development community, but if you are not, and you find the idea of a fluent-API really cool (as I do), please check out the FEST assertion library. If you are new to test driven development, please take up the practice and try applying to your code immediately. If all developers used TDD as a general practice, the level in quality would grow world wide!

September 25, 2012

Testing Custom Exceptions w/ JUnit's ExpectedException and @Rule


Exception Testing

Why test exception flows? Just like with all of your code, test coverage writes a contract between your code and the business functionality that the code is supposed to produce leaving you with a living documentation of the code along with the added ability to stress the functionality early and often. I won't go into the many benefits of testing instead I will focus on just Exception Testing.

There are many ways to test an exception flow thrown from a piece of code. Lets say that you have a guarded method that requires an argument to be not null. How would you test that condition? How do you keep JUnit from reporting a failure when the exception is thrown? This blog covers a few different methods culminating with JUnit's ExpectedException implemented with JUnit's @Rule functionality.


The "old" way

In a not so distant past the process to test an exception required a dense amount of boilerplate code in which you would start a try/catch block, report a failure if your code did not produce the expected behavior and then catch the exception looking for the specific type. Here is an example:

public class MyObjTest {

    @Test
    public void getNameWithNullValue() {

        try {
            MyObj obj = new MyObj();
            myObj.setName(null);
            
            fail("This should have thrown an exception");

        } catch (IllegalArgumentException e) {
            assertThat(e.getMessage().equals("Name must not be null"));
        }
    }
}

As you can see from this old example, many of the lines in the test case are just to support the lack of functionality present to specifically test exception handling. One good point to make for the try/catch method is the ability to test the specific message and any custom fields on the expected exception. We will explore this a bit further down with JUnit's ExpectedException and @Rule annotation.


JUnit adds expected exceptions

JUnit responded back to the users need for exception handling by adding a @Test annotation field "expected". The intention is that the entire test case will pass if the type of exception thrown matched the exception class present in the annotation.

public class MyObjTest {

    @Test(expected = IllegalArgumentException.class)
    public void getNameWithNullValue() {
        MyObj obj = new MyObj();
        myObj.setName(null);
    }
}

As you can see from the newer example, there is quite a bit less boiler plate code and the test is very concise, however, there are a few flaws. The main flaw is that the test condition is too broad. Suppose you have two variables in a signature and both cannot be null, then how do you know which variable the IllegalArgumentException was thrown for? What happens when you have extended a Throwable and need to check for the presence of a field? Keep these in mind as you read further, solutions will follow.


JUnit @Rule and ExpectedException

If you look at the previous example you might see that you are expecting an IllegalArgumentException to be thrown, but what if you have a custom exception? What if you want to make sure that the message contains a specific error code or message? This is where JUnit really excelled by providing a JUnit @Rule object specifically tailored to exception testing. If you are unfamiliar with JUnit @Rule, read the docs here.


ExpectedException

JUnit provides a JUnit class ExpectedException intended to be used as a @Rule. The ExpectedException allows for your test to declare that an exception is expected and gives you some basic built in functionality to clearly express the expected behavior. Unlike the @Test(expected) annotation feature, ExpectedException class allows you to test for specific error messages and custom fields via the Hamcrest matchers library.

An example of JUnit's ExpectedException

import org.junit.rules.ExpectedException;

public class MyObjTest {

    @Rule
    public ExpectedException thrown = ExpectedException.none();

    @Test
    public void getNameWithNullValue() {
        thrown.expect(IllegalArgumentException.class);
        thrown.expectMessage("Name must not be null");

        MyObj obj = new MyObj();
        obj.setName(null);
    }
}

As I eluded to above, the framework allows you to test for specific messages ensuring that the exception being thrown is the case that the test is specifically looking for. This is very helpful when the nullability of multiple arguments is in question.


Custom Fields

Arguably the most useful feature of the ExpectedException framework is the ability to use Hamcrest matchers to test your custom/extended exceptions. For example, you have a custom/extended exception that is to be thrown in a method and inside the exception has an "errorCode". How do you test that functionality without introducing the boiler plate code from the try/catch block listed above? How about a custom Matcher!

This code is available at: https://github.com/mike-ensor/custom-exception-testing


Solution: First the test case

import org.junit.rules.ExpectedException;

public class MyObjTest {

    @Rule
    public ExpectedException thrown = ExpectedException.none();

    @Test
    public void someMethodThatThrowsCustomException() {
        thrown.expect(CustomException.class);
        thrown.expect(CustomMatcher.hasCode("110501"));

        MyObj obj = new MyObj();
        obj.methodThatThrowsCustomException();
    }
}

Solution: Custom matcher

import com.thepixlounge.exceptions.CustomException;
import org.hamcrest.Description;
import org.hamcrest.TypeSafeMatcher;

public class CustomMatcher extends TypeSafeMatcher<CustomException> {

    public static BusinessMatcher hasCode(String item) {
        return new BusinessMatcher(item);
    }

    private String foundErrorCode;
    private final String expectedErrorCode;

    private CustomMatcher(String expectedErrorCode) {
        this.expectedErrorCode = expectedErrorCode;
    }

    @Override
    protected boolean matchesSafely(final CustomException exception) {
        foundErrorCode = exception.getErrorCode();
        return foundErrorCode.equalsIgnoreCase(expectedErrorCode);
    }

    @Override
    public void describeTo(Description description) {
        description.appendValue(foundErrorCode)
                .appendText(" was not found instead of ")
                .appendValue(expectedErrorCode);
    }
}

NOTE: Please visit https://github.com/mike-ensor/custom-exception-testing to get a copy of a working Hamcrest Matcher, JUnit @Rule and ExpectedException.

And there you have it, a quick overview of different ways to test Exceptions thrown by your code along with the ability to test for specific messages and fields from within custom exception classes. Please be specific with your test cases and try to target the exact case you have setup for your test, remember, tests can save you from introducing side-effect bugs!

September 18, 2012

Brief Overview of Java Assertions

What are asserts?

An assertion is a predicate (a true–false statement) placed in a program to indicate that the developer thinks that the predicate is always true at that place. [wikipedia]

Traditional asserts

Traditional testing frameworks started with the built in keyword assert.

An example of the keyword assert:

assert <condition> value

Assert has a few drawbacks including stopping the test execution and lengthy and hard to describe assert statements.

Second generation assertions

Along comes JUnit's assert framework. Built on top of the assert keyword, JUnit provided developers the ability to be more descriptive about the testing statements.

An example of JUnit's asserts:
// asserts that the condition must be true
assertTrue("This should be true", "abc".equalsIgnoreCase("ABC"));
// asserts that the object must not be null
assertNotNull(new MyObject());
//...
assertFalse(false == true);
//
assertNull(null);
// etc...

While there are some improvements on readability and usability to the basic assert keyword provided by JUnit, they share some of the same drawbacks in that many developers just use the "assertTrue()", "assertEquals()" and "assertFalse()" methods still providing a very cryptic assertion statement.

Third generation assertions

In an effort to guide developers into writing test assertions that are more readable and usable the Hamcrest library was created which switched the philosophy from many assert functions to just one basic function. The fundamental thought is that the assert is the same but the conditions will change. Hamcrest was built using the concepts of BDD (behavior driven design) where the test assertion is closer to that of a sentence.

An example of hamcrest assertions

@Test
public void showoffSomeHamcrestAssertsAndMatchers() {
    // asserts that string "abc" is "ABC" ignoring case
    assertThat("abc", is(equalToIgnoringCase("ABC")));
    assertThat(myObject.getFirst(), is("Mike!"));
    assertThat(myObject.getAddress(), is(notNullValue));
}

Hamcrest is a great improvement on top of the JUnit framework providing a flexible and readable testing platform. Hamcrest + JUnit is a comprehensive testing framework and when combined with Mockito (or other mocking framework) can provide a very descriptive and thorough unit testing solution. One of the drawbacks to using Hamcrest is that while descriptive, multiple assertions must be made to ensure that a test case has been covered. Many TDD purest agree that a test case should contain one and only one assertion, but how is this possible with a complex object? (purest will say refactoring, but oftentimes this is not feasible)

Fourth generation frameworks

And finally we come to the present with the latest assertion frameworks. Fest Assertion framework takes off where Hamcrest stopped providing a fluent style assertion framework giving the developer the ability to test objects with a single assertion.

An example of Fest assertions

@Test
public void getAddressOnStreet() {
    List<Address> addresses = addressDAO.liveOnStreet("main");
    assertThat(addresses).hasSize(10).contains(address1, address2);
    assertThat(stringObj).hasSize(7).isEqualToIgnoringCase("abcdefg");
}

As you can see, FEST exceptions provide a cleaner approach to assertions. FEST is very extensible and easy to use.

Conclusion

Assertions are a key tool in a professional developer's toolbox in which to stress test their code. This post is a part in a series with the culmination being a fluent style ExpectedException mechanism, backed by Fest, in which to add better clarity to your test cases when exceptions are being introduced. Feel free to read the lead up to this post; Allowing known failing JUnit Tests to pass test cases.

NOTE: This blog is providing an overview on assertion frameworks it should be noted that there are various other second and third generation testing frameworks that strive to provide better clarity and usability for testing including notables such as TestNG and JTest