January 22, 2013

How to Publish Maven Site Docs to BitBucket or GitHub Pages

Introduction

In this post we will Utilize GitHub and/or BitBucket's static web page hosting capabilities to publish our project's Maven 3 Site Documentation. Each of the two SCM providers offer a slightly different solution to host static pages. The approach spelled out in this post would also be a viable solution to "backup" your site documentation in a supported SCM like Git or SVN. This solution does not directly cover site documentation deployment covered by the maven-site-plugin and the Wagon library (scp, WebDAV or FTP).

There is one main project hosted on GitHub that I have posted with the full solution. The project URL is https://github.com/mike-ensor/clickconcepts-master-pom/. The POM has been pushed to Maven Central and will continue to be updated and maintained.

<parent>
    <groupId>com.clickconcepts.project</groupId>
    <artifactId>master-site-pom</artifactId>
    <version>0.16</version>
</parent>

GitHub Pages

GitHub hosts static pages by using a special branch "gh-pages" available to each GitHub project. This special branch can host any HTML and local resources like JavaScript, images and CSS. There is no server side development.

To navigate to your static pages, the URL structure is as follows:

http://<some-username>.github.com/<project-name>
An example of the project I am using in this blog post: http://mike-ensor.github.com/clickconcepts-master-pom/ where the first bold URL segment is a username and the second bold URL segment is the project.

GitHub does allow you to create a base static hosted static site for your username by creating a repository with your username.github.com. The contents would be all of your HTML and associated static resources. This is not required to post documentation for your project, unlike the BitBucket solution.

There is a GitHub Site plugin that publishes site documentation via GitHub's object API but this is outside the scope of this blog post because it does not provide a single solution for GitHub and BitBucket projects using Maven 3.

BitBucket

BitBucket provides a similar service to GitHub in that it hosts static HTML pages and their associated static resources. However, there is one large difference in how those pages are stored. Unlike GitHub, BitBucket requires you to create a new repository with a name fitting the convention. The files will be located on the master branch and each project will need to be a directory off of the root.

mikeensor.bitbucket.org/
     /some-project
      +index.html
      +...
          /css
          /img
     /some-other-project
      +index.html
      +...
          /css
          /img
index.html
.git
.gitignore

The naming convention is as follows:

<username>.bitbucket.org
An example of a BitBucket static pages repository for me would be: http://mikeensor.bitbucket.org/. The structure does not require that you create an index.html page at the root of the project, but it would be advisable to avoid 404s.

Generating Site Documentation

Maven provides the ability to post documentation for your project by using the maven-site-plugin. This plugin is difficult to use due to the many configuration options that oftentimes are not well documented. There are many blog posts that can help you write your documentation including my post on maven site documentation. I did not mention how to use "xdoc", "apt" or other templating technologies to create documentation pages, but not to fear, I have provided this in my GitHub project.

Putting it all Together

The Maven SCM Publish plugin (http://maven.apache.org/plugins/maven-scm-publish-plugin/ publishes site documentation to a supported SCM. In our case, we are going to use Git through BitBucket or GitHub. Maven SCM Plugin does allow you to publish multi-module site documentation through the various properties, but the scope of this blog post is to cover single/mono module projects and the process is a bit painful.

Take a moment to look at the POM file located in the clickconcepts-master-pom project. This master POM is rather comprehensive and the site documentation is only one portion of the project, but we will focus on the site documentation. There are a few things to point out here, first, the scm-publish plugin and the idiosyncronies when implementing the plugin.

In order to create the site documentation, the "site" plugin must first be run. This is accomplished by running site:site. The plugin will generate the documentation into the "target/site" folder by default.

The SCM Publish Plugin, by default, looks for the site documents to be in "target/staging" and is controlled by the content parameter. As you can see, there is a mismatch between folders. NOTE: My first approach was to run the site:stage command which is supposed to put the site documents into the "target/staging" folder. This is not entirely correct, the site plugin combines with the distributionManagement.site.url property to stage the documents, but there is very strange behavior and it is not documented well.

In order to get the site plugin's site documents and the SCM Publish's location to match up, use the content property and set that to the location of the Site Plugin output (<siteOutputDirectory>).

If you are using GitHub, there is no modification to the siteOutputDirectory needed, however, if you are using BitBucket, you will need to modify the property to add in a directory layer into the site documentation generation (see above for differences between GitHub and BitBucket pages). The second property will tell the SCM Publish Plugin to look at the root "site" folder so that when the files are copied into the repository, the project folder will be the containing folder. The property will look like:

<siteOutputDirectory>${project.build.directory}/site/${project.artifactId}</siteOutputDirectory>
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

Next we will take a look at the custom properties defined in the master POM and used by the SCM Publish Plugin above. Each project will need to define several properties to use the Master POM that are used within the plugins during the site publishing. Fill in the variables with your own settings.

BitBucket

<!-- Override Site Documentation SCM publishing parameters -->
<properties>
...
...
<scm-publish.scmBranch>master</scm-publish.scmBranch>
<scm-publish.pubScmUrl>scm:git:git@bitbucket.org:mikeensor/mikeensor.bitbucket.org.git</scm-publish.pubScmUrl>

<!-- Location of where "site" documentation is output; This is for BitBucket only!!! -->
<siteOutputDirectory>${project.build.directory}/site/${project.artifactId}</siteOutputDirectory>
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

<!-- Overwrite from Parent Pom  -->
<changelog.fileUri>${changelog.bitbucket.fileUri}</changelog.fileUri>
<changelog.revision.fileUri>${changelog.revision.bitbucket.fileUri}</changelog.revision.fileUri>
...
...
</properties>

GitHub

<!-- Override Site Documentation SCM publishing parameters -->
<properties>
...
...
<scm-publish.scmBranch>gh-pages</scm-publish.scmBranch>
<scm-publish.pubScmUrl>scm:git:git@github.com:mikeensor/clickconcepts-master-pom.git</scm-publish.pubScmUrl>

<!-- Location of where "site" documentation is output; This is for 
<scm-publish.siteDocOuputDirectory>${project.build.directory}/site</scm-publish.siteDocOuputDirectory>

<!-- Overwrite from Parent Pom  -->
<changelog.fileUri>${changelog.github.fileUri}</changelog.fileUri>
<changelog.revision.fileUri>${changelog.revision.github.fileUri}</changelog.revision.fileUri>
...
...
</properties>
NOTE: changelog parameters are required to use the Master POM and are not directly related to publishing site docs to GitHub or BitBucket

How to Generate

If you are using the Master POM (or have abstracted out the Site Plugin and the SCM Plugin) then to generate and publish the documentation is simple.

mvn clean site:site scm-publish:publish-scm
mvn clean site:site scm-publish:publish-scm -Dscmpublish.dryRun=true

Gotchas

In the SCM Publish Plugin documentation's "tips" they recommend creating a location to place the repository so that the repo is not cloned each time. There is a risk here in that if there is a git repository already in the folder, the plugin will overwrite the repository with the new site documentation. This was discovered by publishing two different projects and having my root repository wiped out by documentation from the second project. There are ways to mitigate this by adding in another folder layer, but make sure you test often!

Another gotcha is to use the -Dscmpublish.dryRun=true to test out the site documentation process without making the SCM commit and push

Project and Documentation URLs

Here is a list of the fully working projects used to create this blog post: