Thursday, February 27, 2014

Custom reporting of VersionOne data in Clojure: The implementation

This article dives into the implementation of a custom reporting site that produces charts and tables. Previously, I described the custom reports here. If you haven't seen them yet please take a look at the charts in the prior post as they give context to the code presented in this article.

Data aggregations and transformation

The custom reports are essentially extracting tabular data through aggregation and transformation of queried data. Clojure really shines for this kind of work! It is a real pleasure to be able to define concise programmatic transforms in code.

Consider how to calculate the churn for a team sprint. I'll define churn to mean counts of added and removed stories for a team sprint. More specifically, I don't want to look at the stories just at the start and end of the sprint, but also what got added or removed daily. This way I catch stories that are added then removed mid sprint, which otherwise would not show up. Of course there are other definitions you can use, such as, every addition or removal during a single day. I choose to exclude stories added and removed in the same day, as they are usually user error that did not create churn.

With this definition in mind, the way to calculate churn is to get the set of story numbers for each day of the sprint and figure out from the differences in the sets when a story was added or removed.

Another way of thinking about this is creating a sequence of removed stories by mapping difference over a sequence of sets of stories per day adjacent to the sequence of set of stories per next day. The sequence of set of stories per tomorrow is really the sequence of stories per day missing the first element. The sequence is concisely expressed as:

(map difference story-sets (rest story-sets))

For a more detailed treatment of this technique, please take a look here.

The implementation of churn uses maps instead of sets. The reason is that there is another more detailed report which shows the actual stories that were added and removed, so it is convenient to define collect and churn-data to work in this way so they can be used in both places. Here it is:

(defn- map-difference [a b]
  (reduce dissoc a (keys b)))

(defn- collect [as bs]
  (reduce merge (map map-difference as bs)))

(defn- churn-data
  [team sprint]
  (let [span (sprint-span sprint)
        begin (time/plus (span "BeginDate") (time/days 1))
        end (span "EndDate")]
    (for-sprint team sprint begin end stories-on)))

(defn churn
  "The count of stories added to and removed from a sprint"
  [team sprint]
  (let [stories (churn-data team sprint)
        added (collect (rest stories) stories)
        removed (collect stories (rest stories))]
    [sprint (count added) (count removed)]))

To me this is beautiful code that concisely defines how to process data into the report that I want. This sort of work is a perfect fit with Clojure. You can find the complete listing of the code backing 30 charts and tables in src/vone/services.clj. These real world examples of data transformations demonstrate that Clojure provides concise mechanisms to do data crunching.

Services

The open items report shows a summary of stories not yet closed per project. It is a very simple report. Let's follow the plumbing of what happens from the service query all the way out to the final rendered chart. The service is directly querying story details from VersionOne and only adding links to the story page and custom history page. Here is the function that defines it in src/vone/services.clj:

(defn openItems
  [project]
  (cons ["Team" "Story" "Title" "Points" "Status" "Priority" "Sprint" "Links"]
        (map (fn [[a b c d e f g]] [a b c d e f g (links b)])
             (request-rows "/Data/PrimaryWorkitem"
                           {:sel "Team.Name,Number,Name,Estimate,Status.Name,Priority.Name,Timebox.Name"
                            :where (str "Scope.Name='" project
                                        "';AssetState='Active';Status.Name!='Accepted'")
                            :sort "Team.Name,Number"}
                           "None"))))

request-rows constructs a url to query VersionOne and tabulates the results. As you can see, I am requesting a set of interesting fields from the PrimaryWorkitems for a particular Scope of work that are active and not accepted.

If you have downloaded the code, you can execute lein ring server to start up a webserver and navigate to http://localhost:3000/json/openItems/TC5.0.1 to see the output of this function:

[["Team","Story","Title","Points","Status","Priority","Sprint","Links"],["TC Akvelon","D-21200","CSX Regression: U segmented moves complete upon hitching",2.0,"Ready for QA","Critical","TC1403","<a href=\"#/history/D-21200\" target=\"_blank\">[hist]</a><a href=\"http://www3.v1host.com/Tideworks/VersionOne/rest-1.v1/assetdetail.v1?Number=D-21200\" target=\"_blank\">[v1]</a>"],["TC Akvelon","D-21340","CSX Regression: Unlocking or locking a 40 foot container to stack 318 needs to work correctly (Split)",0.5,"Ready for QA","High","TC1403","<a href=\"#/history/D-21340\" target=\"_blank\">[hist]</a><a href=\"http://www3.v1host.com/Tideworks/VersionOne/rest-1.v1/assetdetail.v1?Number=D-21340\" target=\"_blank\">[v1]</a>"],["TC Akvelon","D-21408","CSX Regression: Segmentation keeps changing after an unhitch to unplanned location",2.33,"None","High","TC1403","<a href=\"#/history/D-21408\" target=\"_blank\">[hist]</a><a href=\"http://www3.v1host.com/Tideworks/VersionOne/rest-1.v1/assetdetail.v1?Number=D-21408\" target=\"_blank\">[v1]</a>"],["TC Sharks","D-21448","CSX COL: TCSC required reboot due to chassis assigned to multiple moves",2.33,"In Development","High","TC1403","<a href=\"#/history/D-21448\" target=\"_blank\">[hist]</a><a href=\"http://www3.v1host.com/Tideworks/VersionOne/rest-1.v1/assetdetail.v1?Number=D-21448\" target=\"_blank\">[v1]</a>"]]

The output is tabular data - a sequence of rows with the first row containing header titles. This is the raw data which will populate a table or a chart in the report.

Next navigate to http://localhost:3000/csv/openItems/TC5.0.1 and you will receive a csv file containing the data. Public functions in the vone.services namespace are exposed as json, csv and datasource (for charts).

Routing

Services are defined as public functions in the vone.services namespace, and pages are defined in the vone.pages namespace as public functions. This allows the routes to be constructed automatically. Here is the code that configures the page routes:

(defn page-routes
  "Returns routes for pages defined in a namespace as public functions"
  [page-ns]
  (for [[page-name page] (functions page-ns)]
    (GET (str "/" page-name) request (page))))

As you can see, it makes a Compojure HTTP route with the name of the function to handle the request. Page functions themselves look like this:

(defn about []
  (html
   [:div "Custom Reporting on VersionOne data.  Please select a report from the navbar above."]))

I chose to define my pages in hiccup. The HTML served is static. AngularJS takes care of dynamic UI/server interaction. If you prefer working directly with HTML you can do that by dropping a file in resources/public. Routing pages in this way is not very exciting, but now we get to routing services where things get more interesting. http://localhost:3000/csv/openItems/TC5.0.1 calls the openItems function with one string argument and formats the result as csv.

Report pages are collections of charts. Each chart has a data source implemented on the server as a service. Services are set up by calling path-routes which is very similar in concept to page-routes. path-routes constructs routes which validate and parse URL arguments to match and call the Clojure function.

I found generated routes for services to be very convenient because:
  • Every public function I add to the services namespace can be called from a browser
  • Services can be accessed in multiple formats: json/csv/datasource
  • Service calls have their arguments checked and return helpful error messages if the signatures do not match, type hints are used to validate expected types 
  • Webservice documentation is trivial since they are plain old functions. I use standard Clojure docgen tools like codox to describe them.
I liked it so much that I spun it off into a tiny library called RouteGen and used it in other projects. :)

Charting the data

Notice that openItems is in camel case instead of the Clojure preferred hyphenation open-items. This was done to allow the visualization appropriate for this service to be looked up by convention in a JavaScript object, which does not allow hyphenated symbols.

resources/public/js/charts.js contains an options object which specifies:

  openItems: {
    visualization: "Table",
    title: "Open Items",
    allowHtml: true,
    height: null
  },

The visualization value must be a Google Visualization class, the other properties are chart options.

In the page we insert the chart element as a div with a special attribute:

<div chart="openItems"/>

chart is an AngularJS directive; a way of creating a custom element. The directive is defined in JavaScript inside resources/public/js/charts.js:

directive('chart', function(options, $log) {
  return function(scope, elem, attrs) {
    var chart, query, o = {};
    angular.extend(o, options.general);
    angular.extend(o, options[attrs.chart]);
    angular.extend(o, _.pick(attrs, 'height', 'width'));
    elem[0].innerHTML = "Loading " + o.title + "...";
    chart = new google.visualization[o.visualization](elem[0]);
    query = function() {
      if (!scope.args) {
        return;
      }
      var url = 'datasource/' + attrs.chart + '/' + scope.args;
      $log.info("Quering " + url);
      new google.visualization.Query(url)
      .send(function (response) {
        if (response.isError()) {
          google.visualization.errors
          .addErrorFromQueryResponse(
            elem[0], response);
        } else {
          chart.draw(response.getDataTable(), o);
        }
      });
    }
    scope.$watch("args", query, true);
  };
})

All it does is create a visualization, then feed it the data from querying a service with the same name as the chart.

Options are merged from a set of general options, chart specific options, and element attributes. This allows us to override any aspect of the visualization in html. An alternative to this approach is to provide the visualization specification along with the data, which feels more unified. I preferred the separation of data from visualization options and stuck with that. We can override the default AreaChart visualization for cumulative flow with ColumnChart, resulting in a different chart of the same data:

<div chart="cumulative" visualization="ColumnChart"/>

In practice, I didn't find any scenarios where I wanted multiple variations on visualizations in the custom reports. It might be handy later to add an expandable table to each chart so that the user can see and fetch the underlying data.

AngularJS takes care of binding the arguments to use for the service request in scope. To understand this better let's look at the page definition for open items:


(defn projectopenitems []
  (html
   [:h1 "Open items: {{today}}"]
   select-buttons
   [:label {:ng-repeat "project in projects"
            :style "margin: 0px 20px;"}
    [:input {:type "checkbox" :ng-model "project.enabled"} "{{project.name}}"]]
   [:div.report
    [:ul.list-unstyled
     [:li {:ng-repeat "project in projects | filter:{enabled:true}"
           :ng-init "args = project.name"}
      [:br]
      [:br]
      [:h2 "{{args}}"]
      [:div {:chart "openItems"}]]]]))

This page binds a checkbox for each project, then for each checkbox which is currently enabled it creates an "openItems" chart which will call the openItems service with a single argument - the project name. The yellow highlighted section is where scope.args is set to be project.name. The url constructed as 'datasource/' + attrs.chart + '/' + scope.args will be 'datasource/openItems/TC5.0.1' if there is a TC5.0.1 project that is checked. The datasource format returned is tabular JSON data with some extra sugar to make it plug into the chart seamlessly.

The list of projects itself is populated from a service call. Projects, teams and sprints are loaded on the first view so that they can be selected in drop downs or checkboxes quickly when navigating to page fragments that require them.

Consuming the VersionOne Data API

Time to look into what request-rows in the openItems service actually does. request-rows constructs a URL to make a request from the VersionOne Data API, collapses the XML response into maps, then unmaps the fields into rows in the order you requested them.

Let's examine the URL construction part first. There is a debug function as-url which demonstrates what is happening, it takes the inputs you would normally pass to request-rows, and returns the URL that would be queried and processed:

(as-url "/Data/PrimaryWorkitem"
        {:sel "Team.Name,Number,Name,Estimate,Status.Name,Priority.Name,Timebox.Name"
         :where (str "Scope.Name='TC5.0.1';AssetState='Active';Status.Name!='Accepted'")
         :sort "Team.Name,Number"})

=> "http://www3.v1host.com/Tideworks/VersionOne/rest-1.v1/Data/PrimaryWorkitem?sel=Team.Name,Number,Name,Estimate,Status.Name,Priority.Name,Timebox.Name&where=Scope.Name='TC5.0.1';AssetState='Active';Status.Name!='Accepted'&sort=Team.Name,Number"


If you are logged into VersionOne and go to this URL, you will see the results of the query as XML. I found it very easy and convenient to be able to edit the URL in my browser and play with queries until I got the data I was looking for. Top marks to VersionOne for exposing their data in an extremely accessible way!

As you can see, the URL was constructed by pre-pending the base URL to the asset being queried, and appending the sel/where/sort clauses. But why bother you ask? Why not just use the string I was able to build in my browser? Well, the results of the query come back with the fields I requested, but they are not guaranteed to have the fields in the same order I requested them in. All the fields are labeled, so this is no problem so long as I know which fields I requested. To save some repetition, the sel clause needs to be identifiable, fields stripped out, and extracted in my desired order from the result of the query. In retrospect, this could have been done with a regex on the string instead of specifying it separately. I call this process of constructing a row of values from an unordered set of field:value pairs 'unmapping fields', by which I mean producing tabular data that matches the select clause.

A more important reason why it is good to have the arguments passed explicitly to the request is that it delegates URL encoding to the HTTP request, which is a much better place to deal with spaces and other characters that might appear in the arguments list than trying to do it on a per URL basis.

The code relevant code is in src/vone/version_one_request.clj

I found request-rows fulfilled most of my needs, except for a few rare cases where I used a raw request to preserve the field names in a map because I specifically wanted to use the data in further processing by name.

Overall I found working with the VersionOne data API very easy and powerful, I especially loved that the full history of changes to the data over time was there to access. My only complaint was the lack of group-by, which has since been added in their new JSON API. I am eager to try it out as the XML collapsing will disappear, and the group-by should significantly improve some queries by reducing the amount of data I need to pull back.

Layout

Bootstrap is great. I was really pleased with the look it gave to the site, and the responsive grid layout system.

Adding new queries

Now you have seen how it all works. This section will review the steps to add a new chart. Perhaps you are a VersionOne user with some query in mind. To implement it you need to:
  • Construct the URL that gets useful data from VersionOne in your browser by reading the VersionOne  data API documentation to know what assets you are looking for and what the query syntax is
  • Create myfunction in src/vone/services.clj which requests rows using this query and does any data transformation you require, returning tabular data with a first row consisting of header titles
  • Add a div chart="myfunction" to a page in src/vone/pages.clj or in resources/public
  • Configure the visualization desired in resources/public/js/charts.js
  • Then treat yourself to a chocolate - your done!

Conclusion

This was a really fun project to work on because it combined a personal interest in the data with pleasing technologies that delivered great results with minimal effort on my part. Clojure excelled at data transformation and standing up an intermediary service host. Retrieving the data was easy thanks to a great API from VersionOne. Displaying the information required very little code thanks to AngularJS and Google Visualizations.

4 comments:

Ferec Cetin said...

Hello. Can I run this on windows? If yes, can you tell me how?

Thanks.

timothypratley said...

Hi Ferec,

Yes you can run this on Windows (or any platform that has Java) very easily.
(1) Clone or download the repository from https://github.com/timothypratley/vone.
(2) Install the build tool https://github.com/technomancy/leiningen#windows.
(3) Create a resources/vone.properties file containing:
username = username
password = password
base-url = http://www3.v1host.com/MyCompany/VersionOne/rest-1.v1
(4) start the webserver by running lein ring server. A browser will open pointing to http://localhost:3000 to display the reports.

Please contact me if you have any difficulty, questions, feedback, or need pointers on customizing for your data.

Ferec Cetin said...

Hello Timothy.
In windows, I tried to run this in cmd prompt:
lein ring server

But then comes this:
C:\V1Report\.lein\bin>lein ring server
'ring' is not a task. See 'lein help'.

Is there anything wrong in my installation? Here are the files under lein\bin folder:

configure-leiningen-installer.exe
curl-ca-bundle.crt
curl.exe
file-assoc-in-0.1.0-standalone.jar
lein.bat
license.txt
unins000.dat
unins000.exe


timothypratley said...

Hi Ferec,

Please run the command from the vone root directory (there is a project.clj file that defines this task).