David Dickson About me

Compass Re-architecture

Early this year at Compass we decided to significantly redesign our core product with a focus on making it self-service. This juncture was the perfect opportunity for us to also evaluate the technical stack we were leveraging. This evaluation lead to some specific changes as outlined below.

New Stack Old Stack
Database PostgreSQL MongoDB
Inter-Process Communication RabbitMQ RabbitMQ
Web Backend Sinatra Rails
Web Frontend AngularJS AngularJS

It is now 6 months from our re-architecture and I wanted to discuss the outcomes from our move.

MongoDB to PostgreSQL

The problem we were finding with Mongo was that our data model was becoming a bit of a mess. As a reporting and benchmarking tool it is fundamental that Compass maintains a well structured database. One of the fundamental features of MongoDB (schemaless storage) was proving to be a significant pain point.

Having our storage engine define our schema has certainly been a positive decision. We often run algorithms over very flat data structures - as a result we found a relational data model actually mapped effectively to our problem space. Although there are certainly domains where persisting documents can increase productivity, this simply wasn't the case for us.

Compass is made up of both Software Engineers and Data Scientists. Much of a Data Scientist’s job is often spent cleaning and analyzing data. Moving to PostgreSQL has made it significantly easier for our data science team to explore critical data. The improved consistency coupled with their comfortability with SQL has lead to more efficient turn around in problem understanding and feature specification.

As PostgreSQL comes equipped with some NoSQL features our move really has felt like the best of both worlds. With PostgreSQL you can get document database-like capabilities using the JSON data type, and key/value storage with the HSTORE module. As a result we have been able to make use of these features where appropriate.

Rails to Sinatra

The biggest problem with our existing Rails application was that it didn't have a clear separation of concerns. At times the web layer was implemented as simple restful endpoints, at other times it was rendering complex views. With our move to Sinatra we wanted to address this inconsistency and have a clear separation between the client (Angular SPA) and our web API layer. Furthermore in building a single page application a lot of what Rails would provide is overkill.

Since moving to Sinatra we have found it excellent for building out a web API layer. Its flexibility gave us the control over what libraries we leverage in building out our application we wanted. Furthermore its lightweight nature and simple method for specifying routes has given the team a restful focus, encouraging the separation between web and client we were striving for.

One interesting positive side effect that we didn't anticipate, is that leveraging a micro framework has helped our junior developers better understand web semantics. In having less ‘configuration magic’, ramp up time for our junior developers may be greater, however it has forced them to gain a deeper understanding of how web applications are built.

Conclusion

Six months on from our architectural changes and the team remains busy building out new features for Compass. Software is forever changing and I am sure our stack will have significant evolutions in the years to come. But as it stands the team remains happy with the architectural decisions made and is excited about evolving the platform. Moving to Sinatra gave us the clear separation between web and client we were looking for, and PostgreSQL the tighter structure and subsequent boost in productivity we needed.

Unhealthy Attachment

One thing I notice with junior programmers is that they often get unnecessarily attached to their work. I am not suggesting that you shouldn't take pride in your work. There is no doubt that pride in ones work is fundamental for any professional. But more I try to instill in junior developers that programming is a path of continual learning and the sooner you can let go and enjoy this process of learning, the quicker you will grow.

Unhealthy attachment to their code can result in negative behaviors. For example, an introverted perfectionist may beat himself or herself up when you try to point out potential improvements in their work. An overly strong personality may get defensive over their choices and ultimately block external input. There is no doubt that insecurity at work in any profession can result in negative reactions, so the problem isn't limited to programming. However, the reason I think it is so important for programmers to transcend their insecurities is because the reality is you will be learning your entire career.

When I was junior developer I had the fortunate situation of having an excellent senior engineer help me take steps to lose my insecurities. As we would pair program he would highlight the simple fact that as programmers we simply "write some code and then delete some code." This simple reduction of our work for some reason resonated with me. Further having someone who I looked up to professionally validate my skills did wonders for my confidence.

Building a blog engine in Node

I recently got the urge to mess around with some new technologies. To give me some direction, I decided to build my new blog with a new stack. The technologies I explored were:

Node

The Node ecosystem is very active, this made it easy to find tutorials and examples to get me started. The package manager npm made it simple to pull in libraries to accelerate my development. To build my blog I chose Express as my web application framework. For editing I used Sublime Text, and debugging node-inspector.

Whilst programming in Node, there were times when I certainly missed the security I feel with a static language. However, one thing that's for sure is that working in Node is doing wonders for my JavaScript skills.

Redis

I really enjoyed working with Redis. Redis is an in-memory key-value store, however it is probably better described as a data structure server. What I most enjoyed about working with Redis is how it changed the way I thought about modelling my application’s data.

Rather than thinking of data in tables and relations, or objects and graphs, Redis forces you to think in hashes, lists and sets. I found this focus on basic data structures at the persistence level appealing as it influenced me to build a flat and simple model. One other appealing thing about Redis is it is quick. Below are the results of the redis-benchmark utility run on my macbook pro:

====== PING_INLINE ======
  10000 requests completed in 0.12 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
100.00% <= 0 milliseconds
85470.09 requests per second
 
====== PING_BULK ======
  10000 requests completed in 0.14 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.99% <= 1 milliseconds
100.00% <= 1 milliseconds
74074.07 requests per second
 
====== SET ======
  10000 requests completed in 0.16 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
100.00% <= 0 milliseconds
64516.13 requests per second
 
====== GET ======
  10000 requests completed in 0.11 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
100.00% <= 0 milliseconds
88495.58 requests per second
 
====== INCR ======
  10000 requests completed in 0.16 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.51% <= 1 milliseconds
99.87% <= 2 milliseconds
99.93% <= 3 milliseconds
99.96% <= 4 milliseconds
100.00% <= 4 milliseconds
64102.56 requests per second
 
====== LPUSH ======
  10000 requests completed in 0.15 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.38% <= 1 milliseconds
99.84% <= 2 milliseconds
100.00% <= 2 milliseconds
66666.66 requests per second
 
====== LPOP ======
  10000 requests completed in 0.13 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.88% <= 1 milliseconds
99.96% <= 2 milliseconds
100.00% <= 2 milliseconds
75757.58 requests per second
 
====== SADD ======
  10000 requests completed in 0.15 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
100.00% <= 0 milliseconds
68027.21 requests per second
 
====== SPOP ======
  10000 requests completed in 0.14 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.51% <= 1 milliseconds
100.00% <= 1 milliseconds
68965.52 requests per second
 
====== LPUSH (needed to benchmark LRANGE) ======
  10000 requests completed in 0.14 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
99.77% <= 1 milliseconds
99.86% <= 2 milliseconds
99.91% <= 3 milliseconds
99.97% <= 4 milliseconds
100.00% <= 4 milliseconds
71428.57 requests per second
 
====== LRANGE_100 (first 100 elements) ======
  10000 requests completed in 0.47 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
37.77% <= 1 milliseconds
99.66% <= 2 milliseconds
99.97% <= 3 milliseconds
100.00% <= 3 milliseconds
21276.60 requests per second
 
====== LRANGE_300 (first 300 elements) ======
  10000 requests completed in 1.11 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
0.10% <= 1 milliseconds
10.91% <= 2 milliseconds
70.31% <= 3 milliseconds
92.80% <= 4 milliseconds
99.53% <= 5 milliseconds
99.80% <= 6 milliseconds
99.91% <= 7 milliseconds
100.00% <= 7 milliseconds
8976.66 requests per second
 
====== LRANGE_500 (first 450 elements) ======
  10000 requests completed in 1.49 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
0.05% <= 1 milliseconds
0.96% <= 2 milliseconds
23.64% <= 3 milliseconds
66.67% <= 4 milliseconds
91.83% <= 5 milliseconds
97.22% <= 6 milliseconds
99.28% <= 7 milliseconds
100.00% <= 7 milliseconds
6706.91 requests per second
 
====== LRANGE_600 (first 600 elements) ======
  10000 requests completed in 1.93 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
0.02% <= 1 milliseconds
0.04% <= 2 milliseconds
2.60% <= 3 milliseconds
22.09% <= 4 milliseconds
63.34% <= 5 milliseconds
88.36% <= 6 milliseconds
96.23% <= 7 milliseconds
99.21% <= 8 milliseconds
99.64% <= 9 milliseconds
99.83% <= 10 milliseconds
99.90% <= 11 milliseconds
99.98% <= 12 milliseconds
100.00% <= 12 milliseconds
5175.98 requests per second
 
====== MSET (10 keys) ======
  10000 requests completed in 0.22 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 
53.98% <= 1 milliseconds
99.95% <= 2 milliseconds
99.97% <= 3 milliseconds
100.00% <= 4 milliseconds
46082.95 requests per second

Heroku

Having deployed to both Appharbor, and Azure I was excited about seeing how Heroku did its thing. Heroku was a pleasure to work with. It really is as simple as creating an account then git push. Heroku automatically detected that I was deploying a node application and my site was up. Pulling in Redis as an add on was equally effortless. Another thing that I really enjoyed about working on Heroku is its focus on interacting with the platform via the command line.

Rewriting parts of an Expression Tree in F#

Background

I am in the fortunate situation of working on a project where we make use of F# scripts to administer a production application. It's a very compelling approach as it allows maintenance tasks to be carried out through domain objects, rather than modifying repositories directly. This ensures that when data maintenance is performed, all appropriate business rules are applied and data integrity is maintained.

For its persistence mechanism the application makes use of LINQ to SQL. Within the scripts we build adhoc queries using F# quotations that are subsequently transtated into LINQ expressions to query our repository. For example we make use of the F# power pack's ToLinqExpression in the function below to convert a quotation into a consumable LINQ expression:

let toLinq (exp : Expr<'a -> 'b>) =
    let linq = exp.ToLinqExpression()
    let call = linq :?> MethodCallExpression
    let lambda = call.Arguments.[0] :?> LambdaExpression
    Expression.Lambda<Func<'a, 'b>>(lambda.Body, lambda.Parameters)

Recently when I upgraded the scripts to make use of the latest version of the F# power pack I found that previously working queries now caused our application to throw exceptions that LINQ to SQL had no supported translation to SQL. My investigation found that the cause of these errors were that the latest version of ToLinqExpression had been modified such that when a quotation made use of =, >, >=, <, <= or <> the created BinayExpression make use of a different method for its comparison. So for example if I have a binding like:

let example = toLinq (<@fun u -> u.FirstName = "Bob"@> : Expr<Account->bool>)

When converted to a LINQ expression the implementing method for the binary operation u.FirstName = "Bob" is set to use GenericEqualityIntrinsic whereas in the previous version it was set to use String.op_Equality. As a result the LINQ to SQL provider could no longer translate the expression tree into SQL.

Rewriting the expression tree

Having identified the issue, I decided to come up with a function to rewrite the problematic parts of the returned expression tree. Below is what I came up with.

let rec reWriteExpression  (exp:Expression) =
 
    let makeBinary e left right = Expression.MakeBinary(e, left, right) :> Expression 
 
    let modifyMethod (exp:BinaryExpression) = 
        match exp.NodeType with
        | ExpressionType.Equal as eq -> makeBinary eq exp.Left exp.Right
        | ExpressionType.GreaterThan as gt -> makeBinary gt exp.Left exp.Right
        | ExpressionType.GreaterThanOrEqual as gte -> makeBinary gte exp.Left exp.Right
        | ExpressionType.LessThan as lt -> makeBinary lt exp.Left exp.Right
        | ExpressionType.LessThanOrEqual as lte -> makeBinary lte exp.Left exp.Right
        | ExpressionType.NotEqual as ne -> makeBinary ne exp.Left exp.Right
        | _ -> makeBinary exp.NodeType (reWriteExpression exp.Left) (reWriteExpression exp.Right)
 
    match exp with
    | :? LambdaExpression -> 
        let l = exp :?> LambdaExpression
        Expression.Lambda(l.Type, reWriteExpression l.Body, l.Parameters) :> Expression
    | :? BinaryExpression -> modifyMethod (exp :?> BinaryExpression)
    | _ -> exp

For me, the ease by which I could come up with this rewrite highlighted two very powerful features of functional programming, namely pattern matching and recursion. Pattern matching is simply awesome. The simple syntax in modifyMethod to test for specific expression types makes expressing the program's intent both clear and succinct. You can do all sorts of things with pattern matching, however, I won't cover those details in this post. Similarly, routines on trees are often most easily expressed using recursion. This is natural because the tree is a recursive data type.

Considering the Command Pattern in F#

The command pattern describes a way to represent actions in an application. It is used to encapsulate actions so that they can be invoked at some later point. We often see collections of commands that specify steps of a process or operations that the user can choose from. Below is an object oriented way of implementing this pattern.

If we consider this pattern from a functional perspective, command objects are being used as functions. Indeed they are equivalent to functions. Thus it is arguable that the command pattern is more readily adopted in object oriented programming when there is a lack of support for higher order functions.

Object Oriented Example

Let’s look at a simple example. Say we are tasked to develop the software to control a robot by remote control. Following shows how we could implement this using the object oriented command pattern.

type Robot(position, rotate) =     
    let mutable (x,y) = position
    let mutable rotation = rotate
    member this.Move(distance) =
      x <- x + (distance * sin (System.Math.PI/180.0 * rotation))
      y <- y + (distance * cos (System.Math.PI/180.0 * rotation))
    member this.Rotate(angle) = 
        let newRotation = 
            let nr = rotation + angle
            match nr with
            | n when n < 360.0 -> nr
            | _ -> nr - 360.0
        rotation <- newRotation
 
type ICommand = 
    abstract Execute : unit -> unit
 
type Move(robot:Robot, distance) =
    interface ICommand with 
        member this.Execute() = robot.Move(distance)
 
type Rotate(robot:Robot, rotation) = 
    interface ICommand with
        member this.Execute() = robot.Rotate(rotation)
 
type RemoteControl(cmds: ICommand list) =
    let commands = cmds
    member this.RunCommands() = commands |> List.iter(fun c -> c.Execute())

Functional Solution

Below shows how we could implement the robot example drawing on functional constructs:

type Robot = {Position : float * float; Rotation : float}
 
let move distance robot = 
    let x2 = distance * sin (System.Math.PI/180.0 * robot.Rotation)        
    let y2 = distance * cos (System.Math.PI/180.0 * robot.Rotation)
    {robot with Position = (robot.Position |> fst) + x2, (robot.Position |> snd) + y2 }
 
let rotate angle robot = 
    let newRotation = 
      let nr = robot.Rotation + angle
        match nr with
        | n when n < 360.0 -> nr
        | _ -> nr - 360.0
    {robot with Rotation = newRotation}
 
let remoteControl commands robot = 
    commands |> List.fold(fun w c -> c w) robot

You will notice that in the functional solution there is no command interface defined. The reason for this is because F# has support for higher order functions. An object oriented way to understand a function is to think of it is an interface with a single method. Thus in a functional language like F# the single method command interface becomes unnecessary.

Another difference is that in the provided functional solution the data (i.e. location of the robot) is independent of the functions that transform the data (move and rotate). The reason this is adopted is because one of the principles of functional programming is referential transparency. That is, for a given set of inputs at any point in time, a referentially transparent expression will return the same result. The benefit is that referentially transparent expressions are deterministic by definition. In supporting this principle the functional solution above makes variables immutable, and rather than modifying arguments, the processing functions return new variables.

Conclusion

The command pattern is used to represent some behaviour which you know needs done. In a functional language the need for this object oriented pattern is diminished because these behaviours can be encapsulated and passed around as functions. Design patterns are an interesting topic as they provide solutions to commonly occurring problems in software design. However, I really enjoy it when a language or framework can appropriately abstract me away from boilerplate code. In fact I think Paul Graham sums it up best:

"When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I'm using abstractions that aren't powerful enough - often that I'm generating by hand the expansions of some macro that I need to write."