How many test failures are acceptable?

Continuous Delivery is getting a lot of mileage at the moment. It seems to be an idea whose time has come. There was a survey last year that claimed that 66% of companies had a “Strategy for Continuous Delivery”. Not sure that I believe that, nevertheless it suggests that CD is “cool”. I suppose that it is inevitable that such a popular, widespread idea will be misinterpreted in some places. Two such misinterpretations seem fairly common to me.

The first is that Continuous Delivery is really just about automating deployment of your software. If you have written some scripts or bought a tool to deploy your system you are doing Continuous Delivery – wrong!

The second is that automated testing is an optional part of the process, that getting your release frequency down to a month is a big step forward (which it is for some organisations) and that that means you are doing CD, despite the fact that your primary go-live testing is still manual – wrong again!

I see CD as a holistic process. Our aim is to minimise the gap between having an idea and getting working software into the hands of our users to express that idea so that we can learn from their experience. When I work on a project my aim is always to minimise that cycle-time. This has all sorts of implications, and affects pretty much every aspect of your development process, not to say your business strategy. Central to this is the need to automate, in order to reduce the cycle time.

The most crucial part of that automation, and the most valuable, is your testing. The aim of a CD process is to make software development more empirical. We want to carry out experiments that give us new understanding when they fail, and a higher level of confidence in our assumptions when they don’t. The principal expression of these experiments is as automated tests.

The best projects that I have worked on have taken this approach very seriously. We tested every aspect of our system – every aspect! That is not to say that our testing was exhaustive, you can never test everything, but it was extensive.

So what does such a testing strategy look like?

The deployment pipeline is an automated version of your software release process. Its aim is to provide a channel to production that verifies our decision to release. Unfortunately we can never prove that our code is good, we can only prove that it is bad when a test fails. This is the idea of falsifiability which we learn from science. I can never prove the theory that “All Swans are white”, but as soon as I see a black Swan I know that the theory is wrong.

Karl Popper proposed the idea of falsifiabiliy in his book “The Logic of Scientific Discovery” in 1934. Since then it has become pretty much the defining characteristic of science. If you can falsify a statement through experimental evidence it is a scientific theory, if you cannot it is a guess.

So, back to software. Falsifiability should be a cornerstone of our testing strategy. We want tests that will definitively pass or fail, and when they fail we want that to mean that we should not release our system, because we now know that it has a problem.

I am sometimes asked the question, “What percentage of tests do you think should be passing before we release?”. I think that people think that I am an optimistic fool when I answer “100%”. What is the point of having tests that tell us that our software is not good enough, and then ignoring what they tell us?

In the real world this is difficult for some kinds of tests in some kinds of system. There have been times when I have relaxed this absolute rule. However, there are only two reasons why tests may be failing and it still makes sense to release:

1) The tests are correctly failing and showing a problem, but this is a problem that we are prepared to live with in production.
2) The tests or system under-test (SUT) are flaky (non-deterministic) and so we don’t really know what state we are in.

In my experience, maybe surprisingly, the second case is the more common. This is a pretty serious problem because we don’t really know what is going on now.

Tests that we accept as “Oh that one is always failing” are subversive. First they acclimatise us to accepting a failing status as normal.

It is vital to any Continuous Integration process, let alone a Continuous Delivery process, that we optimise to keep the code in a releasable state. Fixing any failing test should take precedence over any other work. Sometimes this is expensive! Sometimes we have a nasty intermitent test that is extremely hard to figure out. Nevertheless, it must be figured out. The intermitency is telling us something very important. Either our test is flaky, or the SUT is flaky. Either one is bad, and you won’t know which it is until you have found the problem and fixed it.

If you have a flaky system, with flaky tests and lots of bugs in production, this may sound hard to achieve, but this is a self-fulfilling approach. To get your tests to be deterministic, your code needs to be deterministic. If you do this your bug count will fall!

I read a good article on the adoption of Continuous Delivery at PaddyPower recently, (http://www.infoq.com/articles/cd-benefits-challenges) in which the authour, Lianping Chen, claims “Product quality has improved significantly. The number of open bugs for the applications has decreased by more than 90 percent.”. This may sound surprising if you have not seen what Continuous Delivery looks like when you take it seriously, but this is completely in-line with my experience. This kind of effect only happens when you start being aggressive in your denial of failure – a single test-failure must mean “Not good enough!”

So take a hard-line with your automated tests, test everything and ensure that a single failure means that your system is not fit to release.

Posted in Acceptance Testing, Agile Development, Continuous Delivery, TDD | Leave a comment

Incremental Design – Part II

In my earlier blog post on incremental design I suggested that we need to allow for failure. So how do we limit the impact of failure, how do we tell when our design choices don’t work and how do we organise our world in a way that allows us to build upon what we learn from our experiments?

Perhaps it may surprise you that I begin this post on design with a section on team structures…

Team Structure and Architecture

Dev teams work best in small groups. Team sizes of up to about 8 people seem optimal. So organising development groups into many small, 8 or less person teams is a common and effective strategy. Conway’s law tells us that this has implications on the design of our systems. If we want our teams to be autonomous, self-organising and self-directing, but comprise of fewer than 8 people, then how we can allocate work to these teams is seriously constrained. The problem is that the other really important thing to enable the continuous flow of valuable software is that we want these teams to each deliver end-to-end user value without having to wait for another team to complete their work – We want to eliminate cross-team dependencies for any given feature.

I think that this is one of the strengths of MicroService architectures that people are currently excited about. Different flavours of this idea have been around for a long time, but implementing a system as a collection of loosely-coupled, independent services is a pretty good approach. This approach is also reflected in the team structure of companies like Amazon who structure their entire development effort as small independent, (2 pizza) teams, each responsible for a specific function of the business.

A good mental model for this kind of architecture is to imagine an organisation without any computers. Each department has its own office and is wholly responsible for the parts of the business that it looks after and decides for itself how best to accomplish that. Within each office they can work however they like. That is wholly their decision and their responsibility. Communications between departments is in the form of memos. Now, replace the office with a service and the memos with asynchronous messaging and you have a micro-service architecture.

A great tool for helping to design such an organisation, and so encourage this kind of architecture is the idea of Bounded Contexts, from Eric Evans’ book “Domain Driven Design”. A Bounded Context is the scope within which a domain model applies. Any complex problem domain will be composed of various Bounded Contexts, within which a domain model will have a consistent meaning. This idea effectively gives you a coherent scope within your problem domain that sensibly maps to business value.

Bounded Contexts are a great tool to help you organise your development efforts. Look for the Bounded Contexts in your problem domain and use that model to organise your teams. This is rarely a team per-service, or even per-context, but it is a powerful approach to looking at how to group contexts and services to allocate to teams, so ensuring that teams can autonomously deliver end-to-end value to the business.

Maintaining Models

A common concern of people new to iterative design is the loss of a “big-picture”. I learned a trick years ago for maintaining a big-picture, architectural view. I like to maintain what I call a “Whiteboard Model” of the system that I am working on. The nature of the diagram doesn’t matter much, but it’s existence does. The “Whiteboard Model” is a high-level abstract picture of the organisation of the system. It is high-level enough, that any of the senior members of the team should be able to recreate it, on a whiteboard, from memory, in a few minutes. That puts a limit on the level of detail that it can contain.

More detailed description of the model in a CD context should exist as automated tests. These tests should assert anything and everything important about your system. That the system performs correctly from a functional perspective. That it exhibits the desired performance characteristics. That it is secure, scalable, resilient, conforms to your architectural constraints – whatever matters in your application. These are our executable specifications of the behaviour of the system.

In a big system, the Whiteboard Model probably identifies the principle services that collaborate to form the system. If it is too detailed it won’t serve its purpose. This model needs to be in the heads of the development team. It needs to be a working tool, not an academic exercise in system design.

I also like creating little throwaway models for tricky parts of the system that I am working on. These can be simple notes on bits of scrap paper that last a few minutes to explore ideas with your pair, or they can be longer-lived models that can help you to organise a collection of stories that over-time build up a more complex set of behaviours. I once worked as part of a team who designed and built a complex, high-performance, reliable messaging system using these incremental techniques. We created a model of the scenarios that we thought that our reliable messaging system would need to cope with. It took several of us several hours around a whiteboard to come up with the first version. It looked a bit like a couple of robots, so it was ever after referred to as the “Dancing Robot Model”. Often in stand-ups you would hear comments like “I am working on Robot 1’s left leg today” 😉

Over the years I think that I have seen two common failure modes with modelling. The first is over-modelling. For me trying to express the detail of an algorithm, all of the methods or attributes of a class is a waste of time and effort. Code is a much more efficient, much more readable model of structure at this level of detail. Models should be high-level abstractions of ideas that inform decision making, but striving for detail or some level of formal completeness is a big mistake.

The second failure is lack of modelling. I think that the widespread adoption of agile approaches to development have made this failure mode even more common. I often interview developers and commonly ask them to describe a system that they have worked on. It is surprising the number of them who have no obvious organising principles, certainly nothing like a whiteboard model.

There is a third failure, but it is so heinous that I shudder to mention it – lack of tests! Of course, once our models exist in code, they should be validated by automated tests!

The Quality of Design

Personally I believe that an iterative approach is a more natural approach to problem solving. The key is to think about the qualities of good design. Good designs have clean separation of concerns, are modular, abstract the problem and are loosely-coupled. All of these properties are distinct from the technology. This is just as true of the design of a database schema or an ANT script as it is of code written in an OO or Functional language. These things matter more than the technology, they are more fundamental.

One of the reasons why these things are so important is that they allow us to make mistakes. When you write code that has a clean separation of concerns it is easier to change your mind when you learn something new. If the code that places orders or manages accounts is confused with the code that stores them in a database, communicates them across a network or presents them in a user interface then it will be tough to change things when your assumptions change.

I once worked on a project to implement a high-performance financial exchange. We had a pretty clean codebase, with very good separation of concerns. In my time there we changed the messaging system by which services communicated multiple times, growing it from a trivial first version that sent XML over HTTP evolving it to a world-class, reliable, asynchronous binary messaging system. We started with the XML over HTTP not because we thought it would ever suffice, but because I already had some code that did this from a pet project of mine. It was poor in terms of performance, but the separation of concerns was what we needed. Transport, messaging, persistence and logic were all independent and pluggable. So we could start with something simple and ready to go to get the bare-bones of our services in place and communicating. We then evolved the messaging system as we needed to, replacing the transport at one point, the message-protocol at another and so on.

The messaging wasn’t the only thing that evolved, every aspect of the system evolved over time. At any given point in its life doing just enough to fulfil all its requirements. You can’t make this kind of evolutionary change without good quality automated testing, but when you have that insurance, any aspect of the system is amenable to change. As well as messaging, we replaced our relational database vendor, we changed the technology that rendered our UI several times. We even made a dramatic change in the presentation of our system’s UI moving from a tabular, data-entry style user-interface to a graph-based, point and click trading interface at one point. All this while keeping to our regular release schedule, releasing other new features in parallel and, vitally, keeping all the tests running and passing.

Worrying about good separation of concerns, modularity, isolation is important.

The secret to incremental design… it is simply good design!

Posted in Agile Development, Continuous Delivery, Software Architecture, Software Design | Leave a comment

Is Continuous Delivery Riskier?

I read an article in CIO magazine today “Three Views on Continuous Delivery“. Of course the idea of such an article is to present differing viewpoints, and I have no need, or even desire, for the world to see everything from my perspective.
However the contrary view, expressed by Jonathan Mitchell seemed to miss one of the most important reasons that we do Continuous Delivery. He assumes that CD means increased risk. Sorry Jonathan, but I think that this is a poorly informed, misunderstanding of what CD is really about. From my perspective, certainly in the industries where I have worked of late, the reverse is true. CD is very focussed on reducing risk.
Here is my response to the article:
If Continuous Delivery “strikes fear into the heart of CIOs” they are missing some important points.
Continuous Delivery is the process that allows organisations to “maintain the reliability of the complex, kaleidoscope of interrelated software in their estate”, it doesn’t increase risk, it reduces it significantly. There is quite a lot of data to support this argument.
Amazon adopted a Continuous Deployment strategy a few years ago, they currently release into production once every 11.6 seconds. Since adopting this approach they have seen a 75% reduction in outages triggered by deployment and a 90% reduction in outage minutes. (Continuous Deployment is subtly different to Continuous Delivery in that release are automatically pushed into production when all tests pass. In Continuous Delivery release is a human decision).
I think that it is understandable that this idea makes Jonathan Mitchell nervous, it does seem counter-intuitive to release more frequently. However, what really happens when you do this is that it shines a light on all the places where your process and technology are weak, and helps you to strengthen them.
Continuous Delivery is a high-discipline approach. It requires more rigour not less. Continuous Delivery requires significant cultural changes within the development teams that adopt it and beyond. It is not a simple process to adopt, but the rewards are enormous. Continuous Delivery changes the economics of software development.
Continuous Delivery is the process that is chosen by some of the most successful companies in the world. This is not an accident. In my opinion Continuous Delivery finally delivers on the promise of software development. What businesses want from software is to quickly and efficiently put new ideas before their users. Continuous Delivery is built on that premise.
Posted in Agile Development, Continuous Delivery, DevOps | Leave a comment

Incremental Design – Part I

Continuous Delivery is all about making small changes. Work flows more easily, planning is simpler, error detection is helped and the time from idea to value is reduced when we make changes in small increments, but how do you solve big problems in small pieces? How do you maintain a coherent design when each change is small?

There are many facets to this problem. The organisations that are best at it tend to take a very broad view of design and think about things like how teams are structured, how work is defined and how design works when delivering a flow of many small features.

The traditional approach seems, on the face of it, sensible but is not. “Let’s think very hard and design everything in detail before we start”. Big-up-front design has a long history of not working very well, principally because at the point when you do the design you know the least about the problem that you ever will.

For a while we tried something else…

“We’re Agile We Don’t Need Your Stinking Design”(1)

In the early days of agile adoption there were some very naive things said about software design. This was largely as a reaction to the failures of big-up-front design. Some of the advice used to remind me of the “President of the Universe” from Douglas Adams’ “The Restaurant at the End of the Universe”. In this excellent book, it had been decided, by the advanced civilisations of the universe, that the desire to become a politician should exclude anyone from becoming one. They had taken this to it’s logical conclusion and decided that the senior person should have no pre-conceived ideas about anything. So, the President of the Universe lived in a shack on a beach and started every day making no assumptions at all. On waking he would wonder at the big hot thing in the sky, muse on the nature of existence an so on. This left no time at all for any decision making, ideal for a politician, not quite so good for a software developer.

Some people advised a similar approach to design. We should dump our assumptions and previous experience, and instead go straight to code and not think about design at all. If you weren’t sure what to do next, write a test!

Now there are few bigger fans of TDD than me, but I have always thought that this fear of design was stupid.

My own view is that design is all that we, as software developers, do. There is no distinction between coding and design. Coding *is* design, but it is not all that there is to it. Software development is an intriguingly, difficult exercise in design and we should use all of the tools and experience at our disposal to accomplish it. We should also value good design and strive for elegance and simplicity in all that we do.

A More Experimental Approach

If you are building a large complex system, or indeed any system, design is important, but we have learned through painful experience that trying to do all of the design up-front doesn’t work. Does this mean that we have reached an impasse?

Not really! Agile development practices are all about being experimental. So we should be experimental in our approach to design. If we want to be experimental what will that take?

Perhaps the most important thing is to allow for failure. By definition not all experiments succeed.

This need to allow for failure has several implications:

o If a design choice fails, how do we limit the impact of this failure?
o If we expect some of our design choices not to work, how do we tell?
o How do we organise our work to allow us to build upon what we have learned from our experiments?

I am going to suggest approaches to dealing with each of these in my subsequent blog posts. As a teaser here are some of the factors that I think are important in enabling a more incremental approach to design:

Team structure, Business-Alignment, Modelling and Models, Bounded Context, Isolation, Reactive Architectures, Loose-Coupling and Separation of concerns – there may be more 😉
(1) A quote from Andrew Phillips of XebiaLabs – with tongue firmly in cheek.

Posted in Agile Development, Continuous Delivery, Software Architecture, Software Design | Leave a comment

New Job, New Business

I have very recently given up my job at KCG (Formerly Getco) Ltd. KCG have been a very good employer and I thank them for the opportunities that they gave me.
The reason that I left though is, I hope, understandable. I have finally, after the long-time nagging of my friends and relatives, decided to try independence. This is a very exciting time for me and I thank my friends and relatives for giving me the push that I needed
;-)
I have set up my own consulting company called ‘Continuous Delivery Ltd.” – What else? I intend to offer advice to companies travelling down, or embarking on, the complex road to Continuous Delivery. I also plan to work on some software that I think is missing from the Continuous Delivery tool-set – I have to feed my coding-habit somehow (stay-tuned).
I hope that this will, if anything, give me a bit more time for writing blog-entries here. I have a LOT to say that I haven’t got around to yet.
If you are curious, you can read more about my new venture at my company website http://www.continuous-delivery.co.uk/
Naturally, if you feel that my services can be of any help, please get in touch.
Posted in Personal News | Leave a comment

Cargo-cult DevOps

My next blog post in the XebiaLabs “CD Master Series” is now available.
DevOps is a very successful meme in our industry. Most organisations these days seem to be saying that they aspire to it, though they don’t necessarily know what it is.
I confess that I have a slight problem with DevOps. Don’t get me wrong, DevOps and Continuous Delivery share some fundamental values. I see the people that promote DevOps as allies in the cause of making software development better. We are on the same team.
At its most simple level I don’t like the name ‘DevOps.’ It implies that fixing that one problem, the traditional barrier between Dev and Ops, is enough to achieve software nirvana. Fixing the relationship between Dev and Ops is not a silver bullet…
Posted in Agile Development, Continuous Delivery, DevOps | Leave a comment

The Reactive Manifesto

Over the past couple of months I have been helping out some friends to update the Reactive Manifesto.
There are several reasons why I agreed to help. First I was asked to, by my old friend Martin Thompson. The most important reason though is because I think that this is an important idea.
The Reactive Manifesto starts from a simple thought. 21st Century problems are not well-served by 20th Century assumptions of software architecture. The game is moving on!
There are lots of reasons for this: The problems that we are asked to tackle are growing in scale, sometimes in complexity too; The demands of our users are changing; The hardware environment has, and continues to change. The rate of change in our best businesses is increasing.
Talk to any of my friends and they will, no doubt, tell you that I am a bore on the topic of software design – as well as several other subjects
;-)
I think that we, as an industry, don’t spend enough time thinking about the design of our solutions. Too often we start out projects by saying “I have my language installed, my web-server, I have Spring, Hibernate, Ruby-on-Rails, <insert your favourite framework here> and  my database ready to go – now, what is the problem?”. We have become lazy and look for cookie-cutter solutions. We then proceed to write code in straight-lines – poor abstraction, little modelling, rotten separation of concerns. Where is the fun in any of that?
I get genuine pleasure from creating solutions to problems, but I don’t get pleasure from just any old solution. Code that only does the job is not enough for me. I want to do the job with as few instructions as possible, as little duplication. I want the systems that I write to be efficient, readable, testable, flexible, easy to maintain, high-quality, dare I say elegant!
I have been lucky enough to work on a few systems that looked like this. Do you know what? When we achieve those things we are also more efficient and more cost-effective as developers. The software that we create is more efficient too, it runs faster, does more with fewer instructions and is more flexible. This not over-engineering, this is professionalism.
Interestingly their are sometimes similarities in the course-grained architecture of such systems, at least on the larger ones that I have worked on. They are loose-coupled, based on services that implement specific bounded-contexts within the problem domain, that communicate with each other only via asynchronous messaging. These systems almost never look like the standard, out-of-the-box three-layer architecture built on top of a relational database, although pieces of them may use some of the standard technologies, including RDBMS.
The hardware environment in which our software executes is changing. The difference in the cost per byte between RAM and disk is reducing. The capacity of RAM is increasing dramatically. Distributed programming is the norm now, the relative performance of some of our hardware infrastructure has changed (e.g. Network is now faster than disk). Large-scale non-volatile RAM is on the horizon. All this means that the assumptions that underpinned the ‘standard-approach’ have changed. The old assumptions don’t match either the hardware environment nor the problems that we are solving. 
The Reactive Manifesto is about discarding some of those assumptions. About more effectively modelling the problems in our problem domain, writing code that is easier to test, more efficient to run, easier to distribute and that is dramatically more flexible in use.
Take a look at the Reactive Manifesto. If you think we are right please sign it, more than 8000 other people have done so so far. If you think we are wrong, tell us where.
Most importantly of all, please don’t assume that the same old way of doing things is the best approach to every problem.
Posted in High Performance Computing, LMAX, Microservices, Reactive Systems, Software Architecture, Software Design | Leave a comment

Strategies for effective Acceptance Testing – Part II

The second part of my blog post on effective Acceptance Testing is now available on the XebiaLabs website…
In my last blog post I described the characteristics of good Acceptance tests and how I tend to use a Domain Specific Language based approach to defining a language in which to specify my Acceptance Test cases. This time I’d like to describe each of these desirable characteristics in a bit more detail and to reinforce why DSL’s help.
To refresh you memory here are the desirable characteristics:
  • Relevance – A good Acceptance test should assert behaviour from the perspective of some user of the System Under Test (SUT).
  • Reliability / Repeatability – The test should give consistent, repeatable results.
  • Isolation – The test should work in isolation and not depend, or be affected by, the results of other tests or other systems.
  • Ease of Development – We want to create lots of tests, so they should be as easy as possible to write.
  • Ease of Maintenance – When we change code that breaks tests, we want to home in on the problem and fix it quickly.
Posted in Acceptance Testing, Continuous Delivery, TDD | Leave a comment

Sorry to any real readers…

A few weeks ago I switched on the feature in WordPress that allows users to sign-up for notifications when I write a new post.
If you are a real person who signed up, I am very sorry but I am going to turn that feature off again.
Since turning it on I get constantly spammed with bogus sign-ups by clearly made up email accounts. I am not sure what the attack intended is, or what these people hope to gain from this, but it is chewing up storage and cycles to no advantage.
If you want notifications you can follow me on twitter at #davefarley77 and I will tweet when there is a new post.
Very sorry!
Posted in Blog Housekeeping | Leave a comment

Strategies for effective Acceptance Testing

My second guest blog post for XebiaLabs is the first of two parts. It is on the topic of “Strategies for Effective Acceptance Testing”
“Automated testing is at the heart of any good Continuous Delivery process and I see automated Acceptance Testing as being one of the foundations of any effective testing strategy.
In my book ‘Continuous Delivery’ we defined Acceptance Testing as asserting that the code ‘did what the business wanted it to do’. The distinction that we made is between that and unit-test-based TDD, which is really focused on asserting that the code does what the developer thinks it should. Both of these perspectives are important, and compliment one another, but for the rest of this post I want to talk about Acceptance Testing.
Good Acceptance Tests are hard to get right, but there are a few tricks that make it easier…”
Posted in Acceptance Testing, Continuous Delivery, External Post, TDD | Leave a comment