Why Is Test Automation the Backbone of Continuous Delivery?

The path to continuous delivery leads through automation

Software testing and verification needs a careful and diligent process of impersonating an end user, trying various usages and input scenarios, comparing and asserting expected behaviours. Directly, the words “careful and diligent” invoke the idea of letting a computer program do the job. Automating certain programmable aspects of your test suite thus can help software delivery massively. In most of the projects that I have worked on, there were aspects of testing which could be automated, and then there were some that couldn’t. Nonetheless, my teams could rely heavily on our automation suite when we had one, and expend our energies manually testing aspects of the application we could not cover with automated functional tests. Also, automating tests helped us immensely to meet customer demands for quick changes, and subsequently reaching a stage where every build, even ones with very small changes, went out tested and verified from our stable. As Jez Humble rightly says in his excellent blog about continuous delivery, automated tests “take delivery teams beyond basic continuous integration” and on to the path of continuous delivery. In fact, I believe they are of such paramount importance, that to prepare yourself for continuous delivery, you must invest in automation. In this text, I explain why I believe so.

How much does it cost to make one small change to production?

As the complexity of software grows, the amount of effort verifying changes, as well as features already built, grows at least linearly. This means that testing time is directly proportional to the number of test cases needed to verify correctness. Thus, adding new features means that testing either increases the time it takes a team to deliver software from the time development is complete, or it adds cost of delivery if the team adds more testers to cover the increased work (assuming all testing tasks are independent of each other). A lot of teams — and I have worked with some — tackle this by keeping a pool of testers working on “regression” suites throughout the length of a release, determining whether new changes break already built functionality. This is not only costly, its ineffective, slow and error prone.

Automating test scenarios where you can lets you cut the time/money it takes to verify if a user’s interaction with the application works as designed. At this point, let us assume that a reasonable number of your test scenarios can be automated — say 50% — as this is often the lowest bound in software projects. If your team can and does automate this set to a certain number of repeatable tests, it frees up people to concentrate more on immediate changes. Also, let’s suppose that it takes as much as three hours to run your tests (it should take as little as possible — less than 20 minutes even). This directly impacts the amount of time it takes to push a build out to customers. By increasing the number of automated tests, and also investing in getting the test-run time down, your agility and ability to respond increases massively, while also reducing the cost. I explain this with some very simple numbers (taking an average case) below:

Team A

1. Number of scenarios to test: 500 and growing.

2. Time to setup environment for a build: 10 minutes.

3. Time to test one scenario: 10 minutes.

4. Number of testers on your team: 5.

5. Assume that there are no blockers.

If you were to have no automated tests, the amount of time it would take to test one single check-in (in minutes), is:

10 + (500*10)/5 = 1010 minutes.

This is close to two working days (standard eight hours each). Not only is this costly, it means that developers get feedback two days later. This kind of a setup further encourages
mini-waterfalls in your iteration.

Team B

Same as Team A, but we’ve automated 50% (250 test cases) of our suite. Also, assume that running these 250 test cases takes a whopping three hours to complete.

Now, the amount of time it would take to test one single check-in (in minutes), is:

task 1 (manual): 10 + (250*10)/5 = 510 minutes.

task 2 (automated): 10 + 180 minutes.

This is close to one working day. This is not ideal, but just to prove the fact about reduced cost, we turned around the build one day earlier. We halved the cost of testing. We also covered 50% of our cases in three hours.

Now to a more ideal and (yet) achievable case:

Team C

Same as Team B, but we threw in some good hardware to run the tests faster (say 20 minutes), and automated a good 80% of our tests (10% cannot be automated and 10% is new
functionality).

Now, the amount of time it would take to test one single check-in (in minutes), is:

task 1 (manual): 10 + (100*10)/5 = 210 minutes.

task 2 (automated): 10 + 20 minutes = 30 minutes.

So in effect, we cover 80% of our tests in 30 minutes, and overall take 3.5 hours to turn around a build. Moreover, we’ve increased the probability of finding a blocker earlier (by covering the vast bulk of our cases in 30 minutes), meaning that we can suspend further manual testing if we need to. Our costs are lower, we get feedback faster. This changes the game quite a bit, doesn’t it?

Impossibility of verification on time

Team A that I mentioned above would need 50 testers to
certify a build in under two hours. That cost is, not surprisingly, unattractive to customers. In most cases, without automation, it is almost impossible to turn around a build from development to delivery within a day. I say almost impossible, as this would prove to be extremely costly in cases where it is. So, assuming that my team doesn’t automate and hasn’t got an infinite amount of money, every time a developer on the team checks in one line of code, our time to verify a build completely increases by hours and days. This discourages a manager from scheduling running these tests every time on every build, which consequently decreases the quality of coverage for builds, and ups the amount of time bugs stay in the system. It also, in some cases I have experienced, disincentivizes frequent checking in of code, which is not healthy.

Early and often feedback

One of the most important aspects of automation is the quick feedback that a team gets from a build process. Every check-in is tested without prejudice, and the team gets a report card as soon as it can. Getting quicker feedback means that less code gets built on top of buggy code, which in turn increases the credibility of the software. To extend the example of teams A, B and C above:

For Team A: The probability of finding a blocker on day one is 1/2. Which basically means that there is a good risk of finding a bug on the second day of testing, which completely lays the first days of work to waste. That blocker would need to be fixed, and all the tests re-verified. In the worst case, a bug is found after two days of an inclement line of code getting checked-in.

For Team B: The worst case is that you find a blocker in the last few hours of the day. This is still much better than for Team A. Better still, as 50% of test cases are automated, the chance of finding a blocker within three hours is very high (50%). This quick feedback lets you find and fix issues faster, and therefore respond to customer requests very quickly.

For Team C: The best case of all three. The worst-case scenario is that Team C will know after three hours if they checked-in a blocker. As 80% of test cases are automated, by 20 minutes, they would know that they made a mistake. They have come a long way from where Team A is — 20 minutes is way better than two days!

Opportunity cost

Economists use an apt term – opportunity cost – to define what is lost if one choice amongst many is taken. The opportunity cost of re-verifying tedious test cases build after build is the loss of time spent on exploratory testing. More often than not, a bug leads to many, but by concentrating on manual scenarios, and while catching up to do so, testers hardly find any time to create new scenarios and follow up on issues. Not only this, it is a given that by concentrating on regression tests all the time, testers spend proportionately less time on newer features, where there is a higher probability of bugs to be found. By automating as much as possible, a team can free up testers to be more creative and explore an application from the “human angle” and thus increase the depth of coverage and quality. On projects I have worked on, whenever we have had automated tests aiding manual testing, I have noticed better and more in-depth testing which, has resulted in better quality.

Another disadvantage to manual testing is that it involves tedious re-verification of the same cases day after day. Even if managers are creative and distribute tests to different people every day, the cycle inadvertently repeats after a short period of time. Testers have less time to be creative, and therefore their jobs less gratifying. Testers are creative beings and their forte is to act as end-users and find new ways to test and break an application, not in repeating a set process time after time. Without automation, the opportunity cost in terms of keeping and satisfying the best testers around is enormous.

Error prone human behavior

Believe it or not, even the best of us are prone to making mistakes doing our day to day jobs. Given how good or bad we are it, the probability of making a mistake while working is higher or lower, but mostly a number greater than zero. It is important to keep this risk in mind while ascertaining the quality of a build. Indeed, human errors lead to a majority of bugs in software applications — errors that may occur during development and/or testing.

Computers are extremely efficient at doing repetitive tasks. They are diligent and careful, which makes automation a risk mitigation strategy.

Tests as executable documentation

Test scenarios provide an excellent source of knowledge about the state of an application. Manual test results provide a good view of what an application can do for an end user, and also tell the development team about quirky components in their code. There are two components to documenting test results – showing what an application can do and, upon failures, documenting what fails and how, so its easy to manage application abnormalities. If testers are diligent and make sure they keep their documentation up to date (another overhead for them), it is possible to know the state of play through a glance at test results. The amount of work increases drastically with failures, as testers then need to document each step, take screenshots, maybe even videos of crash situations. Adding the time spent on these increases the cost of making changes; in fact, in a way, the added cost disincentivizes documenting the state with every release.

With automated tests, and by choosing the right tools, the process of documenting the state of an application becomes a very low-cost affair. Automated testing tools provide a very good way of executing tests, collating results in categories, and publishing results to a web page, and also let you visualize test result data to monitor progress and get relevant feedback from the tests. With tools like Twist, Concordian, Cucumber and the lot, it becomes really easy to show your test results, even authoring, to your customers, and this reduces the losses in translation, with the added benefit of the customer getting more involved in the application’s development. For failures, a multitude of testing tools automate the process of taking screenshots, even videos, to document failures and errors in a more meaningful way. Results could be mailed to people, much better served as RSS feeds per build to interested parties.

Technology facing tests

Testing non-functional aspects of an application – like testing application performance upon a user action, testing latency over a network and its effect on an end-user’s interaction with the application, etc. — have traditionally been partially automated (although, very early during my work life, I have sat with a stop watch in hand to test performance — low-fi but effective!) It is easy to take advantage of automated tests and reuse them to test such non-functional aspects. For example, running an automated functional test over a number of times can tell you the average performance of an action on your web-page. The model is easy to set up: put a number of your automated functional tests inside a chosen framework that lets you set up and probe non-functional properties while the tests are run. Testing and monitoring aspects like role-based security, effects of latency, query performance, etc., can all be automated by reusing an existing set of automated tests — an added benefit.

Conclusion

On your journey to Continuous Delivery, you have to take many steps, both small and large. My understanding and suggestion would be to start small, with a good investment in a robust automation suite, give it your best people, cultivate habits in your team that respect tests and results, build this backbone first, and then off you go. Have a smooth ride!
This article originally appeared in blog.ranjansakalley.com.

TAG: continuous delivery Ranjan Sakalley Test Automation

Ranjan Sakalley is a lead developer & software architect with ThoughtWorks who “likes writing code and working with great people”. In his career he has worn varied hats, and in particular enjoys being an agile coach and project manager.His interests include software architecture, leading teams to deliver better, being a hands-on lead, C#, Java, Ruby, javascript, Agile, XP, TDD, Story analysis, and Continuous Delivery, among others.