If you whish to discover a way how to turn negative vibes between testers and developers into something positive – here is a great solution for that. The thing I like to introduce is quite old but even today in our brave new DevOps world an evergreen.
Many years ago in the world wide web I stumbled over a PDF called Bug Fix Bingo. A nice funny game for IT professionals. This little funny game originally was invent by the software testing firm K. J. Ross & Associates. Unfortunately the original site disappeared long agoso I decided to conserve this great idea in this blog post.
I can recommend this game also for folks they are not so deep into testing, but have to participate in a lot of IT meetings. Just print the file, bring some copies to your next meeting and enjoy whats gonna happen. I did it several times. Beside the fun we had it changed something. So let’s have a look into the concept and rules.
Bug Fix Bingo is based on a traditional Bingo just with a few adaptions. Everyone can join the game easily without a big preparation, because its really simple. Instead of numbers the Bingo uses statements from developers in defect review meetings to mark off squares.
Rules:
Bingo squares are marked off when a developer makes the matching statement during bug fix sessions.
Testers must call “Bingo” immediately upon completing a line of 5 squares either horizontally, vertically or diagonally.
Statements that arise as result of a bug that later becomes “deferred”, “as designed”, or “not to fixed” should be classified as not marked.
Bugs that are not reported in an incident report can not be used.
Statements should also be recorded against the bug in the defect tracking system for later confirmation.
Any tester marks off all 25 statements should be awarded 2 weeks stress leave immediately.
Any developer found using all 25 statements should be seconded into the test group for a period of no less than 6 months for re-education.
“It works on my machine.”
“Where were you when the program blew up?”
“Why do you want to do it in that way?”
“You can’t use that version on your system.”
“Even thought it doesn’t work, how does it feel.”
“Did you check for a virus on your system?”
“Somebody must have changed my code.”
“It works, but it hasn’t been tested.”
“THIS can’t be the source of that module in weeks!”
“I can’t test anything!”
“It’s just some unlucky coincidence.”
“You must have the wrong version.”
“I haven’t touched that module in weeks.”
“There is something funky in your data.”
“What did you type in wrong to get it to crash?”
“It must be a hardware problem.”
“How is that possible?”
“It worked yesterday.”
“It’s never done that before.”
“That’s weird …”
“That’s scheduled to be fixed in the next release.”
“Yes, we knew that would happen.”
“Maybe we just don’t support that platform.”
“It’s a feature. We just haven’t updated the specs.”
“Surly nobody is going to use the program like that.”
Incidentally, developers have a game like this too. They score points every time a QA person tries to raise a defect on functionality that is working as specified.
For your pleasure I place the original file of the Bug Fix Bingo here, so you can download and print it out for your next meeting. Comments about your experiences playing this game are very welcomed and feel free to share if you like this post. Stay updated and subscribe my newsletter.
By experience, most of us know how difficult it is to express what we mean talking about quality. Why is that so? There exist many different views on quality and every one of them has its importance. What has to be defined for our project is something that fits its needs and works with the budget. Trying to reach perfectionism can be counterproductive if a project is to be terminated successfully. We will start based on a research paper written by B. W. Boehm in 1976 called “Quantitative evaluation of software quality.” Boehm highlights the different aspects of software quality and the right context. Let’s have a look more deeply into this topic.
When we discuss quality, we should focus on three topics: code structure, implementation correctness, and maintainability. Many managers just care about the first two aspects, but not about maintenance. This is dangerous because enterprises will not invest in individual development just to use the application for only a few years. Depending on the complexity of the application the price for creation could reach hundreds of thousands of dollars. Then it’s understandable that the expected business value of such activities is often highly estimated. A lifetime of 10 years and more in production is very typical. To keep the benefits, adaptions will be mandatory. That implies also a strong focus on maintenance. Clean code doesn’t mean your application can simply change. A very easily understandable article that touches on this topic is written by Dan Abramov. Before we go further on how maintenance could be defined we will discuss the first point: the structure.
Scaffolding Your Project
An often underestimated aspect in development divisions is a missing standard for project structures. A fixed definition of where files have to be placed helps team members find points of interests quickly. Such a meta-structure for Java projects is defined by the build tool Maven. More than a decade ago, companies tested Maven and readily adopted the tool to their established folder structure used in the projects. This resulted in heavy maintenance tasks, given the reason that more and more infrastructure tools for software development were being used. Those tools operate on the standard that Maven defines, meaning that every customization affects the success of integrating new tools or exchanging an existing tool for another.
Another aspect to look at is the company-wide defined META architecture. When possible, every project should follow the same META architecture. This will reduce the time it takes a new developer to join an existing team and catch up with its productivity. This META architecture has to be open for adoptions which can be reached by two simple steps:
Don’t be concerned with too many details;
Follow the KISS (Keep it simple, stupid.) principle.
A classical pattern that violates the KISS principle is when standards heavily got customized. A very good example of the effects of strong customization is described by George Schlossnagle in his book “Advanced PHP Programming.” In chapter 21 he explains the problems created for the team when adopting the original PHP core and not following the recommended way via extensions. This resulted in the effect that every update of the PHP version had to be manually manipulated to include its own development adaptations to the core. In conjunction, structure, architecture, and KISS already define three quality gates, which are easy to implement.
The open-source project TP-CORE, hosted on GitHub, concerns itself with the afore-mentioned structure, architecture, and KISS. There you can find their approach on how to put it in practice. This small Java library rigidly defined the Maven convention with his directory structure. For fast compatibility detection, releases are defined by semantic versioning. The layer structure was chosen as its architecture and is fully described here. Examination of their main architectural decisions concludes as follows:
Each layer is defined by his own package and the files following also a strict rule. No special PRE or POST-fix is used. The functionality Logger, for example, is declared by an interface called Logger and the corresponding implementation LogbackLogger. The API interfaces can detect in the package “business” and the implementation classes located in the package “application.” Naming like ILogger and LoggerImpl should be avoided. Imagine a project that was started 10 years ago and the LoggerImpl was based on Log4J. Now a new requirement arises, and the log level needs to be updated during run time. To solve this challenge, the Log4J library could be replaced with Logback. Now it is understandable why it is a good idea to name the implementation class like the interface, combined with the implementation detail: it makes maintenance much easier! Equal conventions can also be found within the Java standard API. The interface List is implemented by an ArrayList. Obviously, again the interface is not labeled as something like IList and the implementation not as ListImpl .
Summarizing this short paragraph, a full measurement rule set was defined to describe our understanding of structural quality. By experience, this description should be short. If other people can easily comprehend your intentions, they willingly accept your guidance, deferring to your knowledge. In addition, the architect will be much faster in detecting rule violations.
Measure Your Success
The most difficult part is to keep a clean code. Some advice is not bad per se, but in the context of your project, may not prove as useful. In my opinion, the most important rule would be to always activate the compiler warning, no matter which programming language you use! All compiler warnings will have to be resolved when a release is prepared. Companies dealing with critical software, like NASA, strictly apply this rule in their projects resulting in utter success.
Coding conventions about naming, line length, and API documentation, like JavaDoc, can be simply defined and observed by tools like Checkstyle. This process can run fully automated during your build. Be careful; even if the code checkers pass without warnings, this does not mean that everything is working optimally. JavaDoc, for example, is problematic. With an automated Checkstyle, it can be assured that this API documentation exists, although we have no idea about the quality of those descriptions.
There should be no need to discuss the benefits of testing in this case; let us rather take a walkthrough of test coverage. The industry standard of 85% of covered code in test cases should be followed because coverage at less than 85% will not reach the complex parts of your application. 100% coverage just burns down your budget fast without resulting in higher benefits. A prime example of this is the TP-CORE project, whose test coverage is mostly between 92% to 95%. This was done to see real possibilities.
As already explained, the business layer contains just interfaces, defining the API. This layer is explicitly excluded from the coverage checks. Another package is called internal and it contains hidden implementations, like the SAX DocumentHandler. Because of the dependencies the DocumentHandler is bound to, it is very difficult to test this class directly, even with Mocks. This is unproblematic given that the purpose of this class is only for internal usage. In addition, the class is implicitly tested by the implementation using the DocumentHandler. To reach higher coverage, it also could be an option to exclude all internal implementations from checks. But it is always a good idea to observe the implicit coverage of those classes to detect aspects you may be unaware of.
Besides the low-level unit tests, automated acceptance tests should also be run. Paying close attention to these points may avoid a variety of problems. But never trust those fully automated checks blindly! Regularly repeated manual code inspections will always be mandatory, especially when working with external vendors. In our talk at JCON 2019, we demonstrated how simply test coverage could be faked. To detect other vulnerabilities you can additionally run checkers like SpotBugs and others more.
Tests don’t indicate that an application is free of failures, but they indicate a defined behavior for implemented functionality.
For a while now, SCM suites like GitLab or Microsoft Azure support pull requests, introduced long ago in GitHub. Those workflows are nothing new; IBM Synergy used to apply the same technique. A Build Manager was responsible to merge the developers’ changes into the codebase. In a rapid manner, all the revisions performed by the developer are just added into the repository by the Build Manager, who does not hold a sufficiently profound knowledge to decide about the implementation quality. It was the usual practice to simply secure that the build is not broken and always the compile produce an artifact.
Enterprises have discovered this as a new strategy to handle pull requests. Now, managers often make the decision to use pull requests as a quality gate. In my personal experience, this slows down productivity because it takes time until the changes are available in the codebase. Understanding of the branch and merge mechanism helps you to decide for a simpler branch model, like release branch lines. On those branches tools like SonarQube operate to observe the overall quality goal.
If a project needs an orchestrated build, with a defined order how artifacts have to create, you have a strong hint for a refactoring.
The coupling between classes and modules is often underestimated. It is very difficult to have an automated visualization for the bindings of modules. You will find out very fast the effect it has when a light coupling is violated because of an increment of complexity in your build logic.
Repeat Your Success
Rest assured, changes will happen! It is a challenge to keep your application open for adjustments. Several of the previous recommendations have implicit effects on future maintenance. A good source quality simplifies the endeavor of being prepared. But there is no guarantee. In the worst cases the end of the product lifecycle, EOL is reached, when mandatory improvements or changes cannot be realized anymore because of an eroded code base, for example.
As already mentioned, light coupling brings with it numerous benefits with respect to maintenance and reutilization. To reach this goal is not that difficult as it might look. In the first place, try to avoid as much as possible the inclusion of third-party libraries. Just to check if a String is empty or NULL it is unnecessary to depend on an external library. These few lines are fast done by oneself. A second important point to be considered in relation to external libraries: “Only one library to solve a problem.” If your project deals with JSON then decide one one implementation and don’t incorporate various artifacts. These two points heavily impact on security: a third-party artifact we can avoid using will not be able to cause any security leaks.
After the decision is taken for an external implementation, try to cover the usage in your project by applying design patterns like proxy, facade, or wrapper. This allows for a replacement more easily because the code changes are not spread around the whole codebase. You don’t need to change everything at once if you follow the advice on how to name the implementation class and provide an interface. Even though a SCM is designed for collaboration, there are limitations when more than one person is editing the same file. Using a design pattern to hide information allows you an iterative update of your changes.
Conclusion
As we have seen: a nonfunctional requirement is not that difficult to describe. With a short checklist, you can clearly define the important aspects for your project. It is not necessary to check all points for every code commit in the repository, this would with all probability just elevate costs and doesn’t result in higher benefits. Running a full check around a day before the release represents an effective solution to keep quality in an agile context and will help recognizing where optimization is necessary. Points of Interests (POI) to secure quality are the revisions in the code base for a release. This gives you a comparable statistic and helps increasing estimations.
Of course, in this short article, it is almost impossible to cover all aspects regarding quality. We hope our explanation helps you to link theory by examples to best practice. In conclusion, this should be your main takeaway: a high level of automation within your infrastructure, like continuous integration, is extremely helpful, but doesn’t prevent you from manual code reviews and audits.
Checklist
Follow common standards
KISS – keep it simple, stupid!
Equal directory structure for different projects
Simple META architecture, which can reuse as much as possible in other projects
Defined and follow coding styles
If a release got prepared – no compiler warnings are accepted
Have test coverage up to 85%
Avoid third-party libraries as much as possible
Don’t support more than one technology for a specific problem (e. g., JSON)
Most of the developer community know what a unit test is, even they don’t write them. But there is still hope. The situation is changing. More and more projects hosted on GitHub contain unit tests.
In a standard set-up for Java projects like NetBeans, Maven, and JUnit, it is not that difficult to produce your first test code. Besides, this approach is used in Test Driven Development (TDD) and exists in other technologies like Behavioral Driven Development (BDD), also known as acceptance tests, which is what we will focus on in this article.
Difference Between Unit and Acceptance Tests
The easiest way to become familiar with this topic is to look at a simple comparison between unit and acceptance tests. In this context, unit tests are very low level. They execute a function and compare the output with an expected result. Some people think differently about it, but in our example, the only one responsible for a unit test is the developer.
Keep in mind that the test code is placed in the project and always gets executed when the build is running. This provides quick feedback as to whether or not something went wrong. As long the test doesn’t cover too many aspects, we are able to identify the problem quickly and provide a solution. The design principle of those tests follows the AAA paradigm. Define a precondition (Arrange), execute the invariant (Act), and check the postconditions (Assume). We will come back to this approach a little later.
When we check the test coverage with tools like JaCoCo and cover more than 85 percent of our code with test cases, we can expect good quality. During the increasing test coverage, we specify our test cases more precisely and are able to identify some optimizations. This can be removing or inverting conditions because during the tests we find out, it is almost impossible to reach those sections. Of course, the topic is a bit more complicated, but those details could be discussed in another article.
Acceptance test are same classified like unit tests. They belong to the family of regression tests. This means we want to observe if changes we made on the code have no effects on already worked functionality. In other words, we want to secure that nothing already is working got broken, because of some side effects of our changes. The tool of our choice is JGiven [1]. Before we look at some examples, first, we need to touch on a bit of theory.
JGiven In-Depth
The test cases we define in JGiven is called a scenario. A scenario is a collection of four classes, the scenario itself, the Given displayed as given (Arrange), the Action displayed as when (Act) and Outcome displayed as then (Assume).
In most projects, especially when there is a huge amount of scenarios and the execution consumes a lot of time, acceptance tests got organized in a separate project. With a build job on your CI server, you can execute those tests once a day to get fast feedback and to react early if something is broken. The code example we demonstrate contains everything in one project on GitHub [2] because it is just a small library and a separation would just over-engineer the project. Usually, the one responsible for acceptance tests is the test center, not the developer.
The sample project TP-CORE is organized by a layered architecture. For our example, we picked out the functionality for sending e-mails. The basic functionality to compose an e-mail is realized in the application layer and has a test coverage of up to 90 percent. The functionality to send the e-mail is defined in the service layer.
In our architecture, we decided that the service layer is our center of attention to defining acceptance tests. Here, we want to see if our requirement to send an e-mail is working well. Supporting this layer with our own unit tests is not that efficient because, in commercial projects, it just produces costs without winning benefits. Also, having also unit tests means we have to do double the work because our JGiven tests already demonstrate and prove that our function is well working. For those reasons, it makes no sense to generate test coverage for the test scenarios of the acceptance test.
Let’s start with a practice example. At first, we need to include our acceptance test framework into our Maven build. In case you prefer Gradle, you can use the same GAV parameters to define the dependencies in your build script.
As you can see in listing 1, JGiven works well together with JUnit. An integration to TestNG also exists , you just need to replace the artifactId for jgiven-testng. To enable the HTML reports, you need to configure the Maven plugin in the build lifecycle, like it is shown in Listing 2.
The report of our scenarios in the TP-CORE project is shown in image 1. As we can see, the output is very descriptive and human-readable. This result will be explained by following some naming conventions for our methods and classes, which will be explained in detail below. First, let’s discuss what we can see in our test scenario. We defined five preconditions:
The configuration for the SMPT server is readable
The SMTP server is available
The mail has a recipient
The mail has attachments
The mail is full composed
If all these conditions are true, the action will send a single e-mail got performed. Afterward, after the SMTP server is checked, we see that the mail has arrived. For the SMTP service, we use the small Java library greenmail [3] to emulate an SMTP server. Now it is understandable why it is advantageous for acceptance tests if they are written by other people. This increases the quality as early on conceptional inconsistencies appear. Because as long as the tester with the available implementations cannot map the required scenario, the requirement is not fully implemented.
Producing Descriptive Scenarios
Now is the a good time to dive deeper into the implementation details of our send e-mail test scenario. Our object under test is the class MailClientService. The corresponding test class is MailClientScenarioTest, defined in the test packages. The scenario class definition is shown in listing 3.
@RunWith(JUnitPlatform.class)
public class MailClientScenarioTest
extends ScenarioTest<MailServiceGiven, MailServiceAction, MailServiceOutcome> {
// do something
}
Listing 3: Acceptance Test Scenario for JGiven.
As we can see, we execute the test framework with JUnit5. In the ScenarioTest, we can see the three classes: Given, Action, and Outcome in a special naming convention. It is also possible to reuse already defined classes, but be careful with such practices. This can cost some side effects. Before we now implement the test method, we need to define the execution steps. The procedure for the three classes are equivalent.
@RunWith(JUnitPlatform.class)
public class MailServiceGivenextends Stage<MailServiceGiven> {
public MailServiceGiven email_has_recipient(MailClient client) {
try {
assertEquals(1, client.getRecipentList().size());
} catch (Exception ex) {
System.err.println(ex.getMessage);
}
return self();
}
}
@RunWith(JUnitPlatform.class)
public class MailServiceActionextends Stage<MailServiceAction> {
public MailServiceAction send_email(MailClient client) {
MailClientService service = new MailClientService();
try {
assertEquals(1, client.getRecipentList().size());
service.sendEmail(client);
} catch (Exception ex) {
System.err.println(ex.getMessage);
}
return self();
}
}
@RunWith(JUnitPlatform.class)
public class MailServiceOutcomeextends Stage<MailServiceOutcome> {
public MailServiceOutcome email_is_arrived(MimeMessage msg) {
try {
Address adr = msg.getAllRecipients()[0];
assertEquals("JGiven Test E-Mail", msg.getSubject());
assertEquals("noreply@sample.org", msg.getSender().toString());
assertEquals("otto@sample.org", adr.toString());
assertNotNull(msg.getSize());
} catch (Exception ex) {
System.err.println(ex.getMessage);
}
return self();
}
}
Listing 4: Implementing the AAA Principle for Behavioral Driven Development.
Now, we completed the cycle and we can see how the test steps got stuck together. JGiven supports a bigger vocabulary to fit more necessities. To explore the full possibilities, please consult the documentation.
Lessons Learned
In this short workshop, we passed all the important details to start with automated acceptance tests. Besides JGiven exist other frameworks, like Concordion or FitNesse fighting for usage. Our choice for JGiven was its helpful documentation, simple integration into Maven builds and JUnit tests, and the descriptive human-readable reports.
As negative point, which could people keep away from JGiven, could be the detail that you need to describe the tests in the Java programming language. That means the test engineer needs to be able to develop in Java, if they want to use JGiven. Besides this small detail, our experience with JGiven is absolutely positive.