Preventing SQL Injections in Java with JPA and Hibernate

When we have a look at OWASP’s top 10 vulnerabilities [1], SQL Injections are still in a popular position. In this short article, we discuss several options on how SQL Injections could be avoided.

When Applications have to deal with databases existing always high-security concerns, if an invader got the possibility to hijack the database layer of your application, he can choose between several options. Stolen the data of the stored users to flood them with spam is not the worst scenario that could happen. Even more problematic would be when stored payment information got abused. Another possibility of an SQL Injection Cyber attack is to get illegal access to restricted pay content and/or services. As we can see, there are many reasons why to care about (Web) Application security.

To find well-working preventions against SQL Injections, we need first to understand how an SQL Injection attack works and on which points we need to pay attention. In short: every user interaction that processes the input unfiltered in an SQL query is a possible target for an attack. The data input can be manipulated in a manner that the submitted SQL query contains a different logic than the original. Listing 1 will give you a good idea about what could be possible.

SELECT Username, Password, Role FROM User 
   WHERE Username = 'John Doe' AND Password = 'S3cr3t';
SELECT Username, Password, Role FROM Users
   WHERE Username = 'John Doe'; --' AND Password='S3cr3t';

Listing 1: Simple SQL Injection

The first statement in Listing 1 shows the original query. If the Input for the variables Username and Password is not filtered, we have a lack of security. The second query injects for the variable Username a String with the username John Doe and extends with the characters ‘; –. This statement bypasses the AND branch and gives, in this case, access to the login. The ‘; sequence close the WHERE statement and with — all following characters got un-commented. Theoretically, it is possible to execute between both character sequences every valid SQL code.

Of course, my plan is not to spread around ideas that SQL commands could rise up the worst consequences for the victim. With this simple example, I assume the message is clear. We need to protect each UI input variable in our application against user manipulation. Even if they are not used directly for database queries. To detect those variables, it is always a good idea to validate all existing input forms. But modern applications have mostly more than just a few input forms. For this reason, I also mention keeping an eye on your REST endpoints. Often their parameters are also connected with SQL queries.

For this reason, Input validation, in general, should be part of the security concept. Annotations from the Bean Validation [2] specification are, for this purpose, very powerful. For example, @NotNull, as an Annotation for the data field in the domain object, ensure that the object only is able to persist if the variable is not empty. To use the Bean Validation Annotations in your Java project, you just need to include a small library.

<dependency> 
    <groupId>org.hibernate.validator</groupId>
    <artifactId>hibernate-validator</artifactId>
    <version>${version}</version>
</dependency>

Listing 2: Maven Dependency for Bean Validation

Perhaps it could be necessary to validate more complex data structures. With Regular Expressions, you have another powerful tool in your hands. But be careful. It is not that easy to write correct working RegEx. Let’s have a look at a short example.

public static final String RGB_COLOR = "#[0-9a-fA-F]{3,3}([0-9a-fA-F]{3,3})?";
 
public boolean validate(String content, String regEx) {
    boolean test;
    if (content.matches(regEx)) {
        test = true;
    } else {
        test = false;
    }
    return test;
}

validate('#000', RGB_COLOR);

Listing 3: Validation by Regular Expression in Java

The RegEx to detect the correct RGB color schema is quite simple. Valid inputs are #ffF or #000000. The Range for the characters is 0-9, and the Letters A to F. Case insensitive. When you develop your own RegEx, you always need to check very well existing boundaries. A good example is also the 24 hours time format. Typical mistakes are invalid entries like 23:60 or 24:00. The validate method compares the input string with the RegEx. If the pattern matches the input, the method will return true. If you want to get more ideas about validators in Java, you can also check my GitHub repository [3].

In resume, our first idea to secure user input against abuse is to filter out all problematic character sequences, like — and so on. Well, this intention of creating a blocking list is not that bad. But still have some limitations. At first, the complexity of the application increased because blocking single characters like –; and ‘ could causes sometimes unwanted side effects. Also, an application-wide default limitation of the characters could cost sometimes problems. Imagine there is a text area for a Blog system or something equal.

This means we need another powerful concept to filter the input in a manner our SQL query can not manipulate. To reach this goal, the SQL standard has a very great solution we can use. SQL Parameters are variables inside an SQL query that will be interpreted as content and not as a statement. This allows large texts to block some dangerous characters. Let’s have a look at how this will work on a PostgreSQL [4] database.

DECLARE user String;
SELECT * FROM login WHERE name = user; 

Listing 4: Defining Parameters in PostgreSQL

In the case you are using the OR mapper Hibernate, there exists a more elegant way with the Java Persistence API (JPA).

String myUserInput;
 
@PersistenceContext
public EntityManager mainEntityManagerFactory;

CriteriaBuilder builder =
    mainEntityManagerFactory.getCriteriaBuilder();

CriteriaQuery<DomainObject> query =
    builder.createQuery(DomainObject.class);

// create Criteria
Root<ConfigurationDO> root =
    query.from(DomainObject.class);

//Criteria SQL Parameters
ParameterExpression<String> paramKey =
    builder.parameter(String.class);

query.where(builder.equal(root.get("name"), paramKey);

// wire queries together with parameters
TypedQuery<ConfigurationDO> result =
    mainEntityManagerFactory.createQuery(query);

result.setParameter(paramKey, myUserInput);
DomainObject entry = result.getSingleResult();

Listing 5: Hibernate JPA SQL Parameter Usage

Listing 5 is shown as a full example of Hibernate using JPA with the criteria API. The variable for the user input is declared in the first line. The comments in the listing explain the way how it works. As you can see, this is no rocket science. The solution has some other nice benefits besides improving web application security. At first, no plain SQL is used. This ensures that each database management system supported by Hibernate can be secured by this code.

May the usage looks a bit more complex than a simple query, but the benefit for your application is enormous. On the other hand, of course, there are some extra lines of code. But they are not that difficult to understand.

Resources


A briefly overview to Java frameworks

When you have a look at Merriam Webster about the word framework you find the following explanations:

  • a basic conceptional structure
  • a skeletal, openwork, or structural frame

May you could think that libraries and frameworks are equal things. But this is not correct. The source code calls the functionality of a library directly. When you use a framework it is exactly the opposite. The framework calls specific functions of your business logic. This concept is also know as Inversion of Control (IoC).

For web applications we can distinguish between Client-Side and Server-Side frameworks. The difference is that the client usually run in a web browser, that means to available programming languages are limited to JavaScript. Depending on the web server we are able to chose between different programming languages. the most popular languages for the internet are PHP and Java. All web languages have one thing in common. They produce as output HTML, witch can displayed in a web browser.

In this article I created an short overview of the most common Java frameworks which also could be used in desktop applications. If you wish to have a fast introduction for Java Server Application you can check out my Article about Java EE and Jakarta.

If you plan to use one or some of the discussed frameworks in your Java application, you just need to include them as Maven or Gradle dependency.

JUnit, TestNGTDD – unit testing
MockitoTDD mocking objects
JGiven, CucumberBDD – acceptance testing
Hibernate, iBatis, Eclipse LinkJPA- O/R Mapper
Spring Framework, Google GuiceDependency Injection
PrimeFaces, BootsFaces, ButterFacesJSF User Interfaces
ControlsFX, BootstrapFXJavaFX User Interfaces
Hazelcast, Apache KafkaEvent Stream Processing
SLF4J, Logback, Log4JLogging
FF4jFeature Flags

Before I continue I wish to telly you, that this frameworks are made to help you in your daily business as developer to solve problems. Every problem have multiple solutions. For this reason it is more important to learn the concepts behind the frameworks instead just how to use a special framework. During the last two decades since I’m programming I saw the rise and fall of plenty Frameworks. Examples of frameworks today almost nobody remember are: Google Web Toolkit and JBoss Seam.

The most used framework in Java for writing and executing unit tests is JUnit. An also often used alternative to JUnit is TestNG. Both solutions working quite equal. The basic idea is execute a function by defined parameters and compare the output with an expected results. When the output fit with the expectation the test passed successful. JUnit and TestNG supporting the Test Driven Development (TDD) paradigm.

If you need to emulate in your test case a behavior of an external system you do not have in the moment your tests are running, then Mockito is your best friend. Mockito works perfectly together with JUnit and TestNG.

The Behavioral Driven Development (BDD) is an evolution to unit tests where you are able to define the circumstances the customer will accepted the integrated functionality. The category of BDD integration tests are called acceptance tests. Integration tests are not a replacement for unit tests, they are an extension to them. The frameworks JGiven and Cucumber are also very similar both are like Mockito an extension for the unit test frameworks JUnit and TestNG.

For dealing in Java with relational databases we can choose between several persistence frameworks. Those frameworks allow you to define your database structure in Java objects without writing any line of SQL The mapping between Java objects and database tables happens in the background. Another very great benefit of using O/R Mapper like Hibernate, iBatis and eclipse link is the possibility to replace the underlying database sever. But this achievement is not so easy to reach as it in the beginning seems.

In the next section I introduce a technique was first introduced by the Spring Framework. Dependency Injection (DI). This allows the loose coupling between modules and an more easy replacement of components without a new compile. The solution from Google for DI is called Guice and Java Enterprise binges its own standard named CDI.

Graphical User Interfaces (GUI) are another category for frameworks. It depends on the chosen technology like JavaFX or JSF which framework is useful. The most of the provided controls are equal. Common libraries for GUI JSF components are PrimeFaces, BootsFaces or ButterFaces. OmniFaces is a framework to have standardized solution for JSF problems, like chaching and so on. Collections for JavaFX controls you can find in ControlsFX and BootstrapFX.

If you have to deal with Event Stream Processing (ESP) may you should have a look on Hazelcast or Apache Kafka. ESP means that the system will react on constantly generated data. The event is a reference to each data point which can be persisted in a database and the stream represent to output of the events.

In December a often used technology comes out of the shadow, because of a attacking vulnerability in Log4J. Log4J together with the Simple Logging Facade for Java (SLF4J) is one of the most used dependencies in the software industry. So you can imagine how critical was this information. Now you can imagine which important role Logging has for software development. Another logging framework is Logback, which I use.

Another very helpful dependency for professional software development is FF4J. This allows you to define feature toggles, also know as feature flags to enable and disable functionality of a software program by configuration.

This list could be much longer. I just tried to focus on the most used ones the are for Java programmers relevant. Feel free to leave a comment to suggest something I may forgot. If you share this article on

How to reduce the size of a PDF document

When you own a big collection of PDF files the used storage space can increasing quite high. Sometimes I own PDF documents with more than 100 MB. Well nowadays this storage capacities are not a big issue. But if you want to backup those files to other mediums like USB pen drives or a DVD it would be great to reduce the file size of you PDF collection.

Long a go I worked with a little scrip that allowed me to reduce the file size of a PDF document significantly. This script called a interactive tool called PDF Sam with some command line parameters. Unfortunately many years ago the software PDF Sam become with this option commercial, so I was needed a new solution.

Before I go closer to my approach I will discuss some basic information about what happens in the background. As first, when your PDF blew up to a huge file is the reason because of the included graphics. If you scanned you handwritten notes to save them in one single archive you should be aware that every scan is a image file. By default the PDF processor already optimize those files. This is why the file size almost don’t get reduced when you try to compress them by a tool like zip.

Scanned images can optimized before to include them to a PDF document by a graphic tool like Gimp. Actions you can perform are reduce the image quality and increase the contrast. Specially for scanned handwritten notes are this steps important. If the contrast is very low and maybe you plan to print those documents, it could happens they are not readable. Another problem in this case is that you can’t apply a text search over the document. A solution to this problem is the usage of an OCR tool to transform text in images back to real text.

We resume shortly the previous minds. When we try to reduce the file size of a PDF we need to reduce the quality of the included images. This can be done by reducing the amount of dots per inch (dpi). Be aware that after the compression the image is still readable. As long you do not plan to do a high quality print like a magazine or a book, nothing will get affected.

When we wanna reduce plenty PDF files in a short time we can’t do all those actions by hand. For instance we need an automated solution. To reach the goal it is important that the tool we use support the command line. The we can create a simple batch job to perform the task without any hands on.

We have several options to optimize the images inside a PDF. If it is a great idea to perform all options, depend on the purpose of the usage.

  1. change the image file to the PNG format
  2. reduce the graphic dimensions to the real printable area
  3. reduce the DPI
  4. change the image color profile to gray-scale

As Ubuntu Linux user I have all of the things I need already together. And now comes the part that I explain you my well working solution.

Ghostscript

GPL Ghostscript is used for PostScript/PDF preview and printing. Usually as a back-end to a program such as ghostview, it can display PostScript and PDF documents in an X11 environment.

If you don’t have Ghostscript installed on you system, you can do this very fast.

sudo apt-get update 
sudo apt-get -y install ghostscript

 Before you execute any script or command be aware you do not overwrite with the output the existing files. In the case something get wrong you loose all originals to try other options. Before you start to try out anything backup your files or generate the compressed PDF in a separate folder.

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

The important parameter is r150, which reduce the output resolution to 150 dpi. In the manage you can check for more parameters to compress the result more stronger. The given command you are able to place in a script, were its surrounded by a FOR loop to fetch all PDF files in a directory, to write them reduced in another directory.

The command I used for a original file with 260 MB and 640 pages. After the operation was done the size got reduced to around 36 MB. The shrunken file is almost 7 times smaller than the original. A huge different. As you can see in the screenshot, the quality of the pictures is almost identical.

As alternative, in the case you won’t come closer to the command line there is a online PDF compression tool in German and English language for free use available.

PDF Workbench

Linux Systems have many powerful tools to deal with PDF documents. For example the Libreoffice Suite have a button where you can generate for every document a proper PDF file. But sometimes you wish to create a PDF in the printing dialog of any other application in your system. With the cups PDF print driver you enable this functionality on your system.

sudo apt-get install printer-driver-cups-pdf 

As I already explained, OCR allows you to extract from graphics text to make a document searchable. When you need to work with this type of software be aware that the result is good, but you cant avoid mistakes. Even when you perform an OCR on a scanned book page, you will find several mistakes. OCRFeeder is a free and very powerful solution for Linux systems.

Another powerful helper is the tool PDF Arranger which allows you to add or remove pages to an existing PDF. You are also able to change the order of the pages.

Resources

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.


Java Enterprise in briefly detail

last update:

If you plan to get in touch with Java Enterprise, may in the beginning it’s a bit overwhelmed and confusing. But don’t worry It’s not so worst like it seems. To start with it, you just need to know some basics about the ideas and concepts.

As first Java EE is not a tool nor a compiler you download and use it in the same manner like Java Development Kit (JDK) also known as Software Development Kit (SDK). Java Enterprise is a set of specifications. Those specifications are supported by an API and the API have a reference implementation. The reference implementation is a bundle you can download and it’s called Application Server.

Since Java EE 8 the Eclipse Foundation maintain Java Enterprise. Oracle and the Eclipse Foundation was not able to find a common agreement for the usage of the Java Trademark, which is owned by Oracle. The short version of this story is that the Eclipse Foundation renamed JavaEE to JakartaEE. This has also an impact to old projects, because the package paths was also changed in Jakarta EE 9 from javax to jakarta. Jakarta EE 9.1 upgrade all components from JDK 8 to JDK 11.

If you want to start with developing Jakarta Enterprise [1] applications you need some prerequisites. As first you have to choose the right version of the JDK. The JDK already contains the runtime environment Java Vitual Machine (JVM) in the same version like the JDK. You don’t need to install the JVM separately. A good choice for a proper JDK is always the latest LTS Version. Java 17 JDK got released 2021 and have support for 3 years until 2024. Here you can find some information about the Java release cycle.

If you wish to overcome the Oracle license restrictions you may could switch to an free Open Source implementation of the JDK. One of the most famous free available variant of the JDK is the OpenJDK from adoptium [2]. Another interesting implementation is GraalVM [3] which is build on top of the OpenJDK. The enterprise edition of GraalVM can speed up your application 1.3 times. For production system a commercial license of the enterprise edition is necessary. GraalVM includes also an own Compiler.

  Version  Year  JSR  Servlet  Tomcat  JavaSE
J2EE – 1.21999
J2EE – 1.32001JSR 58
J2EE – 1.42003JSR 151
Java EE 52006JSR 244
Java EE 62009JSR 316
Java EE 72013JSR 342
Java EE 82017JSR 366
Jakarta 820194.09.08
Jakarta 920205.010.08 & 11
Jakarta 9.120215.010.011
Jakarta 1020226.011.011
Jakarta 112023under development

The table above is not complete but the most important current versions are listed. Feel free to send me an message if you have some additional information are missing in this overview.

You need to be aware, that the Jakarta EE Specification needs a certain Java SDK and the Application Server maybe need as a runtime another Java JDK. Both Java Versions don’t have to be equal.

Dependencies (Maven):

<dependency>
    <groupId>jakarta.platform</groupId>
    <artifactId>jakarta.jakartaee-api</artifactId>
    <version>${version}</version>
    <scope>provided</scope>
</dependency> 
XML
<dependency>
    <groupId>org.eclipse.microprofile</groupId>
    <artifactId>microprofile</artifactId> 
    <version>${version}</version>
    <type>pom</type>
    <scope>provided</scope>
</dependency>
XML

In the next step you have to choose the Jakarta EE environment implementation. This means decide for an application server. It’s very important that the application server you choose can operate on the JVM version you had installed on your system. The reason is quite simple, because the application server is implemented in Java. If you plan to develop a Servlet project, it’s not necessary to operate a full application server, a simple Servlet Container like Apache Tomcat (Catalina) or Jetty contains everything is required.

Jakarta Enterprise reference implementations are: Payara (fork of Glassfish), WildFly (formerly known as JBoss), Apache Geronimo, Apache TomEE, Apache Tomcat, Jetty and so on.

May you heard about Microprofile [4]. Don’t get confused about it, it’s not that difficult like it seems in the beginnin. In general you can understand Microprofiles as a subset of JakartaEE to run Micro Services. Microprofiles got extended by some technologies to trace, observe and monitor the status of the service. Version 5 was released on December 2021 and is full compatible to JakartaEE 9.


Core Technologies

Plain Old Java Beans

POJOs are simplified Java Objects without any business logic. This type of Java Beans only contains attributes and its corresponding getters and setters. POJOs do not:

  • Extend pre-specified classes: e. g. public class Test extends javax.servlet.http.HttpServlet is not considered to be a POJO class.
  • Contain pre-specified annotations: e. g. @javax.persistence.Entity public class Test is not a POJO class.
  • Implement pre-specified interfaces: e. g. public class Test implements javax.ejb.EntityBean is not considered to be a POJO class.

(Jakarta) Enterprise Java Beans

An EJB component, or enterprise bean, is a body of code that has fields and methods to implement modules of business logic. You can think of an enterprise bean as a building block that can be used alone or with other enterprise beans to execute business logic on the Java EE server.

Enterprise beans are either (stateless or stateful) session beans or message-driven beans. Stateless means, when the client finishes executing, the session bean and its data are gone. A message-driven bean combines features of a session bean and a message listener, allowing a business component to receive (JMS) messages asynchronously.

(Jakarta) Servlet

Java Servlet technology lets you define HTTP-specific Servlet classes. A Servlet class extends the capabilities of servers that host applications accessed by way of a request-response programming model. Although Servlets can respond to any type of request, they are commonly used to extend the applications hosted by web servers.

(Jakarta) Server Pages

JSP is a UI technology and lets you put snippets of Servlet code directly into a text-based document. JSP files transformed by the compiler to a Java Servlet.

(Jakarta) Server Pages Standard Tag Library

The JSTL encapsulates core functionality common to many JSP applications. Instead of mixing tags from numerous vendors in your JSP applications, you use a single, standard set of tags. JSTL has iterator and conditional tags for handling flow control, tags for manipulating XML documents, internationalization tags, tags for accessing databases using SQL, and tags for commonly used functions.

(Jakarta) Server Faces

JSF technology is a user interface framework for building web applications. JSF was introduced to solve the problem of JSP, where program logic and layout was extremely mixed up.

(Jakarta) Managed Beans

Managed Beans, lightweight container-managed objects (POJOs) with minimal requirements, support a small set of basic services, such as resource injection, lifecycle callbacks, and interceptors. Managed Beans represent a generalization of the managed beans specified by Java Server Faces technology and can be used anywhere in a Java EE application, not just in web modules.

(Jakarta) Persistence API

The JPA is a Java standards–based solution for persistence. Persistence uses an object/relational mapping approach to bridge the gap between an object-oriented model and a relational database. The Java Persistence API can also be used in Java SE applications outside of the Java EE environment. Hibernate and Eclipse Link are some reference Implementation for JPA.

(Jakarta) Transaction API

The JTA provides a standard interface for demarcating transactions. The Java EE architecture provides a default auto commit to handle transaction commits and rollbacks. An auto commit means that any other applications that are viewing data will see the updated data after each database read or write operation. However, if your application performs two separate database access operations that depend on each other, you will want to use the JTA API to demarcate where the entire transaction, including both operations, begins, rolls back, and commits.

(Jakarta) API for RESTful Web Services

The JAX-RS defines APIs for the development of web services built according to the Representational State Transfer (REST) architectural style. A JAX-RS application is a web application that consists of classes packaged as a servlet in a WAR file along with required libraries.

(Jakarta) Dependency Injection for Java

Dependency Injection for Java defines a standard set of annotations (and one interface) for use on injectable classes like Google Guice or the Sprig Framework. In the Java EE platform, CDI provides support for Dependency Injection. Specifically, you can use injection points only in a CDI-enabled application.

(Jakarta) Contexts & Dependency Injection for Java EE

CDI defines a set of contextual services, provided by Java EE containers, that make it easy for developers to use enterprise beans along with Java Server Faces technology in web applications. Designed for use with stateful objects, CDI also has many broader uses, allowing developers a great deal of flexibility to integrate different kinds of components in a loosely coupled but typesafe way.

(Jakarta) Bean Validation

The Bean Validation specification defines a metadata model and API for validating data in Java Beans components. Instead of distributing validation of data over several layers, such as the browser and the server side, you can define the validation constraints in one place and share them across the different layers.

(Jakarta) Message Service API

JMS API is a messaging standard that allows Java EE application components to create, send, receive, and read messages. It enables distributed communication that is loosely coupled, reliable, and asynchronous.

(Jakarta) EE Connector Architecture

The Java EE Connector Architecture is used by tools vendors and system integrators to create resource adapters that support access to enterprise information systems that can be plugged in to any Java EE product. A resource adapter is a software component that allows Java EE application components to access and interact with the underlying resource manager of the EIS. Because a resource adapter is specific to its resource manager, a different resource adapter typically exists for each type of database or enterprise information system.

The Java EE Connector Architecture also provides a performance-oriented, secure, scalable, and message-based transactional integration of Java EE platform–based web services with existing EISs that can be either synchronous or asynchronous. Existing applications and EISs integrated through the Java EE Connector Architecture into the Java EE platform can be exposed as XML-based web services by using JAX-WS and Java EE component models. Thus JAX-WS and the Java EE Connector Architecture are complementary technologies for enterprise application integration (EAI) and end-to-end business integration.

(Jakarta) Mail API

Java EE applications use the JavaMail API to send email notifications. The JavaMail API has two parts:

  • An application-level interface used by the application components to send mail
  • A service provider interface

The Java EE platform includes the JavaMail API with a service provider that allows application components to send Internet mail.

(Jakarta) Authorization Contract for Containers

The JACC specification defines a contract between a Java EE application server and an authorization policy provider. All Java EE containers support this contract. The JACC specification defines java.security.Permission classes that satisfy the Java EE authorization model. The specification defines the binding of container-access decisions to operations on instances of these permission classes. It defines the semantics of policy providers that use the new permission classes to address the authorization requirements of the Java EE platform, including the definition and use of roles.

(Jakarta) Authentication Service Provider Interface for Containers

The JASPIC specification defines a service provider interface (SPI) by which authentication providers that implement message authentication mechanisms may be integrated in client or server message-processing containers or runtimes. Authentication providers integrated through this interface operate on network messages provided to them by their calling containers. The authentication providers transform outgoing messages so that the source of each message can be authenticated by the receiving container, and the recipient of the message can be authenticated by the message sender. Authentication providers authenticate each incoming message and return to their calling containers the identity established as a result of the message authentication.

(Jakarta) EE Security API

The purpose of the Java EE Security API specification is to modernize and simplify the security APIs by simultaneously establishing common approaches and mechanisms and removing the more complex APIs from the developer view where possible. Java EE Security introduces the following APIs:

  • SecurityContext interface: Provides a common, uniform access point that enables an application to test aspects of caller data and grant or deny access to resources.
  • HttpAuthenticationMechanism interface: Authenticates callers of a web application, and is specified only for use in the servlet container.
  • IdentityStore interface: Provides an abstraction of an identity store and that can be used to authenticate users and retrieve caller groups.

(Jakarta) Java API for WebSocket

WebSocket is an application protocol that provides full-duplex communications between two peers over TCP. The Java API for WebSocket enables Java EE applications to create endpoints using annotations that specify the configuration parameters of the endpoint and designate its lifecycle callback methods.

(Jakarta) Java API for JSON Processing

The JSON-P enables Java EE applications to parse, transform, and query JSON data using the object model or the streaming model.

JavaScript Object Notation (JSON) is a text-based data exchange format derived from JavaScript that is used in web services and other connected applications.

(Jakarta) Java API for JSON Binding

The JSON-B provides a binding layer for converting Java objects to and from JSON messages. JSON-B also supports the ability to customize the default mapping process used in this binding layer through the use of Java annotations for a given field, JavaBean property, type or package, or by providing an implementation of a property naming strategy. JSON-B is introduced in the Java EE 8 platform.

(Jakarta) Concurrency Utilities for Java EE

Concurrency Utilities for Java EE is a standard API for providing asynchronous capabilities to Java EE application components through the following types of objects: managed executor service, managed scheduled executor service, managed thread factory, and context service.

(Jakarta) Batch Applications for the Java Platform

Batch jobs are tasks that can be executed without user interaction. The Batch Applications for the Java Platform specification is a batch framework that provides support for creating and running batch jobs in Java applications. The batch framework consists of a batch runtime, a job specification language based on XML, a Java API to interact with the batch runtime, and a Java API to implement batch artifacts.

Resources

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

Notice: I try to keep this post up to date, but mistakes could happen. Please feel free to drop me a message, if you detected some mistakes or if you have some suggestions. If you like this article it would be great to leave a thumbs up and share with friends and colleges.

The Bug Fix Bingo

If you whish to discover a way how to turn negative vibes between testers and developers into something positive – here is a great solution for that. The thing I like to introduce is quite old but even today in our brave new DevOps world an evergreen.

Many years ago in the world wide web I stumbled over a PDF called Bug Fix Bingo. A nice funny game for IT professionals. This little funny game originally was invent by the software testing firm K. J. Ross & Associates. Unfortunately the original site disappeared long ago so I decided to conserve this great idea in this blog post.

I can recommend this game also for folks they are not so deep into testing, but have to participate in a lot of IT meetings. Just print the file, bring some copies to your next meeting and enjoy whats gonna happen. I did it several times. Beside the fun we had it changed something. So let’s have a look into the concept and rules.

Bug Fix Bingo is based on a traditional Bingo just with a few adaptions. Everyone can join the game easily without a big preparation, because its really simple. Instead of numbers the Bingo uses statements from developers in defect review meetings to mark off squares.

Rules:

  1. Bingo squares are marked off when a developer makes the matching statement during bug fix sessions.
  2. Testers must call “Bingo” immediately upon completing a line of 5 squares either horizontally, vertically or diagonally.
  3. Statements that arise as result of a bug that later becomes “deferred”, “as designed”, or “not to fixed” should be classified as not marked.
  4. Bugs that are not reported in an incident report can not be used.
  5. Statements should also be recorded against the bug in the defect tracking system for later confirmation.
  6. Any tester marks off all 25 statements should be awarded 2 weeks stress leave immediately.
  7. Any developer found using all 25 statements should be seconded into the test group for a period of no less than 6 months for re-education.
It works on my machine.”“Where were you when the program blew up?”“Why do you want to do it in that way?”“You can’t use that version on your system.”“Even thought it doesn’t work, how does it feel.”
“Did you check for a virus on your system?”“Somebody must have changed my code.”“It works, but it hasn’t been tested.”“THIS can’t be the source of that module in weeks!”“I can’t test anything!”
“It’s just some unlucky coincidence.”“You must have the wrong version.”“I haven’t touched that module in weeks.”“There is something funky in your data.”“What did you type in wrong to get it to crash?”
“It must be a hardware problem.”“How is that possible?”“It worked yesterday.”“It’s never done that before.”“That’s weird …”
“That’s scheduled to be fixed in the next release.”“Yes, we knew that would happen.”“Maybe we just don’t support that platform.”“It’s a feature. We just haven’t updated the specs.”“Surly nobody is going to use the program like that.”
The BuxFix Bingo Gamecard

Incidentally, developers have a game like this too. They score points every time a QA person tries to raise a defect on functionality that is working as specified.

The BuxFix Bingo Gamecard

API 4 Future

Many ideas are excellent on paper. However, people often lack the knowledge of how to implement brilliant concepts into their everyday work. This short workshop aims to bridge the gap between theory and practice and demonstrates the steps needed to achieve a stable API in the long term.

(c) 2021 Marco Schulz, Java PRO Ausgabe 1, S.31-34

When developing commercial software, many people involved often don’t realize that the application will be in use for a long time. Since our world is constantly changing, it’s easy to foresee that the application will require major and minor changes over the years. The project becomes a real challenge when the application to be extended is not isolated, but communicates with other system components. This means that in most cases, the users of the application also have to be adapted. A single stone quickly becomes an avalanche. With good avalanche protection, the situation can still be controlled. However, this is only possible if you consider that the measures described below are solely intended for prevention. But once the violence has been unleashed, there is little that can be done to stop it. So let’s first clarify what an API is.

A Matter of Negotiation

A software project consists of various components, each with its own specialized tasks. The most important are source code, configuration, and persistence. We’ll be focusing primarily on the source code area. I’m not revealing anything new when I say that implementations should always be against interfaces. This foundation is already taught in the introduction to object-oriented programming. In my daily work, however, I often see that many developers aren’t always fully aware of the importance of developing against interfaces, even though this is common practice when using the Java Standard API. The classic example of this is:

List<String> collection = new ArrayList<>();

This short line uses the List interface, which is implemented as an ArrayList. Here we can also see that there is no suffix in the form of an “I” to identify the interface. The corresponding implementation also does not have “Impl” in its name. That’s a good thing! Especially with the implementation class, various solutions may be desired. In such cases, it is important to clearly label them and keep them easily distinguishable by name. ListImpl and ListImpl2 are understandably not as easy to distinguish as ArrayList and LinkedList. This also clears up the first point of a stringent and meaningful naming convention.

In the next step, we’ll focus on the program parts that we don’t want to expose to consumers of the application, as they are helper classes. Part of the solution lies in the structure of how the packages are organized. A very practical approach is:

  • my.package.path.business: Contains all interfaces
  • my.package.path.application: Contains the interface implementations
  • my.package.path.application.helper: Contains internal helper classes

This simple architecture alone signals to other programmers that it’s not a good idea to use classes from the helper package. Starting with Java 9, there are even more far-reaching restrictions prohibiting the use of internal helper classes. Modularization, which was introduced in Java 9 with the Jingsaw project [1], allows packages to be hidden from view in the module-info.java module descriptor.

Separatists and their Escape from the Crowd

A closer look at most specifications reveals that many interfaces have been outsourced to their own libraries. From a technological perspective, based on the previous example, this would mean that the business package, which contains the interfaces, is outsourced to its own library. The separation of API and the associated implementation fundamentally makes it easier to interchange implementations. It also allows a client to exert greater influence over the implementation of their project with their contractual partner, as the developer receives the API pre-built by the client. As great as the idea is, a few rules must be observed to ensure it actually works as originally intended.

Example 1: JDBC. We know that Java Database Connectivity is a standard for connecting various database systems to an application. Aside from the problems associated with using native SQL, MySQL JDBC drivers cannot simply be replaced by PostgreSQL or Oracle. After all, every manufacturer deviates more or less from the standard in their implementation and also provides exclusive functionality of their own product via the driver. If you decide to make extensive use of these additional features in your own project, the easy interchangeability is over.

Example 2: XML. Here, you have the choice between several standards. It’s clear, of course, that the APIs of SAX, DOM, and StAX are incompatible. For example, if you want to switch from DOM to event-based SAX for better performance, this can potentially result in extensive code changes.

Example 3: PDF. Last but not least, I have a scenario for a standard that doesn’t have a standard. The Portable Document Format itself is a standard for how document files are structured, but when it comes to implementing usable program libraries for their own applications, each manufacturer has its own ideas.

These three small examples illustrate the common problems that must be overcome in daily project work. A small rule can have a big impact: only use third-party libraries when absolutely necessary. After all, every dependency used also poses a potential security risk. It’s also not necessary to include a library of just a few MB to save the three lines required to check a string for null and empty values.

Model Boys

If you’ve decided on an external library, it’s always beneficial to do the initial work and encapsulate the functionality in a separate class, which you can then use extensively. In my personal project TP-CORE on GitHub [2], I’ve done this in several places. The logger encapsulates the functionality of SLF4J and Logback. Compared to the PdfRenderer, the method signatures are independent of the logging libraries used and can therefore be more easily exchanged via a central location. To encapsulate external libraries in your own application as much as possible, the following design patterns are available: wrapper, facade, and proxy.

Wrapper: also called the adaptor pattern, belongs to the group of structural patterns. The wrapper couples one interface to another that are incompatible.

Facade: is also a structural pattern and bundles several interfaces into a simplified interface.

Proxy: also called a representative, also belongs to the category of structural patterns. Proxies are a generalization of a complex interface. They can be understood as complementary to the facade, which combines multiple interfaces into a single one.

It is certainly important in theory to separate these different scenarios in order to describe them correctly. In practice, however, it is not critical if hybrid forms of the design patterns presented here are used to encapsulate external functionality. For anyone interested in exploring design patterns in more depth, we recommend the book “Design Patterns from Head to Toe” [3].

Class Reunion

Another step toward a stable API is detailed documentation. Based on the interfaces discussed so far, there’s a small library that allows methods to be annotated based on the API version. In addition to status and version information, the primary implementations for classes can be listed using the consumers attribute. To add API Gaurdian to your project, you only need to add a few lines to the POM and replace the ${version} property with the current version.

 <dependency>
    <groupId>org.apiguardian</groupId>
    <artifactId>apiguardian-api</artifactId>
    <version>${version}</version>
 </dependency>

Marking up methods and classes is just as easy. The @API annotation has the attributes: status, since, and consumers. The following values ​​are possible for status:

  • DEPRECATED: Deprecated, should not be used any further.
  • EXPERIMENTAL: Indicates new features for which the developer would like feedback. Use with caution, as changes can always occur.
  • INTERNAL: For internal use only, may be discontinued without warning.
  • STABLE: Backward-compatible feature that remains unchanged for the existing major version.
  • MAINTAINED: Ensures backward stability for the future major release.

Now that all interfaces have been enriched with this useful meta information, the question arises where the added value can be found. I simply refer you to Figure 1, which demonstrates everyday work.

Figure 1: Suggestion in Netbeans with @API annotation in the JavaDoc

For service-based RESTful APIs, there is another tool called Swagger [4]. This also follows the approach of creating API documentation from annotations. However, Swagger itself scans Java web service annotations instead of introducing its own. It is also quite easy to use. All that is required is to integrate the swagger-maven-plugin and specify the packages in which the web services reside in the configuration. Subsequently, a description is created in the form of a JSON file for each build, from which Swagger UI then generates executable documentation. Swagger UI itself is available as a Docker image on DockerHub [5].

<plugin>
   <groupId>io.swagger.core.v3</groupId>
   <artifactId>swagger-maven-plugin</artifactId>
   <version>${version}</version>
   <configuration>
      <outputFileName>swagger</outputFileName>
      <outputFormat>JSON</outputFormat>
      <resourcePackages>
          <package>org.europa.together.service</package>
      </resourcePackages>
      <outputPath>${project.build.directory}</outputPath>
   </configuration>
</plugin>
Figure 2: Swagger UI documentation of the TP-ACL RESTful API.

Versioning is an important aspect for APIs. Using semantic versioning, a lot can be gleaned from the version number. Regarding an API, the major segment is significant. This first digit indicates API changes that are incompatible with each other. Such incompatibility includes the removal of classes or methods. However, changing existing signatures or the return value of a method also requires adjustments from consumers as part of a migration. It’s always a good idea to bundle work that causes incompatibilities and publish it less frequently. This demonstrates project stability.

Versioning is also recommended for Web APIs. This is best done via the URL by including a version number. So far, I’ve had good experiences with only incrementing the version when incompatibilities occur.

Relationship Stress

The great advantage of a RESTful service, being able to get along well with “everyone,” is also its greatest curse. This means that a great deal of care must be taken, as many clients are being served. Since the interface is a collection of URIs, our focus is on the implementation details. For this, I’ll use an example from my TP-ACL project, which is also available on GitHub.

RolesDO role = rolesDAO.find(roleName);
String json = rolesDAO.serializeAsJson(role);
if (role != null) {
    response = Response.status(Response.Status.OK)
            .type(MediaType.APPLICATION_JSON)
            .entity(json)
            .encoding("UTF-8")
            .build();
} else {
    response = Response.status(Response.Status.NOT_FOUND).build();
}

This is a short excerpt from the try block of the fetchRole method found in the RoleService class. The GET request returns a 404 error code if a role is not found. You probably already know what I’m getting at.

When implementing the individual actions GET, PUT, DELETE, etc. of a resource such as a role, it’s not enough to simply implement the so-called HappyPath. The possible stages of such an action should be considered during the design phase. For the implementation of a consumer (client), it makes a significant difference whether a request that cannot be completed with a 200 failed because the resource does not exist (404) or because access was denied (403). Here, I’d like to allude to the telling Windows message about the unexpected error.

Conclusion

When we talk about an API, we mean an interface that can be used by other programs. A major version change indicates to API consumers that there is an incompatibility with the previous version. This may require adjustments. It is completely irrelevant what type of API it is or whether the application uses it publicly or internally via the fetchRole method. The resulting consequences are identical. For this reason, you should carefully consider the externally visible areas of your application.

Work that leads to API incompatibility should be bundled by release management and, if possible, released no more than once per year. This also demonstrates the importance of regular code inspections for consistent quality.

Resources

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

Tooltime: SCM-Manager

If you and your team are dealing with tools like Git or Subversion, you may need an administrative layer where you are able to manage user access and repositories in a comfortable way, because source control management systems (SCM) don’t bring this functionality out of the box.

Perhaps you are already familiar with popular management solutions like GitHub, GitBlit or GitLab. The main reason for their success is their huge functionality. And of course, if you plan to create your own build and deploy pipeline with an automation server like Jenkins you will need to host your own repository manager too.

As great as the usage of GitLab and other solutions is, there is also a little bitter taste:

  • The administration is very complicated and requires some experience.
  • The minimal requirement of hardware resources to operate those programs with good performance is not that little.

To overcome all these hurdles, I will introduce a new star on the toolmaker’s sky SCM-Manager [1]. Fast, compact, extendable and simple, are the main attributes I would use to describe it.

Kick Starter: Installation

Let’s have a quick look at how easy the installation is. For fast results, you can use the official Docker container [2]. All it takes is a short command:

docker run --name scm –restart=always \ 
-p 8080 -p 2222 \
-v /home/<user>/scmManager:/var/lib/scm \
scmmanager/scm-manager:2.22.0

First, we create a container named scm based on the SCM-Manager image 2.22.0. Then, we tell the container to always restart when the host operating system is rebooted. Also, we open the ports 2222 and 8080 to make the service accessible. The last step is to mount a directory inside the container, where all configuration data and repositories are stored.

Another option to get the SCM-Manager running on a Linux server like Ubuntu is by using apt. The listing below shows how to do the installation.

echo 'deb [arch=all] https://packages.scm-manager.org/repository/apt-v2-releases/ stable main' | sudo tee /etc/apt/sources.list.d/scm-manager.list  
sudo apt-key adv --recv-keys --keyserver hkps://keys.openpgp.org 0x975922F193B07D6E 
sudo apt-get update 
sudo apt-get install scm-server

SCM-Manager can also be installed on systems like Windows or Apple. You can find information about the installations on additional systems on the download page [3]. When you perform an installation, you will find a log entry with a startup token in the console.

Startup token in the command line.

After this you can open your browser and type localhost:8080, where you can finish the installation by creating the initial administration account. In this form, you need to paste the startup token from the command line, as it is shown in image 2. After you submitted the initialization form, you get redirected to the login. That’s all and done in less than 5 minutes.

Initialization screen.

For full scripted untouched installations, there is also a way to bypass the Initialization form by using the system property scm.initalPassword. This creates a user named scmadmin with the given password.

In older versions of the SCM-Manager, the default login account was scmadmin with the password scmadmin. This old way is quite helpful but if the administrator doesn’t disable this account after the installation, there is a high-security risk. This security improvement is new since version 2.21.

Before we discover more together about the administration, let’s first get to some details about the SCM-Manager in general. SCM-Manager is open source under MIT license. This allows commercial usage. The Code is available on GitHub. The project started as research work. Since Version 2 the company Cloudogu took ownership of the codebase and manages the future development. This construct allows the offering of professional enterprise support for companies. Another nice detail is that the SCM-Manager is made in Germany.

Pimp Me Up: Plugins

One of the most exciting details of using the SCM-Manager is, that there is a simple possibility to extend the minimal installation with plugins to add more useful functions. But be careful, because the more plugins are installed, the more resources the SCM-Manager needs to be allocated. Every development team has different priorities and necessities, for this reason, I’m always a fan of customizing applications to my needs.

Installed Plugins.

The plugin installation section is reachable by the Administration tab. If you can’t see this entry you don’t have administration privileges. In the menu on the right side, you find the entry Plugins. The plugin menu is divided into two sections: installed and available. For a better overview, the plugins are organized by categories like Administration, Authorization, or Workflow. The short description for each plugin is very precise and gives a good impression of what they do.

Some of the preinstalled plugins like in the category Source Code Management for supported repository types Git, Subversion, and Mercurial can’t be uninstalled.

Some of my favorite plugins are located in the authorization section:

  • Path Write Protection, Branch Write Protection, and,
  • Tag Protection.

Those features are the most convenient for Build- and Configuration Managers. The usage is also as simple as the installation. Let’s have a look at how it works and for what it’s necessary.

Gate Keeper: Special Permissions

Imagine, your team deals for example whit a Java/Maven project. Perhaps it exists a rule that only selected people should be allowed to change the content of the pom.xml build logic. This can be achieved with the Path Write Protection Plugin. Once it is installed, navigate to the code repository and select the entry Settings in the menu on the right side. Then click on the option Path Permissions and activate the checkbox.

Configuring Path permissions.

As you can see in image 4, I created a rule that only the user Elmar Dott is able to modify the pom.xml. The opposite permission is exclude (deny) the user. If the file or a path expression doesn’t exist, the rule cannot be created. Another important detail is, that this permission covers all existing branches. For easier administration, existing users can be organized into groups.

In the same way, you are able to protect branches against unwanted changes. A scenario you could need this option is when your team uses massive branches or the git-flow branch model. Also, personal developer branches could have only write permission for the developer who owns the branch or the release branch where the CI /CD pipeline is running has only permissions for the Configuration Management team members.

Let’s move ahead to another interesting feature, the review plugin. This plugin enables pull requests for your repositories. After installing the review plugin, a new bullet point in the menu of your repositories appears, it’s called Pull Requests.

Divide and Conquer: Pull Requests

On the right hand, pull requests [4] are a very powerful workflow. During my career, I often saw the misuse of pull requests, which led to drastically reduced productivity. For this reason, I would like to go deeper into the topic.

Originally, pull requests were designed for open source projects to ensure code quality. Another name for this paradigm is dictatorship workflow [5]. Every developer submits his changes to a repository and the repository owner decides which revision will be integrated into the codebase.

If you host your project sources on GitHub, strangers can’t just collaborate in your project, they first have to fork the repository into their own GitHub space. After they commit some revisions to this forked repository, they can create a pull request to the original repository. As repository owner, you can now decide whether you accept the pull request.

The SCM tool IBM Synergy had a similar strategy almost 20 years ago. The usage got too complicated so that many companies decided to move to other solutions. These days, it looks like history is repeating itself.

The reason why I’m skeptical about using pull requests is very pragmatic. I often observed in projects that the manager doesn’t trust the developers. Then he decides to implement the pull request workflow and makes the lead developer or the architect accept the pull requests. These people are usually too busy and can’t really check all details of each single pull request. Hence, their solution is to simply merge each pull request to the code base and check if the CI pipeline still works. This way, pull requests are just a waste of time.

There is another way how pull requests can really improve the code quality in the project: if they are used as a code review tool. How this is going to work, will fill another article. For now, we leave pull requests and move to the next topic about the creation of repositories.

Treasure Chest: Repository Management

The SCM-Manager combines three different source control management repository types: Git, Subversion (SVN), and Mercurial. You could think that nobody uses Subversion anymore, but keep in mind that many companies have to deal with legacy projects managed with SVN. A migration from those projects to other technologies may be too risky or simply expensive. Therefore, it is great to have a solution that can manage more than one repository type.

If you are Configuration Manager and have to deal with SVN, keep in mind that some things are a bit different. Subversion organizes branches and tags in directories. An SVN repository usually gets initialized with the folders:

  • trunk — like the master branch in Git.
  • branches — references to revisions in the trunk were forked code changes can committed.
  • tags — like branches without new code revisions.

In Git you don’t need this folder structure, because how branches are organized is completely different. Git (and Mercurial) compared to Subversion is a distributed Source Control Management System and branches are lose coupled and can easily be deleted if they are obsolete. As of now, I don’t want to get lost in the basics of Source Control Management and jump to the next interesting SCM-Manager plugins.

Uncover Secrets

If a readme.md file is located in the root folder of your project, you could be interested in the readme plugin. Once this plugin is activated and you navigate into your repository the readme.md file will be rendered in HTML and displayed.

The rendered readme.md of a repository.

If you wish to have a readable visualization of the repository’s activities, the activity plugin could be interesting for you. It creates a navigation entry in the header menu called Activity. There you can see all commit log entries and you can enter into a detailed view of the selected revision.

The activity view.

This view also contains a compare and history browser, just like clients as TortoiseGit does.

The Repository Manager includes many more interesting details for the daily work. There is even a code editor, which allows you to modify files directly in the SCM-Manger user interface.

Next, we will have a short walk through the user management and user roles.

Staffing Office: User and Group Management

Creating new users is like almost every activity of the SCM-Manager a simple thing. Just switch to the Users tab and press the create user button. Once you have filled out the form and saved it, you will be brought back to the Users overview.

Creating a new user

Here you can already see the newly created user. After this step, you will need to administrate the user’s permissions, because as of now it doesn’t have any privileges. To change that just click on the name of the newly created user. On the user’s detail page, you need to select the menu entry Settings on the right side. Now choose the new entry named Permissions. Here you can select from all available permissions the ones you need for the created account. Once this is done and you saved your changes, you can log out and log in with your new user, to see if your activity was a success.

If you need to manage a massive number of users it’s a good idea to organize them into groups. That means after a new user is created the permissions inside the user settings will not be touched and stay empty. Group permissions can be managed through the Groups menu entry in the header navigation. Create a new group and select Permission from the right menu. This configuration form is the same as the one of the user management. If you wish to add existing users to a group switch to the point General. In the text field Members, you can search for an existing user. If the right one is selected you need to press the Add Member button. After this, you need to submit the form and all changes are saved and the new permissions got applied.

To have full flexibility, it is allowed to add users to several groups (roles). If you plan to manage the SCM-Manager users by group permissions, be aware not to combine too many groups because then users could inherit rights you didn’t intend to give them. Currently, there is no compact overview to see in which groups a user is listed and which permissions are inherited by those groups. I’m quite sure in some of the future versions of the SCM-Manager this detail will be improved.

Besides the internal SCM-Manager user management exist some plugins where you are able to connect the application with LDAP.

Lessons Learned

If you dared to wish for a simpler life in the DevOps world, maybe your wish became true. The SCM-Manager could be your best friend. The application offers a lot of functionality that I briefly described here, but there are even more advanced features that I haven’t even mentioned in this short introduction: There is a possibility to create scripts and execute them with the SCM-Manager API. Also, a plugin for the Jenkins automation server is available. Other infrastructure tools like Jira, Timescale, or Prometheus metrics gathering have an integration to the SCM-Manager.

I hope that with this little article I was able to whet your appetite for this exciting tool and I hope you enjoy trying it out.

Resources

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

Version Number Anti-Patterns

After the gang of four (GOF) Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides published the book, Design Patterns: Elements of Reusable Object-Oriented Software, learning how to describe problems and solutions became popular in almost every field in software development. Likewise, learning to describe don’ts and anti-pattern became equally as popular.

In publications that discussed these concepts, we find helpful recommendations for software design, project management, configuration management, and much more. In this article, I will share my experience dealing with version numbers for software artifacts.

Most of us are already familiar with a method called semantic versioning, a powerful and easy-to-learn rule set for how version numbers have to be structured and how the segments should increase.

Version numbering example:

  • Major: Incompatible API changes.
  • Minor: Add new functionality.
  • Patch: Bugfixes and corrections.
  • Label: SNAPSHOT marking the “under development” status.

An incompatible API Change occurs when an externally accessible function or class was deleted or renamed. Another possibility is a change in the signature of a method. This means the return value or parameters has been changed from its original implementation. In these scenarios, it’s necessary to increase the Major segment of the version number. These changes present a high risk for API consumers because they need to adapt their own code.

When dealing with version numbers, it’s also important to know that 1.0.0 and 1.0 are equal. This has effect to the requirement that versions of a software release have to be unique. If not, it’s impossible to distinguish between artifacts. Several times in my professional experience, I was involved in projects where there was no well-defined processes for creating version numbers. The effect of these circumstances was that the team had to secure the quality of the artifact and got confused with which artifact version they were currently dealing with.

The biggest mistake I ever saw was the storage of the version of an artifact in a database together with other configuration entries. The correct procedure should be: place the version inside the artifact in a way that no one after a release can change from outside. The trap you could fall into is the process of how to update the version after a release or installation.

Maybe you have a checklist for all manual activities during a release. But what happens after a release is installed in a testing stage and for some reason another version of the application has to be installed. Are you still aware of changing the version number manually? How do you find out which version is installed or when the information of the database is incorrect?

Detect the correct version in this situation is a very difficult challenge. For that reason, we have the requirement to keep the version inside of the application. In the next step, we will discuss a secure and simple way on how to solve an automatic approach to this problem.

Our precondition is a simple Java library build with Maven. By default, the version number of the artifact is written down in the POM. After the build process, our artifact is created and named like: artifact-1.0.jar or similar. As long we don’t rename the artifact, we have a proper way to distinguish the versions. Even after a rename with a simple trick of packaging and checking, then, in the META-INF folder, we are able to find the correct value.

If you have the Version hardcoded in a property or class file, it would also work fine, as long you don’t forget to always update it. Maybe the branching and merging in SCM systems like Git could need your special attention to always have the correct version in your codebase.

Another solution is using Maven and the token placement mechanism. Before you run to try it out in your IDE, keep in mind that Maven uses to different folders: sources and resources. The token replacement in sources will not work properly. After a first run, your variable is replaced by a fixed number and gone. A second run will fail. To prepare your code for the token replacement, you need to configure Maven as a first in the build lifecycle:

<build>
   <resources>
      <resource>
         <directory>src/main/resources/</directory>
         <filtering>true</filtering>
      </resource> 
   </resources>
   <testResources>
      <testResource>
         <directory>src/test/resources/</directory>
         <filtering>true</filtering>
      </testResource>
   </testResources>
</build>

After this step, you need to know the ${project.version} property form the POM. This allows you to create a file with the name version.property in the resources directory. The content of this file is just one line: version=${project.version}. After a build, you find in your artifact the version.property with the same version number you used in your POM. Now, you can write a function to read the file and use this property. You could store the result in a constant for use in your program. That’s all you have to do!

Example: https://github.com/ElmarDott/TP-CORE/blob/master/src/main/java/org/europa/together/utils/Constraints.java

Non-Functional Requirements: Quality

By experience, most of us know how difficult it is to express what we mean talking about quality. Why is that so?  There exist many different views on quality and every one of them has its importance. What has to be defined for our project is something that fits its needs and works with the budget. Trying to reach perfectionism can be counterproductive if a project is to be terminated successfully. We will start based on a research paper written by B. W. Boehm in 1976 called “Quantitative evaluation of software quality.” Boehm highlights the different aspects of software quality and the right context. Let’s have a look more deeply into this topic.

When we discuss quality, we should focus on three topics: code structure, implementation correctness, and maintainability. Many managers just care about the first two aspects, but not about maintenance. This is dangerous because enterprises will not invest in individual development just to use the application for only a few years. Depending on the complexity of the application the price for creation could reach hundreds of thousands of dollars. Then it’s understandable that the expected business value of such activities is often highly estimated. A lifetime of 10 years and more in production is very typical. To keep the benefits, adaptions will be mandatory. That implies also a strong focus on maintenance. Clean code doesn’t mean your application can simply change. A very easily understandable article that touches on this topic is written by Dan Abramov. Before we go further on how maintenance could be defined we will discuss the first point: the structure.

Scaffolding Your Project

An often underestimated aspect in development divisions is a missing standard for project structures. A fixed definition of where files have to be placed helps team members find points of interests quickly. Such a meta-structure for Java projects is defined by the build tool Maven. More than a decade ago, companies tested Maven and readily adopted the tool to their established folder structure used in the projects. This resulted in heavy maintenance tasks, given the reason that more and more infrastructure tools for software development were being used. Those tools operate on the standard that Maven defines, meaning that every customization affects the success of integrating new tools or exchanging an existing tool for another.

Another aspect to look at is the company-wide defined META architecture. When possible, every project should follow the same META architecture. This will reduce the time it takes a new developer to join an existing team and catch up with its productivity. This META architecture has to be open for adoptions which can be reached by two simple steps:

  1. Don’t be concerned with too many details;
  2. Follow the KISS (Keep it simple, stupid.) principle.

A classical pattern that violates the KISS principle is when standards heavily got customized. A very good example of the effects of strong customization is described by George Schlossnagle in his book “Advanced PHP Programming.” In chapter 21 he explains the problems created for the team when adopting the original PHP core and not following the recommended way via extensions. This resulted in the effect that every update of the PHP version had to be manually manipulated to include its own development adaptations to the core. In conjunction, structure, architecture, and KISS already define three quality gates, which are easy to implement.

The open-source project TP-CORE, hosted on GitHub, concerns itself with the afore-mentioned structure, architecture, and KISS. There you can find their approach on how to put it in practice. This small Java library rigidly defined the Maven convention with his directory structure. For fast compatibility detection, releases are defined by semantic versioning. The layer structure was chosen as its architecture and is fully described here. Examination of their main architectural decisions concludes as follows:

Each layer is defined by his own package and the files following also a strict rule. No special PRE or POST-fix is used. The functionality Logger, for example, is declared by an interface called Logger and the corresponding implementation LogbackLogger. The API interfaces can detect in the package “business” and the implementation classes located in the package “application.” Naming like ILogger and LoggerImpl should be avoided. Imagine a project that was started 10 years ago and the LoggerImpl was based on Log4J. Now a new requirement arises, and the log level needs to be updated during run time. To solve this challenge, the Log4J library could be replaced with Logback. Now it is understandable why it is a good idea to name the implementation class like the interface, combined with the implementation detail: it makes maintenance much easier! Equal conventions can also be found within the Java standard API. The interface List is implemented by an ArrayList. Obviously, again the interface is not labeled as something like IList and the implementation not as ListImpl .

Summarizing this short paragraph, a full measurement rule set was defined to describe our understanding of structural quality. By experience, this description should be short. If other people can easily comprehend your intentions, they willingly accept your guidance, deferring to your knowledge. In addition, the architect will be much faster in detecting rule violations.

Measure Your Success

The most difficult part is to keep a clean code. Some advice is not bad per se, but in the context of your project, may not prove as useful. In my opinion, the most important rule would be to always activate the compiler warning, no matter which programming language you use! All compiler warnings will have to be resolved when a release is prepared. Companies dealing with critical software, like NASA, strictly apply this rule in their projects resulting in utter success.

Coding conventions about naming, line length, and API documentation, like JavaDoc, can be simply defined and observed by tools like Checkstyle. This process can run fully automated during your build. Be careful; even if the code checkers pass without warnings, this does not mean that everything is working optimally. JavaDoc, for example, is problematic. With an automated Checkstyle, it can be assured that this API documentation exists, although we have no idea about the quality of those descriptions.

There should be no need to discuss the benefits of testing in this case; let us rather take a walkthrough of test coverage. The industry standard of 85% of covered code in test cases should be followed because coverage at less than 85% will not reach the complex parts of your application. 100% coverage just burns down your budget fast without resulting in higher benefits. A prime example of this is the TP-CORE project, whose test coverage is mostly between 92% to 95%. This was done to see real possibilities.

As already explained, the business layer contains just interfaces, defining the API. This layer is explicitly excluded from the coverage checks. Another package is called internal and it contains hidden implementations, like the SAX DocumentHandler. Because of the dependencies the DocumentHandler is bound to, it is very difficult to test this class directly, even with Mocks. This is unproblematic given that the purpose of this class is only for internal usage. In addition, the class is implicitly tested by the implementation using the DocumentHandler. To reach higher coverage, it also could be an option to exclude all internal implementations from checks. But it is always a good idea to observe the implicit coverage of those classes to detect aspects you may be unaware of.

Besides the low-level unit tests, automated acceptance tests should also be run. Paying close attention to these points may avoid a variety of problems. But never trust those fully automated checks blindly! Regularly repeated manual code inspections will always be mandatory, especially when working with external vendors. In our talk at JCON 2019, we demonstrated how simply test coverage could be faked. To detect other vulnerabilities you can additionally run checkers like SpotBugs and others more.

Tests don’t indicate that an application is free of failures, but they indicate a defined behavior for implemented functionality.

For a while now, SCM suites like GitLab or Microsoft Azure support pull requests, introduced long ago in GitHub. Those workflows are nothing new; IBM Synergy used to apply the same technique. A Build Manager was responsible to merge the developers’ changes into the codebase. In a rapid manner, all the revisions performed by the developer are just added into the repository by the Build Manager, who does not hold a sufficiently profound knowledge to decide about the implementation quality. It was the usual practice to simply secure that the build is not broken and always the compile produce an artifact.

Enterprises have discovered this as a new strategy to handle pull requests. Now, managers often make the decision to use pull requests as a quality gate. In my personal experience, this slows down productivity because it takes time until the changes are available in the codebase. Understanding of the branch and merge mechanism helps you to decide for a simpler branch model, like release branch lines. On those branches tools like SonarQube operate to observe the overall quality goal.

If a project needs an orchestrated build, with a defined order how artifacts have to create, you have a strong hint for a refactoring.

The coupling between classes and modules is often underestimated. It is very difficult to have an automated visualization for the bindings of modules. You will find out very fast the effect it has when a light coupling is violated because of an increment of complexity in your build logic.

Repeat Your Success

Rest assured, changes will happen! It is a challenge to keep your application open for adjustments. Several of the previous recommendations have implicit effects on future maintenance. A good source quality simplifies the endeavor of being prepared. But there is no guarantee. In the worst cases the end of the product lifecycle, EOL is reached, when mandatory improvements or changes cannot be realized anymore because of an eroded code base, for example.

As already mentioned, light coupling brings with it numerous benefits with respect to maintenance and reutilization. To reach this goal is not that difficult as it might look. In the first place, try to avoid as much as possible the inclusion of third-party libraries. Just to check if a String is empty or NULL it is unnecessary to depend on an external library. These few lines are fast done by oneself. A second important point to be considered in relation to external libraries: “Only one library to solve a problem.” If your project deals with JSON then decide one one implementation and don’t incorporate various artifacts. These two points heavily impact on security: a third-party artifact we can avoid using will not be able to cause any security leaks.

After the decision is taken for an external implementation, try to cover the usage in your project by applying design patterns like proxy, facade, or wrapper. This allows for a replacement more easily because the code changes are not spread around the whole codebase. You don’t need to change everything at once if you follow the advice on how to name the implementation class and provide an interface. Even though a SCM is designed for collaboration, there are limitations when more than one person is editing the same file. Using a design pattern to hide information allows you an iterative update of your changes.

Conclusion

As we have seen: a nonfunctional requirement is not that difficult to describe. With a short checklist, you can clearly define the important aspects for your project. It is not necessary to check all points for every code commit in the repository, this would with all probability just elevate costs and doesn’t result in higher benefits. Running a full check around a day before the release represents an effective solution to keep quality in an agile context and will help recognizing where optimization is necessary. Points of Interests (POI) to secure quality are the revisions in the code base for a release. This gives you a comparable statistic and helps increasing estimations.

Of course, in this short article, it is almost impossible to cover all aspects regarding quality. We hope our explanation helps you to link theory by examples to best practice. In conclusion, this should be your main takeaway: a high level of automation within your infrastructure, like continuous integration, is extremely helpful, but doesn’t prevent you from manual code reviews and audits.

Checklist

  • Follow common standards
  • KISS – keep it simple, stupid!
  • Equal directory structure for different projects
  • Simple META architecture, which can reuse as much as possible in other projects
  • Defined and follow coding styles
  • If a release got prepared – no compiler warnings are accepted
  • Have test coverage up to 85%
  • Avoid third-party libraries as much as possible
  • Don’t support more than one technology for a specific problem (e. g., JSON)
  • Cover foreign code by a design pattern
  • Avoid strong object/module coupling

Acceptance Tests in Java With JGiven

Most of the developer community know what a unit test is, even they don’t write them. But there is still hope. The situation is changing. More and more projects hosted on GitHub contain unit tests.

In a standard set-up for Java projects like NetBeans, Maven, and JUnit, it is not that difficult to produce your first test code. Besides, this approach is used in Test Driven Development (TDD) and exists in other technologies like Behavioral Driven Development (BDD), also known as acceptance tests, which is what we will focus on in this article.

Difference Between Unit and Acceptance Tests

The easiest way to become familiar with this topic is to look at a simple comparison between unit and acceptance tests. In this context, unit tests are very low level. They execute a function and compare the output with an expected result. Some people think differently about it, but in our example, the only one responsible for a unit test is the developer.

Keep in mind that the test code is placed in the project and always gets executed when the build is running. This provides quick feedback as to whether or not something went wrong. As long the test doesn’t cover too many aspects, we are able to identify the problem quickly and provide a solution. The design principle of those tests follows the AAA paradigm. Define a precondition (Arrange), execute the invariant (Act), and check the postconditions (Assume). We will come back to this approach a little later.

When we check the test coverage with tools like JaCoCo and cover more than 85 percent of our code with test cases, we can expect good quality. During the increasing test coverage, we specify our test cases more precisely and are able to identify some optimizations. This can be removing or inverting conditions because during the tests we find out, it is almost impossible to reach those sections. Of course, the topic is a bit more complicated, but those details could be discussed in another article.

Acceptance test are same classified like unit tests. They belong to the family of regression tests. This means we want to observe if changes we made on the code have no effects on already worked functionality. In other words, we want to secure that nothing already is working got broken, because of some side effects of our changes. The tool of our choice is JGiven [1]. Before we look at some examples, first, we need to touch on a bit of theory.

JGiven In-Depth

The test cases we define in JGiven is called a scenario. A scenario is a collection of four classes, the scenario itself, the Given displayed as given (Arrange), the Action displayed as when (Act) and Outcome displayed as then (Assume).

In most projects, especially when there is a huge amount of scenarios and the execution consumes a lot of time, acceptance tests got organized in a separate project. With a build job on your CI server, you can execute those tests once a day to get fast feedback and to react early if something is broken. The code example we demonstrate contains everything in one project on GitHub [2] because it is just a small library and a separation would just over-engineer the project. Usually, the one responsible for acceptance tests is the test center, not the developer.

The sample project TP-CORE is organized by a layered architecture. For our example, we picked out the functionality for sending e-mails. The basic functionality to compose an e-mail is realized in the application layer and has a test coverage of up to 90 percent. The functionality to send the e-mail is defined in the service layer.

In our architecture, we decided that the service layer is our center of attention to defining acceptance tests. Here, we want to see if our requirement to send an e-mail is working well. Supporting this layer with our own unit tests is not that efficient because, in commercial projects, it just produces costs without winning benefits. Also, having also unit tests means we have to do double the work because our JGiven tests already demonstrate and prove that our function is well working. For those reasons, it makes no sense to generate test coverage for the test scenarios of the acceptance test.

Let’s start with a practice example. At first, we need to include our acceptance test framework into our Maven build. In case you prefer Gradle, you can use the same GAV parameters to define the dependencies in your build script.

<dependency>
   <groupId>com.tngtech.jgiven</groupId>
   <artifactId>jgiven-junit</artifactId>
   <version>0.18.2</version>
   <scope>test</scope>
</dependency>
XML

Listing 1: Dependency for Maven.

As you can see in listing 1, JGiven works well together with JUnit. An integration to TestNG also exists , you just need to replace the artifactId for jgiven-testng. To enable the HTML reports, you need to configure the Maven plugin in the build lifecycle, like it is shown in Listing 2.

<build> 
   <plugins> 
      <plugin>
         <groupId>com.tngtech.jgiven</groupId>
         <artifactId>jgiven-maven-plugin</artifactId>
         <version>0.18.2</version>
         <executions>
            <execution>
               <goals>
                  <goal>report</goal>
               </goals>
            </execution>
         </executions>
         <configuration>
            <format>html</format>
         </configuration>
      </plugin>
   </plugins>
</build>
XML

Listing 2: Maven Plugin Configuration for JGiven.

The report of our scenarios in the TP-CORE project is shown in image 1. As we can see, the output is very descriptive and human-readable. This result will be explained by following some naming conventions for our methods and classes, which will be explained in detail below. First, let’s discuss what we can see in our test scenario. We defined five preconditions:

  1. The configuration for the SMPT server is readable
  2. The SMTP server is available
  3. The mail has a recipient
  4. The mail has attachments
  5. The mail is full composed

If all these conditions are true, the action will send a single e-mail got performed. Afterward, after the SMTP server is checked, we see that the mail has arrived. For the SMTP service, we use the small Java library greenmail [3] to emulate an SMTP server. Now it is understandable why it is advantageous for acceptance tests if they are written by other people. This increases the quality as early on conceptional inconsistencies appear. Because as long as the tester with the available implementations cannot map the required scenario, the requirement is not fully implemented.

Producing Descriptive Scenarios

Now is the a good time to dive deeper into the implementation details of our send e-mail test scenario. Our object under test is the class MailClientService. The corresponding test class is  MailClientScenarioTest, defined in the test packages. The scenario class definition is shown in listing 3.

@RunWith(JUnitPlatform.class) 
public class MailClientScenarioTest
       extends <strong>ScenarioTest</strong><MailServiceGiven, MailServiceAction, MailServiceOutcome> { 
    // do something 
}

Listing 3: Acceptance Test Scenario for JGiven.

As we can see, we execute the test framework with JUnit5. In the  ScenarioTest, we can see the three classes: Given, Action, and Outcome in a special naming convention. It is also possible to reuse already defined classes, but be careful with such practices. This can cost some side effects. Before we now implement the test method, we need to define the execution steps. The procedure for the three classes are equivalent.

@RunWith(JUnitPlatform.class) 
public class MailServiceGiven 
       extends <strong>Stage</strong><MailServiceGiven> { 

    public MailServiceGiven email_has_recipient(MailClient client) {
        try { 
            assertEquals(1, client.getRecipentList().size());
        } catch (Exception ex) {
            System.err.println(ex.getMessage);
        }
        return self(); 
    } 
} 

@RunWith(JUnitPlatform.class)
public class MailServiceAction
       extends <strong>Stage</strong><MailServiceAction> { 

    public MailServiceAction send_email(MailClient client) {
        MailClientService service = new MailClientService();
        try {
            assertEquals(1, client.getRecipentList().size());
            service.sendEmail(client);
        } catch (Exception ex) { 
            System.err.println(ex.getMessage);
        }
        return self();
    }
}

@RunWith(JUnitPlatform.class)
public class MailServiceOutcome 
       extends <strong>Stage</strong><MailServiceOutcome> {

    public MailServiceOutcome email_is_arrived(MimeMessage msg) { 
         try {
             Address adr = msg.getAllRecipients()[0];
             assertEquals("JGiven Test E-Mail", msg.getSubject());
             assertEquals("noreply@sample.org", msg.getSender().toString());
             assertEquals("otto@sample.org", adr.toString());
             assertNotNull(msg.getSize());
         } catch (Exception ex) {
             System.err.println(ex.getMessage);
         }
         return self();
    }
}

Listing 4: Implementing the AAA Principle for Behavioral Driven Development.

Now, we completed the cycle and we can see how the test steps got stuck together. JGiven supports a bigger vocabulary to fit more necessities. To explore the full possibilities, please consult the documentation.

Lessons Learned

In this short workshop, we passed all the important details to start with automated acceptance tests. Besides JGiven exist other frameworks, like Concordion or FitNesse fighting for usage. Our choice for JGiven was its helpful documentation, simple integration into Maven builds and JUnit tests, and the descriptive human-readable reports.

As negative point, which could people keep away from JGiven, could be the detail that you need to describe the tests in the Java programming language. That means the test engineer needs to be able to develop in Java, if they want to use JGiven. Besides this small detail, our experience with JGiven is absolutely positive.