El manejo de excepciones debe ser un conocimiento básico para los desarrolladores de Java. Pero un uso seguro no es tan fácil como parece en un primer momento. También varios libros recomiendan no usar excepciones para evitar problemas de rendimiento. En el medio, nuestro propio código está luchando por las excepciones y tenemos que encontrar el punto exacto de donde proviene el problema. No siempre es una tarea fácil. Porque la posición en la que se detectó la excepción a menudo debe mejorarse para recopilar información relevante. En esta charla comparto mi experiencia sobre cómo tratar en general las excepciones. Explico con ejemplos cómo tratar las excepciones y cuándo es mejor evitar el uso de excepciones. Después de esta presentación, la cantidad de información en sus mensajes de error no aumenta, porque obtendremos la información importante para solucionar el problema donde ocurra.
El flujo del programa no solo se define mediante sentencias if-else. Las excepciones le permiten manejar los problemas antes de que ocurran, si lo hace bien. Aprenda cómo recopilar información sobre las excepciones lanzadas y vea la práctica que debe evitar.
Abstract:In the last decades, many standards were established to increase productivity during Software Lifecycle Management. All these techniques and methodologies promise a higher success rate in software projects which could affirm themselves in the case the involved protagonists are willing to follow the instances recommended. Semantic Versioning, for example, addresses the information leak between functional changes, BugFixes and compatibility of existing and future releases of artifacts. Diving deeper into the daily craftsmanship of software projects enables us to identify the Source Control Management Systems (SCM) as a big treasure box. Much information can be extracted from these repositories, which are currently ignored for project analyzing. Expressions on SCM Commit Messages represent a new formalism that is both human-readable and machine-processable. Such a standard also forms a bridge between the code base and the requirements management and release management, since these activities are identified by a freely expandable vocabulary in the SCM. Another advantage of this strategy is the clear and compact expressiveness for development teams. A very practical aspect of my proposal is the easy applicability of the presented solution in real software development projects. As with the Semantic Versioning methodology already mentioned, there are no additional technical requirements to be met, since commit messages are a fundamental function of SCM systems. This paper discuss the option to improve data collection for controlling software projects and knowledge sharing in collaborative teams.
To cite this article: Marco Schulz. Expressions for Source Control Management Systems. American Journal of Software Engineering and Applications. Vol. 11, No. 2, 2022, pp. 22-30. doi: 10.11648/j.ajsea.20221102.11
Download the PDF: https://www.sciencepg.com/journal/paperinfo?journalid=137&doi=10.11648/j.ajsea.20221102.11
1. Introduction
Thinking about SCM systems we have to keep in mind, that since the first roll out of CVS in the early 1990‘s and today, many things have changed. Searching the free online encyclopedia Wikipedia, presents a page ”Comparison of Version Control Software” which contains an overview of version control software of more than 30 SCM tools. This gives an idea why software companies usually have around three or more different SCM systems in work – of course the real amount depends on how many years they are in business.
The possibility to attach every revision in SCM Systems with a commit message allows the developer to inform other users with a short explanation of his work. This feature is extremely helpful by browsing the history manually in search of special code changes. If these commit messages well structured there exist a possibility to grab automated information of project growth. In this paper on expressions is introduced as solution for structured commit messages which could processed by software and also helps developers to resume their work more efficient.
The list of research on SCM is quite overwhelming and covers multiple aspects. The work of Walter F. Tichy on RCS [2] presents a deep fundamental insight into technical aspects of SCM systems. Abdullah Uz Tansel et al. gives in his research a brief history and builds a bridge to nowadays SCM systems [11]. The paper of Christian Bird et al. describes the ideas why companies deal with various SCM solutions [12]. Many existing papers like the one from Filip Van Rysselberghe and Serge Demeyer already identified SCM repositories as a significant information storage [5], which contains more than a simple history of source code. The approach from Louis Glassy to observe the growth of students in the software development process by using SCM techniques [6] demonstrates another method to grab implicit information from SCM. Alongside the fundamental research in software engineering, there exists a great resource of Blogs, articles and books from people who are directly involved in the topic. They describe experiences and best practice to make the next release come true, as referred towards the web resources in the footnotes. A small selection of related practitioners books is also included in the reference list.
Let us take a closer look at how processes for SCM could be improved. For this reason, section II defines the terminology of this paper and talks in detail about merging and branching strategies. Section III remind some basic knowledge on SCM and gives a simple idea about how complex build and deploy pipelines interact. Following this quick journey, section IV draws a picture about real problems that occur in software development projects and explains possible Points of Interest (POI) inside an SCM repository. These fundamentals allow a definition of the vocabulary we introduce in section V. A real world example will demonstrate in VI the cardinality of the expression and gives ideas about its usage. After all, section VII will reflect and summarize these thoughts. The last section talks about ideas how future work could be continued.
Figure 1: Branch and Merge.
The definitions in this section are based on the English dictionary Merriam Webster with a contextual relation to SCM systems. The term Source Control Management System (SCM) is applied in this paper to describe tools like CVS, Subversion (SVN) or Git. Many other names have appeared over the years in literature for this type of tools. All these terms like Version Control System (VCS) or Revision Control System (RCS) are considered as equal to each other.
Artifact “A USUALLY SIMPLE OBJECT (SUCH AS A TOOL OR ORNAMENT) SHOWING HUMAN WORKMANSHIP OR MODIFICATION AS DISTINGUISHED FROM A NATURAL. OBJECT; “ESPECIALLY: AN OBJECT REMAINING FROM A PARTICULAR PERIOD”. In the context of SCM, an artifact is a binary result of the build process. Artifacts can be libraries, applications and so on.
Repository “A PLACE, ROOM, OR CONTAINER WHERE IS DEPOSITED OR STORED”. In software engineering a repository denotes a managed storage. We can distinguish repositories for source code and for binary artifacts.
Revision “A CHANGE OR A SET OF CHANGES THAT CORRECTS OR IMPROVES SOMETHING”. Each successful commit from a user to the SCM represents a change of the internal state in the SCM. These different states are revisions. Subversion for example increments an internal number after each commit [18]. This unique identifier is called revision number. Git on the other hand manages the revision number smarter and creates SHA-1 Hashes from each commit as an identifier [15]. This brings more flexibility for dealing with branches.
Release “TO GIVE PERMISSION FOR PUBLICATION, PERFORMANCE, EXHIBITION, OR SALE OF; ALSO: TO MAKE AVAILABLE TO THE PUBLIC”. A release defines a set of functional assertions for an artifact. When all functions are implemented, a test procedure is started to exclude as many failures as possible. After the termination of testing and corrections, the artifact gets packed for delivery. To distinguish the different versions of an artifact, it gets labeled by a unique version number. By convention, it is not allowed to have more than one artifact with the same version number.
Tag “A DESCRIPTIVE OR IDENTIFYING EPITHET”. -A Tag is a label to a special revision, like a release, and is used as bookmark.
Trunk “THE CENTRAL PART OF ANYTHING”. A trunk is a common convention and means the main branch, where the current development happens [17]. In Git this branch is called master for the local repository and orgin in the remote repository. Branching and Merging is one of the major feature in SCM systems and also a high sophisticated operation. It is not so unusual that developers and also Configuration Managers struggle with this. The paper of Shaun Phillips et al. contains a developer comment about the dealing with SCM and the pain of merging [10].
“We are a team of four senior developers (by which I mean we’re all over 40 with 20+ years each of development experience) and not one of us has had a positive experience in the past with branching the mainline… The branch is easy – it’s the merge at the end that’s painful.”
This shows that even persons with many years of experience need a detailed explanation of a seemingly trivial procedure. A simple understanding how branches typically have to be used and how they represent the evolution of a real software project is of high relevance for this paper. Figure 1 explains the optimal interaction between branches and the trunk which is described by Chuck Walrad and Darrel Strom as Branch by Release Model [3]. In addition to the context of branching and merging there is a version tree sample graph explained by Yongchang Ren et al. in their paper [8].
In order to give a comprehensive explanation of the process we assume a simple Java library project. As build tool Apache Maven is chosen which is successfully used for years by many different commercial and Open Source projects. Maven defines many standards for the software development process and implements them. Its success feature is a highly efficient dependency management.
The information about the artifact version number is managed in the pom.xml, the Maven build file. For this reason the POM has our special attention. In the context of Maven a versions number is labeled SNAPSHOT while it is still under development. This convention allows in collaborative teams the sharing of non official published artifacts. After removing the label SNAPSHOT the artifact is released. By convention it is not possible to have more than one artifact with the same version number. In section III this topic is discussed in more detail. For the moment it is necessary to know that this convention takes effect in collaborative processes. The correct way to share artifacts is the usage of a Repository Manager. The most common Repository Manager is Sonatype Nexus OSS which is used for Maven Central [19] to deliver dependencies. Nexus will refuse the request if a developer tries to publish an already existing release of an artifact. With this infrastructure it is not necessary to transfer binary artifacts to the SCM. This tool chain is a simple example for a highly complex infrastructure to build and deliver software in large companies.
In figure 1 the development starts with version 1.0-SNAPSHOT. After the release of this version, the development of the next version 1.1-SNAPSHOT continues in trunk. The revision of the released version 1.0 gets branched to fix some bugs. The branch will not be created automatically during the release, rather it gets created when there is a need, for example BugFixes. The branch will be named by its minor version 1.0 to stay flexible for further corrections. After a correct BugFix the changes get merged back to trunk and so on. It is very important to keep in mind, that after a release, no new functionality can be added to the versions 1.0.X, only corrections are allowed.
The merging of failure corrections can lead to complications if there already exist deployed versions. When a bug is detected down to an existing version it will be necessary to fix all following versions and increment their version number as part of the correction. For example if there exist released versions 1.0.2, 1.1.1, 1.2.3 & 2.0.1. and the fix has been done in version 1.0.2 it will have to be renamed 1.0.3 for release. The merge direction is always from the lower to the higher version which means that the version numbers of all following involved artifacts have to be increased. By this it can be assured that only fixes will be exchanged and no functionality is moving form an higher to a lower version within the merging process.
In this model the case of parallel feature development is missing. This happens when a very complex functionality is planned and the implementation cannot be finished in one release cycle. This especially often occurs in agile projects with a short time line between releases. Feature Branches address this requirement as well. The process is a simple extension of the Branch by Release Model. The Feature Branch will be created from the trunk and will be named like the feature. To test compatibility this branch at least needs to be merged from the trunk after each release. A merge can also be performed if the trunk provides important new features – whenever necessary.
A very useful advanced usage of branches is the stash command, that comes as build-in with Git. Indeed this feature is not so common but simple and powerful. Imagine a developer is working on some implementation with the urgency of having to deliver a BugFix for another release. He needs to switch his workspace to this branch but the current work needs to be saved without a direct commit to the trunk. The solution is create a branch and check in the current work and hence switch the branch for the fix. After all is done he will have to switch to the stashed branch, finish the work and merge the result to the trunk. An often observed procedure for developers are simultaneous checkouts of different branches and just switching the IDE workspace. By experience in large companies, this is very time consuming and error prone. By the law of Murphy, the only needed branch is the one not present in a local checkout collection.
To get in touch with branch models more profoundly, the website of the Git SCM [20] presents different branching workflows. Also at [21] exists a very detailed explanation for Git branch and merge best practices.
3. Quick Survey on SCM Basics
As described, there exists a huge amount of Source Control Management solutions. Even just picking out the most popular systems, we are able to identify many differences in detail. These may be the reasons why some tools have become more popular than others. Naturally, all of these systems do the job and are based on common ideas. A very early and fundamental work on SCM systems done by Tichy gives a deep insight about the Theory on how an SCM should be constructed [2]. Today, based on the approach of how things are done, we can classify them. Directory and file based systems, like Microsoft Visual Source Safe, are part of the less effective group of SCM. In commercial environments this group has low relevance because quite often it causes inconsistencies of the repository. This leads us to the category of Client-Server solutions. Client-Server SCM systems have two manifestations: centralizedand distributed. SVN is the most famous representative for centralized solutions. In new projects the choice of the day will very often be Git, a very popular distributed SCM tool. In “Transition from Centralized to Decentralized Version Control Systems” the authors describes why decentralized SCM systems are favored by developers [12]. Interviews of developers have shown the benefits and risks of applicated SCM systems. They deliver a well elaborated explanation why distributed SCM has a higher learning curve. This finding is a important principle for dealing with SCM.
SCM systems are designed to handle plain text files, like those used for source code. After a file has undergone configuration management and had an initial transfer into the repository, the system stores only a delta of the changes for every new transaction. With this requirement the repository is more efficient and needs less disk storage. This implies binary files like office documents should not be stored in SCM repositories because the system cannot calculate a delta and will always store a complete new copy of the file, if it has been changed. A solution for dealing with binaries, like dependencies or third party libraries, are Repository Managers which were introduced in section II.
Figure 2: Changes in the POM, based on Semantic Versioning.
At this point some performance issues for SCM have to be taken in consideration. This is of outstanding importance, because it defines how a repository should be organized. Large projects with a code repository up to 1 GB take a long time for a checkout, even though there is only a small subset of files that are chosen. 20 minutes and more are very common. The reason for this effect is the size of the repository itself. When it contains a lot of files it takes more time to calculate the internal tree. The best solution for a high performance repository is: Only text files and just one independent project or module per repository.
In continuation surges question how files are represented in a SCM. As an example we remember the small Java library project with the Maven build logic. The build logic is represented as an XML file and contains the entry <version>. This entry defines the version number of the artifact and starts with an initialization of 1.0.0-SNAPSHOT. The procedure to increase the version number strictly follows the Semantic Versioning. Figure 2 visualizes several steps between two releases. For each revision a label describes the process and the version number show the value in the POM file. This graphic is an extension with a detailed view of figure 1.
In reality things are never like explained in theory. Initial assumption often create a big dilemma in automation processes when it comes to execution. It is very easy to claim, that in a repository, the entry for version in the POM for releases is unique. For example, it means that there should not exist two revisions with a released version 1.0. But where humans work, mistakes will happen. For this reason we have the option to create tags into the SCM. Every revision in the SCM which represents a deployed release, will be tagged with the correct version number. Deployed releases are defined by a successful transfer of the binary artifact into the Repository Manager for collaborative usage.
4. Scenarios on Real Problems
We should focus our activities on special points in respect to the evolution of software projects. It is not useful to pay attention on each single revision. Let us highlight the Points of Interest (POI) and why they are special. In real projects with collaborative teams, it is quite common that a developer breaks the current build. The good news are: when Continuous Integration (CI) is applied in the process, these kind of problems will be detected very quickly and can be solved at the instance of them appearing [16]. But how a developer is able to break a build? This occurs when the changes get committed into the repository and some files are not included in the commit. A repair can easily and fast be done by adding a new commit with the missing files needed. In this case it is very important to realize that only the one who delivered an incomplete package is able to add the missing parts. Problems arise when this happens on a Friday evening and the person responsible is leaving the office for vacations the next two or tree weeks without checking that everything is in order, causing unnecessary pain in the continuation of the project. These things happen much more often than anyone would expect.
Another effect is called fast shots. These small and often repeated commits typically change only a few lines in just one or two files. This happens when a user for some reason is not able to test his code or settings locally on his own machine. A simple scenario could be the manipulation of the CI Server build output without direct access.
A work flow for developers is the usage of particular commits in order to preserve intermediate steps of the work and allow an easy rollback. This procedure is only applicable in distributed systems or in environments without collaboration. The effect is quit similar. It will produce many revisions inside the SCM, which could get summarized to a single revision.
The Continuous Delivery approach for modern Web Applications is a quite different method compared to the classical release process [14]. This technique requires special strategies like the Feature Toggle Pattern [22] and a highly automated deploy pipeline. Also the usage of the SCM system is very advanced. Each feature is developed in its own branch and the Configuration- or Build Manager creates for each deployment a proper Integration Branch. The biggest challenge in this methodology are fast responses towards urgent problems arising. In the worst case it could be necessary to push out very quickly a new deployment with a full or partial rollback. During deployments database changes are very critical. This aspect could be discussed in a further paper. Databases are not implicitly part of the SCM, but there also exist techniques [23] to keep them under configuration management.
Figure 3: Structure of a commit naming.
As mentioned before, a release R inside an SCM is defined by several commits to the SCM. These commits are identified by the revision r. The lowest amount of revisions between two release is one, but there is no limit concerning to the upper boundary. Special Points of Interests inside an SCM are released revisions which can formally defined by (2).
R := {r 1, r 2, r 3, r n+1,…, r x } (1)
POI:= ∆ Release (R; R + 1) (2)
By this interpretation we are able to develop metrics which show a real project growth and do not just produce an output [13]. The paper of P. Kaur and H. Singh contains a collection of metrics related to their VVCT SCM [9]. An adapted suggestion for possibilities to compare project evolution is:
the amount of BugFix releases in a minor branch,
an count of revisions between two release,
the growth between minor and major release (e.g. Line of Codes),
a direct comparison between the current trunk and a previous release,
two selected releases,
a comparison of an release R and its replacement.
For example the amount of BugFix releases for a minor release allows a conclusion about the quality situation of a project. It is very important to understand the reasons to improve program stability and reduce the number of BugFixes. A classification for changes is described by Swanson [1]. An overview of the project based on these classifications of BugFixes should detect the issues that have to be changed to accomplish high quality.
5. A Vocabulary for SCM Commit Messages
In the early times SCM systems were used for synchronizing source code between developers. Typically users were not paying too much attention to write well formulated explanations about their changes. In many instances they were not leaving any description about what they did. Another extreme was that comments like update build logic frequently appeared in the history. An explanation of everything and nothing without saying what was changed or why. It could either be a version update of an existing library or the addition of a new dependency leading to a heavy time-consuming work in order to identify the points of interest in the commit history. Manual checks between the version with a Diff Tool would be necessary to locate the Line of Code that may have to be changed again. Guidelines have been introduced on how to write a well formulated commit message to solve this problems. A short selection of these guides published on the internet: [24, 25, 26] It was discovered by companies that the approach to apply well formulated descriptions of SCM revisions can improve productivity in teams. By exploring new projects on Source Code Hosting Services like GitHub or Sourceforge the quality of commit messages was increasing in the last years.
Based on these recommendations and the experience gained as of today, a vocabulary should be introduced for writing easier and more efficient commit messages. This simple-to-use standardization could help to visualize the evolution of a project more clearly. By very precise and short explanation of every revision readers do not get flooded with information. This allows analysts to see patterns of process leaks more quickly and increases the team productivity. The usage of a defined structure also allows an automatism to parse the commit messages. The result can generate programmatic presentations of diagrams readable by humans. Naturally this approach is not only limited to SCM. Another usage could be for communication in meetings with strict time limitations, for example in the agile method Scrum.
The vocabulary for SCM Commit Messages follows a defined structure which is shown in figure 3. The composition contains a mandatory first line and includes a FunctionID, label and a short specification. The second and third line is optional and contains the TaskID from the Issue Management System and a description of the more detailed explanation. Our suggestion for the vocabulary covers most SCM work flows. It may will be that some companies need adoptions to implement this solution in their processes. For this reason the definition is flexible and allows extensions.
#INIT – the repository or a release.
repro:documentation / configuration…
archetype:jar / war / ear / pom / zip…
version:<version>
#IMPLEMENT – a functionality.
function:<clazz>
#CHANGE – a functionality.
function:<clazz>
#EXTEND – a functionality.
function:<clazz>
attach:<clazz>
#BUGFIX – a functionality.
priority:critical / medium / low / design
#REVIEW – an implementation.
refactor:<function>
analyze:<quality>migrate:<function>
format:<source>
#RELEASE – an artifact.
version:<version>
#REVERT – a commit.
commit:<id>
#BRANCH – create.
create:<name>
stash:<branch>
#MERGE – from another branch.
from:<branch>
to:<branch>
#CLOSE – a branch.
branch:<name>
As first entry a FunctionID is recommended and not the TaskID of the Issue Management. This decision is based on the experience that functionality could spread in different tasks. In longtime projects it could happen that for some reason the Issue Management System needs to be replaced by another one. Not all projects are connected to Issue Management, especially when they are small or just a prototype. These circumstances proved to be decisive to define the TaskId as optional and move it to the second line. With a FunctionID it is easier to identify parts that should be linked. Sometimes there exist transfers into the repository that cannot be assigned to a dedicated function. These commits are often related to activities of the Build- and Configuration Manager. As best practice an ID should be established which corresponds to these activities. Some examples related to the defined labels are:
[CM-00] INIT;
[CM-10] REVIEW;
[CM-20] BRANCH;
[CM-30] MERGE;
[CM-40] RELEASE;
[CM-50] build management.
The mightiness of this approach is its simplicity and how it can be included in existing projects. The rule set does not contain any additional complexity and the process is quite easy to understand. A short example will demonstrate the usage and a full example is provided in section VI. A change in the POM file to update the version of the test framework could be commented as follows:
[CM-50] #CHANGE ’function:pom’ <QS-23231> {Change version number of the dependency JUnit from 4 to 5.0.2}
6. Release Process
The sample project in section II is not only fictive. The Together Platform (TP) available on GitHub [26] was initiated to study techniques on real conditions. Hence Git is the SCM tool of the choice. As client SmartGit is recommended because of platform independence and it offers plentiful advanced functionality.
For better comprehension of our approach of writing commit expressions we use the TP-CORE project, from initialization of the repository to its first release. No TaskIDs for the revisions exist due to the project not being connected to an Issue Management System. We use an excerpt of TP-CORE to demonstrate the approach because between the initial commit and the first published release 1.0.2 exist over 70 revisions in the repository. The project also contains a set of 12 functions which do not need to be included completely in our sample. Only three functions were selected for demonstration:
CORE-01 Logger;
CORE-02 genericDAO;
CORE-05 ApplicationConfiguration.
This cuts the revisions in half and shows enough complexity avoiding readers falling asleep.
The condition for a first release was the implementation of all 12 functionalities. The overall test coverage has reached more than 85%. Code smells detected with checks by Findbugs, Checkstyle, PMD et cetera have been removed. For an facilitate explanation, we add a revision number before the FunctionID. TP-CORE Commit Messages:
01 [CM-00] #INIT ’archtype:jar’
{Initial the repository for Java JAR library.}
02 [CORE-01] #IMPLEMENT ’function:Logger’
{Application wide standard logger.}
03 [CORE-02] #IMPLEMENT
{Generic Data Access Object Pattern for centralized database access.}
04 [CORE-05] #IMPLEMENT ’function:AppConfigDO’
{Domain Object for application configuration.}
05 [CM-10] #REVIEW ’analyze:quality’
{Formatting, fix Checkstyle hints, JavaDoc & test coverage}
06 [CORE-05] #IMPLEMENT ’function:ConfigurationDAO’
{Add the ConfigurationDAO implementation.}
07 [CORE-05] #EXTEND ’attach:tests’
{Create test cases for Bean Validation.}
08 [CORE-01] #EXTEND ’function:Logger’
{Add new Method to detect the configured LogLevel.}
09 [CORE-05] #EXTEND ’function:AppConfigDO’
{Change Primary Key to UUID and extend tests.}
10 [CORE-05] #CHANGE ’function:AppConfigDO’
{Rename to ConfigurationDO and define table indexes.}
11 [CORE-02] #EXTEND ’function:GenericDAO’
{Add flushTable, countEnties and optimize.}
12 [CORE-05] #EXTEND ’attach:tests’
{Update test cases for application configuration.}
13 [CORE-05] #EXTEND ’function:ConfigurationDAO’
{Update the implementation for ConfigurationDAOImpl.}
14 [CORE-01] #EXTEND ’function:Logger’
{Add method for exception handling.}
15 [CORE-05] #EXTEND ’function:ConfigurationDO’
{Add field mandatory.}
16 [CM-10] #REVIEW ’migrate:JUnit’
{Migrate Test cases from JUnit4 to JUnit5.}
17 [CM-10] #REVIEW ’analyze:quality’
{Fix JavaDoc, Checkstyle & Findbugs.}
18 [CM-50] #EXTEND ’function:POM’
{Update SCM connection to GitHub.}
19 [CM-50] #EXTEND ’attach:APIguards’
{Attach annotation for API documentation.}
20 [CORE-05] #REVIEW ’refactor:ConfigurationDO’
{FindBugs: optimize constructor parameters.}
21 [CORE-02] #BUGFIX ’priority:design’
{Fix FindBugs hint: visible modifier.}
22 [CM-50] #EXTEND ’attach:site’
{Extend MVN site configuration.}
23 [CORE-02] #BUGFIX ’priority:high’
{Fix spring DAO configuration.}
24 [CORE-05] #IMPLEMENT ’function:ConfigurationService’
{Implement basic functionality for
ConfigurationService.}
25 [CM-10] #REVIEW ’analyze:quality’
{Remove all compiler warnings, FindBugs,
Checkstyle & PMD Hits.}
26 [CORE-05] #EXTEND
’attach:ConfigurationService’
{A dd JGiven test scenarios.}
27 [CM-40] #RELEASE ’version:1.0’
{Release artifact to version 1.0}
28 [CM-40] #RELEASE ’version:1.0.1’
{Change POM GroupId to Maven Central conventions.}
29 [CM-00] #INIT ’version:1.1’
{Start implementation of version 1.1.0.}
30 [CM-50] #MERGE ’from:1.0.1’
{Integrate GAV POM changes to trunk.}
31 [CM-40] #RELEASE ’version:1.0.2’
{Include PGP signing.}
32 [CM-20] #CHANGE ’function:Constraints’
{Add Constraints.VERSION to 1.1}
33 [CORE-01] #EXTEND ’function:Logger’
{Default loader for logback.xml configuration files in the application DIR.}
Considering the previous example, we see that a limitation to around 80 – 100 characters for the first line is recommendable. Displaying the history with any client could get very messy if the first line has no size restrictions. The log output of the commit messages does not display the branch and tag operation, a behavior of Git. These revisions do not appear in any history list by browsing GitHub. Revision 28 is a branch based on revision 27. The branch is named as 1.0. Releases are published in consonance with the convention to be labeled, revision 31 tagged as Release 1.0.2. The revisions 28 and 31 are part of branch 1.0.
In this constellation we are able to see an important detail for dealing with branches. A branch will only be created when it is necessary. Usually BugFix branches do not have their own build plans on CI Servers and are managed manually. The primary arguments for this practice are to reduce the administrative overhead for the CI Servers. Companies that orchestrate their applications by web services or modules loose capacities by binding their recourses in this kind of activities.
7. Conclusion
“There is nothing permanent except change.” – Heraclitus
The whole infrastructure of commercial software projects contains a lot of independent fragments which share information over all development cycle. In projects we are overloaded by documentation production processes. The high amount of all this information inhibits profoundly comprehension and handling capabilities. Applications are getting more complex and bigger resulting in the necessity to establish more efficient ways to deal with information accumulation. There exists a giant overhead of managing documents like release notes, release plan, issue management, quality reports, statistics & metrics, documentation, architectural documents and BugFix lists. Typically each tool stores its data in its own structure. This makes changes to other tools, that might fit better, risky and expensive.
Companies know the effect that developers feel uncomfortable having to track their work in Issue Management tools like JIRA resulting in them trying to hide their part of the work flow as much as possible. Tasks will be opened up when they are almost done or already finished. The information on how many project days were spent for a function covers more the expectations and less the reality with the intent that developers can escape a bit from the daily pressure of productivity. Often developers are forced to spend their time with data acquisition for management controlling instead of programming resulting in low cost efficiency of a project and even additional and unplanned costs. Developers dislike this kind of activities because it keeps them away from their actual work: development. This is what makes the simple approach towards human readable and machine processable commit messages attractive and more convenient. The most important fact is that no extra costs are generated applying this method to existing processes.
We are enabled to generate several reports based on real data if SCM repositories can be populated with additional information. Impact assessments could be more efficient and accurate when they are created by facts and not emotionally blended.
Future Work
The idea to make information inside SCM systems more transparent is not just limited to commit messages. Another obvious point for future research is the history command. In the paper of Abram Hindle and Daniel M. German a query language for source control is introduced [7]. The idea of SCM Language could be picked up and transformed applying it to a specific solution. This work would use the Domain Driven Development paradigm to model an own SCM language based on Domain Specific Language (DSL) concepts – leading to the discovery of real world DSL solutions allowing for quick construction of a viable prototype or application based upon certain specifications.
Also a point which boldly comes to mind after reading the paper of Fischer et al., is the inclusion of released information into SCM [4]. This approach should not fully be automated due to its requirement of an advanced knowledge about branching and merging. A small self written extension could be a probable solution. A short tutorial 17 for Git suggests certain possibilities.
Acknowledgements
Special thanks to Joachim Reiter and Harald Kaufmann for spending their time to review this document. Their feedback was very productive.
References
[1] E. Burton Swanson, 1978, The Dimension of Maintenance. [2] Walter F. Tichy, 1985, RCS – A System for Version Control. [3] Chuck Walrad and Darrel Strom, 2002, The Importance of Branching Models in SCM. [4] Michael Fischer, Martin Pinzger, Harald Gall, 2003, Populating a Release History Database from Version Control and Bug Tracking Systems. [5] Filip Van Rysselberghe and Serge Demeyer, 2004, Mining Version Control Systems for FACs (Frequently Applied Changes). [6] Louis Glassy, 2005, Using version control to observe student software development processes. [7] Abram Hindle and Daniel M. German, 2005, SCQL: a formal model and a query language for source control. [8] Yongchang Ren, Tao Xing, Qiang Quan, Ying Zhao, 2010, Software Configuration Management of Version Control Study Based on Baseline. [9] Parminder Kaur and Hardeep Singh, 2011, A Model for Versioning Control Mechanism in Component- Based Systems [10] Shaun Phillips, Jonathan Sillito, Rob Walker, 2011, Branching and merging: an investigation into current version control practices. [11] Abdullah Uz Tansel and Ali Koc, 2011, A Survey of Version Control Systems. [12] Christian Bird et al., 2014, Transition from Centralized to Decentralized Version Control Systems A Case Study on Reasons, Barriers, and Outcomes. [13] Norman E. Fenton and Shari Lawrence Pfieeger, 1997, PWS Publishing Company, Software Metrics – A Rigorous and Practical Approach 2nd Edition, ISBN O·534·95425·1. [14] Jez Humble and David Farley, 2010, Addison-Wesley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation, ISBN 0-321-60191-2. [15] Scott Chacon and Ben Straub, 2014, Apress, Pro Git 2nd Edition, ISBN 978-1-4842-0077-3. [16] Mike Clark, 2004, The Pragmatic Bookshelf, Pragmatic Project Automation, ISBN 0-9745140-3-9. [17] Dave Thomas and Andy Hunt, 2003, The Pragmatic Bookshelf, Pragmatic Version Control with CVS, ISBN 0-9745140-0-4. [18] Mike Mason, 2010, The Pragmatic Bookshelf, Pragmatic Guide to Subversion, ISBN 1-934356-61-1. [19] https://search.maven.org [20] https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows [21] https://nvie.com/posts/a-successful-git-branching-model/ [22] https://www.martinfowler.com/articles/feature-toggles.html [23] https://flywaydb.org [24] https://chris.beams.io/posts/git-commit/ [25] http://who-t.blogspot.mx/2009/12/on-commit-messages.html [26] https://github.com/ElmarDott/TP-CORE/
Biography
Marco Schulz, also kown by his online identity Elmar Dott is an independent consultant in the field of large Web Application, generally based on the JavaEE environment. His main working field is Build-, Configuration- & Release-Management as well as software architecture. In addition his interests cover the full software development process and the discovery of possibilities to automate them as much as possible. Over the time of the last ten years he has authored a variety of technical articles for different publishers and speaks on various software development conferences. He is also the aut
In the previous part of the article treasure chest, I described how the database connection for the TP-CORE library got established. Also I gave a insight to the internal structure of the ConfiguartionDO. Now in the second part I explain the ConfiguartionDAO and its corresponding service. With all this knowledge you able to include the application configuration feature of TP-CORE in your own project to build your own configuration registry.
Lets resume in short the architectural design of the TP-CORE library and where the fragments of the features located. TP-CORE is organized as layer architecture as shown in the graphic below.
As you can see there are three relevant packages (layer) we have to pay attention. As first the business layer resides like all other layers in an equal named package. The whole API of TP-CORE is defined by interfaces and stored in the business layer. The implementation of the defined interfaces are placed in the application layer. Domain Objects are simple data classes and placed in the domain layer. Another important pattern is heavily used in the TP-CORE library is the Data Access Object (DAO).
Now the days micro services and RESTful application are state of the art. Especially in TP-CORE the defined services aren’t REST. This design decision is based on the mind that TP-CORE is a dependency and not a standalone service. Maybe in future, after I got more feedback how and where this library is used, I could rethink the current concept. For now we treat TP-CORE as what it is, a library. That implies for the usage in your project, you can replace, overwrite, extend or wrap the basic implementation of the ConfigurationDAO to your special necessities.
To keep the portability of changing the DBMS Hibernate (HBM) is used as JPA implementation and O/R mapper. The Spring configuration for Hibernate uses the EntityManager instead of the Session, to send requests to the DBMS. Since version 5 Hibernate use the JPA 2 standard to formulate queries.
As I already mentioned, the application configuration feature of TP-CORE is implemented as DAO. The domain object and the database connection was topic of the first part of this article. Now I discuss how to give access to the domain object with the ConfigurationDAO and its implementation ConfigurationHbmDAO. The domain object ConfigurationDO or a list of domain objects will be in general the return value of the DAO. Actions like create are void and throw just an exception in the case of a failure. For a better style the return type is defined as Boolean. This simplifies also writing unit tests.
Sometimes it could be necessary to overwrite a basic implementation. A common scenario is a protected delete. For example: a requirement exist that a special entry is protected against a unwanted deletion. The most easy solution is to overwrite the delete whit a statement, refuses every time a request to delete a domain object whit a specific UUID. Only adding a new method like protectedDelete() is not a god idea, because a developer could use by accident the default delete method and the protected objects are not protected anymore. To avoid this problem you should prefer the possibility of overwriting GenericDAO methods.
As default query to fetch an object, the identifier defined as primary key (PK) is used. A simple expression fetching an object is written in the find method of the GenericHbmDAO. In the specialization as ConfigurationHbmDAO are more complex queries formulated. To keep a good design it is important to avoid any native SQL. Listing 1 shows fetch operations.
The readability of these few lines of source is pretty easy. The query we formulated for getAllConfigurationSetEntries() returns a list of ConfigurationDO objects from the same module whit equal version of a configSet. A module is for example the library TP-CORE it self or an ACL and so on. The configSet is a namespace that describes configuration entries they belong together like a bundle and will used in a service like e-mail. The version is related to the service. If in future some changes needed the version number have increase. Lets get a bit closer to see how the e-mail example will work in particular.
We assume that a e-mail service in the module TP-CORE contains the configuration entries: mailer.host, mailer.port, user and password. As first we define the module=core, configSet=email and version=1. If we call now getAllConfigurationSetEntries(core, 1, email); the result is a list of four domain objects with the entries for mailer.host, mailer.port, user and password. If in a newer version of the email service more configuration entries will needed, a new version will defined. It is very important that in the database the already exiting entries for the mail service will be duplicated with the new version number. Of course as effect the registry table will grow continual, but with a stable and well planned development process those changes occur not that often. The TP-CORE library contains an simple SMTP Mailer which is using the ConfigurationDAO. If you wish to investigate the usage by the MailClient real world example you can have a look on the official documentation in the TP-CORE GitHub Wiki.
The benefit of duplicate all existing entries of a service, when the service configuration got changed is that a history is created. In the case of update a whole application it is now possible to compare the entries of a service by version to decide exist changes they take effect to the application. In practical usage this feature is very helpful, but it will not avoid that updates could change our actual configuration by accident. To solve this problem the domain object has two different entries for the configuration value: default and configuration.
The application configuration follows the convention over configuration paradigm. Each service need by definition for all existing configuration entries a fix defined default value. Those default values can’t changed itself but when the value in the ConfigurationDO is set then the defaultValue entry will ignored. If an application have to be updated its also necessary to support a procedure to capture all custom changes of the updated configuration set and restore them in the new service version. The basic functionality (API) for application configuration in TP-CORE release 3.0 is:
The following listing gives you an idea how a implementation in your own service could look like. This snipped is taken from the JavaMailClient and shows how the internal processing of the fetched ConfigurationDO objects are managed.
privatevoidprocessConfiguration(){ListconfigurationEntries=configurationDAO.getAllConfigurationSetEntries("core",1,"email");for(ConfigurationDOentry: configurationEntries){Stringvalue;if(StringUtils.isEmpty(entry.getValue())){ value =<strong>entry.getDefaultValue</strong>();}else{ value =<strong>entry.getValue</strong>();}if(entry.getKey().equals(cryptoTools.calculateHash("mailer.host",HashAlgorithm.SHA256))){configuration.replace("mailer.host", value);}elseif(entry.getKey().equals(cryptoTools.calculateHash("mailer.port",HashAlgorithm.SHA256))){configuration.replace("mailer.port", value);}elseif(entry.getKey().equals(cryptoTools.calculateHash("user",HashAlgorithm.SHA256))){configuration.replace("mailer.user", value);}elseif(entry.getKey().equals(cryptoTools.calculateHash("password",HashAlgorithm.SHA256))){configuration.replace("mailer.password", value);}}}
Java
Another functionality of the application configuration is located in the service layer. The ConfigurationService operates on the module perspective. The current methods resetModuleToDefault() and filterMandatoryFieldsOfConfigSet() already give a good impression what that means.
If you take a look on the MailClientService you detect the method updateDatabaseConfiguration(). May you wonder why this method is not part of the ConfigurationService? Of course this intention in general is not wrong, but in this specific implementation is the update functionality specialized to the MailClient configuration. The basic idea of the configuration layer is to combine several DAO objects to a composed functionality. The orchestration layer is the correct place to combine services together as a complex process.
Resume
The implementation of the application configuration inside the small library TP-CORE allows to define an application wide configuration registry. This works also in the case the application has a distribute architecture like micro services. The usage is quite simple and can easily extended to own needs. The proof that the idea is well working shows the real world usage in the MailClient and FeatureToggle implementation of TP-CORE.
I hope this article was helpful and may you also like to use TP-CORE in your own project. Feel free to do that, because of the Apache 2 license is also no restriction for commercial usage. If you have some suggestions feel free to leave a comment or give a thumbs up.
If you whish to discover a way how to turn negative vibes between testers and developers into something positive – here is a great solution for that. The thing I like to introduce is quite old but even today in our brave new DevOps world an evergreen.
Many years ago in the world wide web I stumbled over a PDF called Bug Fix Bingo. A nice funny game for IT professionals. This little funny game originally was invent by the software testing firm K. J. Ross & Associates. Unfortunately the original site disappeared long ago so I decided to conserve this great idea in this blog post.
I can recommend this game also for folks they are not so deep into testing, but have to participate in a lot of IT meetings. Just print the file, bring some copies to your next meeting and enjoy whats gonna happen. I did it several times. Beside the fun we had it changed something. So let’s have a look into the concept and rules.
Bug Fix Bingo is based on a traditional Bingo just with a few adaptions. Everyone can join the game easily without a big preparation, because its really simple. Instead of numbers the Bingo uses statements from developers in defect review meetings to mark off squares.
Rules:
Bingo squares are marked off when a developer makes the matching statement during bug fix sessions.
Testers must call “Bingo” immediately upon completing a line of 5 squares either horizontally, vertically or diagonally.
Statements that arise as result of a bug that later becomes “deferred”, “as designed”, or “not to fixed” should be classified as not marked.
Bugs that are not reported in an incident report can not be used.
Statements should also be recorded against the bug in the defect tracking system for later confirmation.
Any tester marks off all 25 statements should be awarded 2 weeks stress leave immediately.
Any developer found using all 25 statements should be seconded into the test group for a period of no less than 6 months for re-education.
“It works on my machine.”
“Where were you when the program blew up?”
“Why do you want to do it in that way?”
“You can’t use that version on your system.”
“Even thought it doesn’t work, how does it feel.”
“Did you check for a virus on your system?”
“Somebody must have changed my code.”
“It works, but it hasn’t been tested.”
“THIS can’t be the source of that module in weeks!”
“I can’t test anything!”
“It’s just some unlucky coincidence.”
“You must have the wrong version.”
“I haven’t touched that module in weeks.”
“There is something funky in your data.”
“What did you type in wrong to get it to crash?”
“It must be a hardware problem.”
“How is that possible?”
“It worked yesterday.”
“It’s never done that before.”
“That’s weird …”
“That’s scheduled to be fixed in the next release.”
“Yes, we knew that would happen.”
“Maybe we just don’t support that platform.”
“It’s a feature. We just haven’t updated the specs.”
“Surly nobody is going to use the program like that.”
The BuxFix Bingo Gamecard
Incidentally, developers have a game like this too. They score points every time a QA person tries to raise a defect on functionality that is working as specified.
If you and your team are dealing with tools like Git or Subversion, you may need an administrative layer where you are able to manage user access and repositories in a comfortable way, because source control management systems (SCM) don’t bring this functionality out of the box.
Perhaps you are already familiar with popular management solutions like GitHub, GitBlit or GitLab. The main reason for their success is their huge functionality. And of course, if you plan to create your own build and deploy pipeline with an automation server like Jenkins you will need to host your own repository manager too.
As great as the usage of GitLab and other solutions is, there is also a little bitter taste:
The administration is very complicated and requires some experience.
The minimal requirement of hardware resources to operate those programs with good performance is not that little.
To overcome all these hurdles, I will introduce a new star on the toolmaker’s sky SCM-Manager [1]. Fast, compact, extendable and simple, are the main attributes I would use to describe it.
Kick Starter: Installation
Let’s have a quick look at how easy the installation is. For fast results, you can use the official Docker container [2]. All it takes is a short command:
First, we create a container named scm based on the SCM-Manager image 2.22.0. Then, we tell the container to always restart when the host operating system is rebooted. Also, we open the ports 2222 and 8080 to make the service accessible. The last step is to mount a directory inside the container, where all configuration data and repositories are stored.
Another option to get the SCM-Manager running on a Linux server like Ubuntu is by using apt. The listing below shows how to do the installation.
SCM-Manager can also be installed on systems like Windows or Apple. You can find information about the installations on additional systems on the download page [3]. When you perform an installation, you will find a log entry with a startup token in the console.
Startup token in the command line.
After this you can open your browser and type localhost:8080, where you can finish the installation by creating the initial administration account. In this form, you need to paste the startup token from the command line, as it is shown in image 2. After you submitted the initialization form, you get redirected to the login. That’s all and done in less than 5 minutes.
Initialization screen.
For full scripted untouched installations, there is also a way to bypass the Initialization form by using the system property scm.initalPassword. This creates a user named scmadmin with the given password.
In older versions of the SCM-Manager, the default login account was scmadmin with the password scmadmin. This old way is quite helpful but if the administrator doesn’t disable this account after the installation, there is a high-security risk. This security improvement is new since version 2.21.
Before we discover more together about the administration, let’s first get to some details about the SCM-Manager in general. SCM-Manager is open source under MIT license. This allows commercial usage. The Code is available on GitHub. The project started as research work. Since Version 2 the company Cloudogu took ownership of the codebase and manages the future development. This construct allows the offering of professional enterprise support for companies. Another nice detail is that the SCM-Manager is made in Germany.
Pimp Me Up: Plugins
One of the most exciting details of using the SCM-Manager is, that there is a simple possibility to extend the minimal installation with plugins to add more useful functions. But be careful, because the more plugins are installed, the more resources the SCM-Manager needs to be allocated. Every development team has different priorities and necessities, for this reason, I’m always a fan of customizing applications to my needs.
Installed Plugins.
The plugin installation section is reachable by the Administration tab. If you can’t see this entry you don’t have administration privileges. In the menu on the right side, you find the entry Plugins. The plugin menu is divided into two sections: installed and available. For a better overview, the plugins are organized by categories like Administration, Authorization, or Workflow. The short description for each plugin is very precise and gives a good impression of what they do.
Some of the preinstalled plugins like in the category Source Code Management for supported repository types Git, Subversion, and Mercurial can’t be uninstalled.
Some of my favorite plugins are located in the authorization section:
Those features are the most convenient for Build- and Configuration Managers. The usage is also as simple as the installation. Let’s have a look at how it works and for what it’s necessary.
Gate Keeper: Special Permissions
Imagine, your team deals for example whit a Java/Maven project. Perhaps it exists a rule that only selected people should be allowed to change the content of the pom.xml build logic. This can be achieved with the Path Write Protection Plugin. Once it is installed, navigate to the code repository and select the entry Settings in the menu on the right side. Then click on the option Path Permissions and activate the checkbox.
Configuring Path permissions.
As you can see in image 4, I created a rule that only the user Elmar Dott is able to modify the pom.xml. The opposite permission is exclude (deny) the user. If the file or a path expression doesn’t exist, the rule cannot be created. Another important detail is, that this permission covers all existing branches. For easier administration, existing users can be organized into groups.
In the same way, you are able to protect branches against unwanted changes. A scenario you could need this option is when your team uses massive branches or the git-flow branch model. Also, personal developer branches could have only write permission for the developer who owns the branch or the release branch where the CI /CD pipeline is running has only permissions for the Configuration Management team members.
Let’s move ahead to another interesting feature, the review plugin. This plugin enables pull requests for your repositories. After installing the review plugin, a new bullet point in the menu of your repositories appears, it’s called Pull Requests.
Divide and Conquer: Pull Requests
On the right hand, pull requests [4] are a very powerful workflow. During my career, I often saw the misuse of pull requests, which led to drastically reduced productivity. For this reason, I would like to go deeper into the topic.
Originally, pull requests were designed for open source projects to ensure code quality. Another name for this paradigm is dictatorship workflow [5]. Every developer submits his changes to a repository and the repository owner decides which revision will be integrated into the codebase.
If you host your project sources on GitHub, strangers can’t just collaborate in your project, they first have to fork the repository into their own GitHub space. After they commit some revisions to this forked repository, they can create a pull request to the original repository. As repository owner, you can now decide whether you accept the pull request.
The SCM tool IBM Synergy had a similar strategy almost 20 years ago. The usage got too complicated so that many companies decided to move to other solutions. These days, it looks like history is repeating itself.
The reason why I’m skeptical about using pull requests is very pragmatic. I often observed in projects that the manager doesn’t trust the developers. Then he decides to implement the pull request workflow and makes the lead developer or the architect accept the pull requests. These people are usually too busy and can’t really check all details of each single pull request. Hence, their solution is to simply merge each pull request to the code base and check if the CI pipeline still works. This way, pull requests are just a waste of time.
There is another way how pull requests can really improve the code quality in the project: if they are used as a code review tool. How this is going to work, will fill another article. For now, we leave pull requests and move to the next topic about the creation of repositories.
Treasure Chest: Repository Management
The SCM-Manager combines three different source control management repository types: Git, Subversion (SVN), and Mercurial. You could think that nobody uses Subversion anymore, but keep in mind that many companies have to deal with legacy projects managed with SVN. A migration from those projects to other technologies may be too risky or simply expensive. Therefore, it is great to have a solution that can manage more than one repository type.
If you are Configuration Manager and have to deal with SVN, keep in mind that some things are a bit different. Subversion organizes branches and tags in directories. An SVN repository usually gets initialized with the folders:
trunk — like the master branch in Git.
branches — references to revisions in the trunk were forked code changes can committed.
tags — like branches without new code revisions.
In Git you don’t need this folder structure, because how branches are organized is completely different. Git (and Mercurial) compared to Subversion is a distributed Source Control Management System and branches are lose coupled and can easily be deleted if they are obsolete. As of now, I don’t want to get lost in the basics of Source Control Management and jump to the next interesting SCM-Manager plugins.
Uncover Secrets
If a readme.md file is located in the root folder of your project, you could be interested in the readme plugin. Once this plugin is activated and you navigate into your repository the readme.md file will be rendered in HTML and displayed.
The rendered readme.md of a repository.
If you wish to have a readable visualization of the repository’s activities, the activity plugin could be interesting for you. It creates a navigation entry in the header menu called Activity. There you can see all commit log entries and you can enter into a detailed view of the selected revision.
The activity view.
This view also contains a compare and history browser, just like clients as TortoiseGit does.
The Repository Manager includes many more interesting details for the daily work. There is even a code editor, which allows you to modify files directly in the SCM-Manger user interface.
Next, we will have a short walk through the user management and user roles.
Staffing Office: User and Group Management
Creating new users is like almost every activity of the SCM-Manager a simple thing. Just switch to the Users tab and press the create user button. Once you have filled out the form and saved it, you will be brought back to the Users overview.
Creating a new user
Here you can already see the newly created user. After this step, you will need to administrate the user’s permissions, because as of now it doesn’t have any privileges. To change that just click on the name of the newly created user. On the user’s detail page, you need to select the menu entry Settings on the right side. Now choose the new entry named Permissions. Here you can select from all available permissions the ones you need for the created account. Once this is done and you saved your changes, you can log out and log in with your new user, to see if your activity was a success.
If you need to manage a massive number of users it’s a good idea to organize them into groups. That means after a new user is created the permissions inside the user settings will not be touched and stay empty. Group permissions can be managed through the Groups menu entry in the header navigation. Create a new group and select Permission from the right menu. This configuration form is the same as the one of the user management. If you wish to add existing users to a group switch to the point General. In the text field Members, you can search for an existing user. If the right one is selected you need to press the Add Member button. After this, you need to submit the form and all changes are saved and the new permissions got applied.
To have full flexibility, it is allowed to add users to several groups (roles). If you plan to manage the SCM-Manager users by group permissions, be aware not to combine too many groups because then users could inherit rights you didn’t intend to give them. Currently, there is no compact overview to see in which groups a user is listed and which permissions are inherited by those groups. I’m quite sure in some of the future versions of the SCM-Manager this detail will be improved.
Besides the internal SCM-Manager user management exist some plugins where you are able to connect the application with LDAP.
Lessons Learned
If you dared to wish for a simpler life in the DevOps world, maybe your wish became true. The SCM-Manager could be your best friend. The application offers a lot of functionality that I briefly described here, but there are even more advanced features that I haven’t even mentioned in this short introduction: There is a possibility to create scripts and execute them with the SCM-Manager API. Also, a plugin for the Jenkins automation server is available. Other infrastructure tools like Jira, Timescale, or Prometheus metrics gathering have an integration to the SCM-Manager.
I hope that with this little article I was able to whet your appetite for this exciting tool and I hope you enjoy trying it out.
Many developers have ideas they work on it in their spare time. The most of us are convinced about open source and share their own projects on platforms like GitHub. But what happen after a publication of the source code? If you really want to gain people to use your project you’re not done yet. It’s also a good idea to publish your artifacts for a simple usage.
The most famous storage for binary Java Artifacts is Maven Central. Doesn‘t matter if you use in your projects Ivy, Gradle or Maven as dependency management, all those technologies access to Maven Central. In this talk you will learn how to publish your artifacts with Maven to Sonatype Nexuss OSS. We pass through all steps from creating accounts until the the binaries are available. In between I give some general hints about the usage of repository managers and helpful tricks for a lightweight Release Management.
Learn things about token replacement, executable jar, BOM, Dependency Management, enforcement, reporting and much more in live demonstrations.
After the gang of four (GOF) Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides published the book, Design Patterns: Elements of Reusable Object-Oriented Software, learning how to describe problems and solutions became popular in almost every field in software development. Likewise, learning to describe don’ts and anti-pattern became equally as popular.
In publications that discussed these concepts, we find helpful recommendations for software design, project management, configuration management, and much more. In this article, I will share my experience dealing with version numbers for software artifacts.
Most of us are already familiar with a method called semantic versioning, a powerful and easy-to-learn rule set for how version numbers have to be structured and how the segments should increase.
Version numbering example:
Major: Incompatible API changes.
Minor: Add new functionality.
Patch: Bugfixes and corrections.
Label: SNAPSHOT marking the “under development” status.
An incompatible API Change occurs when an externally accessible function or class was deleted or renamed. Another possibility is a change in the signature of a method. This means the return value or parameters has been changed from its original implementation. In these scenarios, it’s necessary to increase the Major segment of the version number. These changes present a high risk for API consumers because they need to adapt their own code.
When dealing with version numbers, it’s also important to know that 1.0.0 and 1.0 are equal. This has effect to the requirement that versions of a software release have to be unique. If not, it’s impossible to distinguish between artifacts. Several times in my professional experience, I was involved in projects where there was no well-defined processes for creating version numbers. The effect of these circumstances was that the team had to secure the quality of the artifact and got confused with which artifact version they were currently dealing with.
The biggest mistake I ever saw was the storage of the version of an artifact in a database together with other configuration entries. The correct procedure should be: place the version inside the artifact in a way that no one after a release can change from outside. The trap you could fall into is the process of how to update the version after a release or installation.
Maybe you have a checklist for all manual activities during a release. But what happens after a release is installed in a testing stage and for some reason another version of the application has to be installed. Are you still aware of changing the version number manually? How do you find out which version is installed or when the information of the database is incorrect?
Detect the correct version in this situation is a very difficult challenge. For that reason, we have the requirement to keep the version inside of the application. In the next step, we will discuss a secure and simple way on how to solve an automatic approach to this problem.
Our precondition is a simple Java library build with Maven. By default, the version number of the artifact is written down in the POM. After the build process, our artifact is created and named like: artifact-1.0.jar or similar. As long we don’t rename the artifact, we have a proper way to distinguish the versions. Even after a rename with a simple trick of packaging and checking, then, in the META-INF folder, we are able to find the correct value.
If you have the Version hardcoded in a property or class file, it would also work fine, as long you don’t forget to always update it. Maybe the branching and merging in SCM systems like Git could need your special attention to always have the correct version in your codebase.
Another solution is using Maven and the token placement mechanism. Before you run to try it out in your IDE, keep in mind that Maven uses to different folders: sources and resources. The token replacement in sources will not work properly. After a first run, your variable is replaced by a fixed number and gone. A second run will fail. To prepare your code for the token replacement, you need to configure Maven as a first in the build lifecycle:
After this step, you need to know the ${project.version} property form the POM. This allows you to create a file with the name version.property in the resources directory. The content of this file is just one line: version=${project.version}. After a build, you find in your artifact the version.property with the same version number you used in your POM. Now, you can write a function to read the file and use this property. You could store the result in a constant for use in your program. That’s all you have to do!
By experience, most of us know how difficult it is to express what we mean talking about quality. Why is that so? There exist many different views on quality and every one of them has its importance. What has to be defined for our project is something that fits its needs and works with the budget. Trying to reach perfectionism can be counterproductive if a project is to be terminated successfully. We will start based on a research paper written by B. W. Boehm in 1976 called “Quantitative evaluation of software quality.” Boehm highlights the different aspects of software quality and the right context. Let’s have a look more deeply into this topic.
When we discuss quality, we should focus on three topics: code structure, implementation correctness, and maintainability. Many managers just care about the first two aspects, but not about maintenance. This is dangerous because enterprises will not invest in individual development just to use the application for only a few years. Depending on the complexity of the application the price for creation could reach hundreds of thousands of dollars. Then it’s understandable that the expected business value of such activities is often highly estimated. A lifetime of 10 years and more in production is very typical. To keep the benefits, adaptions will be mandatory. That implies also a strong focus on maintenance. Clean code doesn’t mean your application can simply change. A very easily understandable article that touches on this topic is written by Dan Abramov. Before we go further on how maintenance could be defined we will discuss the first point: the structure.
Scaffolding Your Project
An often underestimated aspect in development divisions is a missing standard for project structures. A fixed definition of where files have to be placed helps team members find points of interests quickly. Such a meta-structure for Java projects is defined by the build tool Maven. More than a decade ago, companies tested Maven and readily adopted the tool to their established folder structure used in the projects. This resulted in heavy maintenance tasks, given the reason that more and more infrastructure tools for software development were being used. Those tools operate on the standard that Maven defines, meaning that every customization affects the success of integrating new tools or exchanging an existing tool for another.
Another aspect to look at is the company-wide defined META architecture. When possible, every project should follow the same META architecture. This will reduce the time it takes a new developer to join an existing team and catch up with its productivity. This META architecture has to be open for adoptions which can be reached by two simple steps:
Don’t be concerned with too many details;
Follow the KISS (Keep it simple, stupid.) principle.
A classical pattern that violates the KISS principle is when standards heavily got customized. A very good example of the effects of strong customization is described by George Schlossnagle in his book “Advanced PHP Programming.” In chapter 21 he explains the problems created for the team when adopting the original PHP core and not following the recommended way via extensions. This resulted in the effect that every update of the PHP version had to be manually manipulated to include its own development adaptations to the core. In conjunction, structure, architecture, and KISS already define three quality gates, which are easy to implement.
The open-source project TP-CORE, hosted on GitHub, concerns itself with the afore-mentioned structure, architecture, and KISS. There you can find their approach on how to put it in practice. This small Java library rigidly defined the Maven convention with his directory structure. For fast compatibility detection, releases are defined by semantic versioning. The layer structure was chosen as its architecture and is fully described here. Examination of their main architectural decisions concludes as follows:
Each layer is defined by his own package and the files following also a strict rule. No special PRE or POST-fix is used. The functionality Logger, for example, is declared by an interface called Logger and the corresponding implementation LogbackLogger. The API interfaces can detect in the package “business” and the implementation classes located in the package “application.” Naming like ILogger and LoggerImpl should be avoided. Imagine a project that was started 10 years ago and the LoggerImpl was based on Log4J. Now a new requirement arises, and the log level needs to be updated during run time. To solve this challenge, the Log4J library could be replaced with Logback. Now it is understandable why it is a good idea to name the implementation class like the interface, combined with the implementation detail: it makes maintenance much easier! Equal conventions can also be found within the Java standard API. The interface List is implemented by an ArrayList. Obviously, again the interface is not labeled as something like IList and the implementation not as ListImpl .
Summarizing this short paragraph, a full measurement rule set was defined to describe our understanding of structural quality. By experience, this description should be short. If other people can easily comprehend your intentions, they willingly accept your guidance, deferring to your knowledge. In addition, the architect will be much faster in detecting rule violations.
Measure Your Success
The most difficult part is to keep a clean code. Some advice is not bad per se, but in the context of your project, may not prove as useful. In my opinion, the most important rule would be to always activate the compiler warning, no matter which programming language you use! All compiler warnings will have to be resolved when a release is prepared. Companies dealing with critical software, like NASA, strictly apply this rule in their projects resulting in utter success.
Coding conventions about naming, line length, and API documentation, like JavaDoc, can be simply defined and observed by tools like Checkstyle. This process can run fully automated during your build. Be careful; even if the code checkers pass without warnings, this does not mean that everything is working optimally. JavaDoc, for example, is problematic. With an automated Checkstyle, it can be assured that this API documentation exists, although we have no idea about the quality of those descriptions.
There should be no need to discuss the benefits of testing in this case; let us rather take a walkthrough of test coverage. The industry standard of 85% of covered code in test cases should be followed because coverage at less than 85% will not reach the complex parts of your application. 100% coverage just burns down your budget fast without resulting in higher benefits. A prime example of this is the TP-CORE project, whose test coverage is mostly between 92% to 95%. This was done to see real possibilities.
As already explained, the business layer contains just interfaces, defining the API. This layer is explicitly excluded from the coverage checks. Another package is called internal and it contains hidden implementations, like the SAX DocumentHandler. Because of the dependencies the DocumentHandler is bound to, it is very difficult to test this class directly, even with Mocks. This is unproblematic given that the purpose of this class is only for internal usage. In addition, the class is implicitly tested by the implementation using the DocumentHandler. To reach higher coverage, it also could be an option to exclude all internal implementations from checks. But it is always a good idea to observe the implicit coverage of those classes to detect aspects you may be unaware of.
Besides the low-level unit tests, automated acceptance tests should also be run. Paying close attention to these points may avoid a variety of problems. But never trust those fully automated checks blindly! Regularly repeated manual code inspections will always be mandatory, especially when working with external vendors. In our talk at JCON 2019, we demonstrated how simply test coverage could be faked. To detect other vulnerabilities you can additionally run checkers like SpotBugs and others more.
Tests don’t indicate that an application is free of failures, but they indicate a defined behavior for implemented functionality.
For a while now, SCM suites like GitLab or Microsoft Azure support pull requests, introduced long ago in GitHub. Those workflows are nothing new; IBM Synergy used to apply the same technique. A Build Manager was responsible to merge the developers’ changes into the codebase. In a rapid manner, all the revisions performed by the developer are just added into the repository by the Build Manager, who does not hold a sufficiently profound knowledge to decide about the implementation quality. It was the usual practice to simply secure that the build is not broken and always the compile produce an artifact.
Enterprises have discovered this as a new strategy to handle pull requests. Now, managers often make the decision to use pull requests as a quality gate. In my personal experience, this slows down productivity because it takes time until the changes are available in the codebase. Understanding of the branch and merge mechanism helps you to decide for a simpler branch model, like release branch lines. On those branches tools like SonarQube operate to observe the overall quality goal.
If a project needs an orchestrated build, with a defined order how artifacts have to create, you have a strong hint for a refactoring.
The coupling between classes and modules is often underestimated. It is very difficult to have an automated visualization for the bindings of modules. You will find out very fast the effect it has when a light coupling is violated because of an increment of complexity in your build logic.
Repeat Your Success
Rest assured, changes will happen! It is a challenge to keep your application open for adjustments. Several of the previous recommendations have implicit effects on future maintenance. A good source quality simplifies the endeavor of being prepared. But there is no guarantee. In the worst cases the end of the product lifecycle, EOL is reached, when mandatory improvements or changes cannot be realized anymore because of an eroded code base, for example.
As already mentioned, light coupling brings with it numerous benefits with respect to maintenance and reutilization. To reach this goal is not that difficult as it might look. In the first place, try to avoid as much as possible the inclusion of third-party libraries. Just to check if a String is empty or NULL it is unnecessary to depend on an external library. These few lines are fast done by oneself. A second important point to be considered in relation to external libraries: “Only one library to solve a problem.” If your project deals with JSON then decide one one implementation and don’t incorporate various artifacts. These two points heavily impact on security: a third-party artifact we can avoid using will not be able to cause any security leaks.
After the decision is taken for an external implementation, try to cover the usage in your project by applying design patterns like proxy, facade, or wrapper. This allows for a replacement more easily because the code changes are not spread around the whole codebase. You don’t need to change everything at once if you follow the advice on how to name the implementation class and provide an interface. Even though a SCM is designed for collaboration, there are limitations when more than one person is editing the same file. Using a design pattern to hide information allows you an iterative update of your changes.
Conclusion
As we have seen: a nonfunctional requirement is not that difficult to describe. With a short checklist, you can clearly define the important aspects for your project. It is not necessary to check all points for every code commit in the repository, this would with all probability just elevate costs and doesn’t result in higher benefits. Running a full check around a day before the release represents an effective solution to keep quality in an agile context and will help recognizing where optimization is necessary. Points of Interests (POI) to secure quality are the revisions in the code base for a release. This gives you a comparable statistic and helps increasing estimations.
Of course, in this short article, it is almost impossible to cover all aspects regarding quality. We hope our explanation helps you to link theory by examples to best practice. In conclusion, this should be your main takeaway: a high level of automation within your infrastructure, like continuous integration, is extremely helpful, but doesn’t prevent you from manual code reviews and audits.
Checklist
Follow common standards
KISS – keep it simple, stupid!
Equal directory structure for different projects
Simple META architecture, which can reuse as much as possible in other projects
Defined and follow coding styles
If a release got prepared – no compiler warnings are accepted
Have test coverage up to 85%
Avoid third-party libraries as much as possible
Don’t support more than one technology for a specific problem (e. g., JSON)
Most of the developer community know what a unit test is, even they don’t write them. But there is still hope. The situation is changing. More and more projects hosted on GitHub contain unit tests.
In a standard set-up for Java projects like NetBeans, Maven, and JUnit, it is not that difficult to produce your first test code. Besides, this approach is used in Test Driven Development (TDD) and exists in other technologies like Behavioral Driven Development (BDD), also known as acceptance tests, which is what we will focus on in this article.
Difference Between Unit and Acceptance Tests
The easiest way to become familiar with this topic is to look at a simple comparison between unit and acceptance tests. In this context, unit tests are very low level. They execute a function and compare the output with an expected result. Some people think differently about it, but in our example, the only one responsible for a unit test is the developer.
Keep in mind that the test code is placed in the project and always gets executed when the build is running. This provides quick feedback as to whether or not something went wrong. As long the test doesn’t cover too many aspects, we are able to identify the problem quickly and provide a solution. The design principle of those tests follows the AAA paradigm. Define a precondition (Arrange), execute the invariant (Act), and check the postconditions (Assume). We will come back to this approach a little later.
When we check the test coverage with tools like JaCoCo and cover more than 85 percent of our code with test cases, we can expect good quality. During the increasing test coverage, we specify our test cases more precisely and are able to identify some optimizations. This can be removing or inverting conditions because during the tests we find out, it is almost impossible to reach those sections. Of course, the topic is a bit more complicated, but those details could be discussed in another article.
Acceptance test are same classified like unit tests. They belong to the family of regression tests. This means we want to observe if changes we made on the code have no effects on already worked functionality. In other words, we want to secure that nothing already is working got broken, because of some side effects of our changes. The tool of our choice is JGiven [1]. Before we look at some examples, first, we need to touch on a bit of theory.
JGiven In-Depth
The test cases we define in JGiven is called a scenario. A scenario is a collection of four classes, the scenario itself, the Given displayed as given (Arrange), the Action displayed as when (Act) and Outcome displayed as then (Assume).
In most projects, especially when there is a huge amount of scenarios and the execution consumes a lot of time, acceptance tests got organized in a separate project. With a build job on your CI server, you can execute those tests once a day to get fast feedback and to react early if something is broken. The code example we demonstrate contains everything in one project on GitHub [2] because it is just a small library and a separation would just over-engineer the project. Usually, the one responsible for acceptance tests is the test center, not the developer.
The sample project TP-CORE is organized by a layered architecture. For our example, we picked out the functionality for sending e-mails. The basic functionality to compose an e-mail is realized in the application layer and has a test coverage of up to 90 percent. The functionality to send the e-mail is defined in the service layer.
In our architecture, we decided that the service layer is our center of attention to defining acceptance tests. Here, we want to see if our requirement to send an e-mail is working well. Supporting this layer with our own unit tests is not that efficient because, in commercial projects, it just produces costs without winning benefits. Also, having also unit tests means we have to do double the work because our JGiven tests already demonstrate and prove that our function is well working. For those reasons, it makes no sense to generate test coverage for the test scenarios of the acceptance test.
Let’s start with a practice example. At first, we need to include our acceptance test framework into our Maven build. In case you prefer Gradle, you can use the same GAV parameters to define the dependencies in your build script.
As you can see in listing 1, JGiven works well together with JUnit. An integration to TestNG also exists , you just need to replace the artifactId for jgiven-testng. To enable the HTML reports, you need to configure the Maven plugin in the build lifecycle, like it is shown in Listing 2.
The report of our scenarios in the TP-CORE project is shown in image 1. As we can see, the output is very descriptive and human-readable. This result will be explained by following some naming conventions for our methods and classes, which will be explained in detail below. First, let’s discuss what we can see in our test scenario. We defined five preconditions:
The configuration for the SMPT server is readable
The SMTP server is available
The mail has a recipient
The mail has attachments
The mail is full composed
If all these conditions are true, the action will send a single e-mail got performed. Afterward, after the SMTP server is checked, we see that the mail has arrived. For the SMTP service, we use the small Java library greenmail [3] to emulate an SMTP server. Now it is understandable why it is advantageous for acceptance tests if they are written by other people. This increases the quality as early on conceptional inconsistencies appear. Because as long as the tester with the available implementations cannot map the required scenario, the requirement is not fully implemented.
Producing Descriptive Scenarios
Now is the a good time to dive deeper into the implementation details of our send e-mail test scenario. Our object under test is the class MailClientService. The corresponding test class is MailClientScenarioTest, defined in the test packages. The scenario class definition is shown in listing 3.
@RunWith(JUnitPlatform.class)publicclassMailClientScenarioTestextends<strong>ScenarioTest</strong><MailServiceGiven,MailServiceAction,MailServiceOutcome>{// do something }
Listing 3: Acceptance Test Scenario for JGiven.
As we can see, we execute the test framework with JUnit5. In the ScenarioTest, we can see the three classes: Given, Action, and Outcome in a special naming convention. It is also possible to reuse already defined classes, but be careful with such practices. This can cost some side effects. Before we now implement the test method, we need to define the execution steps. The procedure for the three classes are equivalent.
@RunWith(JUnitPlatform.class)publicclassMailServiceGivenextends<strong>Stage</strong><MailServiceGiven>{publicMailServiceGivenemail_has_recipient(MailClientclient){try{assertEquals(1,client.getRecipentList().size());}catch(Exceptionex){System.err.println(ex.getMessage);}returnself();}}@RunWith(JUnitPlatform.class)publicclassMailServiceActionextends<strong>Stage</strong><MailServiceAction>{publicMailServiceActionsend_email(MailClientclient){MailClientServiceservice=newMailClientService();try{assertEquals(1,client.getRecipentList().size());service.sendEmail(client);}catch(Exceptionex){System.err.println(ex.getMessage);}returnself();}}@RunWith(JUnitPlatform.class)publicclassMailServiceOutcomeextends<strong>Stage</strong><MailServiceOutcome>{publicMailServiceOutcomeemail_is_arrived(MimeMessagemsg){try{Addressadr=msg.getAllRecipients()[0];assertEquals("JGiven Test E-Mail",msg.getSubject());assertEquals("noreply@sample.org",msg.getSender().toString());assertEquals("otto@sample.org",adr.toString());assertNotNull(msg.getSize());}catch(Exceptionex){System.err.println(ex.getMessage);}returnself();}}
Listing 4: Implementing the AAA Principle for Behavioral Driven Development.
Now, we completed the cycle and we can see how the test steps got stuck together. JGiven supports a bigger vocabulary to fit more necessities. To explore the full possibilities, please consult the documentation.
Lessons Learned
In this short workshop, we passed all the important details to start with automated acceptance tests. Besides JGiven exist other frameworks, like Concordion or FitNesse fighting for usage. Our choice for JGiven was its helpful documentation, simple integration into Maven builds and JUnit tests, and the descriptive human-readable reports.
As negative point, which could people keep away from JGiven, could be the detail that you need to describe the tests in the Java programming language. That means the test engineer needs to be able to develop in Java, if they want to use JGiven. Besides this small detail, our experience with JGiven is absolutely positive.
This short tutorial covers the most fundamental steps to use docker in your development tool chain. After we introduced the basic theory, we will learn how to install docker on a Linux OS (Ubuntu Mate). When this is done we have a short walk through to download an image and instantiate the container. The example use the official PHP 7.3 image with an Apache 2 HTTP Server.
[EN] We use cookies to improve your experience on our site. By using our site, you consent to cookies.
[DE] Wir verwenden Cookies, um Ihre Erfahrungen auf unserer Website zu verbessern. Durch die Nutzung unserer Website stimmen Sie Cookies zu.
This website uses cookies
Websites store cookies to enhance functionality and personalise your experience. You can manage your preferences, but blocking some cookies may impact site performance and services.
Essential cookies enable basic functions and are necessary for the proper function of the website.
Name
Description
Duration
Cookie Preferences
This cookie is used to store the user's cookie consent preferences.
30 days
These cookies are needed for adding comments on this website.
Name
Description
Duration
comment_author
Used to track the user across multiple sessions.
Session
comment_author_email
Used to track the user across multiple sessions.
Session
comment_author_url
Used to track the user across multiple sessions.
Session
These cookies are used for managing login functionality on this website.
Name
Description
Duration
wordpress_logged_in
Used to store logged-in users.
Persistent
wordpress_sec
Used to track the user across multiple sessions.
15 days
wordpress_test_cookie
Used to determine if cookies are enabled.
Session
Statistics cookies collect information anonymously. This information helps us understand how visitors use our website.
Matomo is an open-source web analytics platform that provides detailed insights into website traffic and user behavior.