RTFM – usable documentation

Posted on 2025-11-17 by Elmar Dott

An old master craftsman used to say: “He who writes, remains.” His primary intention was to obtain accurate measurements and weekly reports from his journeymen. He needed this information to issue correct invoices, which was crucial to the success of his business. This analogy can also be readily applied to software development. It wasn’t until Ruby, the programming language developed in Japan by Yukihiro Matsumoto, had English-language documentation that Ruby’s global success began.

We can see, therefore, that documentation can be of considerable importance to the success of a software project. It’s not simply a repository of information within the project where new colleagues can find necessary details. Of course, documentation is a rather tedious subject for developers. It constantly needs to be kept up-to-date, and they often lack the skills to put their own thoughts down on paper in a clear and organized way for others to understand.

I myself first encountered the topic of documentation many years ago through reading the book “Software Engineering” by Johannes Siedersleben. Ed Yourdon was quoted there as saying that before methods like UML, documentation often took the form of a Victorian novella. During my professional life, I’ve also encountered a few such Victorian novellas. The frustrating thing was: after battling through the textual desert—there’s no other way to describe the feeling than as overcoming and struggling—you still didn’t have the information you were looking for. To paraphrase Goethe’s Faust: “So here I stand, poor fool, no wiser than before.”

Here we already see a first criticism of poor documentation: inappropriate length and a lack of information. We must recognize that writing isn’t something everyone is born with. After all, you became a software developer, not a book author. For the concept of “successful documentation,” this means that you shouldn’t force anyone to do it and instead look for team members who have a knack for it. This doesn’t mean, however, that everyone else is exempt from documentation tasks. Their input is essential for quality. Proofreading, pointing out errors, and suggesting additions are all necessary tasks that can easily be shared.

It’s highly advisable to provide occasional rhetorical training for the team or individual team members. The focus should be on precise, concise, and understandable expression. This also involves organizing your thoughts so they can be put down on paper and follow a clear and coherent structure. The resulting improved communication has a very positive impact on development projects.

Up-to-date documentation that is easy to read and contains important information quickly becomes a living document, regardless of the format chosen. This is also a fundamental concept for successful DevOps and agile methodologies. These paradigms rely on good information exchange and address the avoidance of information silos.

One point that really bothers me is the statement: “Our tests are the documentation.” Not all stakeholders can program and are therefore unable to understand the test cases. Furthermore, while tests demonstrate the behavior of functions, they don’t inherently demonstrate their correct usage. Variations of usable solutions are also usually missing. For test cases to have a documentary character, it’s necessary to develop specific tests precisely for this purpose. In my opinion, this approach has two significant advantages. First, the implementation documentation remains up-to-date, because changes will cause the test case to fail. Another positive effect is that the developer becomes aware of how their implementation is being used and can correct a flawed design in a timely manner.

Of course, there are now countless technical solutions that are suitable for different groups of people, depending on their perspective on the system. Issue and bug tracking systems, such as the commercial JIRA or the open-source Redmine, map entire processes. They allow testers to assign identified problems and errors in the software to a specific release version. Project managers can use release management to prioritize fixes, and developers document the implemented fixes. That’s the theory. In practice, I’ve seen in almost every project how the comment function in these systems is misused as a chat to describe the change status. The result is a bug report with countless useless comments, and any real, relevant information is completely missing.

Another widespread technical solution in development projects is the use of enterprise wikis. They enhance simple wikis with navigation and allow the creation of closed spaces where only explicitly authorized user groups receive granular permissions such as read or write. Besides the widely used commercial solution Confluence, there’s also a free alternative called BlueSpice, which is based on MediaWiki. Wikis allow collaborative work on a document, and individual pages can be exported as PDFs in various formats. To ensure that the wiki pages remain usable, it’s important to maintain clean and consistent formatting. Tables should fit their content onto a single A4 page without unwanted line breaks. This improves readability. There are also many instances where bulleted lists are preferable to tables for the sake of clarity.

This brings us to another very sensitive topic: graphics. It’s certainly true that a picture is often worth a thousand words. But not always! When working with graphics, it’s important to be aware that images often require a considerable amount of time to create and can often only be adapted with significant effort. This leads to several conclusions to make life easier. A standard program (format) should be used for creating graphics. Expensive graphics programs like Photoshop and Corel should be avoided. Graphics created for wiki pages should be attached to the wiki page in their original, editable form. A separate repository can also be set up for this purpose to allow reuse in other projects.

If an image offers no added value, it’s best to omit it. Here’s a small example: It’s unnecessary to create a graphic depicting ten stick figures with a role name or person underneath. Here, it is advisable to create a simple list, which is also easier to supplement or adapt.

Take professional screenshots

But you should also avoid overloaded graphics. True to the motto “more is better,” overly detailed information tends to cause confusion and can lead to misinterpretations. A recommended book is “Documenting and Communicating Software Architectures” by Stefan Zörner. In this book, he effectively demonstrates the importance of different perspectives on a system and which groups of people are addressed by a specific viewpoint. I would also like to take this opportunity to share his seven rules for good documentation:

Write from the reader’s perspective.
Avoid unnecessary repetition.
Avoid ambiguity; explain notation if necessary.
Use standards such as UML.
Include the reasons (why).
Keep the documentation up-to-date, but never too up-to-date.
Review the usability.

Anyone tasked with writing the documentation, or ensuring its progress and accuracy, should always be aware that it contains important information and presents it correctly and clearly. Concise and well-organized documentation can be easily adapted and expanded as the project progresses. Adjustments are most successful when the affected area is as cohesive as possible and appears only once. This centralization is achieved through references and hyperlinks, so that changes in the original document are reflected in the references.

Of course, there is much more to say about documentation; it’s the subject of numerous books, but that would go beyond the scope of this article. My main goal was to raise awareness of this topic, as paradigms like Agile and DevOps rely on a good flow of information.

Articels

How to buy and pay with Bitcoin

Elmar Dott Jan 14, 2025

Articels

Marketing with artificial intelligence

Neo Diseno Sep 1, 2025

Articels

Beyond code: Why soft skills for developers in the AI era become irreplaceable

Elmar Dott Jul 15, 2025

Articels

RTFM – usable documentation

Elmar Dott Nov 17, 2025

Articels

Talents wanted

Elmar Dott Oct 9, 2024

Articels

Stability in the crisis – business continuity & disaster recovery

Elmar Dott Oct 21, 2024

BugChaser – The limits of test coverage

Posted on 2025-11-10 by Elmar Dott

The paradigms now established in software engineering, such as Test-Driven Development (TDD) and Behavior-Driven Development (BDD), along with correspondingly easy-to-use tools, have opened up a new, pragmatic perspective on the topic of software testing. Automated tests are a crucial factor in commercial software projects. Therefore, in this context, a successful testing strategy is one in which test execution proceeds without human intervention.

Test automation forms the basis for achieving stability and reducing risk in critical tasks. Such critical tasks include, in particular, refactoring, maintenance, and bug fixes. All these activities share one common goal: preventing new errors from creeping into the code.

In his 1972 article “The Humble Programmer,” Edsger W. Dijkstra stated the following:

„Program testing can be a very effective way to show the presence of bugs, but is hopelessly inadequate for showing their absence.“

Therefore, simply automating test execution is not sufficient to ensure that changes to the codebase do not have unintended effects on existing functionality. For this reason, the quality of the test cases must be evaluated. Proven tools already exist for this purpose.

Before we delve deeper into the topic, let’s first consider what automated testing actually means. This question is quite easy to answer. Almost every programming language has a corresponding unit test framework. Unit tests call a method with various parameters and compare the return value with an expected value. If both values match, the test is considered passed. Additionally, it can also be checked whether an exception was thrown.

If a method has no return value or does not throw an error, it cannot be tested directly. Methods marked as private, or inner classes, are also not easily testable, as they cannot be called directly. These must be tested indirectly through public methods that call the ‘hidden’ methods.

When dealing with methods marked as private, it is not an option to access and test the functionality they represent using techniques such as the Reflection API. We must be aware that such methods are often also used to encapsulate code fragments to avoid duplication.

public boolean method() {
    boolean success = false;
    List collector = new ArryList();
    collector.add(1);
    collector.add(2);
    collector.add(3);
    
    sortAsc(collector);
    if(collector.getFirst().equals(1)) {
        success = true;
    }
    return success;
}

private void sortAsc(List collection) {
    collection.sort( 
            (a, b) -> { 
                return -1 * a.compareTo(b); 
            });
}

public boolean method() {
    boolean success = false;
    List collector = new ArryList();
    collector.add(1);
    collector.add(2);
    collector.add(3);
    
    sortAsc(collector);
    if(collector.getFirst().equals(1)) {
        success = true;
    }
    return success;
}

private void sortAsc(List collection) {
    collection.sort( 
            (a, b) -> { 
                return -1 * a.compareTo(b); 
            });
}

Therefore, to effectively write automated tests, it is necessary to follow a certain coding style. The preceding Listing 1 simply demonstrates what testable code can look like.

Since developers also write the corresponding component tests for their own implementations, the problem of difficult-to-test code is largely eliminated in projects that follow a test-driven approach. The motivation to test now lies with the developer, as this paradigm allows them to determine whether their implementation behaves as intended. However, we must ask ourselves: Is that all we need to do to develop good and stable software?

As we might expect with such questions, the answer is no. An essential tool for evaluating the quality of tests is achieving the highest possible test coverage. A distinction is made between branch and line coverage. To illustrate the difference more clearly, let’s briefly look at the pseudocode in Listing 2.

if( Expression-A OR Expression-B ) {
	print(‘allow‘);
} else {
	print(‘decline‘);
}

Our goal is to execute every line of code if possible. To achieve this, we already need two separate test cases: one for entering the IF branch and one for entering the ELSE branch. However, to achieve 100% branch coverage, we must cover all variations of the IF branch. In this example, that means one test that makes Expression A true, and another test that makes Expression B true. This results in a total of three different test cases.

The screenshot from the TP-CORE project shows what such test coverage can look like in ‘real-world’ projects.

Of course, this example is very simple, and in real life, there are often constructs where, despite all efforts, it’s impossible to reach all lines or branches. Exceptions from third-party libraries that need to be caught but cannot be triggered under normal circumstances are a typical example.

For this reason, while we strive to achieve the highest possible test coverage and naturally aim for 100%, there are many cases where this is not feasible. However, a test coverage of 90% is quite achievable. The industry standard for commercial projects is 85% test coverage. Based on these observations, we can say that test coverage correlates with the testability of an application. This means that test coverage is a suitable measure of testability.

However, it must also be acknowledged that the test coverage metric has its limitations. Regular expressions and annotations for data validation are just a few simple examples of where test coverage alone is not a sufficient indicator of quality.

Without going too much into the implementation details, let’s imagine we had to write a regular expression to validate input against a correct 24-hour time format. If we don’t keep the correct interval in mind, our regular expression might be incorrect. The correct interval for the 24-hour format is 00:00 – 23:59. Examples of invalid values are 24:00 or 23:60. If we are unaware of this fact, errors can remain hidden in our application despite test cases, only to surface and cause problems when the application is actually used.

„… In a few cases, participants were unable to think of alternative solutions …“

The question here was whether error correction always represents the optimal solution. Beyond that, it would be necessary to clarify what constitutes an optimal solution in commercial software development projects. The statement that there are cases in which developers only know or understand one way of doing things is very illustrative. This is also reflected in our example of regular expressions (RegEx). Software development is a thought process that cannot be accelerated. Our thinking is determined by our imagination, which in turn is influenced by our experience.

This already shows us another example of sources of errors in test cases. A classic example is incorrect comparisons in collections, such as comparing arrays. The problem we are dealing with here is how variables are accessed: call by value or call by reference. With arrays, access is via call by reference, i.e., directly to the memory location. If you now assign an array to a new variable and compare both variables, they will always be the same because you are comparing the array with itself. This is an example of a test case that is essentially meaningless. However, if the implementation is correct, this faulty test case will never cause any problems.

This realization shows us that blindly striving for complete test coverage is not conducive to quality. Of course, it’s understandable that this metric is highly valued by management. However, we have also been able to demonstrate that one cannot rely on it alone. We therefore see that there is also a need for code inspections and refactorings for test cases. Since it’s impossible to read and understand all the code from beginning to end due to time constraints, it’s important to focus on problematic areas. But how can we find these problem areas? A relatively new technique helps us here. The theoretical work on this is already somewhat older; it just took a while for corresponding implementations to become available.

Articels Publications

Test First?

Elmar Dott Dec 20, 2023

Conferences

JPoint Moscow 2023

Elmar Dott Apr 12, 2023

Articels

Acceptance Tests in Java With JGiven

Elmar Dott Jan 1, 2020

Cryptography – more than just coincidence

Posted on 2025-10-27 by Elmar Dott

In everyday language, we use the word “coincidence” rather unreflectively. Phrases like, “I happened to be passing by here” or “What a coincidence to meet you here” are familiar to everyone. But what do we mean by that? What we’re actually trying to say is that we didn’t expect the current situation.

Coincidence is actually a mathematical term that we’ve adopted into everyday language. Coincidence means something unpredictable. Things like the exact location of any electron in an atom at a given moment. While the path I take to reach a particular destination can be arbitrary, preferences can be derived from probabilities, which then make the choice quite predictable.

Circumstances for such a scenario can be distance, personal well-being (time pressure, discomfort, or boredom), or external circumstances (weather: sunshine, rain). If I’m bored AND the sun is shining, I choose an unknown route for a bit of distraction and curiosity. If I’m short on time AND it’s raining, I choose the shortest route I know, or a route that’s as sheltered as possible. This means that the better you know a person’s habits, the more predictable their decisions are. But predictability contradicts the concept of chance.

It’s nothing new that mathematical terms with very strict definitions are temporarily adopted into our everyday language as a fad. I’d like to briefly address a very popular example, one already cited by Joseph Weizenbaum: the term chaos. In mathematical terms, chaos actually describes the fact that a very small change over very long distances significantly distorts the result, so that it can’t even be used as an estimate or approximation. A typical application is astronomy. If I point a laser beam from Earth to the moon, a deviation of just a few millimeters causes the laser beam to miss the moon by kilometers. To explain such facts to a broader audience in popular science, an association was used that if a butterfly flaps its wings in Tokyo, it can cause a storm in Berlin. Unfortunately, there are quite a few pseudoscientists who seize on this image and sell it to their peers as fact. This is, of course, nonsense. The flapping of a butterfly’s wings cannot create a storm on the other side of the globe. Just think of the impact this would have on our world, just all the birds that take to the air every day.

“Why did the mathematician’s marriage fail? His wife was unpredictable.”

But why is randomness so important in mathematics? Specifically, it’s about the broad topic of cryptography. If we choose combinations for encryption that are easy to guess, the protection is quickly lost. Here’s a small example.

Internet pages are stateless. This means that after a website is accessed and a link is clicked to go to the next page, all information from the previous page is lost. To still be able to provide things like an online shop, a shopping cart, and all the other necessary shopping functions, there is the option of storing data on the server in so-called sessions. This data often includes the user’s login. To distinguish between sessions, they have an identification (ID). The programmer then specifies how this ID is generated. One property of these IDs is that they must be unique; no ID can occur twice.

Now, one might think of using the timestamp, including the milliseconds, to generate a hash. The hash prevents anyone from immediately recognizing that the ID is created from a timestamp. A patient hacker, with a little diligence, uncovered this secret relatively quickly. Added to that is the probability that two users could create a session at the same time, which would lead to an error.

Now, one might come up with the idea of assembling the SessionID from various segments such as timestamps + usernames and other details. Although increasing complexity offers a certain degree of protection, this is not true security. Professionals have methods with manageable effort to guess these ‘avoidable’ secrets. The only real protection is the use of cryptographically secure randomness. As a segment that cannot be guessed, no matter how much effort is put into it.

Before I reveal how we can address the problem, I would like to briefly discuss the typical attack vector and the damage it causes to SessionIDs. If the SessionID has been guessed by an attacker and this session is still active, the hacker can take over this session in their browser. This process is called session hijacking or session riding. The attacker who has managed to take over an active session is logged into an online service as a foreign user with a profile that does not belong to them. This allows them to perform all the actions that a legitimate user can do. It would therefore be possible to place an order in an online shop and have the goods shipped to a different address. This is a situation that must be prevented at all costs.

There are various strategies used to prevent the theft of an active session. Each of these strategies offers a certain level of protection, but the full strength is only achieved by combining the various options, as hackers are constantly evolving and looking for new opportunities. In this short article, we will only consider the aspect of how to generate a cryptographically secure session ID.

Almost all common programming languages have a random() function that generates a random number. The implementation of this random number varies. Unfortunately, these generated numbers are not as random as they should be for attackers. Therefore, developers should always avoid this simple random function. Instead, there are cryptographically secure implementations for random numbers for backend languages such as PHP and Java.

For Java programs, you can use the java.security.SecureRandom class. An important feature of this class is the ability to choose from various cryptographic algorithms [1]. Additionally, the starting value can be specified using the so-called seed. To demonstrate its use, here is a short code snippet:

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

As we can see, its use is quite simple and can be easily adapted. Generating randomness is even easier in PHP. To do so, simply call the function random_int ( $min, $max ); [2]. The interval can be specified optionally.
Thus, we see that the assumption of many people that our world is largely computable is not entirely true. In many areas of the natural sciences, there are processes that we cannot calculate. These, in turn, form the basis for generating ‘true’ randomness. For applications that require very strong protection, hardware is often used. These might be devices that measure the radioactive decay of a low-radiation isotope.

The fields of cryptography and web application security are, of course, much more extensive. This article is intended to draw attention to the necessity of this topic using a fairly simple example. In doing so, I have avoided confusing and ultimately alienating potential interested parties with complicated mathematics.

Resources

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

Workshops

Network spy protection with AdGuard Home on a Raspberry Pi & Docker

Elmar Dott Jan 3, 2022 6 min read

Articels

Recover privacy with Kodachi Linux on the Internet

Peter Kapherr Aug 18, 2025

Subscription

CONTENT SECURITY POLICY

Elmar Dott Jul 7, 2025

Pathfinder

Posted on 2025-10-13 by Elmar Dott

So that we can call console programs directly across the system without having to specify the full path, we use the so-called path variable. So we save the entire path including the executable program, the so-called executable, in this path variable so that we no longer have to specify the path including the executable on the command line. By the way, the word executable derives the file extension exe, which is common in Windows. Here we also have a significant difference between the two operating systems Windows and Linux. While Windows knows whether it is a pure ASCII text file or an executable file via the file extension such as exe or txt, Linux uses the file’s meta information to make this distinction. That’s why it’s rather unusual to use these file extensions txt and exe under Linux.

Typical use cases for setting the path variable are programming languages such as Java or tools such as the Maven build tool. For example, if we downloaded Maven from the official homepage, we can unpack the program anywhere on our system. On Linux the location could be /opt/maven and on Microsoft Windows it could be C:/Program Files/Maven. In this installation directory there is a subdirectory /bin in which the executable programs are located. The executable for Maven is called mvn and in order to output the version, under Linux without the entry in the path variable the command would be as follows: /opt/maven/bin/mvn -v. So it’s a bit long, as we can certainly admit. Entering the Maven installation directory in the path shortens the entire command to mvn -v. By the way, this mechanism applies to all programs that we use as a command in the console.

Before I get to how the path variable can be adjusted under Linux and Windows, I would like to introduce another concept, the system variable. System variables are global variables that are available to us in Bash. The path variable also counts as a system variable. Another system variable is HOME, which points to the logged in user’s home directory. System variables are capitalized and words are separated with an underscore. For our example with entering the Maven Executable in the path, we can also set our own system variable. The M2_HOME convention applies to Maven and JAVA_HOME applies to Java. As a best practice, you bind the installation directory to a system variable and then use the self-defined system variable to expand the path. This approach is quite typical for system administrators who simplify their server installation using system variables. Because these system variables are global and can also be read by automation scripts.

The command line, also known as shell, bash, console and terminal, offers an easy way to output the value of the system variable with echo. Using the example of the path variable, we can immediately see the difference to Linux and Windows. Linux: echo $PATH Windows: echo %PATH%

ed@local:~$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/snap/bin:/home/ed/Programs/maven/bin:/home/ed/.local/share/gem//bin:/home/ed/.local/bin:/usr/share/openjfx/lib

ed@local:~$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/snap/bin:/home/ed/Programs/maven/bin:/home/ed/.local/share/gem//bin:/home/ed/.local/bin:/usr/share/openjfx/lib

Let’s start with the simplest way to set the path variable. In Linux we just need to edit the hidden .bashrc file. At the end of the file we add the following lines and save the content.

export M2_HOME="/opt/maven"
export PATH=$PATH:$M2_HOME/bin

We bind the installation directory to the M2_HOME variable. We then expand the path variable to include the M2_HOME system variable with the addition of the subdirectory of executable files. This procedure is also common on Windows systems, as it allows the installation path of an application to be found and adjusted more quickly. After modifying the .bashrc file, the terminal must be restarted for the changes to take effect. This procedure ensures that the entries are not lost even after the computer is restarted.

Under Windows, the challenge is simply to find the input mask where the system variables can be set. In this article I will limit myself to the version for Windows 11. It may of course be that the way to edit the system variables has changed in a future update. There are slight variations between the individual Windows versions. The setting then applies to both the CMD and PowerShell. The screenshot below shows how to access the system settings in Windows 11.

To do this, we right-click on an empty area on the desktop and select the System entry. In the System – About submenu you will find the system settings, which open the System properties popup. In the system settings we press the Environment Variables button to get the final input mask. After making the appropriate adjustments, the console must also be restarted for the changes to take effect.

In this little help, we learned about the purpose of system variables and how to store them permanently on Linux and Windows. We can then quickly check the success of our efforts in the shell using echo by outputting the contents of the variables. And we are now one step closer to becoming an IT professional.

Subscription

Built-in backup in Linux with rsync

Elmar Dott May 19, 2025

Articels

Disk-Jock-Ey II

Elmar Dott Nov 3, 2025

Articels Publications

Tooltime: SCM-Manager

Elmar Dott Sep 5, 2021

PHP Elegant Testing with Laravel

Posted on 2025-10-06 by Elmar Dott

The PHP programming language has been the first choice for many developers in the field of web applications for decades. Since the introduction of object-oriented language features with version 5, PHP has come of age. Large projects can now be implemented in a clean and, above all, maintainable architecture. A striking difference between commercial software development and a hobbyist who has assembled and maintains a club’s website is the automated verification that the application adheres to specified specifications. This brings us into the realm of automated software testing.

A key principle of automated software testing is that it verifies, without additional interaction, that the application exhibits a predetermined behavior. Software tests cannot guarantee that an application is error-free, but they do increase quality and reduce the number of potential errors. The most important aspect of automated software testing is that behavior already defined in tests can be quickly verified at any time. This ensures that if developers extend an existing function or optimize its execution speed, the existing functionality is not affected. In short, we have a powerful tool for ensuring that we haven’t broken anything in our code without having to laboriously click through all the options manually each time.

To be fair, it’s also worth mentioning that the automated tests have to be developed, which initially takes time. However, this ‘supposed’ extra effort quickly pays off once the test cases are run multiple times to ensure that the status quo hasn’t changed. Of course, the created test cases also have to be maintained.

If, for example, an error is detected, you first write a test case that replicates the error. The repair is then successfully completed if the test case(s) pass. However, changes in the behavior of existing functionality always require corresponding adaptation of the associated tests. This concept of writing tests in parallel to implement the function is feasible in many programming languages and is called test-driven development. From my own experience, I recommend taking a test-driven approach even for relatively small projects. Small projects often don’t have the complexity of large applications, which also require some testing skills. In small projects, however, you have the opportunity to develop your skills within a manageable framework.

Test-driven software development is nothing new in PHP either. Sebastian Bergmann’s unit testing framework PHPUnit has been around since 2001. The PEST testing framework, released around 2021, builds on PHPUnit and extends it with a multitude of new features. PEST stands for PHP Elegant Testing and defines itself as a next-generation tool. Since many agencies, especially smaller ones, that develop their software in PHP generally limit themselves to manual testing, I would like to use this short article to demonstrate how easy it is to use PEST. Of course, there is a wealth of literature on the topic of test-driven software development, which focuses on how to optimally organize tests in a project. This knowledge is ideal for developers who have already taken their first steps with testing frameworks. These books teach you how to develop independent, low-maintenance, and high-performance tests with as little effort as possible. However, to get to this point, you first have to overcome the initial hurdle: installing the entire environment.

A typical environment for self-developed web projects is the Laravel framework. When creating a new Laravel web project, you can choose between PHPUnit and PEST. Laravel takes care of all the necessary details. A functioning PHP environment is required as a prerequisite. This can be a Docker container, a native installation, or the XAMPP server environment from Apache Friends. For our short example, I’ll use the PHP CLI on Debian Linux.

sudo apt-get install php-cli php-mbstring php-xml php-pcov

After executing the command in the console, you can test the installation success using the php -v command. The next step is to use a package manager to deploy other PHP libraries for our application. Composer is one such package manager. It can also be quickly deployed to the system with just a few instructions.

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php -r "if (hash_file('sha384', 'composer-setup.php') === 'ed0feb545ba87161262f2d45a633e34f591ebb3381f2e0063c345ebea4d228dd0043083717770234ec00c5a9f9593792') { echo 'Installer verified'.PHP_EOL; } else { echo 'Installer corrupt'.PHP_EOL; unlink('composer-setup.php'); exit(1); }"
php composer-setup.php
php -r "unlink('composer-setup.php');"

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php -r "if (hash_file('sha384', 'composer-setup.php') === 'ed0feb545ba87161262f2d45a633e34f591ebb3381f2e0063c345ebea4d228dd0043083717770234ec00c5a9f9593792') { echo 'Installer verified'.PHP_EOL; } else { echo 'Installer corrupt'.PHP_EOL; unlink('composer-setup.php'); exit(1); }"
php composer-setup.php
php -r "unlink('composer-setup.php');"

This downloads the current version of the composer.phar file to the current directory in which the command is executed. The correct hash is also automatically checked. To make Composer globally available via the command line, you can either include the path in the path variable or link composer.phar to a directory whose path is already integrated into Bash. I prefer the latter option and achieve this with:

ln -d composer.phar $HOME/.local/bin/composer

If everything was executed correctly, composer list should now display the version, including the available commands. If this is the case, we can install the Lavarel installer globally in the Composer repository.

php composer global require laravel/installer

To install Lavarel via Bash, the path variable COMPOSER_HOME must be set. To find out where Composer created the repository, simply use the command composer config -g home. The resulting path, which in my case is /home/ed/.config/composer, is then bound to the variable COMPOSER_HOME. We can now run

php $COMPOSER_HOME/vendor/bin/laravel new MyApp

in an empty directory to create a new Laravel project. The corresponding console output looks like this:

ed@P14s:~/Downloads/test$ php $COMPOSER_HOME/vendor/bin/laravel new MyApp

   _                               _
  | |                             | |
  | |     __ _ _ __ __ ___   _____| |
  | |    / _` |  __/ _` \ \ / / _ \ |
  | |___| (_| | | | (_| |\ V /  __/ |
  |______\__,_|_|  \__,_| \_/ \___|_|


 ┌ Which starter kit would you like to install? ────────────────┐
 │ None                                                         │
 └──────────────────────────────────────────────────────────────┘

 ┌ Which testing framework do you prefer? ──────────────────────┐
 │ Pest                                                         │
 └──────────────────────────────────────────────────────────────┘

Creating a "laravel/laravel" project at "./MyApp"
Installing laravel/laravel (v12.4.0)
  - Installing laravel/laravel (v12.4.0): Extracting archive
Created project in /home/ed/Downloads/test/MyApp
Loading composer repositories with package information

ed@P14s:~/Downloads/test$ php $COMPOSER_HOME/vendor/bin/laravel new MyApp

   _                               _
  | |                             | |
  | |     __ _ _ __ __ ___   _____| |
  | |    / _` |  __/ _` \ \ / / _ \ |
  | |___| (_| | | | (_| |\ V /  __/ |
  |______\__,_|_|  \__,_| \_/ \___|_|


 ┌ Which starter kit would you like to install? ────────────────┐
 │ None                                                         │
 └──────────────────────────────────────────────────────────────┘

 ┌ Which testing framework do you prefer? ──────────────────────┐
 │ Pest                                                         │
 └──────────────────────────────────────────────────────────────┘

Creating a "laravel/laravel" project at "./MyApp"
Installing laravel/laravel (v12.4.0)
  - Installing laravel/laravel (v12.4.0): Extracting archive
Created project in /home/ed/Downloads/test/MyApp
Loading composer repositories with package information

The directory structure created in this way contains the tests folder, where the test cases are stored, and the phpunit.xml file, which contains the test configuration. Laravel defines two test suites: Unit and Feature, each of which already contains a demo test. To run the two demo test cases, we use the artisan command-line tool [1] provided by Laravel. To run the tests, simply enter the php artisan test command in the root directory.

In order to assess the quality of the test cases, we need to determine the corresponding test coverage. We also obtain the coverage using artisan with the test statement, which is supplemented by the --coverage parameter.

php artisan test --coverage

The output for the demo test cases provided by Laravel is as follows:

Unfortunately, artisan’s capabilities for executing test cases are very limited. To utilize PEST’s full functionality, the PEST executor should be used right from the start.

php ./vendor/bin/pest -h

The PEST executor can be found in the vendor/bin/pest directory, and the -h parameter displays help. In addition to this detail, we’ll focus on the tests folder, which we already mentioned. In the initial step, two test suites are preconfigured via the phpunit.xml file. The test files themselves should end with the suffix Test, as in the ExampleTest.php example.

Compared to other test suites, PEST attempts to support as many concepts of automated test execution as possible. To maintain clarity, each test level should be stored in its own test suite. In addition to classic unit tests, browser tests, stress tests, architecture tests, and even the newly emerging mutation testing are supported. Of course, this article can’t cover all aspects of PEST, and there are now many high-quality tutorials available for writing classic unit tests in PEST. Therefore, I’ll limit myself to an overview and a few less common concepts.

Architecture test

The purpose of architectural tests is to provide a simple way to verify whether developers are adhering to the specifications. This includes, among other things, ensuring that classes representing data models are located in a specified directory and may only be accessed via specialized classes.

test('models')
->expect('App\Models')
->toOnlyBeUsedOn('App\Repositories')
->toOnlyUse('Illuminate\Database');

Mutation-Test

This form of testing is something new. The purpose of the exercise is to create so-called mutants by making changes, for example, to the conditions of the original implementation. If the tests assigned to the mutants continue to run correctly instead of failing, this can be a strong indication that the test cases may be faulty and lack meaningfulness.

Original: if(TRUE) → Mutant: if(FALSE)

Stress-Test

Another term for stress tests is penetration testing, which focuses specifically on the performance of an application. This allows you to ensure that the web app, for example, can handle a defined number of accesses.

Of course, there are many other helpful features available. For example, you can group tests and then run the groups individually.

// definition
pest()->extend(TestCase::class)
->group('feature')
->in('Feature');

// calling
php ./vendor/bin/pest --group=feature

For those who don’t work with the Lavarel framework but still want to test in PHP with PEST, you can also integrate the PEST framework into your application. All you need to do is define PEST as a corresponding development dependency in the Composer project configuration. Then, you can initiate the initial test setup in the project’s root directory.

php ./vendor/bin/pest --init

As we’ve seen, the options briefly presented here alone are very powerful. The official PEST documentation is also very detailed and should generally be your first port of call. In this article, I focused primarily on minimizing the entry barriers for test-driven development in PHP. PHP now also offers a wealth of options for implementing commercial software projects very efficiently and reliably.

Ressourcen

[1] Artisan Cheat Sheat: https://artisan.page
[2] PEST Homepage: https://pestphp.com

Subscription

CONTENT SECURITY POLICY

Elmar Dott Jul 7, 2025

Subscription

Accelerate PHP Applications with Redis

Elmar Dott Jul 28, 2024

Workshops

Successful validation of ISBN validation of ISBN numbers

Elmar Dott Jul 28, 2025

Successful validation of ISBN validation of ISBN numbers

Posted on 2025-07-28 by Elmar Dott

Developers are regularly faced with the task of checking user input for accuracy. A considerable number of standardized data formats now exist that make such validation tasks easy to master. The International Standard Book Number, or ISBN for short, is one such data format. ISBN comes in two versions: a ten-digit and a 13-digit version. From 1970 to 2006, the ten-digit version of the ISBN was used (ISBN-10), which was replaced by the 13-digit version (ISBN-13) in January 2007. Nowadays, it is common practice for many publishers to provide both versions of the ISBN for titles. It is common knowledge that books can be uniquely identified using this number. This, of course, also means that these numbers are unique. No two different books have the same ISBN (Figure 1).

Validating credit card numbers for online payments

The theoretical background for determining whether a sequence of numbers is correct comes from coding theory. Therefore, if you would like to delve deeper into the mathematical background of error-detecting and error-correcting codes, we recommend the book “Coding Theory” by Ralph Hardo Schulz [1]. It teaches, for example, how error correction works on compact disks (CDs). But don’t worry, we’ll reduce the necessary mathematics to a minimum in this short workshop.

The ISBN is an error-detecting code. Therefore, we can’t automatically correct a detected error. We only know that something is wrong, but we don’t know the specific error. So let’s get a little closer to the matter.

Why exactly 13 digits were agreed upon for ISBN-13 remains speculation. At least the developers weren’t influenced by any superstition. The big secret behind validation is the determination of the residual classes [2]. The algorithms for ISBN-10 and ISBN-13 are quite similar. So let’s start with the older standard, ISBN-10, which is calculated as follows:

1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6 + 7x7 + 8x8 + 9x9 + 10x10 = k modulo 11

Don’t worry, you don’t have to be a SpaceX rocket engineer to understand the formula above. We’ll lift the veil of confusion with a small example for ISBN 3836278340. This results in the following calculation:

(1*3) + (2*8) + (3*3) + (4*6) + (5*2) + (6*7) + (7*8) + (8*3) + (9*4) + (10*0) = 220
220 modulo 11 = 0

The last digit of the ISBN is the check digit. In the example given, this is 0. To obtain this check digit, we multiply each digit by its value. This means that the fourth position is a 6, so we calculate 4 * 6. We repeat this for all positions and add the individual results together. This gives us the amount 220. The 220 is divided by 11 using the remainder operation modulo 11. Since 11 fits exactly 20 times into 220, there is a remainder of zero. The result of 220 modulo 11 is 0 and matches the check digit, which tells us that we have a valid ISBN-10.

However, there is one special feature to note. Sometimes the last digit of the ISBN ends with X. In this case, the X must be replaced with 10.

As you can see, the algorithm is very simple and can easily be implemented using a simple for loop.

boolean success = false;
int[] isbn;
int sum = 0;

for(i=0; i<10; i++) {
    sum += i*isbn[i];
}

if(sum%11 == 0) {
    success = true;
}

To keep the algorithm as simple as possible, each digit of the ISBN-10 number is stored in an integer array. Based on this preparation, it is only necessary to iterate through the array. If the sum check using the modulo 11 then returns 0, everything is fine.

To properly test the function, two test cases are required. The first test checks whether an ISBN is correctly recognized. The second test checks for so-called false positives. This provokes an expected error with an incorrect ISBN. This can be quickly accomplished by changing any digit of a valid ISBN.

Our ISBN-10 validator still has one minor flaw. Digit sequences that are shorter or longer than 10, i.e., do not conform to the expected format, could be rejected beforehand. The reason for this can be seen in the example: The last digit of the ISBN-10 is a 0 – thus, the character result is also 0. If the last digit is forgotten and a check for the correct format isn’t performed, the error won’t be detected. Something that has no effect on the algorithm, but is very helpful as feedback for user input, is to gray out the input field and disable the submit button until the correct ISBN format has been entered.

The algorithm for ISBN-13 is similarly simple.

x₁ + 3x₂ + x₃ + 3x₄ + x₅ + 3x₆ + x₇ + 3x₈ + x₉ + 3x₁₀ + x₁₁ + 3x₁₂ + x₁₃ = k modulo 10

As with ISBN-10, xn represents the numerical value at the corresponding position in the ISBN-13. Here, too, the partial results are summed and divided by a modulo. The main difference is that only the even-numbered positions—positions 2, 4, 6, 8, 10, and 12—are multiplied by 3, and the result is then divided by modulo 10. As an example, we calculate the ISBN-13: 9783836278348.

9 + (3*7) + 8 + (3*3) + 8 + (3*3) + 6 + (3*2) + 7 + (3*8) + 3 + (3*4) + 8 = 130
130 modulo 10 = 0

The algorithm can also be implemented for the ISBN-13 in a simple for loop.

boolean success = false;
int[] isbn;
int sum = 0;

for(i=0; i<13; i++) {
    if(i%2 == 0) {
        sum += 3*isbn[i];
    } else {
        sum += isbn[i];
    }
}

if(sum%10 == 0) {
    success = true;
}

The two code examples for ISBN-10 and ISBN-13 differ primarily in the if condition. The expression i % 2 calculates the modulo value 2 for the respective iteration. If the result is 0 at this point, it means it is an even number. The corresponding value must then be multiplied by 3.

This shows how useful the modulo operation % can be for programming. To keep the implementation as compact as possible, the so-called triple operator can be used instead of the if-else condition. The expression sum += (i%2) ? isbn[i] : 3 * isbn[3] is much more compact, but also more difficult to understand.

Below you will find a fully implemented class for checking the ISBN in the programming languages: Java, PHP, and C#.

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

While the solutions presented in the examples all share the same core approach, they differ in more than just syntactical details. The Java version, for example, offers a more comprehensive variant that distinguishes more generically between ISBN-10 and ISBN-13. This demonstrates that there are many ways to Rome. It also aims to show less experienced developers different approaches and encourage them to make their own adaptations. To simplify understanding, the source code has been enriched with comments. PHP, as an untyped language, eliminates the need to convert strings to numbers. Instead, a RegEx function is used to ensure that the entered characters are type-safe.

Lessons Learned

As you can see, verifying whether an ISBN is correct isn’t rocket science. The topic of validating user input is, of course, much broader. Other examples include credit card numbers. But regular expressions also provide valuable services in this context.

Ressourcen

[1] Ralph-Hardo Schulz, Codierungstheorie: Eine Einführung, 2003, ISBN 978-3-528-16419-5
[2] Concept of modular aritmetic on Wikipedia, https://en.wikipedia.org/wiki/Modular_arithmetic

Articels

Pathfinder

Elmar Dott Oct 13, 2025

Maven Master Class

Apache Maven [en]

Elmar Dott Jun 25, 2025

Conferences

jConf Peru 2022

Elmar Dott Nov 26, 2022

Decision making with constraint programming

Posted on 2025-07-21 by Elmar Dott

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

Beyond code: Why soft skills for developers in the AI era become irreplaceable

Posted on 2025-07-15 by Elmar Dott

AI tools such as Github Copilot, Chatgpt and other code generators change the developer role. Many programmers wonder which skills will be asked in the future. AI does not replace any developers. But developers without soft skills replace themselves.

“The best developers 2030 will not be a better code – but better translators between humans and machines.” Andrej Karpathy, ex-Openai

In June 2025, Microsoft deleted 9000 jobs [1]. Companies such as Microsoft, Google or IBM change their teams-and AI tools are often part of the strategy. One reason for these laying waves is the comprehensive availability of powerful AI tools. According to a study by McKinsey [2], AI systems can accelerate up to 60% of the Developer workload. If AI can do up to 80% of the coding, what makes me irreplaceable? More and more people are now asking themselves this central question because they are directly affected by the 4th industrial revolution or are affected in the foreseeable future.

Unlike earlier revolutions, there is no ‘retraining on web design’ this time. AI tools such as Devin or Chatgpt code not only automate tasks, but entire job profiles and faster than most of those affected can react. Studies show that up to 30% of all developer roles will not be converted by 2030, but are replaced by artificial intelligence.

This trend can be found in almost all professions, also in classic craft. On YouTube you can specifically search for videos, such as deliver small, cute robots orders in Moscow. Or as robots print out entire houses. New patents that affect steel shavings to concrete increase the stability and replace classic iron lichen. Machines that lay the floor tiles can also be seen. The list of activities that can be carried out by AI is long.

If you internalize this forecast, you can be afraid and worried. In order not only to survive in this new period, but even to be one of the winners, requires a high degree of flexibility. That is why one of the most important properties we have to develop will be a flexible spirit. Because although AI is very powerful, their limits are also set. If we only think about what defines us as humans, we find an important quality: creativity. How can we use this for future success? So that the statement: if your creativity does not become a platitude, I first look at the way how it will most likely become nothing.

Often junior developers ask me which framework, which programming Apache, which operating system etc. you should learn. These were the wrong questions in the old days. It’s not about following trends, but an appeal. If programming is to be a calling for me, it is first of all about understanding what the code you write really does. With a profound understanding of the source text, performance improvements can also be found quickly. Optimizations in the area of security are also included. But locating errors and their elimination are also characteristics of good developers. Because it is precisely in these areas that human creativity of artificial intelligence is superior. Of course, this means that as a consequence, it is consequently expanding exactly these skills.

Anyone who is only busy running after current fashion phenomena was not one of the specialists in demand in the ‘old’ time. Pure code of Monkeys their activities primarily consist of copying and inserting, without really grasping what the code snippets mean, were easy to replace. Especially now that AI is supposed to increase productivity, it is important to decide quickly and safely where a proposed implementation needs adjustments so that there are no unpleasant surprises when the application goes into production. Of course, this also means as a consequence that AI is a tool that needs to be used efficiently. In order to continue to stay on the winning page in the future, it is essential to significantly improve your own productivity by handling AI. Companies expect their employees that with the support of AI they can do a four to five times the current workload.

In order to be able to work effectively with artificial intelligence, your own communication skills are essential. Because only if you have clearly structured your thoughts can you formulate it correctly and specifically. A significant increase in performance can only be achieved if the desired result is achieved when the first instruction. Anyone who has to explain to the language model every time how inquiries can be understood, for example, because they contain amplifying, you will be able to achieve little time savings by AI.

You can basically say that the developer of the future should have some management skills. In addition to clear tasks, there will be a lot of self -management. To distribute suitable resources for optimal results. Because not only artificial intelligence threatens your own workplace, but also a strong competition from the Asian region. Well -trained, motivated and powerful people are now available in high numbers.

So we see that very moving times are coming up. The world will turn a little faster. Anyone who perceives these changes as a threat but as a challenge has a good chance of being fit for the no longer too long future. Anyone who already sets the course is well prepared for what will come to us and does not have to be afraid of anything.