Successful validation of ISBN validation of ISBN numbers

Developers are regularly faced with the task of checking user input for accuracy. A considerable number of standardized data formats now exist that make such validation tasks easy to master. The International Standard Book Number, or ISBN for short, is one such data format. ISBN comes in two versions: a ten-digit and a 13-digit version. From 1970 to 2006, the ten-digit version of the ISBN was used (ISBN-10), which was replaced by the 13-digit version (ISBN-13) in January 2007. Nowadays, it is common practice for many publishers to provide both versions of the ISBN for titles. It is common knowledge that books can be uniquely identified using this number. This, of course, also means that these numbers are unique. No two different books have the same ISBN (Figure 1).

The theoretical background for determining whether a sequence of numbers is correct comes from coding theory. Therefore, if you would like to delve deeper into the mathematical background of error-detecting and error-correcting codes, we recommend the book “Coding Theory” by Ralph Hardo Schulz [1]. It teaches, for example, how error correction works on compact disks (CDs). But don’t worry, we’ll reduce the necessary mathematics to a minimum in this short workshop.

The ISBN is an error-detecting code. Therefore, we can’t automatically correct a detected error. We only know that something is wrong, but we don’t know the specific error. So let’s get a little closer to the matter.

Why exactly 13 digits were agreed upon for ISBN-13 remains speculation. At least the developers weren’t influenced by any superstition. The big secret behind validation is the determination of the residual classes [2]. The algorithms for ISBN-10 and ISBN-13 are quite similar. So let’s start with the older standard, ISBN-10, which is calculated as follows:

1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6 + 7x7 + 8x8 + 9x9 + 10x10 = k modulo 11

Don’t worry, you don’t have to be a SpaceX rocket engineer to understand the formula above. We’ll lift the veil of confusion with a small example for ISBN 3836278340. This results in the following calculation:

(1*3) + (2*8) + (3*3) + (4*6) + (5*2) + (6*7) + (7*8) + (8*3) + (9*4) + (10*0) = 220
220 modulo 11 = 0

The last digit of the ISBN is the check digit. In the example given, this is 0. To obtain this check digit, we multiply each digit by its value. This means that the fourth position is a 6, so we calculate 4 * 6. We repeat this for all positions and add the individual results together. This gives us the amount 220. The 220 is divided by 11 using the remainder operation modulo 11. Since 11 fits exactly 20 times into 220, there is a remainder of zero. The result of 220 modulo 11 is 0 and matches the check digit, which tells us that we have a valid ISBN-10.

However, there is one special feature to note. Sometimes the last digit of the ISBN ends with X. In this case, the X must be replaced with 10.

As you can see, the algorithm is very simple and can easily be implemented using a simple for loop.

boolean success = false;
int[] isbn;
int sum = 0;

for(i=0; i<10; i++) {
sum += i*isbn[i];
}

if(sum%11 == 0) {
success = true;
}

To keep the algorithm as simple as possible, each digit of the ISBN-10 number is stored in an integer array. Based on this preparation, it is only necessary to iterate through the array. If the sum check using the modulo 11 then returns 0, everything is fine.

To properly test the function, two test cases are required. The first test checks whether an ISBN is correctly recognized. The second test checks for so-called false positives. This provokes an expected error with an incorrect ISBN. This can be quickly accomplished by changing any digit of a valid ISBN.

Our ISBN-10 validator still has one minor flaw. Digit sequences that are shorter or longer than 10, i.e., do not conform to the expected format, could be rejected beforehand. The reason for this can be seen in the example: The last digit of the ISBN-10 is a 0 – thus, the character result is also 0. If the last digit is forgotten and a check for the correct format isn’t performed, the error won’t be detected. Something that has no effect on the algorithm, but is very helpful as feedback for user input, is to gray out the input field and disable the submit button until the correct ISBN format has been entered.

The algorithm for ISBN-13 is similarly simple.

x1 + 3x2 + x3 + 3x4 + x5 + 3x6 + x7 + 3x8 + x9 + 3x10 + x11 + 3x12 + x13 = k modulo 10

As with ISBN-10, xn represents the numerical value at the corresponding position in the ISBN-13. Here, too, the partial results are summed and divided by a modulo. The main difference is that only the even-numbered positions—positions 2, 4, 6, 8, 10, and 12—are multiplied by 3, and the result is then divided by modulo 10. As an example, we calculate the ISBN-13: 9783836278348.

9 + (3*7) + 8 + (3*3) + 8 + (3*3) + 6 + (3*2) + 7 + (3*8) + 3 + (3*4) + 8 = 130
130 modulo 10 = 0

The algorithm can also be implemented for the ISBN-13 in a simple for loop.

boolean success = false;
int[] isbn;
int sum = 0;

for(i=0; i<13; i++) {
if(i%2 == 0) {
sum += 3*isbn[i];
} else {
sum += isbn[i];
}
}

if(sum%10 == 0) {
success = true;
}

The two code examples for ISBN-10 and ISBN-13 differ primarily in the if condition. The expression i % 2 calculates the modulo value 2 for the respective iteration. If the result is 0 at this point, it means it is an even number. The corresponding value must then be multiplied by 3.

This shows how useful the modulo operation % can be for programming. To keep the implementation as compact as possible, the so-called triple operator can be used instead of the if-else condition. The expression sum += (i%2) ? isbn[i] : 3 * isbn[3] is much more compact, but also more difficult to understand.

Below you will find a fully implemented class for checking the ISBN in the programming languages: Java, PHP, and C#.

Abonnement / Subscription

[English] This content is only available to subscribers.

[Deutsch] Diese Inhalte sind nur für Abonnenten verfügbar.

While the solutions presented in the examples all share the same core approach, they differ in more than just syntactical details. The Java version, for example, offers a more comprehensive variant that distinguishes more generically between ISBN-10 and ISBN-13. This demonstrates that there are many ways to Rome. It also aims to show less experienced developers different approaches and encourage them to make their own adaptations. To simplify understanding, the source code has been enriched with comments. PHP, as an untyped language, eliminates the need to convert strings to numbers. Instead, a RegEx function is used to ensure that the entered characters are type-safe.

Lessons Learned

As you can see, verifying whether an ISBN is correct isn’t rocket science. The topic of validating user input is, of course, much broader. Other examples include credit card numbers. But regular expressions also provide valuable services in this context.

Ressourcen

  • [1] Ralph-Hardo Schulz, Codierungstheorie: Eine Einführung, 2003, ISBN 978-3-528-16419-5
  • [2] Concept of modular aritmetic on Wikipedia, https://en.wikipedia.org/wiki/Modular_arithmetic

Index & Abbreviations

[A]

[B]

[C]

[D]

[E]

[F]

[G]

[H]

[I]

[J]

[K]

[L]

[M]

[N]

[O]

[P]

[Q]

[R]

[S]

[T]

[U]

[V]

[W]

[Y]

[Z]

[X]

Zurück zum Inhaltsverzeichniss: Apache Maven Master Class

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

[A]

[B]

[C]

[D]

[E]

[F]

[G]

[H]

[I]

[J]

[K]

[L]

[M]

[N]

[O]

[P]

[Q]

[R]

[S]

[T]

[U]

[V]

[W]

[Y]

[Z]

[X]

Zurück zum Inhaltsverzeichniss: Apache Maven Master Class

Cook Book: Maven Source Code Samples

Our Git repository contains an extensive collection of various code examples for Apache Maven projects. Everything is clearly organized by topic.

Back to table of contents: Apache Maven Master Class

  1. Token Replacement
  2. Compiler Warnings
  3. Excecutable JAR Files
  4. Enforcments
  5. Unit & Integration Testing
  6. Multi Module Project (JAR / WAR)
  7. BOM – Bill Of Materials (Dependency Management)
  8. Running ANT Tasks
  9. License Header – Plugin
  10. OWASP
  11. Profiles
  12. Maven Wrapper
  13. Shade Ueber JAR (Plugin)
  14. Java API Documantation (JavaDoc)
  15. Java Sources & Test Case packaging into JARs
  16. Docker
  17. Assemblies
  18. Maven Reporting (site)
  19. Flatten a POM
  20. GPG Signer

Apache Maven Master Class

Apache Maven (Maven for short) was first released on March 30, 2002, as an Apache Top-Level Project under the free Apache 2.0 License. This license also allows free use by companies in a commercial environment without paying license fees.

The word Maven comes from Yiddish and means something like “collector of knowledge.”

Maven is a pure command-line program developed in the Java programming language. It belongs to the category of build tools and is primarily used in Java software development projects. In the official documentation, Maven describes itself as a project management tool, as its functions extend far beyond creating (compiling) binary executable artifacts from source code. Maven can be used to generate quality analyses of program code and API documentation, to name just a few of its diverse applications.

Vorteile


  Online Course (yearly subsciption / 365 days)

Maven Master Class
m 3.47 Milli-Bitcoin

Target groups

This online course is suitable for both beginners with no prior knowledge and experienced experts. Each lesson is self-contained and can be individually selected. Extensive supplementary material explains concepts and is supported by numerous references. This allows you to use the Apache Maven Master Class course as a reference. New content is continually being added to the course. If you choose to become an Apache Maven Master Class member, you will also have full access to exclusive content.

Developer

  • Maven Basics
  • Maven on the Command Line
  • IDE Integration
  • Archetypes: Creating Project Structures
  • Test Integration (TDD & BDD) with Maven
  • Test Containers with Maven
  • Multi-Module Projects for Microservices

Build Manager / DevOps

  • Release Management with Maven
  • Deploy to Maven Central
  • Sonatype Nexus Repository Manager
  • Maven Docker Container
  • Creating Docker Images with Maven
  • Encrypted Passwords
  • Process & Build Optimization

Quality Manager

  • Maven Site – The Reporting Engine
  • Determine and evaluate test coverage
  • Static code analysis
  • Review coding style specifications

In-Person Live Training – Build Management with Apache Maven

JPoint Moscow 2023

Test Driven: from zero to hero

In the software industry, it is a common agreement that the code base has sufficient test automation. Because this is necessary for a stable DevOps process and secure refactoring. But the reality is completely different. Almost every project I joined during my career didn’t have any lines of test code. If we think about the fact that after more than 40 years, 80% of all commercial software projects fail, we should not be surprised. But this doesn’t have to be like this. In this talk, we demonstrate how easy it is to introduce, even in huge projects, a test-driven approach. The technical setup is a standard Java project with Apache Maven and JUnit 5.