This article is only visible for logged in patreons. With a Patreon membership you help me to keep this page up to date.
Author Archives: Elmar Dott
Conway’s law
During my work as a Configuration Manager / DevOps for large web projects, I have watched companies disregard Conway’s Law and fail miserably. Such failure then often manifested itself in significant budget overruns and missed deadlines.
![](https://elmar-dott.com/wp-content/uploads/phone-communication.jpg)
The internal infrastructure in the project collaboration was exactly modeled on the internal organizational structures and all experiences and established standards were ‘bent’ to fit the internal organization. This resulted in problems that made the CI/CD pipelines particularly cumbersome and resulted in long execution times. But also adjustments could only be made with a lot of effort. Instead of simplifying existing processes and aligning them with established standards, excuses were made to leave everything as it was before. Let’s take a look at what Conway’s Law is and why we should know it.
The US American researcher and programmer Melvin E. Conway received his doctorate from Case Western Reserve University in 1961. His area of expertise is programming languages and compiler design.
In 1967, he submitted to The Harvard Business Review his paper “How Do Committees Invent?” and was rejected. The reason given was that his thesis was not substantiated. However, Datamation, the largest IT magazine at the time, accepted his article and published it in April 1968. And this paper has since been widely cited. The core statement is:
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.
Conway, Melvin E. “How do Committees Invent?” 1968, Datamation, vol. 14, num. 4, pp. 28–31
When Fred Brooks cited the essay in his legendary 1975 book, The Mythical Man-Month, he called this key statement Conway’s Law. Brooks recognized the connection between Conway’s Law and management theory. In this regard, we find the following example in the article:
Because the design which occurs first is almost never the best possible, the prevailing system concept may need to change. Therefore, flexibility of organization is important to effective design.
![](https://elmar-dott.com/wp-content/uploads/Mythical_man-month.jpg)
An often-cited example of an “ideal” team size in terms of Conway’s Law is Amazon’s two-pizza rule, which states that individual project teams should have no more members than two pizzas can fill in one meeting. The most important factor to consider in team alignment, however, is the ability to work across teams and not live in silos.
Conway’s Law was not intended as a joke or Zen koan, but is a valid sociological observation. Take a look at structures from government agencies and their digital implementation. But also processes found in large corporations have been emulated by software systems. Such applications are considered very cumbersome and complicated, so that they find little acceptance among users and they prefer to fall back on alternatives. Unfortunately, it is often impossible to simplify processes in large organizational structures for politically motivated reasons.
Among other things, there is a detailed article by Martin Fowler, who deals explicitly with software architectures and elaborates the importance of the coupling of objects and modules.The communication of the developers among themselves plays a substantial role, in order to obtain best possible results. This circumstance over the importance of communication was taken up also by the agile software development and converted as essential point.Especially when distributed teams work on a joint project, the time difference is a limiting factor in team communication.This must then be designed particularly efficiently.
In 2010, Jonny Leroy and Matt Simons coined the term Inverse Conway Maneuver in the article “Dealing with creaky legacy platforms”:
Conway’s Law … can be summarized as “Dysfunctional organizations tend to create dysfunctional applications.” To paraphrase Einstein, you can’t fix a problem from within the same mindset that created it, so it is often worth investigating whether restructuring your organization or team would prevent the new application from displaying all the same structural dysfunctions as the original. In what could be termed an “inverse Conway maneuver,” you may want to begin by breaking down silos that constrain the team’s ability to collaborate effectively.
Since the 2010s, a new architectural style has entered the software industry. The so-called microservices, which are created by small agile teams. The most important criterion of a microservice compared to a modular monolith is that a microservice can be seen as an independently viable module or subsystem. On the one hand, this allows the microservice to be reused in other applications. On the other hand, there is a strong encapsulation of the functional domain, which opens up a very high flexibility for adaptations.
However, Conway’s law can be applied to many other areas and is not exclusively limited to the software industry. This is what makes the work so valuable.
Ressourcen
WordPress REST API is unreachable
The spectre of artificial intelligence
The hype surrounding artificial intelligence has been going on for several years. Currently, companies like OpenAI are causing quite a stir with freely accessible neural networks like ChatGPT. Users are fascinated by the possibilities and some intellectual figures of our time are warning humanity about artificial intelligence. So what is it about the specter of AI? In this article, I explore this question and you are invited to join me on this journey. Let’s go and follow me into the future.
In the spring of 2023, reports about the performance capabilities of artificial neural networks overflowed. This trend is continuing and, in my opinion, will not abate any time soon. In the midst of the emerging gold rush mood, however, there are also isolated bad news doing the rounds. For example, Microsoft announced a massive investment in artificial intelligence on a grand scale. This announcement was underlined in the spring of 2023 with the dismissal of just under 1000 employees and gave rise to familiar fears of industrialization and automation. Things were less spectacular at Digital Ocean, which laid off its entire content creation and documentation team. Quickly, some people rightly asked whether AI would now make professions like programmers, translators, journalists, editors and so on obsolete? For now, I would like to answer this question with a no. In the medium term, however, changes will occur, as history has already taught us. Something old passes away while new things come into being. So follow me on a little historical excursion.
To do this, we first look at the various stages of industrialization, which originated in England in the second half of the 18th century. Already the meaning of the original Latin term Industria, which can be translated with diligence, is extremely interesting. Which leads us to Norbert Wiener and his 1960 book ern God and Golem Inc [1]. He publicly pondered whether people who create machines that in turn can create machines are gods. Something I do not want to subscribe from my feeling. But let’s come back to industrialization for the time being.
The introduction of the steam engine and the use of location-independent energy sources such as coal enabled precise mass production. With cheaper automation of production by machines, manual home workplaces were displaced. In exchange, cheaper products were now available in stores. But there were also significant changes in transportation. The railroad allowed for faster, more comfortable and cheaper travel. This catapulted mankind into a globalized world. Because goods could now also travel long distances in a short time without any problems. Today, when we look back at the discussions that took place when the railroad began its triumphal march, we can only smile. After all, some intellectuals of the time argued that speeds in a train of more than 30 kilometers per hour would literally crush the human occupants. A fear that fortunately turned out to be unfounded.
While people in the first industrial revolution could no longer earn an income from working at home, they found an alternative to continue earning a living by working in a factory.
The second industrial revolution is characterized by electrification, which further increased the degree of automation. Machines became less cumbersome and more precise. But new inventions also entered daily life. Fax, telephone and radio spread information at a rapid pace. This brought us into the Information Age and accelerated not only our communication, but also our lives. We created a society that is primarily characterized by the saying “time is money”.
![](https://elmar-dott.com/wp-content/uploads/lost-places-computer-vintage.jpg)
The third industrial revolution blessed mankind with a universal machine, which determined its functionality by the programs (software) running on it. Nowadays, computers support us in a wide range of activities. Modern cash register systems do much more than just spit out the total amount of the purchase made. They log money and flow of goods and allow evaluations for optimization with the collected data. This is a new quality of automation that we have achieved in the last 200 years. With the widespread availability of artificial neural networks, we are now on our way out of this phase, which is why we are currently in the transformation to the fourth industrial revolution. How else do we as humans intend to cope with the constantly growing flood of information?
Even though Industry 4.0 focuses on the networking of machines, this is not a real revolution. The Internet is only a consequence of the previous development to enable communication between machines. We can compare this with the replacement of the steam engine by electric motors. The real innovation was in electric machines that changed our communication. This is now happening in our time through the broad field of artificial intelligence.
In the near future, we will no longer use computers the way we have been doing. That’s because today’s computers owe their existence to the previously limited communication between humans and machines. The keyboard and mouse are actually clumsy input devices. They are slow and prone to error. Voice and gesture control via microphone and camera will replace mouse and keyboard. We will talk to our computers the way we talk to other people. This also means that today’s computer programs will become obsolete. We will no longer have to fill out tedious input masks in graphical user interfaces in order to reach our goal. Gone are the days where I type my articles. I will type them in and my computer will visually display them for me to proofread. Presumably, the profession of speech therapist will then experience a significant upswing.
There will certainly also be enough outcries from people who fear the disintegration of human communication. This fear is not at all unfounded. Let’s just look at the development of the German language in the period since the turn of the millennium. This was marked by the emergence of various text messaging services and the optimization of messages by using as many abbreviations as possible. This in turn only created question marks on the foreheads of parents when it came to deciphering the content of their children’s messages. Even though the current trend is away from text messages to audio messages, it does not mean that our language will not continue to change. I myself have observed for years that many people are no longer able to express themselves correctly in writing or to extract content from written texts. In the long run, this could lead to the unlearning of skills such as reading and writing. Thus also classical print articles such as books and magazines become obsolete. Finally, content can also be produced as video or podcast. Our intellectual abilities will degenerate in the long run.
Since the turn of the millennium, it has become easier and easier for many people to use computers. So first the good news. It will become much easier to use computers in the future because human-machine interaction is becoming more intuitive. In the meantime, we will see more and more major Internet portals shutting down their services because their business model is no longer viable. Here’s a quick example.
As a programmer, I often use the website StackOverflow to find help with problems. The information on this website about programming issues is now so extensive that you can quickly find suitable solutions to your own concerns by searching Google and the like, without having to formulate questions yourself. So far so good. But if you now integrate a neural network like ChatGPT into your programming environment to find the answer to all questions, the number of visitors for StackOverflow will continuously decrease. This in turn has an impact on advertising campaigns to be able to offer the service free of charge on the net. Initially, this will be compensated by the fact that operators of AI systems that access the data from StackOverflow will pay a flat fee for the use of the database. However, this will not stop the dwindling number of visitors. Which will lead to either a payment barrier preventing free use or the service being discontinued completely. There are many offers on the Internet that will encounter similar problems, which will ensure in the long term that the Internet as we know it has disappeared in the future.
Let’s imagine what a future search query for the search term ‘industrial revolution’ might look like. I ask my digital assistant: What do you know about industrial revolution? – Instead of searching through a seemingly endless list of thousands of entries for relevant results, I am read a short explanation with a personalized address that matches my age and level of education. Which immediately raises the question of who is judging my level of education and how?
This is a further downgrading of our abilities. Even if it is perceived as very comfortable in the first moment. If we no longer have the need to focus our attention on one specific thing over a long period of time, it will certainly be difficult for us to think up new things in the future. Our creativity will be reduced to an absolute minimum.
It will also change the way data is stored in the future. Complicated structures that are optimized and stored in databases will be the exception rather than the rule. Rather, I expect independent chunks of data that are concatenated like lists. Let’s look at this together to get a good idea of what I mean.
![](https://elmar-dott.com/wp-content/uploads/artificial-intelligence-coding.jpg)
As a starting point, let’s take Aldous Huxley’s book ‘Brave New World’ from 1932. In addition to the title, the author and the year of publication, we can add English as the language to the meta information. This is then followed by the entire contents of the book including preface and epilogue as plain ASCII text. Generic or changeable things like table of contents or copyright are not included at this stage. With such a chunk, we have defined an atomic datum that can be uniquely identified by a hash value. Since Huxley’s Brave New World was originally written in English, this datum is also an immutable source for all data derived and generated from it.
If the work of Huxley is now translated into German or Spanish, it is the first derivation with the reference to the original. It can happen that books have been translated by different translators in different epochs. This results in a different reference hash for the German translation by Herbert E. Herlitschka from 1933 with the title ‘Brave New World’ than for the translation by Eva Walch published in 1978 with the same title ‘Brave New World’.
If audio books are now produced from the various texts, these audio books are the second derivative of the original text, since they represent an abridged version. A text is also created as an independent version before the recording. The audio track created from the abridged original text has the director as its author and refers to the speaker(s). As in theater, a text can be interpreted and staged by different people. Film adaptations can be treated in the same way.
Books, audio books and films in turn have graphics for the cover. These graphics again represent independent works, which are referenced with the corresponding version of the original.
Quotations from books can also be linked in this way. Similarly, critiques, interpretations, reviews and all kinds of other variations of content that refer to an original.
However, such data blocks are not only limited to books, but can also be applied to music scores, lyrics, etc. The decisive factor is that one can start from the original as far as possible. The resulting files are optimized exclusively for software programs, since they do not contain any formatting that is visible to the human eye. Finally, the corresponding hash value about the content of the file is sufficient as file name.
This is where the vision of the future begins. As authors of our work, we can now use artificial intelligence to automatically create translations, illustrations, audio books and animations even from a book. At this point, I would like to briefly refer to the neural network DeepL [2], which already delivers impressive translations and even improves the original text if handled skillfully. Does DeepL now put translators and editors out of work? I mean no! Because also like us humans, artificial intelligences are not infallible. They also make mistakes. That’s why I think that the price for these jobs will drop dramatically in the future, because these people can now do many times more work than before, thanks to their knowledge and excellent tools. This makes the individual service considerably cheaper, but because more individual services are possible through automation in the same period of time, this compensates for the price reduction for the provider.
If we now look at the new possibilities that are open to us, it doesn’t seem to be so problematic for us. So what are people like Elon Musk trying to warn us about?
If we now assume that the entire human knowledge will be digitized by the fourth industrial revolution and that all new knowledge will only be created in digital form, computer algorithms will be free to use suitable computing power to change these chunks of knowledge in such a way that we humans will not notice. A scenario loosely based on Orwell’s Ministry of Truth from the novel 1984. If we unlearn our abilities out of convenience, we also have few possibilities of verification.
If you think this would not be a problem, I would like to refer you to the lecture “Trust no scan” by David Kriesel [3].What happened? In short, it was about the fact that a construction company noticed discrepancies in copies of their construction plans. This resulted in different copies of the same original, in which the numerical values were changed. A very fatal problem in a construction project for the executing trades. If the bricklayer gets different size data than the concrete formers. The error was finally traced back to the fact that Xerox used an AI as software in their scanners for the OCR and the subsequent compression, which could not reliably recognize the characters read in.
![](https://elmar-dott.com/wp-content/uploads/question-mark.jpg)
But also the quote from Ted Chiang “Think of ChatGPT as a blurry jpeg of all the text on the Web.” should make us think. Certainly, for people who only know AI as an application, the meaning is hard to understand what is meant by saying “ChatGPT is just a blurry jpeg of all the text on the web”. However, it is not as difficult to understand as it seems at the first moment. Due to their structure, neural networks are always only a snapshot. Because with every input the internal state of a neural network changes. It is the same as with us humans. After all, we are only the sum of our experiences. If in the future more and more texts created by an AI are placed on the web without being reflected, the AI will form its knowledge from its own derivations. The originals fade with the time there they by ever smaller references in weighting lose. If someone would flood the Internet with topics like flat earth and lizard people, programs like ChatGPT would inevitably react to it and let this flow into their texts. These texts could be published then either independently by the AI in the net automated or find their spreading by unreflective persons accordingly. We have thus created a spiral that can only be broken if people have not given up their ability to exercise judgment out of convenience.
So we see that the warnings for caution in dealing with AI are not unfounded. Even if I consider scenarios like in the movie WarGames from 1983 [4] as improbable, we should consider very well how far we want to go with the technology of AI. Not that it happens to us like the sorcerer’s apprentice and we have to find out that we cannot master it any more.
References
No post found
Date vs. Boolean
When we designing data models and their corresponding tables appears sometimes Boolean as datatype. In general those flags are not really problematic. But maybe there could be a better solution for the data design. Let me give you a short example about my intention.
Assume we have to design a simple domain to store articles. Like a Blog System or any other Content Management. Beside the content of the article and the name of the author could we need a flag which tells the system if the article is visible for the public. Something like published as a Boolean. But there is also an requirement of when the article is scheduled a date for publishing. In the most database designs I observed for those circumstances a Boolean: published and a Date: publishingDate. In my opinion this design is a bit redundant and also error prone. As a fast conclusion I would like to advice you to use from the beginning just Date instead of Boolean. The scenario I described above can also transformed to many other domain solutions.
For now, after we got an idea why we should replace Boolean for Date datatype we will focus about the details how we could reach this goal.
Dealing with standard SQL suggest that replacing a Database Management System (DBMS) for another one should not be a big issue. The reality is unfortunately a bit different. Not all available data types for date like Timestamp are really recommendable to use. By experience I prefer to use the simple java.util.Date to avoid future problems and other surprises. The stored format in the database table looks like: ‘YYYY-MM-dd HH:mm:ss.0’. Between the Date and Time is a single space and .0 indicates an offset. This offset describes the time zone. The Standard Central European Timezone CET has an offset of one hour. That means UTC+01:00 as international format. To define the offset separately I got good results by using java.util.TimeZone, which works perfectly together with Date.
Before we continue I will show you a little code snippet in Java for the OR Manager Hibernate and how you could create those table columns.
@Table(name = "ARTICLE")
public class ArticleDO {
@CreationTimestamp
@Column(name = "CREATED")
@Temporal(TemporalType.DATE)
private Date created;
@Column(name = "PUBLISHED")
@Temporal(TemporalType.DATE)
private Date published;
@Column(name = "DEFAULT_TIMEZONE")
private String defaultTimezone;
//Constructor
public ArticleDO() {
TimeZone.setDefault(Constraints.SYSTEM_DEFAULT_TIMEZONE);
this.defaultTimezone = "UTC+00:00";
this.published = new Date('0000-00-00 00:00:00.0');
}
public Date isPublished() {
return published;
}
public void setPublished(Date publicationDate) {
if(publicationDate != null) {
this.published = publicationDate;
} else {
this.published = new Date(System.currentTimeMillis());
}
}
}
Java-- SQL
INSERT INTO ARTICLE (CREATED, PUBLISHED, DEFAULT_TIMEZONE)
VALUES ('1984-04-01 12:00:01.0', '0000-00-00 00:00:00,0', 'UTC+00:00);
SQLLet get a bit closer about the listing above. As first we see the @CreationTimestamp Annotation. That means when the ArticleDO Object got created the variable created will initialized by the current time. This value never should changed, because an article can just once created but several times changed. The Timezone is stored in a String. In the Constructor you can see how the system Timezone could grabbed – but be careful this value should not trusted to much. If you have a user like me traveling a lot you will see in all the places I stay the same system time, because usually I never change that. As default Timezone I define the correct String for UTC-0. The same I do for the variable published. Date can also created by a String what we use to set our default zero value. The Setter for published has the option to define an future date or use the current time in the case the article will published immediately. At the end of the listing I demonstrate a simple SQL import for a single record.
But do not rush to fast. We also need to pay a bit attention how to deal with the UTC offset. Because I observed in huge systems several times problems which occurred because developer was used only default values.
The timezone in general is part of the internationalization concept. For managing the offset adjustments correctly we can decide between different strategies. Like in so many other cases there no clear right or wrong. Everything depends on the circumstances and necessities of your application. If a website is just national wide like for a small business and no time critical events are involved everything become very easy. In this case it will be unproblematic to manage the timezone settings automatically by the DBMS. But keep in mind in the world exist countries like Mexico with more than just one timezone. An international system where clients spread around the globe it could be useful to setup each single DBMS in the cluster to UTC-0 and manage the offset by the application and the connected clients.
Another issue we need to come over is the question how should initialize the date value of a single record by default? Because null values should avoided. A full explanation why returning null is not a good programming style is given by books like ‘Effective Java’ and ‘Clean Code’. Dealing with Null Pointer Exceptions is something I don’t really need. An best practice which well works for me is an default date – time value by ‘0000-00-00 00:00:00.0’. Like this I’m avoiding unwanted publishing’s and the meaning is very clear – for everybody.
As you can see there are good reasons why Boolean data types should replaced by Date. In this little article I demonstrated how easy you can deal with Date and timezone in Java and Hibernate. It should also not be a big thing to convert this example to other programming languages and Frameworks. If you have an own solution feel free to leave a comment and share this article with your colleagues and friends.
No post found
Working with JSON in Java RESTful Services using Jackson
Since a long time the Java Script Object Notation [1] become as a lightweight standard to replace XML for information exchange between heterogeneous systems. Both technologies XML and JSON closed those gap to return simple and complex data of a remote method invocation (RMI), when different programming languages got involved. Each of those technologies has its own benefits and disadvantages. A good designed XML document is human readable but needs in comparing to JSON more payload when it send through the network. For almost every programming languages existing plenty implementations to deal with XML and also JSON. We don’t need to reinvent the wheel, to implement our own solution for handling JSON objects. But choosing the right library is not that easy might it seems.
The most popular library for JSON in Java projects is the one I already mentioned: Jackson [2]. because of its huge functionality. Another important point for choosing Jackson instead of other libraries is it’s also used by the Jersey REST Framework [3]. Before we start now our journey with the Java Frameworks Jersey and Jackson, I like to share some thoughts about things, I often observe in huge projects during my professional life. Because of this reason I always proclaim: don’t mix up different implementation libraries for the same technology. The reason is it’s a huge quality and security concern.
The general purpose for using JSON in RESTful applications is to transmit data between a server and a client via HTTP. To achieve that, we need to solve two challenges. First, on the server side, we need create form a Java object a valid JSON representation which we can send to the client. This process we call serialization. On the client side, we do the second step, which is exactly the opposite, we did on the server. De-serialization we call it, when we create a valid object from a JSON String.
In this article we will use on the server side and also on the client side Java as programming language, to deal with JSON objects. But keep in mind REST allows you to have different programming languages on the server and for the client. Java is always a good choice to implement your business logic on the server. The client side often is made with JavaScript. Also PHP, .NET and other programming Languages are possible.
In the next step we will have a look at the project architecture. All artifacts are organized by one Apache Maven Multi-Module project. It’s a good recommendation to follow this structure in your own projects too. The three artifacts we create are: api, server and client.
- API: contain shared objects which will needed on the server and also client side, like domain objects and interfaces.
- Server: producer of a RESTful service, depends on API.
- Client: consumer of the RESTful service, depends on API.
Inside of this artifacts an layer architecture is applied. This means the access to objects from a layer is only allowed to the direction of the underlying layers. In short: from top to down. The layer structure are organized by packages. Not every artifact contains every layer, only the ones which are implemented. The following picture draws an better understanding for the whole architecture is used.
![](https://elmar-dott.com/wp-content/uploads/SE-Architectural-Layer.png)
The first piece of code, I’d like to show are the JSON dependencies we will need in the notation for Maven projects.
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>${version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>${version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${version}</version>
</dependency>
XMLListing 1
In respect to the size of this article, I only focus how the JSON object is used in RESTful applications. It’s not a full workshop about RESTful (Micro) Services. As code base we reuse my open source GitHub project TP-ACL [4], an access control list. For our example I decided to sliced apart the Role – Functionality from the whole code base.
For now we need as first an Java object which we can serialize to an JSON String. This Domain Object will be the Class RolesDO and is located in the layer domain inside the API module. The roles object contains a name, a description and a flag that indicates if a role is allowed to delete.
@Entity
@Table(name = "ROLES")
public class RolesDO implements Serializable {
private static final long serialVersionUID = 50L;
@Id
@Column(name = "NAME")
private String name;
@Column(name = "DESCRIPTION")
private String description;
@Column(name = "DELETEABLE")
private boolean deleteable;
public RolesDO() {
this.deleteable = true;
}
public RolesDO(final String name) {
this.name = name;
this.deleteable = true;
}
//Getter & Setter
}
JavaListing 2
So far so good. As next step we will need to serialize the RolesDO in the server module as a JSON String. This step we will do in the RolesHbmDAO which is stored in the implementation layer within the Server module. The opposite direction, the de-serialization is also implemented in the same class. But slowly, not everything at once. lets have as first a look on the code.
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
public class RolesDAO {
public transient EntityManager mainEntityManagerFactory;
public String serializeAsJson(final RolesDO role)
throws JsonProcessingException {
ObjectMapper mapper = new ObjectMapper();
return mapper.writeValueAsString(role);
}
public RolesDO deserializeJsonAsObject(final String json, final RolesDO role)
throws JsonProcessingException, ClassNotFoundException {
ObjectMapper mapper = new ObjectMapper();
return (RolesDO) mapper.readValue(json, Class.forName(object.getCanonicalName()));
}
public List<RolesDO> deserializeJsonAsList(final String json)
throws JsonProcessingException, ClassNotFoundException {
ObjectMapper mapper = new ObjectMapper();
return mapper.readValue(json, new TypeReference<List>() {});
}
public List listProtectedRoles() {
CriteriaBuilder builder = mainEntityManagerFactory.getCriteriaBuilder();
CriteriaQuery query = builder.createQuery(RolesDO.class);
Root root = query.from(RolesDO.class);
query.where(builder.isNull(root.get("deactivated")));
query.orderBy(builder.asc(root.get("name")));
return mainEntityManagerFactory.createQuery(query).getResultList();
}
}
JavaListing 3
The implementation is not so difficult to understand, but may at this point could the first question appear. Why the de-serilization is in the server module and not in the client module? When the client sends a JSON to the server module, we need to transform this to an real Java object. Simple as that.
Usually the Data Access Object (DAO) Pattern contains all functionality for database operations. This CRUD (create, read, update and delete) functions, we will jump over. If you like to get to know more about how the DAO pattern is working, you could also check my project TP-CORE [4] at GitHub. Therefore we go ahead to the REST service implemented in the object RoleService. Here we just grep the function fetchRole().
@Service
public class RoleService {
@Autowired
private RolesDAO rolesDAO;
@GET
@Path("/{role}")
@Produces({MediaType.APPLICATION_JSON})
public Response fetchRole(final @PathParam("role") String roleName) {
Response response = null;
try {
RolesDO role = rolesDAO.find(roleName);
if (role != null) {
String json = rolesDAO.serializeAsJson(role);
response = Response.status(Response.Status.OK)
.type(MediaType.APPLICATION_JSON)
.entity(json)
.encoding("UTF-8")
.build();
} else {
response = Response.status(Response.Status.NOT_FOUND).build();
}
} catch (Exception ex) {
LOGGER.log("ERROR CODE 500 " + ex.getMessage(), LogLevel.DEBUG);
response = Response.status(Response.Status.INTERNAL_SERVER_ERROR).build();
}
return response;
}
}
JavaListing 4
The big secret here we have in the line where we stick the things together. As first the RolesDO is created and in the next line the DAO calls the serializeAsJson() Method with the RoleDO as parameter. The result will be a JSON representation of the RoleDO. If the role exist and no exceptions occur, then the service is ready for consuming. In the case of any problem the service send a HTTP error code instead of the JSON.
Complex Services which combine single services to a process take place in the orchestration layer. At this point we can switch to the client module to learn how the JSON String got transformed back to a Java domain object. In the client we don’t have RolesHbmDAO to use the deserializeJsonAsObject() method. And of course we also don’t want to create duplicate code. This forbids us to copy paste the function into the client module.
As pendant to the fetchRole() on the server side, we use for the client getRole(). The purpose of both implementations is identical. The different naming helps to avoid confusions.
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.ObjectMapper;
public class Role {
private final String API_PATH
= "/acl/" + Constraints.REST_API_VERSION + "/role";
private WebTarget target;
public RolesDO getRole(String role) throws JsonProcessingException {
Response response = target
.path(API_PATH).path(role)
.request()
.accept(MediaType.APPLICATION_JSON)
.get(Response.class);
LOGGER.log("(get) HTTP STATUS CODE: " + response.getStatus(), LogLevel.INFO);
ObjectMapper mapper = new ObjectMapper();
return mapper.readValue(response.readEntity(String.class), RolesDO.class);
}
}
JavaListing 5
As conclusion we have now seen the serialization and de-serialisation by using the Jackson library of JSON objects is not that difficult. In the most of the cases we just need three methods:
- serialize a Java object to a JSON String
- create a Java object from a JSON String
- de-serialize a list of objects inside a JSON String to a Java object collection
This three methods I already introduced in Listing 2 for the DAO. To prevent duplicate code we should separte those functionality in an own Java Class. This is known as the design pattern Wrapper [5] also known as Adapter. For reaching the best flexibility I implemented the JacksonJsonTools from TP-CORE as Generic.
package org.europa.together.application;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.core..JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.List;
public class JacksonJsonTools {
private static final long serialVersionUID = 15L;
public String serializeAsJsonObject(final T object)
throws JsonProcessingException {
try {
ObjectMapper mapper = new ObjectMapper();
return mapper.writeValueAsString(object);
} catch (JsonProcessingException ex) {
System.err.println(ex.getOriginalMessage());
}
}
public T deserializeJsonAsObject(final String json, final Class object)
throws JsonProcessingException, ClassNotFoundException {
try {
Class<?> clazz = Class.forName(object.getCanonicalName());
ObjectMapper mapper = new ObjectMapper();
return (T) mapper.readValue(json, clazz);
} catch (JsonProcessingException ex) {
System.err.println(ex.getOriginalMessage());
}
}
public List deserializeJsonAsList(final String json)
throws JsonProcessingException, ClassNotFoundException {
try {
ObjectMapper mapper = new ObjectMapper();
return mapper.readValue(json, new TypeReference<List>() {
});
} catch (com.fasterxml.jackson.core.JsonProcessingException ex) {
System.err.println(ex.getOriginalMessage());
}
}
}
JavaListing 6
This and much more useful Implementations with a very stable API you find in my project TP-CORE for free usage.
Resources:
- [1] https://www.json.org/json-en.html
- [2] https://github.com/FasterXML/jackson
- [4] https://github.com/ElmarDott/TP-CORE/wiki/%5BCORE-02%5D-generic-Data-Access-Object—DAO
- [5] https://en.wikipedia.org/wiki/Adapter_pattern
No post found
Preventing SQL Injections in Java with JPA and Hibernate
![](https://elmar-dott.com/wp-content/uploads/logo-DZone.png)
When we have a look at OWASP’s top 10 vulnerabilities [1], SQL Injections are still in a popular position. In this short article, we discuss several options on how SQL Injections could be avoided.
When Applications have to deal with databases existing always high-security concerns, if an invader got the possibility to hijack the database layer of your application, he can choose between several options. Stolen the data of the stored users to flood them with spam is not the worst scenario that could happen. Even more problematic would be when stored payment information got abused. Another possibility of an SQL Injection Cyber attack is to get illegal access to restricted pay content and/or services. As we can see, there are many reasons why to care about (Web) Application security.
To find well-working preventions against SQL Injections, we need first to understand how an SQL Injection attack works and on which points we need to pay attention. In short: every user interaction that processes the input unfiltered in an SQL query is a possible target for an attack. The data input can be manipulated in a manner that the submitted SQL query contains a different logic than the original. Listing 1 will give you a good idea about what could be possible.
SELECT Username, Password, Role FROM User
WHERE Username = 'John Doe' AND Password = 'S3cr3t';
SELECT Username, Password, Role FROM Users
WHERE Username = 'John Doe'; --' AND Password='S3cr3t';
SQLListing 1: Simple SQL Injection
The first statement in Listing 1 shows the original query. If the Input for the variables Username and Password is not filtered, we have a lack of security. The second query injects for the variable Username a String with the username John Doe and extends with the characters ‘; –. This statement bypasses the AND branch and gives, in this case, access to the login. The ‘; sequence close the WHERE statement and with — all following characters got un-commented. Theoretically, it is possible to execute between both character sequences every valid SQL code.
Of course, my plan is not to spread around ideas that SQL commands could rise up the worst consequences for the victim. With this simple example, I assume the message is clear. We need to protect each UI input variable in our application against user manipulation. Even if they are not used directly for database queries. To detect those variables, it is always a good idea to validate all existing input forms. But modern applications have mostly more than just a few input forms. For this reason, I also mention keeping an eye on your REST endpoints. Often their parameters are also connected with SQL queries.
For this reason, Input validation, in general, should be part of the security concept. Annotations from the Bean Validation [2] specification are, for this purpose, very powerful. For example, @NotNull, as an Annotation for the data field in the domain object, ensure that the object only is able to persist if the variable is not empty. To use the Bean Validation Annotations in your Java project, you just need to include a small library.
<dependency>
<groupId>org.hibernate.validator</groupId>
<artifactId>hibernate-validator</artifactId>
<version>${version}</version>
</dependency>
XMLListing 2: Maven Dependency for Bean Validation
Perhaps it could be necessary to validate more complex data structures. With Regular Expressions, you have another powerful tool in your hands. But be careful. It is not that easy to write correct working RegEx. Let’s have a look at a short example.
public static final String RGB_COLOR = "#[0-9a-fA-F]{3,3}([0-9a-fA-F]{3,3})?";
public boolean validate(String content, String regEx) {
boolean test;
if (content.matches(regEx)) {
test = true;
} else {
test = false;
}
return test;
}
validate('#000', RGB_COLOR);
JavaListing 3: Validation by Regular Expression in Java
The RegEx to detect the correct RGB color schema is quite simple. Valid inputs are #ffF or #000000. The Range for the characters is 0-9, and the Letters A to F. Case insensitive. When you develop your own RegEx, you always need to check very well existing boundaries. A good example is also the 24 hours time format. Typical mistakes are invalid entries like 23:60 or 24:00. The validate method compares the input string with the RegEx. If the pattern matches the input, the method will return true. If you want to get more ideas about validators in Java, you can also check my GitHub repository [3].
In resume, our first idea to secure user input against abuse is to filter out all problematic character sequences, like — and so on. Well, this intention of creating a blocking list is not that bad. But still have some limitations. At first, the complexity of the application increased because blocking single characters like –; and ‘ could causes sometimes unwanted side effects. Also, an application-wide default limitation of the characters could cost sometimes problems. Imagine there is a text area for a Blog system or something equal.
This means we need another powerful concept to filter the input in a manner our SQL query can not manipulate. To reach this goal, the SQL standard has a very great solution we can use. SQL Parameters are variables inside an SQL query that will be interpreted as content and not as a statement. This allows large texts to block some dangerous characters. Let’s have a look at how this will work on a PostgreSQL [4] database.
DECLARE user String;
SELECT * FROM login WHERE name = user;
SQLListing 4: Defining Parameters in PostgreSQL
In the case you are using the OR mapper Hibernate, there exists a more elegant way with the Java Persistence API (JPA).
String myUserInput;
@PersistenceContext
public EntityManager mainEntityManagerFactory;
CriteriaBuilder builder =
mainEntityManagerFactory.getCriteriaBuilder();
CriteriaQuery<DomainObject> query =
builder.createQuery(DomainObject.class);
// create Criteria
Root<ConfigurationDO> root =
query.from(DomainObject.class);
//Criteria SQL Parameters
ParameterExpression<String> paramKey =
builder.parameter(String.class);
query.where(builder.equal(root.get("name"), paramKey);
// wire queries together with parameters
TypedQuery<ConfigurationDO> result =
mainEntityManagerFactory.createQuery(query);
result.setParameter(paramKey, myUserInput);
DomainObject entry = result.getSingleResult();
JavaListing 5: Hibernate JPA SQL Parameter Usage
Listing 5 is shown as a full example of Hibernate using JPA with the criteria API. The variable for the user input is declared in the first line. The comments in the listing explain the way how it works. As you can see, this is no rocket science. The solution has some other nice benefits besides improving web application security. At first, no plain SQL is used. This ensures that each database management system supported by Hibernate can be secured by this code.
May the usage looks a bit more complex than a simple query, but the benefit for your application is enormous. On the other hand, of course, there are some extra lines of code. But they are not that difficult to understand.
Resources
No post found
Working with textfiles on the Linux shell
![](https://elmar-dott.com/wp-content/uploads/logo-tux-linux.png)
Linux turns more and more to a popular operating system for IT professional. One of the reasons for this movement are the server solutions. Stability and low resource consuming are some of the important characteristics for this choice. May you already played around with a Microsoft Server you will miss the graphical Desktop in a Linux Server. After a login into a Linux Server you just see the command prompt is waiting for your inputs.
In this short article I introduce you some helpful Linux programs to work with files on the command line. This allows you to gather information, for example from log files. Before I start I’d like to recommend you a simple and powerful editor named joe.
Ctrl + C – Abort the current editing of a file without saving changes
Ctrl + KX – Exit the current editing and save the file
Ctrl + KF – Find text in the current file
Ctrl + V – Paste clipboard into document (CMD + V for Mac)
Ctrl + Y – Delete current line where cursor is
To install joe on an Debian based Linux distribution you just need to type:
sudo apt-get install joe
1. When you need to find content in a huge text file GREP will be your best friend. GREP allows you to search for text pattern in files.
gerp <pattern> file.log
-n : number of lines that matches
-i : case insensitive
-v : invert matches
-E : extended regex
-c : count number of matches
-l : find filenames that matches the pattern
Bash2. When you need to analyze network packages NGREP is the tool of your choice.
ngrep -I file.pcap
-d : specify the network interface
-i : case insensitive
-x : print in alternate hexdump
-t : print timestamp
-I : read a pcap file
Bash3. When you need to see the changes between two versions of a file, DIFF will do the job.
diff version1.txt version2.txt
-a : add
-c : change
-d : delete
# : line numbers
< : file 1
> : file 2
Bash4. Sometimes it is necessary to give an order to the entries in a file. SORT is gonna to help you with this task.
sort file.log
-o : write the result to a file
-r : reverse order
-n : numerical sort
-k : sort by column
-c : check if orderd
-u : sort and remove
-f : ignore case
-h : human sort
Bash5. If you have to replace Strings inside of a huge text, like find and replace you can do that with SED, the stream editor.
sed s/regex/replace/g
-s : search
-g : replace
-d : delete
-w : append to file
-e : execute command
-n : suppress output
Bash6. Parsing fields using delimiters in text files can done by using CUT.
cut -d ":" -f 2 file.log
-d : use the field delimiter
-f : field numbers
-c : specific characters position
Bash7. The extraction of substrings who occurred just once in a text file you will reach with UNIQ.
uniq file.txt
-c : count the numbers of duplicates
-d : print duplicates
-i : case insesitive
Bash8. AWK is a programming language consider to manipulate data.
awk {print $2} file.log
Bash- 1
- 2
A briefly overview to Java frameworks
When you have a look at Merriam Webster about the word framework you find the following explanations:
- a basic conceptional structure
- a skeletal, openwork, or structural frame
May you could think that libraries and frameworks are equal things. But this is not correct. The source code calls the functionality of a library directly. When you use a framework it is exactly the opposite. The framework calls specific functions of your business logic. This concept is also know as Inversion of Control (IoC).
For web applications we can distinguish between Client-Side and Server-Side frameworks. The difference is that the client usually run in a web browser, that means to available programming languages are limited to JavaScript. Depending on the web server we are able to chose between different programming languages. the most popular languages for the internet are PHP and Java. All web languages have one thing in common. They produce as output HTML, witch can displayed in a web browser.
In this article I created an short overview of the most common Java frameworks which also could be used in desktop applications. If you wish to have a fast introduction for Java Server Application you can check out my Article about Java EE and Jakarta.
If you plan to use one or some of the discussed frameworks in your Java application, you just need to include them as Maven or Gradle dependency.
Before I continue I wish to telly you, that this frameworks are made to help you in your daily business as developer to solve problems. Every problem have multiple solutions. For this reason it is more important to learn the concepts behind the frameworks instead just how to use a special framework. During the last two decades since I’m programming I saw the rise and fall of plenty Frameworks. Examples of frameworks today almost nobody remember are: Google Web Toolkit and JBoss Seam.
The most used framework in Java for writing and executing unit tests is JUnit. An also often used alternative to JUnit is TestNG. Both solutions working quite equal. The basic idea is execute a function by defined parameters and compare the output with an expected results. When the output fit with the expectation the test passed successful. JUnit and TestNG supporting the Test Driven Development (TDD) paradigm.
If you need to emulate in your test case a behavior of an external system you do not have in the moment your tests are running, then Mockito is your best friend. Mockito works perfectly together with JUnit and TestNG.
The Behavioral Driven Development (BDD) is an evolution to unit tests where you are able to define the circumstances the customer will accepted the integrated functionality. The category of BDD integration tests are called acceptance tests. Integration tests are not a replacement for unit tests, they are an extension to them. The frameworks JGiven and Cucumber are also very similar both are like Mockito an extension for the unit test frameworks JUnit and TestNG.
For dealing in Java with relational databases we can choose between several persistence frameworks. Those frameworks allow you to define your database structure in Java objects without writing any line of SQL The mapping between Java objects and database tables happens in the background. Another very great benefit of using O/R Mapper like Hibernate, iBatis and eclipse link is the possibility to replace the underlying database sever. But this achievement is not so easy to reach as it in the beginning seems.
In the next section I introduce a technique was first introduced by the Spring Framework. Dependency Injection (DI). This allows the loose coupling between modules and an more easy replacement of components without a new compile. The solution from Google for DI is called Guice and Java Enterprise binges its own standard named CDI.
Graphical User Interfaces (GUI) are another category for frameworks. It depends on the chosen technology like JavaFX or JSF which framework is useful. The most of the provided controls are equal. Common libraries for GUI JSF components are PrimeFaces, BootsFaces or ButterFaces. OmniFaces is a framework to have standardized solution for JSF problems, like chaching and so on. Collections for JavaFX controls you can find in ControlsFX and BootstrapFX.
If you have to deal with Event Stream Processing (ESP) may you should have a look on Hazelcast or Apache Kafka. ESP means that the system will react on constantly generated data. The event is a reference to each data point which can be persisted in a database and the stream represent to output of the events.
In December a often used technology comes out of the shadow, because of a attacking vulnerability in Log4J. Log4J together with the Simple Logging Facade for Java (SLF4J) is one of the most used dependencies in the software industry. So you can imagine how critical was this information. Now you can imagine which important role Logging has for software development. Another logging framework is Logback, which I use.
Another very helpful dependency for professional software development is FF4J. This allows you to define feature toggles, also know as feature flags to enable and disable functionality of a software program by configuration.
JUnit, TestNG | TDD – unit testing |
Mockito | TDD mocking objects |
JGiven, Cucumber | BDD – acceptance testing |
Hibernate, iBatis, Eclipse Link | JPA- O/R Mapper |
Spring Framework, Google Guice | Dependency Injection |
PrimeFaces, BootsFaces, ButterFaces | JSF User Interfaces |
ControlsFX, BootstrapFX | JavaFX User Interfaces |
Hazelcast, Apache Kafka | Event Stream Processing |
SLF4J, Logback, Log4J | Logging |
FF4j | Feature Flags |
This list could be much longer. I just tried to focus on the most used ones the are for Java programmers relevant. Feel free to leave a comment to suggest something I may forgot. If you share this article on several social media platforms with your friends or colleagues I will appreciate.
No post found