Every developer has a more or less pronounced affinity to the database. In my experience many developers view the database as necessary evil that is somewhat cumbersome to refactor. Often tools are being used that generate the database structure for the developers (e.g., Liquibase or Flyway in the JVM area). Tools and libraries (Object-relation mapping) renders it very easy to make objects persistent. A few annotations later and the domain is saved in the database.
All these tools remove the database from the typical developers, who “only” want to write their code. This has sometimes the consequence that there is not much attention given to the database during the development process. For instance, indices that were not created will slow down searches on the database. This will not show up in a typical test, which does not work with large data amounts, and thus go like that into production.
Let’s take the fictional case of an online shoe shop. The company requires a service that enables users to log in. A user service is created containing the typical fields like ID, first name, family name, address, and password. To now offer fitting shoes to the users, only a selection of shoes in their actual size is supposed to be displayed. The size is registered in the welcome mask. What could be more sensible than to store this data in the already existing user service? Everybody is sure this is the right decision: these are user-associated data, and this is the right location.
Now the shoe shop expands and starts to sell additional types of clothing. Dress size, collar size, and all other related data are now also stored in the user service.
Several teams are employed in the company. The code gets progressively more complex. It is this point in time where the monolith is split into domain-based services. The refactoring in the source code works well, and a soon the monolith is split apart into many microservices.
Unfortunately, it turns out that it is still not easy to introduce changes. The team in charge of shoes wants to accept different currencies because of international expansion and has to modify the structure of the billing data to include the address format. During the upgrade the database is blocked. Meanwhile no dress size or favorite color can be changed. Moreover, the address data are used in different standard forms of other services and thus cannot be changed without coordination and effort. Therefore, the feature cannot be implemented promptly.
Even though the code is well separated, the teams are indirectly coupled via the database. To rename columns in the user service database is nearly impossible because nobody knows anymore in detail who is using which columns. Consequently, the teams do workarounds. Either fields with the name ‘Userattribute1’ are created, which then are mapped onto the right description in the code, or separations are introduced into the data like ‘#Color: Blue#Size:10.’ Nobody except the involved team knows what is meant by ‘Userattributel,’ and it is difficult to generate an index on ‘#Color: #Size.’ Database structure and code are progressively harder to read and maintain.
It has to be essential for every software developer to think about how to make the data persistent, not only about the database structures but also about where which data is stored. Is the table respective database the place where these data should be located? From a business domain perspective, does this data have connections to other data? In order to remain flexible in the long term, it is worthwhile to carefully consider these questions every time. Typically, databases and tables are not created very often. However, they are a component that is very hard to modify later. Besides, databases and tables are often the origin of a hidden interdependence between services. In general, it has to be that data can only be used by exactly one service via direct database access. All other services that want to use the data may only access it via the public interfaces of the service.