Agile and domain driven design (part 3): Microservices

by simbo1905

In the last post we imaged we are a programmer on an agile digital service team who had gotten as far as writing some DDD code. It is a rich OO domain model that knows how to obey the business rules (aka enforce the invariants of the contract aggregate). The next question is where should this code run? What provides the surrounding workflow? Which screens, running where, drive this “thing”? Talking it through with the other developers in the team you decided to build and deploy a contracts microservice

First up we need to beware “CV driven engineering” and ensure that we aren’t just suffering from “microservices envy“. Just because we have identified an aggregate and a root entity that revolves around a contract doesn’t mean that we should be deploying it as it’s own service. Breaking up a system too early into many independently deployable services increases complexity and overhead. Deferring breaking out services into a separately deployable services can, and should, be pushed back as long as possible. You really shouldn’t be trying mocroservices unless you are running Kubernetes and a Service Mesh. In our fictional example we deem that its appropriate to run a full-blown service for contract. This doesn’t really make sense when you have a small team but it is a great tradeoff when you have multiple teams that can work on different services who need a degree of autonomy to add features quickly.

Some likely indicators of why it is okay to do this would be:

  1. contract is a well defined bounded context that is “a thing” the looks like it can be evolved, expanded and critically independently released without touching other bounded contexts such as deliveries.
  2. You are working on a large platform not a small project. A digital service that comfortably lives within a single app can and should fit into one app for operational efficiency. If you are running one scrum team there are other ways of compartmentalising your software without the costs, complexity and pitfalls of remote calls to separately deployed services. See the diagram at that last link which explains this.
  3. There is a group of sub-processes or specific specialist activities that go on within the contract bounded context that doesn’t need to be exposed outside of it.
  4. It looks likely that you will want to enhance the contract functionality, or expand coverage of the functionality, in a way that can happen entirely independently of the downstream delivers fulfilment of contracts.
  5. It looks likely that you may end up with multiple front-end applications that depend up contract functionality.
  6. You have a number of touch points within the wider flow that will lead to synchronously or asynchronously read or update contracts but these look to be coarse grained “business event” driven updates the users identified during event storming, not small and chatty interactions.
  7. You have a “feature owner” assigned to the bounded context in question. So whilst you only want to have a single product owner for a digital service often there are intricate bounded contexts which need a deputy to manage on behalf the sheriff.
  8. You are pretty sure its not just Conways law or office politics that is trying to enforce an artificial or historic boundary that shouldn’t be there in a bright shiny, new and joined-up digital service.

The term “micro” in the word micro-service doesn’t mean tiny which may imply “chatty”. It is more to imply “does one thing and does that one thing well”. Only “one thing” may imply a little to small. A name like “independently deployable bounded context with a coarse grained business event orientated API” isn’t as catchy as the name “microservice”. Most organization practicing agile use small mixed discipline teams with less developers in each team than you can count on one hand. AWS famously keep a team to a size you can feed with two American sized pizza. Such a team size is going up be able to build and evolve a one or a few microservice that covers one or a few DDD aggregates in a few sprints.

The preview of the 2018 book Microservice Patterns by Chris Richards went as far as to say that a startup should never use microservices but stick with monoliths. Why? Because there won’t be any stable bounded contexts that reflect stable area of the business. Complete changes to business models and processes will crush the team with reorganising remote APIs, distributed service and many databases. We will also see in a later post that sagas for business transactions are going be way harder to completely refactor in response to business process changes than changing some business process code within a monolith. To quote Martin Fowler “don’t even consider microservices unless you have a system that’s too complex to manage as a monolith”.

Once we are convinced its a good idea we are ready to define the API of our contracts service. How? Well we did event storming that defined key events that updated contracts. So we can pretty much straight away take a stab at the first set of commands or business aligned events that the API should support. Start with that. We should at this point be in a good position to know the initial commands and queries we need to support other teams and a micro-frontend that uses contracts.

What we probably don’t want is just a CRUD interface. I would say that a purely CRUD interface is an anti-pattern the name suggests that we are exposing our internal model to our clients to load, update and save. Rather we want a coarse grained API that speak to the language of the business domain and the business events and intentions of the users. Sure we do need search methods to load and load-by-id methods. When it comes to saving and updating though I would aim to be more about passing commands or notifications of business events with the associated data in a fairly flat and minimalist document structure rather that expose the model inside of the service. That way if we change our database tables we don’t have to refactor our service APIs. Happy days.

What about concepts that span the bounded contexts? In the sample code a delivery is a pretty skinny record in a contract negotiation. Once a contract is agreed there are a whole host of delivery fulfilment work to be done. If I order something on Amazon then my delivery only needs to be my address and whether I chose same day, next day or free shipping. Once I confirm my order I suspect that Amazon checks where the stock is, matches it to a shipping supplier, and so explodes it into more details. So a logical delivery may be a skinny composite entity in one part of their system but a whole root entity for a large aggregate within a different part of their system.

So whilst logically a delivery is “one thing” there may physically be different schemas in different databases. Why? So that they can completely and independently scale out and upgrade the shopping experience, or completely and independently scale out and upgrade the order fulfilment systems. More significantly the business could launch a new delivery service and deploy (or roll-back) that largely independently of services that are upstream of the fulfilment process. All you need at each service is the business key of an aggregate entity and whatever skinny details are needed in the subdomain covered by the service.

The 2018 book Microservice Patterns by Chris Richards has a fairly complete example of a restaurant take-out ordering example. At one point it shows a food order being a “God object” relational database style object that is entangled into everything. This is then clearly contrasted with a food order being a self contained aggregate in it own service holding only the business keys to the other aggregates in other services. The order service caches only some key details such as the delivery address that has been copied from the customer service. The customer service can hold a richer model of addresses such as allowing a customer to have multiple ceregistered active addresses and a history of addresses. That address complexity is a encapsulated in a separate service from the core food order service.

This model of autonomous domains is confusingly called a “share nothing” architecture. It means that physically you don’t share the same local database between the two parts of the platform (although they might be two logical databases on the same database cluster). This implies that you have eventual consistency and need to use appropriate design patterns to keep seperate parts of the system in sync. Why you use eventual consistency I will cover in a later post about Sagas.

The final part of the picture are the screens. If you have ever had the bad luck to have try to upgrade a successful system which has a data model that is “a bunch of fields collected from a load of screens” then you truly understand why you want to do DDD. The business logic gets smeared around the system in transaction script handlers, driven by screen controllers, driven by screens, that are in constant churn. This is the road to features that need bug fixed neatly every sprint. The cost of change sky rockets and eventually the stakeholders give up on adding new features. If users have a choice they switch to a competitor.

In many systems the screens are aligned to the business workflow rather than the data model. Sure they are a lens through which you view the data model but the screens should be optimised to best guide the user through a journey to achieve an outcome. You should record the click flow of users though the system and do A/B testing to investigate how changing or re-ording screens might improve the efficiency of the system. This implies that screens should be orthogonal to the data model so that they can evolve independently.

You cannot circle the square and have fast summary screens that query many aggregates in many services. Rather you need to create a separate search API with its own optimised data model for aggregated screens. This can be part of an existing service if that makes sense or in its own service if it is logically its own domain. Management reporting or analytics is a classic example where you use a custom data store; exposing rarely summary views to power users should follow the same pattern. Following the microservices patterns book we can publish domain events when aggregates are updated. The reporting or analytics stores can subscribe to these events to keep themselves up to date. A saga may be necessary to transactionally update a search microservice where consistency is required. Eventual consistency and sagas will be covered in a later post.

In the simplest case a bunch of screens on a user journey would typically make multiple read queries from microservices to build up a unit of work that is discharged as one or more commands. To process a command a service may need to update multiple aggregates in multiple services. The next post will go into how asynchronous messaging and eventual consistency between micro-services improves uptime. Yet this precludes using global transactions. To commit or roll-back a change across multiple services you need a series of local transactions applied to each service which is known as a Saga.