Domain Driven Design: Entities, Value Objects, Aggregates and Roots with JPA (Part 5)

by simbo1905

This is the last article in the series which discusses a sample app that does DDD using JPA. I would hesitate to recommend using JPA for DDD unless you are very familiar with JPA. Why? Some of the issues you can hit using JPA are written up on the Scabl blog on Advanced Enterprise DDD. In my own code and this blog, I explained how to dodge some of the bullets, but you need to be quite aware of the pitfalls of JPA to dodge them all. So why did I write the code?

A key point of the demo was to show a friend who is familiar with relational databases that a join table is an implementation detail that can be hidden within a root entity. The code shows that by using package-private fields, methods, and classes, and by returning unmodifiable lists (else defensive copies), we can force all business logic to go via root entities. Each root entity can then ensure that the aggregate of objects it controls does not get corrupted. Root entities enforce the invariants aka the business rules about the valid relationships between entities that model the problem domain.

Why? Well firstly so that when we are in the thick of writing new functionality we won’t mess up the data. That isn’t really DDD it’s plain old OO encapsulation. The real win of DDD is that by modeling the problem domain that the user understands in code you can build high-quality functionality faster. Without an explicit rich domain library that models the business domain and enforces the invariants you have to maintain the invariants in many places in the codebase. Inconsistencies and subtle logic bugs are then guaranteed.

A rich model that makes explicit the business rules will help you capture when a user contradicts themselves when asking for a new feature that violates the known business rules. That may be a “bug” in the user story that needs to be corrected to reflect the actual business rules. Or it may be a fundamental change in the business rules which needs to be reflected in many places. Better you find that out as soon as it comes up rather than when the system has been live for a while. Subtle inconstancies in logic across the application can leave you trying to figure out why data in the production database got into an unexpected state. You want to enforce the state consistency logic in only one place which all functionality uses; a rich DDD library lets you do this.

Better yet the discussions around locking and audit trails in this blog series show that by talking to the root entities a number of concerns can be managed in one place; just below a narrow public API. Once again this isn’t really DDD it’s plain old OO encapsulation. This can be managing a join table or maintaining an audit trail. Such features should not be directly exposed to the outside world. The logic can and should be hidden below a public API that models the problem domain. This means that we can rip out JPA as it is isolated within the package private Repositories. We can replace it with another Java relational database mapping library such as JOOP, or myBatis or even simple JDBC.

We only need a trivial modification outlined in the Scabl blog to be able to remove JPA from the demo code. Rather than the “line item” having a “Product product” field that references to a product root entity we would have a “ProductId productId” to model the association across two aggregates. The ID value objects would be a simple wrapper over the PK of a root entity:

public class ProductId {
    private Long productId;
    public ProductID(Long productId) {
        this.productId = productId;
    }
    public Long getProductId() {
        return productId;
    }
}

Then any public methods that load Product root entities should take a ProductId as a parameter. Why? Firstly we because we have to. If we are using JDBC, we need to have a developer explicitly run queries by PKs to load root entities. Secondly, because it is considered bad practice in the literature about DDD not to have a developer consciously choose when they load root entities. Why? I am not really sure [1].

What I just described was “fixing” things by avoiding a core feature of JPA: lazy loading via proxies. I agree that lazy loading leads to a lot of headaches as outlined by the excellent Scabl blog. I agree with that blog that loading aggregates by explicitly loading the root entities avoids the headaches. Yet forcing a developer to think about when they need to load data was exactly what JPA was trying to avoid. JPA introduces a lot of complexity and gotchas by trying to give the illusion that as you walk around objects they just magically load from the database. That is not something that you need; it is supposed to be a convenient feature. You can simply avoid the headaches by using a less tricky Java relational mapping framework [2].

Where the Scabl blog concludes is by recommending Scala (for immutable objects) and document databases like MongoDB for root entities. Having used Scala I can highly recommend it. Having worked with two production systems that went live on Mongo but then migrated to relational databases I cannot recommend it. In the experience of those teams the lack of transactions, lack of joins, and eventual consistency made writing sophisticated applications that don’t lose any data harder than using a relational database. Having rolled out Cassandra I think it’s a great tool if you have a problem that needs that highly scalable tool. Yet if you have a general problem then a general tool like a relational database has a lot of capabilities. Recently PostgreSQL added jsonb support. So you can possibly have your cake and eat it by mixing documents with joins with transactions within a relational world. And the performance that AWS Aurora gets out of re-engineering MySQL for the cloud to run 64T database suggests that relational database as DBaaS have many more years ahead of them.

So if I had a free hand on my next commercial project what would I use? I would probably avoid JPA (and Java) and use a Scala framework that loads immutable objects from a relational database. I would still look to apply microservices and DDD design principles but would use newer tools.


Endnotes:

[1] Probably because developers don’t notice functionality that is making too many trips to the database until there is a performance problem in production. Remodeling the code to fix that can be hard. So the idea seems to be to make it very explicit when folks are loading data so that it’s obvious when it is happening. Then again you would expect professional developers to load test their apps. It is easy to do things like configure a logging JDBC proxy driver to look at the actual SQL being generated and run so that you can fix any crappy queries before you commit your code.

[2] This would seem to be a case of what is a good pattern to one set of framework designers is later highlighted as an anti-pattern when in practice developers frequently abuse it. A very strong example of this happening is in the Microsoft C# ORM “Entity Framework”. The original windows only version has a lazy loading as a feature. A decade later the framework was re-written to be open source and cross-platform. As at version 1.1 “Entity Framework Core”doesn’t yet have lazy loading. It is on the roadmap, but the fact that they have a 1.1 version without lazy loading is a very strong indication that it is no longer considered a “killer feature” for ORM. .