toString(), equals() and hashCode() in JPA entities

Many users have generated toString, equals() and hashCode() methods in their JPA entities.
But most times they underestimate what impact that can have.

This blog post post is inspired by a chat I had with Gavin King and Vlad Mihalcea.

Preface: I like to emphase that I take a big focus on keeping the customer code portable across different JPA vendors. Some ‘Uber trick’ might work in one JPA vendor and totally mess up the others. Each JPA provider is broken in it’s own very special way. Trust me, I know what I am talking about from both a user and a vendor perspective… The stuff I show here is the least common denominator for JBoss Hibernate, EclipseLink and Apache OpenJPA. Please shout out if you think some of the shown code does not work on one of those JPA containers.

toString()

What’s wrong with most toString() methods in entities?
Well, most of the times developers just use the ‘generated toString’ shortcut to create this method. And that means that the generated toString() method usually just reads all the attributes of your entity and prints it.

What happens if you touch an attribute really depends in a high degree which ‘mode’ your JPA provider runs in. In Hibernate you often have the pure class. In that case not much will happen if you only read the attributes which are not Collections etc. By ‘using attributes’ I mean this.fieldname and not using getters like this.getFieldname(). Simply because Hibernate does not support lazy loading for any other fields in that mode. However, if you touch a @OneToMany or an @ElementCollection field then you will force lazy loading on the first time toString() gets invoked. It might also behave different if you use the getters instead of reading the attributes.

And if you use EclipseLink, Apache OpenJPA or even Hibernate in byte-code weaving mode or if you get a javassist proxy from Hibernate(e.g from em.getReference()) then you are in even deeper troubles. Because in that case touching the attributes might trigger lazy loading for any other field as well.

I tried to explain how the enhancement or ‘weaving’ works in JPA in a blog post many years ago https://struberg.wordpress.com/2012/01/08/jpa-enhancement-done-right/ Parts of it might nowadays work a tad different but the most basic approach should still be the same.

Note that OpenJPA will generate a toString() method for you if the entity class doesn’t have one. In that case we will print the name of the entity and the primary key. And since we know the state of the _loaded fields we will also not force generating a new PK if the entity didn’t already load one from the sequence.
According to Gavin and Vlad Hibernate doesn’t generate any toString(). I have no clue whether EclipseLink does.

For other JPA implementations than Apache OpenJPA I suggest you provide a toString which looks like the following

public String toString() {
    return this.getClass().getSimpleName() + "-" + getId();
}

And not a single attribute more.

equals() and hashCode()

This is where Vlad, Gavin and I really disagree.
My personal opinion is that you shall not write own equals() nor hashCode() methods for entities.

Vlad did write a blog post about equals() and hashCode() in the past https://vladmihalcea.com/2016/06/06/how-to-implement-equals-and-hashcode-using-the-entity-identifier/

As you can see it’s not exactly easy to write a proper equals() and hashCode() method for JPA entities. Even Vlad’s advanced version does have holes. E.g. if you use em.getReference() or em.merge().
In any case, there is a point where Gavin, Vlad and I agree upon: generating equals() and hashCode() with IDEs is totally bollocks for JPA entities. It’s always broken to compare *all* fields. You would simply not be able to update your database rows 😉

IF you like to write a equals() method then compare the ids with a fallback on instance equality. And have the hashCode() always return zero as shown in Vlad’s blog.

Another way is to generated a UUID in the constructor or the getId() method. But this is pretty performance intense and also not very nice to handle on the DB side (large Strings as PK consume a lot more storage in the indexes on disk and in memory)

Using ‘natural IDs’ for equals()

That sounds promising. And IF you have a really good natural ID then it’s also a good thing. But most times you don’t.

So what makes a good naturalId? It must adhere to the following criteria:

  • it must be unique
  • it must not change

Sadly most natural IDs you think of are not unique. The social security number (SSN) in most countries? Hah, not unique! Really, there are duplicates in most countries…
Also often used in examples: the ISBN of a book. Toooo bad that those are not unique neither… Sometimes the same ISBN references different books, and sometimes the same book has multiple ISBNs assigned.

What about immutability? Sometimes a customer does not have a SSN yet. Or you simply don’t know it YET. Or you only know it further down the application process. So the SSN is null and only later get’s filled. Or you detect a collision with another person and you have to assign one of them a new SSN (that really happens more often than you think!). There is also the case where the same physical person got multiple SSN (happens more frequent as well).

Many tables also simply don’t have a good natural ID. Romain Manni-Bucau came up with the example of a Blog entry. What natural ID does a blog entry have? The date? -> Not unique. The title? -> can get changed later…

Why do you need equals() and hashCode() at all?

This is a good question. And my answer is: “you don’t !”

The argument why people think it’s needed for JPA entities is because e.g. having a field like:

@OneToMany 
private Set others;

A HashSet internally of course uses equals() and hashCode() but why do you need to provide a custom one? In my opinion the one you implicitly derive from Object.class is perfectly fine. It gives you instance-equality. And since per the JPA specification the EntityManager guarantees that you only get exactly one single entity instance for a row in the database you don’t need more. Doubt it? Then read the JPA specification yourself:

"An EntityManager instance is associated with a persistence context. A persistence context is a set of entity instances in which for any persistent entity identity there is a unique entity instance."

https://docs.oracle.com/javaee/7/api/javax/persistence/EntityManager.html

An exception where instance equality does not work is if you mix managed with detached entity instances. But that is something you should avoid at any cost as my following examples show.

Why you shouldn’t store managed and detached entities in the same Collection

Why would you do that? Instead of storing entities in a Set you can always use a Map. In that case you again don’t need any equals() nor hashCode() for the whole entity. And even then you might get into troubles.

One example is to have a ‘cache’.
Say you have a university management software which has a Course table. Courses get updated only a few times per year and only by some administrative people. But almost every page in the application reads the information. So what could be more reasonable as to simply store the Course in a shared @ApplicationScoped cache as Map for say an hour? Why don’t I use the cache management provided with some JPA containers? Many reasons. First and foremost they are not portable. They are also really tricky to configure (I’m talking about real production, not a sample app!). And you like to have FULL control over the cache!

So, having a cache is really a great idea, but *please* do not store JPA entities in the cache. At least not as long as they are managed. All is fine as long as you only run it locally and click around on your app and only do unit tests. But under heavy load in production (our app had 5 Mio page hits/day average) you will hit the following problem:

The JPA specification does not allow an EntityManager to be used from multiple threads at the same time. As a managed entity is bound to an EntityManager, this limitation also affects the entities themselves.
So while you do the em.find() and later a coursesCache.put(courseId, course) the entity is still in ‘managed’ mode! And under heavy load it *will* happen that another user gets the still managed entity from the cache before it got detached (which happens at the tx commit or request end, depending on your setup). Boooommm it goes…

How can you avoid that? Simply use a view object. Normally the full database entities with all their gory attribute details and sub-tables are not needed on an overview course list anyway. So you better use a ‘new’ query:

CourseListVO couseViewItem 
  = em.createQuery("SELECT NEW org.myproject.Course(c.id, c.name, c.,...) " +
      " FROM Course AS c WHERE...");
cache.put(courseId, courseViewItem);

By using a ‘new Query’ you will get instances which are not managed by the container. And it’s also much faster and consumes less memory btw.

Oh I’m sure there are things which are still not cosidered yet…

PS: this is not an easy topic as you might be able to judge from looking at the involved people. Gavin is the inventor of Hibernate and JPA, Vlad is the current Hibernate maintainer. And I was involved in the DODS DB layer of Lutris Enhydra in the 90s and am a long time Apache OpenJPA committer (and even the current PMC chair).

Advertisements

About struberg
I'm an Apache Software Foundation member blogging about Java, µC, TheASF, OpenWebBeans, Maven, MyFaces, CODI, GIT, OpenJPA, TomEE, DeltaSpike, ...

One Response to toString(), equals() and hashCode() in JPA entities

  1. Pingback: The best way to implement equals, hashCode, and toString with JPA and Hibernate | Vlad Mihalcea's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: