Is there a way to fix the JPA EntityManager?

Using JPA is easy for small projects but has well hidden problems which are caused by some very basic design decisions. Quite a few of them are caused because the EntityManager cannot be made Serializable. Although there are some JPA providers which claim serializability (Hibernate) they aren’t!

Is the EntityManager Serializable?

The LazyInitializationException is a pretty bad beast if you ever worked with EJB managed EntityManagers. That problem caused lots of people to discover alternative ways. Two of the most prominent are JBoss Seam2 if you are working with the JBoss stack and Apache MyFaces Orchestra for Spring applications.

The basic problems are summed up very well in the at large still correct Apache MyFaces Orchestra documentation:
Apache MyFaces Orchestra Persistence explanation

If you read through the whole page, you will see the TODOs at the very bottom of the page:

TODO: is the persistence-context serializable? Are all persistent objects in the context always serializable?

The simple answer is: NO not at all! Neither the EntityManager nor the state in the entities are Serializable as per the current JPA specification!

Why is the EntityManager not Serializable

There are a few reasons:

1. Pessimistic Locking

The biggest blocker first: JPA doesn’t only support Optimistic Locking but also Pessimistic Locking. You can either declare this in your persistence.xml and also programmatically via the LockModeType in many functions.

EntityManager#find(java.lang.Class entityClass, java.lang.Object primaryKey, LockModeType lockMode) 
EntityManager#lock(java.lang.Object entity, LockModeType lockMode) 
...

But if you ever use pessimistic locking (a real hard lock on the database) the connection is bound to the database and cannot be ‘transferred’ to another EntityManager without losing the lock.

2. Id and Version fields are optional

To use the optimistic locking approach, a primary key plus some ‘version’ field must be used in the entity:

 UPDATE tableX SET([somevalues], version=:oldversion+1) WHERE id=:myId AND version==:oldversion

Obviously this update can only succeed once. Trying to update the row a second time will not find any database entry because the version==:oldversion will not be true anymore.

When you use optimistic locking in JPA, you will always have such a ‘version’ column already. But there is no need to specify it yet! Thus this information will not be transported if you serialize the entity!

To fully support optimistic locking, those entities will need mandatory @Id and @Version columns.

3. Losing the entity state information

As outlined in a previous blog post every JPA entity will get ‘enhanced’ with some magic code which tracks _loaded and _dirty state information. Those BitFlags will track the parts of the entity which got changed or fetched lazily.

The problem in this area is mostly caused by the JPA spec which by default prevents the JPA providers from serializing the ‘enhanced entities’ but requires serializing the ‘native’ information. At least that seems to be the common understanding of the following paragraph in the JPA spec:

„Serializing entities and merging those entities back into a persistence context may not be interoperable across vendors when lazy properties or fields and/or relationships are used.
A vendor is required to support the serialization and subsequent deserialization and merging of detached entity instances (which may contain lazy properties or fields and/or relationships that have not been fetched) back into a separate JVM instance of that vendor’s runtime, where both runtime instances have access to the entity classes and any required vendor persistence implementation classes.

Of course, most JPA providers know a way to enable the serialization of the state fields. In OpenJPA just provide the following magic properties to your persistence.xml:

<property name="openjpa.DetachState" value="loaded(DetachedStateField=true)"/>
<property name="openjpa.Compatibility" value="IgnoreDetachedStateFieldForProxySerialization=true"/>

This will also serialize _loaded and _state BitFlags along with your Entity.

The problem with having the EntityManager not Serializable

Well, this one is easy:

  • You cannot store the EnityManager in a Conversation
  • You cannot store the EntityManager in a Session
  • You cannot store the EntityManager in a JSF View
  • No clustering, because Clustering means that you need to Serialize the state

What can you do today?

Today the only working sulution is the entitymanager-per-request pattern. Basically creating a @RequestScoped EntityManager e.g. via a CDI @Produces for each and every request. That also means that you need to manually merge those entities on the callback. If you use JSF that is in your action.

 

How to fix the JPA EntityManager in the future?

Here are my thoughts about how we can do better in the future. Please note that there is a project called Avaje eBean which is not JPA compliant but has already successfully implemented those ideas.

Provide an OptimisticEntityManager

public interface OptimisticEntityManager extends EntityManager, Serializable

The most important change here is that it implements the java.io.Serializable interface.
This OptimisticEntityManager should throw an NonOptimisticModeException whenever one tries to execute an operation on the EntityManager which requires a non-optimistic LockModeType or another operation which creates some lock or non-serializable behaviour.

There should be a way to explicitly request an OptimisticEntityManager, e.g. via

OptimisticEntityManager EntityManagerFactoy#createOptimisticEntityManager(); 

Make @Id and @Version mandatory for those Entities

This will solve the problem with losing the optimistic lock information when serializing.

Define _loaded and _dirty Serialization

The future JPA spec could either clarify that serialization is more important than JPA-vendor inter-compatibility (who uses 2 different JPA providers in the same environment anyway?).
Or just specify that 2 BitFlags can be passed in the Serialized entity and how they should behave.

Please tell me what you think? Do we miss something? It’s not an easy move, but up to now I think it is doable!

PS: Thanks to Shane Bryzak and Jason Porter for helping me get rid of the worst English grammar and wording issues at least. Hope you folks got the gist regardless of my bad english ;)

About these ads

About struberg
I'm ASF member, blogging about Java, µC, TheASF, OpenWebBeans, Maven, MyFaces, CODI, GIT, OpenJPA, TomEE, DeltaSpike, ...

6 Responses to Is there a way to fix the JPA EntityManager?

  1. agoncal says:

    Serializing the EntityManager is crucial for heavy conversational web application. Spring WebFlow has an (unknown) hack to be able to do it…. and it works well. This way you can have flows and subflows (the entity manager gets persisted in subflows and makes it possible).

    Here is my +1 for the JPA 2.1 Expert Group to allow EntityManager to be seriazable

  2. struberg says:

    Thanks for the reply!

    Can you give a quick hint about the trick they are using? I found most of the tricks being used to only work in 95% of the use cases. And in the other 5% they spit Exceptions or even worse they sometimes even silently loose state/data.

  3. I cannot decide if EntityManagers and managed Entities belong into a Conversation/Session etc. or not.

    E.g. initially the usage of JBoss Seam 2 (which enables such features) felt really natural. Especially the default Seam-generated CRUD-apps made heavy usage of this (Conversational Homes for Wizards with Conversational EntityManager). But after a while we realized the heavy CPU and RAM stress – one problem was this that many EMs and their managed Entities where Conversational. Was easy to develop with that but hard to fix at the end…

    It’s like using a O/R mapper without thinking and making mistakes like adding associations to huge lists because of the business model. The hard work to fix this is at the end of the project.

    I’m really not sure if this magic works in practice in the long run, we will get many teribbly slow or ressource hungry web apps. But at leat they where easy to write ;)

  4. struberg says:

    Hi André!

    Well, I fear I didn’t make this clear enough in my original post.

    The current JPA spec has actually 2 problems:

    1. The EntityManager is not Serializable, but much worse
    2. The Entities are not really Serializable neither!

    That is because the Entities (by default) will loose all the state information when they get serialized.This can be prevented by using a few magic switches in the persistence.xml, but by default it doesn’t work.

    Of course, entitymanager-per-request always consumes much less memory than storing the EntityManager in the Session!

    • I understood what you described. I’m just not sure if this should be a valid objective.
      Seam provided a hacky solution that enabled parts of your desired approach…using Extended EntityManagers bound to Conversations/Sessions amd managed Entities bound to SFSB, EM Cluster-”Replication” via Passivation/Reactivation tricks etc. – many pitfalls.

      Thats not the topic…the problem was, that it really was _much_ more ressource intensive than other solutions – it borders on unusable (at least for complex forms and apps).
      Currently we don’t even use “EM in request” patterns and have functions like search(params, fetchPathes) to prevent lazy loading. It’s fast and scalable – but also more cumbersome.

      I also don’t see much value in this Long-conversations with many paged wizards…I don’t know what business apps other people are developing, but when should such a transaction really work? I filled in 5 tabs and say: yep, save – inbetween associated data has changed (other users) or the user was at lunch and the conversation timed out etc.

      Complicated topic…I’m really not very settled about this, depends on the use cases.

      cheers André

      • struberg says:

        full ack on the resource front!

        Most times I’m really advocating for the entitymanager-per-request pattern.

        But as alway: “use the right tool for the right job!”

        The thing is: sometimes you only have to hack an intranet app with very limited outreach. 500 users max and no public internet for example. In this case it is sometimes really nice to just store the EM in a conversation.

        Late detection of concurrent changes is something we also need to take care something in the upcoming Apache DeltaSpike JPA module. Guess we could need some additional brainpower. Feel free to add your ideas!

        LieGrue,
        strub

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: