JPA Enhancement done right

JPA Enhancement done right

When it comes to handling databases in a highlevel language there are basically 2 different approaches:

a.) Systems which give you access to all the gory details and therefor have a pretty high entry barrier.
b.) Systems which are perfectly easy to use for samples and simple apps – but you need all the (well hidden) gory details if you use that stuff in real world projects.

Without a doubt JPA is more the category b.) type of project.

Using JPA

If you look at a classical JPA entity, you will find something similar to our following example:

import javax.persistence.*;

@Entity
public class Customer {
  @Id
  @GeneratedValue
  private Long id;

  @Column(length = 50)
  private String firstName;

  public Long getId() {
    return id;
  }

  public String getFirstName() {
    return firstName;
  }

  public void setFirstName(String firstName) {
    this.firstName = firstName; 
  }
}

Usually the entities get created with the ‘new’ keyword and then stored by calling EntityManager#persist()

Customer c = new Customer();
c.setFirstName("John");
entityManager.persist(c);

Looks easy, isn’t?

Enhancement

A lot of the ‘easyness’ of JPA actually comes by the feature of ‘enhancing’ classes which are annotated with @Entity. And this enhancement step is usually well hidden deep inside the code.

So let’s first look what this ‘enhancement’ does for you!

Whenever you touch a JPA entity, it will track the following information for each and every non-transient attribute of the entity:

* _dirty: if you change firstName from ‘John’ to ‘Barry’, this field becomes ‘dirty’. JPA uses this information to decide which fields/entities needs to get written to the database.

* _loaded: JPA entities might contain fields which have FetchType.LAZY. They will only get loaded from the database once they are accessed the first time. This is pretty neat if you have a fat @ElementCollection which takes some time to load. Some older JPA impls did rely on if(field == null) but what about c.setFirstName(null); to clear out this info in the database? How to distinguish between a field which didn’t yet get laoded (and is null) from a field which got loaded and afterwards set to null?
_loaded will be used to track this info.

But where is this _dirty and _loaded info in my entities? Well, this is exactly what the ‘enhancement’ step is doing. It adds this information to the entity class file for you!

Basically the setFirstName in the ‘enhanced’ class will finally look somehow like the following (completely transparent for the user!):

public void setFirstName(String firstName) {
  if (safeEquals(this.firstName, firstName) {
    return;
  }

  //EntityManager must not get accessed in different threads
  assertEntityManagerLocking();

  setDirty(FIRSTNAME_FIELD);
  this.firstName = firstName;
}

Technically there are 3 different ways to achieve this

1.) Using Proxies/Subclassing at runtime

A Proxy is basically a dynamically created Subclass. This is the default mode for Hibernate and has quite a few drawbacks.

1.a.) you can only proxy methods. It’s not possible to intercept the field access inside the class level. This makes it much harder to correctly write EntityListeners and other methods which juggle around with fields.

1.b.) Only property access for classes loaded via a query or EntityManager#find() can get tracked. It’s not possible to do the same for Entities just created via ‘new’ because there is no proxy in this case.

1.c.) Lazy loading is not possible in this case. All fields (including expensive @ElementCollection reads) will be performed eagerly.

1.d.) Hibernate seems to have a mode where it dynamically replaces Lists stored in an entity with their own ProxyLists for managed entities. That way they can at least do some LazyLoading. But doing the dirty-check requires to compare each entry in those Lists with the database – that’s damn slow!

2.) Using Compilation Enhancement

You can use either ant or maven to ‘enhance’ the class files of your entities to contain all the additional code which is needed to work properly. This is the prefered way as it provides the full set of features and also the best performance at runtime. Actually this option comes in 2 different flavours:

2.a.) Build Time Enhancement: The class file get’s modified even before it will get packaged into the JAR. I will come back to this again a bit later and explain it in detail.

2.b.) Deploy Time Enhancement: The class file get’s enhanced by the ‘deployer’ in a JavaEE container. If you upload your WAR/EAR to your server, it will get unpackaged, all classes with an @Entity will automatically get enhanced immediately afterwards by the container.

3.) Runtime Enhancement (via ClassTransformer)

At the first glance the JavaAgent [1] looked like a smart way to perform the enhancement at runtime, but it turned out that there are fundamental problems with this approach. Again there are 2 different flavours for this way of enhancement:
3.a.) using the Java5 -javaagent option.
3.b.) using the Java6 Instrumentation#retransformClasses. This allows to drop out loaded classes and replace them with their ‘transformed’ version.

The major problem with both of the aforementioned mechanism is that they use the SystemClassLoader to register the ClassTransformer and subsequently also to load the classes needed to perform the transformation!

This has the bad side effect that the SystemClassloader gets polluted with some impl classes of your persistence provider. But not all, and if the transformation is finished, the jar containing the Transformer will get detached from the JVM again. This will leave you with 30% of Hibernate/OpenJPA/EclipseLink lying around in your SystemClassPath and the rest is not available.

Not that worse, one might say, because all those classes are still available in your web application. Too bad that a nice little security mechanism kicks in and destroys all our dreams. The Servlet specification defines that any container must prevent redefining ‘system classes’ – that is all classes which get loaded via the SystemClassLoader. Thus in any spec conform servlet container (e.g. tomcat [2]) you are now bound to the few classes of your jpa provider which are available in your SystemClassLoader. This will result in tons of NoClassDefFound errors and many tears…

The Winner: Build Time Enhancement

Technically only Built Time Enhancement and Deploy Time Enhancement do work sufficiently at all. The reason I don’t use Deploy Time Enhancement is that you are not able to test and debug locally what you have running in your server. All the unit tests are not worth much if you don’t run the code which gets executed in production. Thus Build Time Enhancement is the clear winner to me.

The openjpa-maven-plugin

Quite some time ago we wrote the openjpa-maven-plugin [3] which allows to do all the enhancement steps while building your project with maven. This plugin got moved over to the OpenJPA project itself and will first be available with the upcoming OpenJPA-2.2.0 version (expect this to be out this month). A similar tool is available as ant-task. There is also a maven plugin for doing the same with hibernate [4].

IDE pitfalls

Most IDEs nowadays compile the classes on the fly. If you don’t have a special IDE plugin installed which ‘fixes’ the entity classes afterwards, you might get errors reporting that you ‘try to run unenhanced classes’. Usually this happens only the first time or if you change an Entity class.

In this case simply enhance them again on the commandline.
$> mvn clean process-classes

Afterwards you can work in your IDE without any further problems.

Glossar:

[1] javaagent: http://docs.oracle.com/javase/6/docs/api/java/lang/instrument/package-summary.html

[2] tomcat WebAppClassLoader: http://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/catalina/loader/WebappClassLoader.java
see public synchronized Class loadClass(String name, boolean resolve)

// (0.2) Try loading the class with the system class loader, to prevent
// the webapp from overriding J2SE classes

[3] http://mojo.codehaus.org/openjpa-maven-plugin/usage.html

[4] http://mojo.codehaus.org/maven-hibernate3/hibernate3-maven-plugin/

OpenJPA: http://openjpa.apache.org/builds/latest/docs/manual/manual.html#ref_guide_pc_enhance

Hibernate: http://docs.jboss.org/hibernate/core/3.3/reference/en/htm/performance.html#performance-fetching-lazyproperties

Advertisements

About struberg
I'm an Apache Software Foundation member blogging about Java, µC, TheASF, OpenWebBeans, Maven, MyFaces, CODI, GIT, OpenJPA, TomEE, DeltaSpike, ...

21 Responses to JPA Enhancement done right

  1. Have a look at the pom.xml archetype I’m putting together: https://github.com/ljnelson/jpa-archetype

  2. Runtime enhancement does not have to have the drawbacks you mention.

    If the persistence provider implements the javaagent as a totally separate jar, with only the classes required for transformation (and possibly even uses the maven shade plugin to relocate them to a different package) then the drawbacks you mention can be avoided.

  3. struberg says:

    Hi Stu!

    Yup, that might work out. In OpenJPA we have already a separated jar, but we do not yet shade it away.
    What is the current status in Hibernate? The last time I tried it (~2 years ago), I ended up with getting some javax.persistence classes in the SystemClassLoader.

    LieGrue,
    strub

  4. struberg says:

    Laird,
    This seems like a good idea. However, in bigger projects I found it much easier to create a project which can run against different databases (without changing the WAR/EAR) than to run with different JPA providers. You can easily switch the database even dynamically using CDI [1][2][3], but you cannot switch JPA providers that easily. Usually you also need some ‘magic configuration’ for each of the JPA provides to work around spec shortcomings.

    With OpenJPA I usually use the following properties in my persistence.xml:

     
    <!-- disable runtime instrumentation -->
    <property name="openjpa.DynamicEnhancementAgent" value="false"/>
    
    <property name="openjpa.PostLoadOnMerge" value="true" />
    <property name="openjpa.DetachState" value="loaded(DetachedStateField=true)"/>
    <property name="openjpa.Compatibility" value="IgnoreDetachedStateFieldForProxySerialization=true"/>
    
    <property name="openjpa.jdbc.MappingDefaults"
                 value="ForeignKeyDeleteAction=restrict, JoinForeignKeyDeleteAction=restrict"/>
    <!-- use class per table strategy -->
    <property name="openjpa.Sequence" value="class-table(Table=SEQUENCES, Increment=20, InitialValue=10000)"/>
                
    

    LieGrue,
    strub

    [1] https://cwiki.apache.org/EXTCDI/jpa-usage.html#JPAUsage-ConfigurableDataSource%28sincev1.0.2%29
    [2] https://svn.apache.org/repos/asf/myfaces/extensions/cdi/trunk/jee-modules/jpa-module/api/src/main/java/org/apache/myfaces/extensions/cdi/jpa/api/datasource/DataSourceConfig.java
    [3] https://svn.apache.org/repos/asf/myfaces/extensions/cdi/trunk/jee-modules/jpa-module/impl/src/test/java/org/apache/myfaces/extensions/cdi/jpa/test/dbconfig/TestDbConfig.java

  5. markus says:

    Thanks for the great post. It unveils a lot of details which are not widely know. If its time to pick your favorite JPA implementation all the ‘hidden treasures’ come into play. And every provider does it’s own magic here.

    Rgds,
    Markus

  6. struberg says:

    Markus,
    Thanks for the kind words. This ‘hidden treasures’ became even more important in real life lately, as most JPA providers react completely different depending on the version number in your persistence.xml:

    <persistence version="2.0">
    

    will (in non-trivial scenarios) give you completely different behaviour than using

    <persistence version="1.0">
    

    All the stuff mentioned above will make no difference in small samples which often get showcased in tutorials, but they might cause a whole bunch of problems in real world projects.

  7. craigday says:

    There’s also a Maven plugin to do static weaving for EclipseLink. Its available at: http://code.google.com/p/eclipselink-staticweave-maven-plugin/

    Cheers
    Craig

  8. Pingback: Is there a way to fix the JPA EntityManager? « Struberg's Blog

  9. agoncal says:

    For the _loaded property there is the PersistenceUtil interface since JPA 2.0 which has a few isLoaded(Object) methods

  10. Pingback: Using JPA in real projects (part 1) « Struberg's Blog

  11. ams says:

    Does OpenJPA weave the lodaded and dirty management into the JPA classes or is it implemented more like a real AspectJ aspect?

    • struberg says:

      That depends!
      The ‘normal’ operation mode is to completely replace the original class and store the PcStateManager directly in the Entity itself. Try it out yourself. Take a JPA Entity, enhance it with build-time enhancement (openjpa-maven-plugin) and then use jad to decompile the class file again. You will end up getting something like that:

      public class MyEntity
          implements AuditedEntity, Serializable, PersistenceCapable, Externalizable
      {
      ...
          private static Class pcPCSuperclass;
          protected transient boolean pcVersionInit;
          protected transient StateManager pcStateManager;
          private transient Object pcDetachedState;
      
      ...
      

      and lots of other changes directly in your class.

      OpenJPA also knows a ‘subclassing mode’ like Hibernate, but that’s not recommended.

  12. mauro says:

    how i can to make enhancement at build time with netbeans?
    mauro

  13. Chrisco says:

    It’s amazing that JPA with enhancement is almost as awesome as JDO (always has had enhancement) was 7 years ago. The main reason Hibernate users dissed JDO was because of its enhancement yet now people openly admit an enhanced solution performs much better than the proxied/reflection based solutions. Ah, tis a strange world we live in!

    • struberg says:

      I personally like JDO. It’s way more explicit than JPA.
      And in fact, OpenJPA is the direct descendent of Solarmetric KODO, which was one of the first (and best) JDO implementations.

  14. Pingback: The maven jpa-archetipe for Enhance the EntityClass at build-time | mauroprogram's Blog

  15. Raeffray says:

    Hi,
    I have two classes: super class A and Class B that inherits from A. Both are entity classes. After enhancement I saw (decompiling both them) that Class A has implements Externalizable read/writeExternal and class B doesn’t and consequently the attributes of Class B aren’t serialized. Do you have any clue about it? Thanks and best wishes.

    • struberg says:

      Hi, and sorry for only answering thus late.

      This is a bug which I fixed in OpenJPA-2.2.x (don’t remember exactly when). We now generate bytecode which delegates to super() to handle their own stuff.

  16. Nice Article. Thanks so much !

  17. Pingback: toString(), equals() and hashCode() in JPA entities | Struberg's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: