The (mostly) unknown story behind javax.ejb.EJBException

Yesterday I blogged about what impact Exceptions do have on JavaEE transactions in EJB.
But there is another funny EJB Exception mechanism waiting for you to get discovered – the javax.ejb.EJBException.

This handling dates back to the times when EJB was mainly intended to be a competition to NeXT Distributed Objects and Microsoft DCOM. Back in those days it was all about ‘Client-Server Architecture’ and people tried to spread the load to different servers on the network. A single EJB back then needed 4 classes and was inherently remote by default.

Only much later EJBs got a ‘local’ behaviour as well. And only in EJB-3.1 the No-Interface View (NIV) got introduced which made interfaces obsolete and are local-only.
But for a very long time remoting was THE default operation mode of EJBs. So all the behaviour was tailored around this – regardless whether you are really using remoting or are running in the very same JVM.

The impact of remoting

The tricky situation with remote calls is that you cannot be sure that every class is available on the client.

Imagine a server which uses JPA. This might throw a javax.persistence.EntityNotFoundException. But what if the caller – a Swing EJB client app – doesn’t have any JPA classes on it’s classpath?
This will end up in a ClassNotFoundException or NoClassDefFoundException because de-serialisation of the EntityNotFoundException will blow up on the client side.

To avoid this from happening the server will serialize a javax.ejb.EJBException instead of the originally thrown Exception in certain cases. The EJBException will contain the original Exceptions stack trace as pure Strings. So you at least have the information about what was going wrong in a human readable format.

If you like to read up the very details then check out 9.4 Client’s View of Exceptions in the EJB specification.

Switching on the ‘Auto Pilot”

Some containers like e.g. OpenEJB/TomEE contain a dual-strategy. We have a ‘container’ (ThrowableArtifact) which wraps the orignal Throwable plus the String interpretation and sends both informations as fallback over the line.

On the client side the de-serialization logic of ThrowableArtifact first tries to de-serialize the original Exception. Whenever this is possible you will get the originally thrown Exception on the client side. If this didn’t work then we will use the passed information and instead of the original Exception we throw an EJBException with the meta information as Strings.

The impact on your program?

The main impact for you as programmer is that you need to know that you probably not only need to catch the original Exception but also an EJBException. So this really changes the way your code needs to be written.
And of course if you only got the EJBException then you do not exactly know what was really going on. If you need to react on different Exceptions in different ways then you might try to look it up in the exception message but you have no type-safe way anymore. In that case it might be better to catch it on the server side and send an explicit @ApplicationException over the line.

When do I get the EJBException and when do I get the original one?

I’d be happy to have a good answer myself ;)

My experience so far is that it is not well enough specified when each of them gets thrown. But there are some certain course grained groups of container behaviour:

  • Container with Auto-Pilot mode; Those containers will always try to give you the original Exception. And only if it is really not technically possible will give you an EJBException. E.g. TomEE works that way.
  • Container who use the original Exception for ‘local’ calls and EJBException for remote calls.
  • Container who will always give you an EJBException – even for local invocations. I have not seen those for quite some time though. Not sure if this is still state of the art?

Any feedback about which container behaves what way is welcome. And obviously also if you think there is another category!

Transaction handling in EJBs and JavaEE7 @Transactional

Handling transactions in EJBs is easy, right? Well, in theory it should be. But how does the theory translate into reality once you leave the ivory tower?

I’ll show you a small example. Let’s assume we have 2 infrastructure service EJBs:

@Stateless
public class StorageServiceImpl implements StorageService {
  private @EJB CustomerService customerService;
  private @PersistenceContext EntityManager em;

  public void chargeStorage(int forYear) throws CustomerNotFoundException {
    storeNiceLetterInDb(em);
    Customer c = customerService.getCurrentCustomer(); 
    doSomethingElseInDB(); 
  }
} 

And now for the CustomerService which is an EJB as well:

@Stateless
public class CustomerServiceImpl implements CustomerService {
  public Customer getCurrentCustomer() throws CustomerNotFoundException {
    // do something if there is a current customer
    // otherwise throw a CustomerNotFoundException
  }
}

The Sunshine Case

Let’s first look at what happens if no problems occur at runtime.

In the normal operation mode some e.g. JSF backing bean will call storageService.chargeService(2015);. The implicit transaction interceptor will use a TransactionManager (all done in the interceptor which you do not see in your code) to check whether a Transaction is already open. If not it will open a new transaction and remember this fact. The same check will happen in the implicit transaction interceptor for the CustomerService.

When leaving CustomerService#getCurrentCustomer the interceptor will recognize that it didn’t open the transaction and thus will simply return. Otoh when leaving StorageService#chargeStorage it’s interceptor will commit the transaction and close the EntityManager.

Broken?: Handling checked Exceptions

Once we leave the sunny side of the street and hit some problems the whole handling start to become messy. Let’s look what happens if there is a checked CustomerNotFoundException thrown in CustomerService#getCurrentCustomer. Most people will now find their first surprise: The database changes done in storeNiceLetterInDb() will get committed into the database.

So we got an Exception but the transaction still got committed? WT*piep*!
Too bad that this is not a bug but the behaviour is exactly as specified in “9.2.1 Application Exceptions” of the EJB specification:

An application exception does not automatically result in marking the transaction for rollback unless the ApplicationException annotation is applied to the exception class and is specified with the rollback element value true…

So this means we could annotate the CustomerNotFoundException with @javax.ejb.ApplicationException(rollback=true) to force a rollback.
And of course we need to do this for ALL checked exceptions if we like to get a rollback.

Broken?: Handling unchecked Exceptions

The good news upfront: unchecked Exceptions (RuntimeExceptions) will usually cause a rollack of your transaction (unless annotated as @AppliationException(rollback=false) of course).

Let’s assume there is some other entity lookup in the code and we get a javax.persistence.EntityNotFoundException if the address of the customer couldn’t be found. This will rollback your transaction.

But what can we do if this is kind of expected and you just like to use a default address in that case? The natural solution would be to simply catch this Exception in the calling method. In our case that would be a try/catch block in StorageServiceImpl#chargeStorage.

That’s a great idea – but it doesn’t work in many containers!

Some containers interpret the spec pretty strictly and do the Exception check on _every_ layer (EJB spec 9.3.6) . And if the interceptor in the CustomerService detects an Exception then the implicit EJB interceptor will simply roll back the whole transaction and mark it as “rollbackOnly”. Catching this Exception in an outer level doesn’t help a bit. You will not get your changes into the database. And if you try to do even more on the database then you will blow up again with something like “The connection was already marked for rollback”.

And how is that with @javax.transaction.Transactional?

Basically the same like with EJBs. In my opinion this was a missed chance to clean up this behaviour.
You can read this up in chapter 3.6 of the JTA-1.2 specification.

The main difference is how to demarcate rollback vs commit exceptions. You can use the rollbackOn and dontRollbackOn attributes of @Transactional:

@Transactional(rollbackOn={SQLException.class}, dontRollbackOn={SQLWarning.class})

Now what about DeltaSpike @Transactional?

In Apache DeltaSpike @Transactional and it’s predecessor Apache MyFaces CODI @Transactional we have a much cleaner handling:

Exceptions only get handled on the layer where the transaction got opened. If you catch an Exception along the way than we do not care about it.

Any Exception on the outermost layer will cause a rollback of your transaction. It doesn’t matter if it is a RuntimeException or a checked Exception.

If there was no Exception in the outermost interceptor then we will commit the transaction.

PS: please note that I explicitly used interfaces in my samples. Otherwise you will get NIV (No Interface View) objects which again might behave slightly different as they use a totally different proxying technique and default behaviour. But that might be enough material for yet another own blog post.
PPS: I also spared you EJBs with TransactionManagementType.BEAN. That one is also pretty much non-portable by design as you effectively cannot nest them as it forces you to either commit or rollback the tx on every layer. Some containers work fine while others really force this.

The right Scope for JBatch Artifacts

In my recent JavaLand conference talk about JSR-352 JBatch and Apache BatchEE I briefly mentioned that JBatch Artifacts should have a scope of @Dependent (or Prototype scope if you are using Spring). Too bad there was not enough time to dig into the problem in depth so here comes the detailed explanation.

What is a JBatch Artifact

A JBatch batch needs a Job Specification Language XML file in META-INF/batch-jobs/*.xml files. These files describes how your batch job is built up.

Here is a small example of how such a batch JSL file could look like

<job id="mysamplebatch" version="1.0" xmlns="http://xmlns.jcp.org/xml/ns/javaee">
  <step id="mysample-step">
    <listeners>
      <listener ref="batchUserListener" >
      <properties>
        <property name="batchUser" value="#{batchUser}"/>
      </properties>
      </listener>
    </listeners>
    <batchlet ref="myWorkerBatchlet">
      <properties>
        <property name="inputFile" value="#{inputFile}"/>
      </properties>
    </batchlet>
  </step>

In JSR-352 an Artifact are all pieces which are defined in your JBatch JSL file and get requested by the container. In the sample above this would be batchUserListener and myWorkerBatchlet.

The following types can be referenced as Batch Artifacts from within your JSL:

  • Batchlets
  • ItemReader
  • ItemProcessor
  • ItemWriter
  • JobListener
  • StepListener
  • CheckpointAlgorithm
  • Decider
  • PartitionMapper
  • PartitionReducer
  • PartitionAnalyzer
  • PartitionCollector

The Batch Artifact Lifecycle

The JBatch spec is actually pretty clear what lifecycle needs to get applied on Batch Artifacts:

11.1 Batch Artifact Lifecycle
All batch artifacts are instantiated prior to their use in the scope in which they are declared in the Job XML and are valid for the life of their containing scope. There are three scopes that pertain to artifact lifecycle: job, step, and step-partition.
One artifact per Job XML reference is instantiated. In the case of a partitioned step, one artifact per Job XML reference per partition is instantiated. This means job level artifacts are valid for the life of the job. Step level artifacts are valid for the life of the step. Step level artifacts in a partition are valid for the life of the partition.
No artifact instance may be shared across concurrent scopes. The same instance must be used in the applicable scope for a specific Job XML reference.

The problem is that whenever you use a JavaEE artifact then you might get only a proxy. Of course the reference to this proxy gets thrown away correctly but the instance behind the proxy might survive. Let’s look at how this works internally.

How Batch Artifacts get resolved

A JBatch implementation can provide it’s own mechanism to load the artifacts. This is needed as it is obviously different whether you use BatchEE with CDI or if you use Spring Batch (which also implements JSR-352).
In general there are 3 different ways you can reference a Batch Artifact in your JSL:

  1. Via a declaration in an optional META-INF/batch.xml file. See the section 10.7.1 of the specification for further information.
  2. Via it’s fully qualified class name.
    In BatchEE we first try to get the class via BeanManager#getBeans(Type) and BeanManager#getReference. If that doesn’t give us the desired Contextual Reference (CDI proxy) then we simply call ClassLoader#loadClass create the Batch Artifact with newInstance() and apply injection into this instance
  3. Via it’s EL name. More precisely we use BeanManager#getBeans(String elName) plus a call to BeanManager#getReference() as shown above.

We now know what a Batch Artifact is. Whenever you are on a JavaEE Server you will most likely end up with a CDI or EJB Bean which got resolved via the BeanManager. If you are using Spring-Batch then you will most times get a nicely filled Spring bean.

The right Scope for a Batch Artifact

I’ve seen the usage of @javax.ejb.Stateless on Batch Artifacts in many samples. I guess the people writing such samples never used JBatch in real production yet ;) Why so? Well, let’s look at what would happen if we implement our StepListener as stateless EJB:

@javax.ejb.Stateless
@javax.inject.Named // this makes it available for EL lookup
public class BatchUserListener implements StepListener {
  @Inject 
  @BatchProperty
  private String batchUser;

  @Override
  public void beforeStep() throws Exception {
     setUserForThread(   
  }

  @Override
  public void afterStep() throws Exception {
    clearUserForThread();
  }
}

Now let’s assume that the BatchUserListener gets not only used in my sample batch but in 30 other batches of my company (this ‘sample’ is actually taken from a HUGE real world project where we use Apache BatchEE since over a year now).

What will happen if e.g. a ‘DocumentImport’ batch runs before my sample batch? The first batch who uses this StepListener will create the instance. At the time when the instance gets created by the container (and ONLY at that time) it will also perform all the injection. That means it will look up the ‘batchUser’ parameter and injects it into my @BatchProperty String. Let’s assume this DocumentImport batch uses a ‘documentImportUser’. So this is what we will get injected into the ‘batchUser’ variable;

Once the batch step is done the @Stateless instance might be put back into some pool cache. And if I’m rather unlucky then exactly this very instance will later get re-used for mysample-step. But since the listener already exists there will no injection be performed on that instance. What means that the steplistener STILL contains the ‘documentImportUser’ and not the ‘mySampleUser’ which I explicitly did set as parameter of my batch.

The very same issue also will happen for all injected Variables which do not use proxies, e.g.:

  • @Inject StepContext
  • @Inject JobContext

TL;DR: The Solution

Use @Dependent scoped beans for your Batch Artifacts and only use another scope if you really know what you are doing.

If you like to share code across different items of a Step or a Job then you can also use BatchEE’s @StepScoped and @JobScoped which is available through a portable BatchEE CDI module

CDI in EARs

Foreword

I was banging my head against the wall for the last few days when trying to solve a few tricky issues we saw with EAR support over at Apache DeltaSpike. I’m writing this up to clean my mind and to share my knowledge with other EAR-pig-wrestlers…

The EAR ClassLoader dilemma

EARs are a constant pain when it comes to portability amongst servers. This has to do with the fact that JavaEE still doesn’t define any standards for visibility. There is no clear rule about how the ClassLoaders, isolation and visibility has to be set up. There is just as single paragraph (JavaEE7 spec 8.3.1) about which classes you might probably see.

There are 2 standard ClassLoader setups we see frequently in EARs.
For the sake of further discussion we assume an EAR with the following structure:

sample.ear
├── some-ejb.jar
├── lib
│   ├── some-shared.jar
│   └── another-shared.jar
├── war1.war
│   └── WEB-INF
│       ├── classes 
│       └── lib
│           └── war1-lib.jar
└── war2.war
    └── WEB-INF
        ├── classes 
        └── lib
            └── war2-lib.jar

Flat ClassLoader

The whole EAR is served by just a single flat ClassLoader. If you have 2 WARs inside your application then the classes in each of it can see each other. And also the classes in the shared EAR lib folder can see. This is e.g. used in JBoss-4 (though not from the beginning). You can also configure most of the other containers to use this setup for your EAR. But nowadays it’s hardly a default anymore (and boy is that good!)

Hierarchic ClassLoader

This is the setup used by most containers these days – but it’s still not a mandated by the spec! The Container will provide a shared EAR Application ClassLoader which itself only contains the shared ear libs. This is the parent ClassLoader of all the ClassLoaders in the EAR. Each WAR and ejb-jar inside your EAR will get an own child WebAppClassLoader.

This means war2 doesn’t see any classes or resources from war1 and the other way around. It further means that the shared libs do not see the classes of war1 nor war2, etc!

If you need some caches in your shared libs, then you need to rely on the ThreadContextClassLoader (TCCL) as outlined in JavaEE 7 paragraph 8.2.5. While this section is about “Dynamic Class Loading” it also is valid for caches and storing other dynamic information in static variables. Otherwise you end up mixing values from war1 and probably re-use them in war2 (where you will get a ClassNotFound exception). Even if your 2 WARs contain the same jar (e.g. commons-lang.jar) the Class instances are different as they come from a different ClassLoader. If you store those in a shared-lib jar then you most probably end up with the (in)famous “Cannot cast class Xxx to class Xxx”.

One common solution to this problem will look something like:

public class MyInfoStore {
  private Map<ClassLoader, Set<Info>> infoMap;

  public void storeInfo(Info info) {
    ClassLoader tccl = Thread.currentThread().getContextClassLoader(); // probably guarded for SecurityManager
    infoMap.put(tccl, info);
  }
...
}

Of course you must really be careful to clean up this Map during shutdown. In CDI Applications you can use the @Observes BeforeShutdown event to trigger the cleanup.

The impact on us programmers

These various scenarios make it really hard to write any form of portable framework who runs fine inside of EARs. This is not only true for client frameworks but also for container frameworks itself like CDI and Spring.

Integrating CDI containers in EARs

It is pretty obvious that – mostly due to the lack of a guaranteed default isolation scenario in JavaEE – CDI containers have a hard time in finding a nice and portable handling of CDI beans and Extensions in EARs. I got involved in CDI in late 2008 when the name of the spec still was WebBeans. And that name was taken literally – it originally was only targetting web applications and not EARs. The EAR support only got roughly added (according to some interpretations) shortly before the EE-6 specification got published. So there are multiple reasons why CDI in EARs is not really a first class citizen yet.

To my knowledge there are 3 sane ways how a container can integrate CDI in ears. And of those 3 sane ways, 5 are used in various containers ;)

A.) 1 BeanManager per EAR

All the EAR is handled via 1 BeanManager. This is the way JBoss WildFly seems to handle things. First the BeanManager gets created and all it’s CDI Extensions get loaded (TODO research: all or only the ones from the shared libs?).

In reality it’s a bit more complicated. Weld uses a single BeanManager for each and every JAR in your EAR. Don’t ask me why, but that’s what I’ve seen. Still you only get one set of Extensions for your whole EAR. Keep this in mind.

The TCCL always seems to stay the shared EAR ApplicationClassLoader during boot time. Even while processing the classes of the WAR files. Thus the TCCL during @Observes ProcessAnnotatedType will be the shared EAR ApplicationClassLoader. But if you access your Extensions (or the information collected by them) later at runtime you might end up with a different TCCL. This is especially true for all servlet requests in your WARs. This makes it really hard to store anything with the usual Map<ClassLoader, Set<Info>> trick.

B.) 1 BeanManager per WAR + 1 ‘shared’ BeanManager

Each WAR fully boots up his own BeanManager. The shared EAR libs also get an own BeanManager to serve non-servlet requests like JMS and remote EJB invocations. Classes in shared EAR-libs simply get discovered over again for each WAR. Each WAR also has it’s own set of Extensions (they are usually 1:1 to the BeanManager). This is what I’ve seen in Apache Geronimo, IBM WebSphere and early Apache TomEE containers.

This is a bit tricky when it comes to handling of the @ApplicationScoped context (see CDI-129 for more information) or if you handle EJBs (which are by nature shared across the EAR). WebSphere e.g. seems to solve this by having an own Bean<SomeSharedEarLibClass> instance for each BeanManager (and thus for each WAR) but they share a single ApplicationContext storage and those beans are equals(). Probably they just compare the PassivationCapable#getPassivationId()?

It usually works fairly nice and it allows the usage of most common programming patterns. This also allows to modify classes from shared libs via Extensions you register in a single WAR. The downside is that you have to scan and process all the shared classes over and over again for each WAR. This obviously slows down the application boot time.

Oh this mode strictly seen also conflicts with the requirements for modularity of section 5 of the CDI specification. Overall I’m not very happy with section 5 but it should be mentioned.

C.) Hierarchic BeanManagers

In this case we have an 1:1 relation between a ClassLoader and the BeanManager. The shared EAR libs will get scanned by an EAR-BeanManager. Then each of the WARs get scanned via their own WebAppBeanManager. In contrast to scenario B. these WebAppBaenManagers will only scan the classes of the local WARs WEB-INF/lib and WEB-INF/classes and not the shared ear lib over again. If you need information from shared EAR lib classes then the WebAppBeanManger simply delegates to it’s ‘parent’ EAR-BeanManager. Apache TomEE uses this mode since 1.7.x

There are a few tricks the container vendor need to do in this case. E.g. propagating events and Bean lookups to the parent BeanManager, etc. Or to suppress sending CDI Extension Events to the parent (we recently learned this the hard way – is fixed in TomEE-1.7.2 which is to be released soon).

It also has an important impact on CDI Extension programmers: As your Extensions are 1:1 to the BeanManager we now also have the ClassPath split up into different Extension instances. This works fine for ProcessAnnotatedType but can be tricky in some edge cases. E.g. the DeltaSpike MessageBundleException did collect info about a certain ProducerBean and stored in in the Extension for later usage in @Observes AfterBeanDiscovery. Too bad that in my case this certain ProducerMethod was in a shared ear lib and thus gets picked up by the Extension-instance of the EAR-BeanManager but the ‘consumer’ (the interface annotated with @MessageBundle is in some WARs. And the WebAppBeanManager of this WAR obviously is not the one scanning the ProducerMethod of the class in the shared ear lib. Thus the Extension created a NullPointerException. This will be fixed in the upcoming Apache DeltaSpike-1.2.2 release.

The Impact on poor CDI Extension programmers

TODO: this is a living document. I’ll add more info and put it up review.

Explaining Java Inner Classes

The Problem

Today I had a colleague asking me to help her find a NotSerializableException. The code was like the following:

@ApplicationScoped
public class MyBusinessWindowHelper {
  ...
  public Button.ClickListener createExcelExportClickListener(Table t, String fileName) {
    return new Button.ClickListener() {
      @Override
      public void buttonClick(Button.ClickEvent clickEvent) {
        some action...
      }  
    }
  } 
}

The Button.ClickListener and Table are Serializable. The inner class also doesn’t store anything else which is not Serializable, so where is the problem?

People who found the answer in under 1 minute are probably under the top 1% Java programmers. If you don’t know the answer yet (not a shame, this is pretty hardcore stuff!) then read on.

The difference between a static and a non-static inner class

Static inner classes

Let’s first discuss static inner classes. They are quite the same like any other class which is lying around on your disk. In fact it makes almost no difference if you have the class located inside another class or as top level class. In terms of the generated bytecode there is really no difference.

Non-static inner classes

Non-static inner classes have access to the members of the outer class.

public class OuterClass
{
    private int meaningOfLife = 42;
    
    public class InnerClass {
        public int getMeaningOfLife() {
            return meaningOfLife;
        }
    }
}

As you can see the InnerClass can return the variable of the OuterClass. But how does that work? How does the inner class know about the outer class?

The answer is stunning but simple: You DON’T get what you see!

Let’s look at the bytecode which gets generated:

$> javac OuterClass.java
$> javap -c OuterClass\$InnerClass.class

will give us the following output:

Compiled from "OuterClass.java"
public class OuterClass$InnerClass {
  final OuterClass this$0;

  public OuterClass$InnerClass(OuterClass);
    Code:
       0: aload_0
       1: aload_1
       2: putfield      #1                  // Field this$0:LOuterClass;
       5: aload_0
       6: invokespecial #2                  // Method java/lang/Object."":()V
       9: return

  public int getMeaningOfLife();
    Code:
       0: aload_0
       1: getfield      #1                  // Field this$0:LOuterClass;
       4: invokestatic  #3                  // Method OuterClass.access$000:(LOuterClass;)I
       7: ireturn
}

There are 2 things which pop out:

final OuterClass this$0;

and

public OuterClass$InnerClass(OuterClass);

But hey, where does this come from? We did not have any constructor with an OuterClass parameter!

The answer is stunning but simple: Java did that for us!

Java automatically generates the this pointer of the outer class into any non-static inner class. And each constructor will also get an additional this parameter which is used to initialise this final field. Of course now it is possible for Java to use the generated outer class field (this$0) to access members of the outer class.

And what about anonymous classes?

Well, anonymous classes are just inner-classes without any specified name. Of course they will also get the this pointer as parameter.

And this is EXACTLY the reason why my colleague did get the NotSerializableException: Because storing the ClickListener in a View and serialising it away to another cluster also tried to serialise away the outer class (MyBusinessWindowHelper).

Btw, this is not only an issue if you play catch and hide with NotSerializableExceptions but also if you are hunting down mem-leaks!

A few more thoughts

That also explains why non-static inner classes cannot get proxied (because they do not have a default constructor). Plus it explains why they cannot be used as CDI or EJB beans.

All that is pretty obvious if one knows how the system works internally, isn’t?

The tricky CDI ServiceProvider pattern

CDI is really cool, and works great. But it’s not the hammer for all your problems!

I’ve seen the following a lot of times and I confess that I also used it in the first place: The CDI ServiceProvider pattern. But first let’s clarify what it is:

The ServiceProvider pattern

Consider you have a bunch of plugable modules and all of them might add an implementation for a certain Interface:

// this is just a marker interface for all my plugins
public interface MyPlugin{} 

The idea is that extensions can later easily implement this marker interface and the system will automatically pick up all the implementations accordingly.

public class PluginA implements MyPlugin {
  ...
}

public class PluginB implements MyPlugin {
  ...
}

The known way

I’ve seen lots of blogs recommending the following solution:

public static  List getContextualReferences(Class type, Annotation... qualifiers)
{
    BeanManager beanManager = getBeanManager();

    Set<Bean> beans = beanManager.getBeans(type, qualifiers);

    List result = new ArrayList(beans.size());

    for (Bean bean : beans)
    {
        Bean<?> bean = beanManager.resolve(beans);
        CreationalContext<?> creationalContext = beanManager.createCreationalContext(bean);
        result.add(beanManager.getReference(bean, type, creationalContext););
    }
    return result;
}

The original implementation can be found in the Apache DeltaSpike BeanProvider (and in Apache MyFaces CODI before that).

This worked really well for some time. But suddenly a few surprises happened.

What’s the problem with getBeans()?

One day we got a bug report in Apache OpenWebBeans that OWB is broken because the pattern shown above doesn’t work: JIRA issue OWB-658

What we found out is that the whole pattern doesn’t work anymore once one single plugin is defined as @Alternative

I digged into the issue and became aware that this is not an OWB issue but BeanManager#getBeans() is just not made for this usage! So let’s have a look about what the CDI spec says:

11.3.4:
The method BeanManager.getBeans() returns the set of beans which have the given required type and qualifiers and are available for injection

The important part here is the difference between “give me all beans of type X” and “give me all beans which can be used for an InjectionPoint of type X”. Those 2 are fundamentally different, because in the later case all other beans are no candidates for injection anymore if a single @Alternative annotated bean gets spotted. OWB did it just right!

Possible Solution 1

Don’t use @Alternative on any of your plugin classes ;)

That sounds a bit ridiculous, but at least it works.

Possible Solution 2

You could create a Message which collects all your plugins in a CDI-Extension-like way.
First we need a data object to store all the collected plugins.

public class PluginDetection {
  private List plugins = new ArrayList();

  public void addPlugin(MyPlugin plugin) {
    plugins.add(plugin);
  }

  public List getPlugins() {
    return plugins;
  }
}

Then we gonna fire this around:

private @Inject Event pdEvent;

public List detectPlugins() {
  PluginDetection pd = new PluginDetection();
  pdEvent.fire(pd);
  return pd.getPlugins();
}

The plugins just need to Observe this event and register themself:

public void register(@Observes PluginDetection pd) {
  pd.add(this);
}

Since the default for @Observes is Reception.ALWAYS, the contextual instance will automatically get created if it doesn’t yet exist.

But there is also a small issue with this approach: There is no way to disable/override a plugin with @Alternative anymore, since the messaging doesn’t do any bean resolving at all.

Using JPA in real projects (part 1)

Is JPA only good for samples?

This question is of course a bit provocative. But if you look at all the JPA samples out in the wild, then none of them can be applied to real world projecs without fundamental changes.

This post tries to cover a few JPA aspects as well as showing off some maven-foo from a big real world project. I am personally using Apache OpenJPA because it works well and I’m a committer on the project (which means I can immediately fix bugs if I hit one). I will try to motivate my friends from JBoss to provide a parallel guide for Hibernate and maybe we even find some Glassfish/EclipseLink geek.

One of the most fundamental differences between the different JPA providers is where they store the state information for the loaded entities. OpenJPA stores this info directly in the entities (EclipseLink as well afaik) and Hibernate stores it in the EntityManager and sometimes in the 1:n proxies used for lazy loading (if no weaving is used). All this is not defined in the spec but product specific behaviour. Please always keep this in mind when applying JPA techiques to another JPA provider.

I’ll have to split this article in 2 parts, otherwise it would be too much for a good read. Todays part will focus on the general project setup, the 2nd one will cover some coding practices usable for JPA based projects.

The Project Infrastructure and Setup

A general note on my project structure: my project is not a sample but fairly big (40k users, 5mio page hits, 600++ JSF pages) and consists of 10++ WebApps with each of them having their own backend (JPA + db + businesslogic), frontend (JSF and backing beans) and api (remote APIs) JARs. Thus I have all my shared configuration in myprj/parent/fe myprf/parent/be and myprj/parent/api maven modules, containing the pom.xml pointed to as <parent> by all backends, frontends resp apis.

├── parent
│   ├── api
│   ├── be (<- here I keep all my shared backend configuration) 
│   ├── fe
├── webapp1
│   ├── api
│   ├── be (referencing ../../parent/be/pom.xml)
│   └── fe
├── webapp2
│   ├── api
│   ├── be (referencing ../../parent/be/pom.xml)
│   └── fe
...

Backend Unit Test Setup

1. All my backend unit tests use testng and really do hit the database! A business process test which doesn’t touch the database is worth nothing imo…
We are using a local MySQL installation for the tests and use an Apache Maven Profile for switching to other databases like Oracle and PostgreSQL (which we both use in production).

2. We have a special testng test-group called createData which we can @Test(dependsOnGroups="createData"). Or we just use the @Test(dependsOnMethods="myTestMethodCreatingTheData").
That way we have all tests which create some pretty complex set of test-data running first. All tests which need this data as base for their own work will run afterwards.

3. Each test must be re-runnable and cleanup his own mess in @BeforeClass. We use BeforeClass because this also works if you kill your test in the debugger. Nice goodie: you also can check the produced data in the database later on. Too bad that there is no easy way to automatically proove this. The best bet is to make all your colleagues aware of it and tell them that they have to throw the next party if they introduce a broken or un-repeatable test ;)

The Enhancement Question

I’ve outlined the details and pitfalls of JPA enhancement in a previous post.
I’m a big fan of build-time-enhancement because it a.) works nicely with OpenJPA and b.) my testng unit tests run much faster (because I only enhance those entities once). I also like the fact that I know exactly what will run on the server and my unit tests will hit side effects early on. In a big project you’ll hit enhancement and state side effects which let your app act differently in unit test and on the EE server more often than you’ll guess.
Of course, this might differ if you use another JPA provider.

For enabling build-time-enhancement with OpenJPA I have the following in my parent-be.pom.

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.openjpa</groupId>
                <artifactId>openjpa-maven-plugin</artifactId>
                <version>${openjpa.version}</version>
                <configuration>
                    <includes>
                        ${jpa-includes}
                    </includes>
                    <excludes>
                        ${jpa-excludes}
                    </excludes>
                    <addDefaultConstructor>true</addDefaultConstructor>
                    <enforcePropertyRestrictions>true</enforcePropertyRestrictions>
                    <sqlAction>${openjpa.sql.action}</sqlAction>
                    <sqlFile>${project.build.directory}/database.sql</sqlFile>
                    <connectionDriverName>com.mchange.v2.c3p0.ComboPooledDataSource</connectionDriverName>
                    <connectionProperties>
                        driverClass=${database.driver.name},
                        jdbcUrl=${database.connection.url},
                        user=${database.user},
                        password=${database.password},
                        minPoolSize=5,
                        acquireRetryAttempts=3,
                        maxPoolSize=20
                    </connectionProperties>
                </configuration>
                <executions>
                    <execution>
                        <id>mappingtool</id>
                        <phase>process-classes</phase>
                        <goals>
                            <goal>enhance</goal>
                        </goals>
                    </execution>
                </executions>
                <dependencies>
                    <dependency>
                        <groupId>log4j</groupId>
                        <artifactId>log4j</artifactId>
                        <version>1.2.12</version>
                    </dependency>
                    <dependency>
                        <!-- 
                          otherwise you get ClassNotFoundExceptions during 
                          the code coverage report run
                        -->
                        <groupId>net.sourceforge.cobertura</groupId>
                        <artifactId>cobertura</artifactId>
                        <version>1.9.2</version>
                    </dependency>
                    <dependency>
                        <groupId>c3p0</groupId>
                        <artifactId>c3p0</artifactId>
                        <version>${c3p0.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>mysql</groupId>
                        <artifactId>mysql-connector-java</artifactId>
                        <version>${mysql-connector.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>com.oracle</groupId>
                        <artifactId>ojdbc14</artifactId>
                        <version>${ojdbc.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>postgresql</groupId>
                        <artifactId>postgresql</artifactId>
                        <version>${postrgresql-jdbc.version}</version>
                    </dependency>
                </dependencies>
            </plugin>

You might have spotted a few maven properties which I later define in each projects pom. That way I can keep my common configuration generic and still have a way to tweak the behaviour for each sub-project. Again a nice benefit: You can easily use mvn -Dsomeproperty=anothervalue to tweak those settings on the commandline.

  • ${jpa-includes} for defining the comma separated list of classes which should get enhanced, e.g. "mycomp/project/modulea/backend/*.class,mycomp/project/modulea/backend/otherstuff/*.class
  • ${jpa-exludes} the opposite to jpa-includes
  • openjpa.sql.action to define what should be done during DB schema creation. This can be build for always create the whole DB schema (CREATE TABLES), or refresh for generating only ALTER TABLE statements for the changes. I’ll come back to this later.
  • ${database.driver.name} and credentials properties are used to be able to run the schema creation against Oracle, MySQL and PostgreSQL (switched via maven profiles).

Creating the Database

For doing tests with a real database we of course need to create the schema first. We do NOT let JPA do any automatic database schema changes on JPA-startup. Doing so might unrecoverably trash your production database, so it’s always turned off!

Instead we trigger the SQL schema creation process via the Apache OpenJPA openjpa-maven-plugin manually (for the configuration see below):

$> mvn openjpa:sql

Then we check the generated SQL in target/database.sql and copy it to the structure we have in each of our backend projects:

webapp1/be/src/main/sql/
├── mysql
│   ├── createdb.sql
│   ├── createindex.sql
│   ├── database.sql
│   └── schema_delta.sql
├── oracle
│   ├── createdb.sql
│   ├── createindex.sql
│   ├── database.sql
│   └── schema_delta.sql
└── postgres
    ├── createdb.sql
    ├── createindex.sql
    ├── database.sql
    └── schema_delta.sql

The following files are involved in the db setup:

createdb.sql

This file creates the database itself. It is optional as not every database supports to create a whole database. In MySQL we just do the following

DROP DATABASE if exists ProjextXDatabase
CREATE DATABASE ProjextXDatabase CHARACTER SET utf8;
USE ProjextXDatabase;

In Oracle this is not that easy. It’s a major pain to drop and then setup a whole data store. A major problem is that you cannot easily access a datastore which doesnt exist anymore via Oracles JDBC driver. Instead, we just drop all the tables.:

DROP TABLE MyTable CASCADE constraints PURGE;
DROP TABLE AnotherTable CASCADE constraints PURGE;
...

If you have a better idea, then please speak up ;)

database.sql

This is the exact 1:1 DDL/Schema file we generated via the JPA (in my case via the openjpa-maven-plugins mvn openjpa:sql mentioned above). It is simply copied over from target/database.sql but the content remains unchanged. It runs after the createdb.sql file.

createindex.sql

This file contains the initial index tweaks which were not generated in the DDL. In Oracle and PostgreSQL this file e.g. contains all the indices on foreign keys, because OpenJPA doesn’t generate them (I remember that Hibernate does, correct?). In MySQL we don’t need those because MySQL automatically adds indices for foreign keys itself.

But this is of course a good place to add all the performance tuning stuff you ever wanted ;)

schema_delta.sql

This one is really a goldie! Once a project goes into production we do not generate full databae schemas anymore! Instead we switch the openjpa-maven-plugin to the refresh mode. In this mode OpenJPA will compare the entities with the state of the configured database and only generate ALTER TABLE and similar statements for the changes in target/database.sql. This works surprisingly good!

We then review the generated schema changes and append the content to src/main/sql/[dbvendor]/schema_delta.sql. Of course we also add clean comments about the product revision in which the change got made. That way an administrator just picks the n last entries from this file and is easily able to bring the production database to the last revision.

Doing this step manually is very important! From time to time there are changes (renaming a column for example) which cannot be handled by the generated DDL. Such changes or small migration updates need to be maintained manually.

How to create the DB for my tests?

This one is pretty easy if you know the trick: We just make use of the sql-maven-plugin.

Here is the configuration I use in my project:

    <profiles>
        <!-- Default profile for surefire with MySQL: creates database, imports testdata and runs all unit tests -->
        <profile>
            <id>default</id>
            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
            <build>
                <plugins>
                    <plugin>
                        <groupId>org.codehaus.mojo</groupId>
                        <artifactId>sql-maven-plugin</artifactId>
                        <configuration>
                            <driver>com.mysql.jdbc.Driver</driver>
                            <url>jdbc:mysql://localhost/</url>
                            <username>root</username>
                            <password/>
                            <escapeProcessing>false</escapeProcessing>

                            <srcFiles>
                                <srcFile>src/main/sql/mysql/createdb.sql</srcFile>
                                <srcFile>src/main/sql/mysql/database.sql</srcFile>
                                <srcFile>src/main/sql/mysql/schema_delta.sql</srcFile>
                                <srcFile>src/main/sql/mysql/createindex.sql</srcFile>
                                <srcFile>src/test/sql/mysql/testdata.sql</srcFile>
                            </srcFiles>
                        </configuration>

                        <executions>
                            <execution>
                                <id>setup-test-database</id>
                                <phase>process-test-resources</phase>
                                <goals>
                                    <goal>execute</goal>
                                </goals>
                            </execution>
                        </executions>

                        <dependencies>
                            <dependency>
                                <groupId>mysql</groupId>
                                <artifactId>mysql-connector-java</artifactId>
                                <version>${mysql-connector.version}</version>
                                <scope>runtime</scope>
                            </dependency>
                        </dependencies>
                    </plugin>
                </plugins>
            </build>
        </profile>

        <profile>
            <!-- that skips sql plugin and test!!! -->
            <id>skipSql</id>
        </profile>
        ...
        add profile for oracle and postgresql accordingly

Whenever you run your build, the database will be freshly set up in the process-test-resources phase. The database will then be exactly as in production!

Guess we are now basically ready to start hacking on our project!

The 2nd part will focus on how to handle JPA stuff in the application code. Stay tuned!

LieGrue,
strub

Follow

Get every new post delivered to your Inbox.