Explaining Java Inner Classes

The Problem

Today I had a colleague asking me to help her find a NotSerializableException. The code was like the following:

@ApplicationScoped
public class MyBusinessWindowHelper {
  ...
  public Button.ClickListener createExcelExportClickListener(Table t, String fileName) {
    return new Button.ClickListener() {
      @Override
      public void buttonClick(Button.ClickEvent clickEvent) {
        some action...
      }  
    }
  } 
}

The Button.ClickListener and Table are Serializable. The inner class also doesn’t store anything else which is not Serializable, so where is the problem?

People who found the answer in under 1 minute are probably under the top 1% Java programmers. If you don’t know the answer yet (not a shame, this is pretty hardcore stuff!) then read on.

The difference between a static and a non-static inner class

Static inner classes

Let’s first discuss static inner classes. They are quite the same like any other class which is lying around on your disk. In fact it makes almost no difference if you have the class located inside another class or as top level class. In terms of the generated bytecode there is really no difference.

Non-static inner classes

Non-static inner classes have access to the members of the outer class.

public class OuterClass
{
    private int meaningOfLife = 42;
    
    public class InnerClass {
        public int getMeaningOfLife() {
            return meaningOfLife;
        }
    }
}

As you can see the InnerClass can return the variable of the OuterClass. But how does that work? How does the inner class know about the outer class?

The answer is stunning but simple: You DON’T get what you see!

Let’s look at the bytecode which gets generated:

$> javac OuterClass.java
$> javap -c OuterClass\$InnerClass.class

will give us the following output:

Compiled from "OuterClass.java"
public class OuterClass$InnerClass {
  final OuterClass this$0;

  public OuterClass$InnerClass(OuterClass);
    Code:
       0: aload_0
       1: aload_1
       2: putfield      #1                  // Field this$0:LOuterClass;
       5: aload_0
       6: invokespecial #2                  // Method java/lang/Object."":()V
       9: return

  public int getMeaningOfLife();
    Code:
       0: aload_0
       1: getfield      #1                  // Field this$0:LOuterClass;
       4: invokestatic  #3                  // Method OuterClass.access$000:(LOuterClass;)I
       7: ireturn
}

There are 2 things which pop out:

final OuterClass this$0;

and

public OuterClass$InnerClass(OuterClass);

But hey, where does this come from? We did not have any constructor with an OuterClass parameter!

The answer is stunning but simple: Java did that for us!

Java automatically generates the this pointer of the outer class into any non-static inner class. And each constructor will also get an additional this parameter which is used to initialise this final field. Of course now it is possible for Java to use the generated outer class field (this$0) to access members of the outer class.

And what about anonymous classes?

Well, anonymous classes are just inner-classes without any specified name. Of course they will also get the this pointer as parameter.

And this is EXACTLY the reason why my colleague did get the NotSerializableException: Because storing the ClickListener in a View and serialising it away to another cluster also tried to serialise away the outer class (MyBusinessWindowHelper).

Btw, this is not only an issue if you play catch and hide with NotSerializableExceptions but also if you are hunting down mem-leaks!

A few more thoughts

That also explains why non-static inner classes cannot get proxied (because they do not have a default constructor). Plus it explains why they cannot be used as CDI or EJB beans.

All that is pretty obvious if one knows how the system works internally, isn’t?

The tricky CDI ServiceProvider pattern

CDI is really cool, and works great. But it’s not the hammer for all your problems!

I’ve seen the following a lot of times and I confess that I also used it in the first place: The CDI ServiceProvider pattern. But first let’s clarify what it is:

The ServiceProvider pattern

Consider you have a bunch of plugable modules and all of them might add an implementation for a certain Interface:

// this is just a marker interface for all my plugins
public interface MyPlugin{} 

The idea is that extensions can later easily implement this marker interface and the system will automatically pick up all the implementations accordingly.

public class PluginA implements MyPlugin {
  ...
}

public class PluginB implements MyPlugin {
  ...
}

The known way

I’ve seen lots of blogs recommending the following solution:

public static  List getContextualReferences(Class type, Annotation... qualifiers)
{
    BeanManager beanManager = getBeanManager();

    Set<Bean> beans = beanManager.getBeans(type, qualifiers);

    List result = new ArrayList(beans.size());

    for (Bean bean : beans)
    {
        Bean<?> bean = beanManager.resolve(beans);
        CreationalContext<?> creationalContext = beanManager.createCreationalContext(bean);
        result.add(beanManager.getReference(bean, type, creationalContext););
    }
    return result;
}

The original implementation can be found in the Apache DeltaSpike BeanProvider (and in Apache MyFaces CODI before that).

This worked really well for some time. But suddenly a few surprises happened.

What’s the problem with getBeans()?

One day we got a bug report in Apache OpenWebBeans that OWB is broken because the pattern shown above doesn’t work: JIRA issue OWB-658

What we found out is that the whole pattern doesn’t work anymore once one single plugin is defined as @Alternative

I digged into the issue and became aware that this is not an OWB issue but BeanManager#getBeans() is just not made for this usage! So let’s have a look about what the CDI spec says:

11.3.4:
The method BeanManager.getBeans() returns the set of beans which have the given required type and qualifiers and are available for injection

The important part here is the difference between “give me all beans of type X” and “give me all beans which can be used for an InjectionPoint of type X”. Those 2 are fundamentally different, because in the later case all other beans are no candidates for injection anymore if a single @Alternative annotated bean gets spotted. OWB did it just right!

Possible Solution 1

Don’t use @Alternative on any of your plugin classes ;)

That sounds a bit ridiculous, but at least it works.

Possible Solution 2

You could create a Message which collects all your plugins in a CDI-Extension-like way.
First we need a data object to store all the collected plugins.

public class PluginDetection {
  private List plugins = new ArrayList();

  public void addPlugin(MyPlugin plugin) {
    plugins.add(plugin);
  }

  public List getPlugins() {
    return plugins;
  }
}

Then we gonna fire this around:

private @Inject Event pdEvent;

public List detectPlugins() {
  PluginDetection pd = new PluginDetection();
  pdEvent.fire(pd);
  return pd.getPlugins();
}

The plugins just need to Observe this event and register themself:

public void register(@Observes PluginDetection pd) {
  pd.add(this);
}

Since the default for @Observes is Reception.ALWAYS, the contextual instance will automatically get created if it doesn’t yet exist.

But there is also a small issue with this approach: There is no way to disable/override a plugin with @Alternative anymore, since the messaging doesn’t do any bean resolving at all.

Using JPA in real projects (part 1)

Is JPA only good for samples?

This question is of course a bit provocative. But if you look at all the JPA samples out in the wild, then none of them can be applied to real world projecs without fundamental changes.

This post tries to cover a few JPA aspects as well as showing off some maven-foo from a big real world project. I am personally using Apache OpenJPA because it works well and I’m a committer on the project (which means I can immediately fix bugs if I hit one). I will try to motivate my friends from JBoss to provide a parallel guide for Hibernate and maybe we even find some Glassfish/EclipseLink geek.

One of the most fundamental differences between the different JPA providers is where they store the state information for the loaded entities. OpenJPA stores this info directly in the entities (EclipseLink as well afaik) and Hibernate stores it in the EntityManager and sometimes in the 1:n proxies used for lazy loading (if no weaving is used). All this is not defined in the spec but product specific behaviour. Please always keep this in mind when applying JPA techiques to another JPA provider.

I’ll have to split this article in 2 parts, otherwise it would be too much for a good read. Todays part will focus on the general project setup, the 2nd one will cover some coding practices usable for JPA based projects.

The Project Infrastructure and Setup

A general note on my project structure: my project is not a sample but fairly big (40k users, 5mio page hits, 600++ JSF pages) and consists of 10++ WebApps with each of them having their own backend (JPA + db + businesslogic), frontend (JSF and backing beans) and api (remote APIs) JARs. Thus I have all my shared configuration in myprj/parent/fe myprf/parent/be and myprj/parent/api maven modules, containing the pom.xml pointed to as <parent> by all backends, frontends resp apis.

├── parent
│   ├── api
│   ├── be (<- here I keep all my shared backend configuration) 
│   ├── fe
├── webapp1
│   ├── api
│   ├── be (referencing ../../parent/be/pom.xml)
│   └── fe
├── webapp2
│   ├── api
│   ├── be (referencing ../../parent/be/pom.xml)
│   └── fe
...

Backend Unit Test Setup

1. All my backend unit tests use testng and really do hit the database! A business process test which doesn’t touch the database is worth nothing imo…
We are using a local MySQL installation for the tests and use an Apache Maven Profile for switching to other databases like Oracle and PostgreSQL (which we both use in production).

2. We have a special testng test-group called createData which we can @Test(dependsOnGroups="createData"). Or we just use the @Test(dependsOnMethods="myTestMethodCreatingTheData").
That way we have all tests which create some pretty complex set of test-data running first. All tests which need this data as base for their own work will run afterwards.

3. Each test must be re-runnable and cleanup his own mess in @BeforeClass. We use BeforeClass because this also works if you kill your test in the debugger. Nice goodie: you also can check the produced data in the database later on. Too bad that there is no easy way to automatically proove this. The best bet is to make all your colleagues aware of it and tell them that they have to throw the next party if they introduce a broken or un-repeatable test ;)

The Enhancement Question

I’ve outlined the details and pitfalls of JPA enhancement in a previous post.
I’m a big fan of build-time-enhancement because it a.) works nicely with OpenJPA and b.) my testng unit tests run much faster (because I only enhance those entities once). I also like the fact that I know exactly what will run on the server and my unit tests will hit side effects early on. In a big project you’ll hit enhancement and state side effects which let your app act differently in unit test and on the EE server more often than you’ll guess.
Of course, this might differ if you use another JPA provider.

For enabling build-time-enhancement with OpenJPA I have the following in my parent-be.pom.

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.openjpa</groupId>
                <artifactId>openjpa-maven-plugin</artifactId>
                <version>${openjpa.version}</version>
                <configuration>
                    <includes>
                        ${jpa-includes}
                    </includes>
                    <excludes>
                        ${jpa-excludes}
                    </excludes>
                    <addDefaultConstructor>true</addDefaultConstructor>
                    <enforcePropertyRestrictions>true</enforcePropertyRestrictions>
                    <sqlAction>${openjpa.sql.action}</sqlAction>
                    <sqlFile>${project.build.directory}/database.sql</sqlFile>
                    <connectionDriverName>com.mchange.v2.c3p0.ComboPooledDataSource</connectionDriverName>
                    <connectionProperties>
                        driverClass=${database.driver.name},
                        jdbcUrl=${database.connection.url},
                        user=${database.user},
                        password=${database.password},
                        minPoolSize=5,
                        acquireRetryAttempts=3,
                        maxPoolSize=20
                    </connectionProperties>
                </configuration>
                <executions>
                    <execution>
                        <id>mappingtool</id>
                        <phase>process-classes</phase>
                        <goals>
                            <goal>enhance</goal>
                        </goals>
                    </execution>
                </executions>
                <dependencies>
                    <dependency>
                        <groupId>log4j</groupId>
                        <artifactId>log4j</artifactId>
                        <version>1.2.12</version>
                    </dependency>
                    <dependency>
                        <!-- 
                          otherwise you get ClassNotFoundExceptions during 
                          the code coverage report run
                        -->
                        <groupId>net.sourceforge.cobertura</groupId>
                        <artifactId>cobertura</artifactId>
                        <version>1.9.2</version>
                    </dependency>
                    <dependency>
                        <groupId>c3p0</groupId>
                        <artifactId>c3p0</artifactId>
                        <version>${c3p0.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>mysql</groupId>
                        <artifactId>mysql-connector-java</artifactId>
                        <version>${mysql-connector.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>com.oracle</groupId>
                        <artifactId>ojdbc14</artifactId>
                        <version>${ojdbc.version}</version>
                    </dependency>
                    <dependency>
                        <groupId>postgresql</groupId>
                        <artifactId>postgresql</artifactId>
                        <version>${postrgresql-jdbc.version}</version>
                    </dependency>
                </dependencies>
            </plugin>

You might have spotted a few maven properties which I later define in each projects pom. That way I can keep my common configuration generic and still have a way to tweak the behaviour for each sub-project. Again a nice benefit: You can easily use mvn -Dsomeproperty=anothervalue to tweak those settings on the commandline.

  • ${jpa-includes} for defining the comma separated list of classes which should get enhanced, e.g. "mycomp/project/modulea/backend/*.class,mycomp/project/modulea/backend/otherstuff/*.class
  • ${jpa-exludes} the opposite to jpa-includes
  • openjpa.sql.action to define what should be done during DB schema creation. This can be build for always create the whole DB schema (CREATE TABLES), or refresh for generating only ALTER TABLE statements for the changes. I’ll come back to this later.
  • ${database.driver.name} and credentials properties are used to be able to run the schema creation against Oracle, MySQL and PostgreSQL (switched via maven profiles).

Creating the Database

For doing tests with a real database we of course need to create the schema first. We do NOT let JPA do any automatic database schema changes on JPA-startup. Doing so might unrecoverably trash your production database, so it’s always turned off!

Instead we trigger the SQL schema creation process via the Apache OpenJPA openjpa-maven-plugin manually (for the configuration see below):

$> mvn openjpa:sql

Then we check the generated SQL in target/database.sql and copy it to the structure we have in each of our backend projects:

webapp1/be/src/main/sql/
├── mysql
│   ├── createdb.sql
│   ├── createindex.sql
│   ├── database.sql
│   └── schema_delta.sql
├── oracle
│   ├── createdb.sql
│   ├── createindex.sql
│   ├── database.sql
│   └── schema_delta.sql
└── postgres
    ├── createdb.sql
    ├── createindex.sql
    ├── database.sql
    └── schema_delta.sql

The following files are involved in the db setup:

createdb.sql

This file creates the database itself. It is optional as not every database supports to create a whole database. In MySQL we just do the following

DROP DATABASE if exists ProjextXDatabase
CREATE DATABASE ProjextXDatabase CHARACTER SET utf8;
USE ProjextXDatabase;

In Oracle this is not that easy. It’s a major pain to drop and then setup a whole data store. A major problem is that you cannot easily access a datastore which doesnt exist anymore via Oracles JDBC driver. Instead, we just drop all the tables.:

DROP TABLE MyTable CASCADE constraints PURGE;
DROP TABLE AnotherTable CASCADE constraints PURGE;
...

If you have a better idea, then please speak up ;)

database.sql

This is the exact 1:1 DDL/Schema file we generated via the JPA (in my case via the openjpa-maven-plugins mvn openjpa:sql mentioned above). It is simply copied over from target/database.sql but the content remains unchanged. It runs after the createdb.sql file.

createindex.sql

This file contains the initial index tweaks which were not generated in the DDL. In Oracle and PostgreSQL this file e.g. contains all the indices on foreign keys, because OpenJPA doesn’t generate them (I remember that Hibernate does, correct?). In MySQL we don’t need those because MySQL automatically adds indices for foreign keys itself.

But this is of course a good place to add all the performance tuning stuff you ever wanted ;)

schema_delta.sql

This one is really a goldie! Once a project goes into production we do not generate full databae schemas anymore! Instead we switch the openjpa-maven-plugin to the refresh mode. In this mode OpenJPA will compare the entities with the state of the configured database and only generate ALTER TABLE and similar statements for the changes in target/database.sql. This works surprisingly good!

We then review the generated schema changes and append the content to src/main/sql/[dbvendor]/schema_delta.sql. Of course we also add clean comments about the product revision in which the change got made. That way an administrator just picks the n last entries from this file and is easily able to bring the production database to the last revision.

Doing this step manually is very important! From time to time there are changes (renaming a column for example) which cannot be handled by the generated DDL. Such changes or small migration updates need to be maintained manually.

How to create the DB for my tests?

This one is pretty easy if you know the trick: We just make use of the sql-maven-plugin.

Here is the configuration I use in my project:

    <profiles>
        <!-- Default profile for surefire with MySQL: creates database, imports testdata and runs all unit tests -->
        <profile>
            <id>default</id>
            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
            <build>
                <plugins>
                    <plugin>
                        <groupId>org.codehaus.mojo</groupId>
                        <artifactId>sql-maven-plugin</artifactId>
                        <configuration>
                            <driver>com.mysql.jdbc.Driver</driver>
                            <url>jdbc:mysql://localhost/</url>
                            <username>root</username>
                            <password/>
                            <escapeProcessing>false</escapeProcessing>

                            <srcFiles>
                                <srcFile>src/main/sql/mysql/createdb.sql</srcFile>
                                <srcFile>src/main/sql/mysql/database.sql</srcFile>
                                <srcFile>src/main/sql/mysql/schema_delta.sql</srcFile>
                                <srcFile>src/main/sql/mysql/createindex.sql</srcFile>
                                <srcFile>src/test/sql/mysql/testdata.sql</srcFile>
                            </srcFiles>
                        </configuration>

                        <executions>
                            <execution>
                                <id>setup-test-database</id>
                                <phase>process-test-resources</phase>
                                <goals>
                                    <goal>execute</goal>
                                </goals>
                            </execution>
                        </executions>

                        <dependencies>
                            <dependency>
                                <groupId>mysql</groupId>
                                <artifactId>mysql-connector-java</artifactId>
                                <version>${mysql-connector.version}</version>
                                <scope>runtime</scope>
                            </dependency>
                        </dependencies>
                    </plugin>
                </plugins>
            </build>
        </profile>

        <profile>
            <!-- that skips sql plugin and test!!! -->
            <id>skipSql</id>
        </profile>
        ...
        add profile for oracle and postgresql accordingly

Whenever you run your build, the database will be freshly set up in the process-test-resources phase. The database will then be exactly as in production!

Guess we are now basically ready to start hacking on our project!

The 2nd part will focus on how to handle JPA stuff in the application code. Stay tuned!

LieGrue,
strub

Is there a way to fix the JPA EntityManager?

Using JPA is easy for small projects but has well hidden problems which are caused by some very basic design decisions. Quite a few of them are caused because the EntityManager cannot be made Serializable. Although there are some JPA providers which claim serializability (Hibernate) they aren’t!

Is the EntityManager Serializable?

The LazyInitializationException is a pretty bad beast if you ever worked with EJB managed EntityManagers. That problem caused lots of people to discover alternative ways. Two of the most prominent are JBoss Seam2 if you are working with the JBoss stack and Apache MyFaces Orchestra for Spring applications.

The basic problems are summed up very well in the at large still correct Apache MyFaces Orchestra documentation:
Apache MyFaces Orchestra Persistence explanation

If you read through the whole page, you will see the TODOs at the very bottom of the page:

TODO: is the persistence-context serializable? Are all persistent objects in the context always serializable?

The simple answer is: NO not at all! Neither the EntityManager nor the state in the entities are Serializable as per the current JPA specification!

Why is the EntityManager not Serializable

There are a few reasons:

1. Pessimistic Locking

The biggest blocker first: JPA doesn’t only support Optimistic Locking but also Pessimistic Locking. You can either declare this in your persistence.xml and also programmatically via the LockModeType in many functions.

EntityManager#find(java.lang.Class entityClass, java.lang.Object primaryKey, LockModeType lockMode) 
EntityManager#lock(java.lang.Object entity, LockModeType lockMode) 
...

But if you ever use pessimistic locking (a real hard lock on the database) the connection is bound to the database and cannot be ‘transferred’ to another EntityManager without losing the lock.

2. Id and Version fields are optional

To use the optimistic locking approach, a primary key plus some ‘version’ field must be used in the entity:

 UPDATE tableX SET([somevalues], version=:oldversion+1) WHERE id=:myId AND version==:oldversion

Obviously this update can only succeed once. Trying to update the row a second time will not find any database entry because the version==:oldversion will not be true anymore.

When you use optimistic locking in JPA, you will always have such a ‘version’ column already. But there is no need to specify it yet! Thus this information will not be transported if you serialize the entity!

To fully support optimistic locking, those entities will need mandatory @Id and @Version columns.

3. Losing the entity state information

As outlined in a previous blog post every JPA entity will get ‘enhanced’ with some magic code which tracks _loaded and _dirty state information. Those BitFlags will track the parts of the entity which got changed or fetched lazily.

The problem in this area is mostly caused by the JPA spec which by default prevents the JPA providers from serializing the ‘enhanced entities’ but requires serializing the ‘native’ information. At least that seems to be the common understanding of the following paragraph in the JPA spec:

„Serializing entities and merging those entities back into a persistence context may not be interoperable across vendors when lazy properties or fields and/or relationships are used.
A vendor is required to support the serialization and subsequent deserialization and merging of detached entity instances (which may contain lazy properties or fields and/or relationships that have not been fetched) back into a separate JVM instance of that vendor’s runtime, where both runtime instances have access to the entity classes and any required vendor persistence implementation classes.

Of course, most JPA providers know a way to enable the serialization of the state fields. In OpenJPA just provide the following magic properties to your persistence.xml:

<property name="openjpa.DetachState" value="loaded(DetachedStateField=true)"/>
<property name="openjpa.Compatibility" value="IgnoreDetachedStateFieldForProxySerialization=true"/>

This will also serialize _loaded and _state BitFlags along with your Entity.

The problem with having the EntityManager not Serializable

Well, this one is easy:

  • You cannot store the EnityManager in a Conversation
  • You cannot store the EntityManager in a Session
  • You cannot store the EntityManager in a JSF View
  • No clustering, because Clustering means that you need to Serialize the state

What can you do today?

Today the only working sulution is the entitymanager-per-request pattern. Basically creating a @RequestScoped EntityManager e.g. via a CDI @Produces for each and every request. That also means that you need to manually merge those entities on the callback. If you use JSF that is in your action.

 

How to fix the JPA EntityManager in the future?

Here are my thoughts about how we can do better in the future. Please note that there is a project called Avaje eBean which is not JPA compliant but has already successfully implemented those ideas.

Provide an OptimisticEntityManager

public interface OptimisticEntityManager extends EntityManager, Serializable

The most important change here is that it implements the java.io.Serializable interface.
This OptimisticEntityManager should throw an NonOptimisticModeException whenever one tries to execute an operation on the EntityManager which requires a non-optimistic LockModeType or another operation which creates some lock or non-serializable behaviour.

There should be a way to explicitly request an OptimisticEntityManager, e.g. via

OptimisticEntityManager EntityManagerFactoy#createOptimisticEntityManager(); 

Make @Id and @Version mandatory for those Entities

This will solve the problem with losing the optimistic lock information when serializing.

Define _loaded and _dirty Serialization

The future JPA spec could either clarify that serialization is more important than JPA-vendor inter-compatibility (who uses 2 different JPA providers in the same environment anyway?).
Or just specify that 2 BitFlags can be passed in the Serialized entity and how they should behave.

Please tell me what you think? Do we miss something? It’s not an easy move, but up to now I think it is doable!

PS: Thanks to Shane Bryzak and Jason Porter for helping me get rid of the worst English grammar and wording issues at least. Hope you folks got the gist regardless of my bad english ;)

Unit Testing Strategies for CDI based projects

Even after 2 years of CDI going public there are still frequent questions about how to unit test CDI based applications. Well, here it goes.

1. Arquillian, the heavyweight champion

Arquillian is the core part of the EE testing effort of JBoss. The idea is to write a unit test exactly once and run it on the target platform via some container integration connectors.

How does Arquillian tests look like?

I will not give a detailed introduction here, but I’d like to mention that an Arquillian test will usually contain a @Deployment section which packages all the classes under test into an own temporary Java packaging, like a JAR, EAR, WAR, etc and uses this temporary artifact to run the tests.

There are basically 2 ways to run Arquillian tests. I’ll not go into details but only give a very rough (and probably not too exact) overview:

Remote/Managed style Arquillian tests

In this case the Arquillian tests will be started in an extenal Java VM, e.g. a started JBossAS7 or GlassFish-3.2. If you start your unit test locally, it will get ‘deployed’ to the other VM and executed over there. You can debug through as if you were working over there which is established via the java debugging and profiling API. Some container connectors also allow to locally run ‘as-client’.

The Good: this is running exactly on the target platform you do your real stuff on.
The Bad: Slower than native unit tests. No full control over the lifecycle, etc. You e.g. cannot simulate multiple servlet requests in one unit test.

Embedded style Arquillian tests

Instead of deploying your Arquillian test to another VM you are starting a standalone CDI or embedded EE container in the current Java VM.

The Good: Much faster as going remote. And you don’t need any server running.
The Bad: Sometimes this leads to ClassLoader isolation problems between your @Deployment and your unit test. This leads to problems when you e.g. like to unit test classes which starts an EntityManager. The reason is that e.g. any persistence.xml will get picked up twice. Once from your regular classpath and the second time from the @Deployment.

You can read more about it in the JBoss Arquillian documentation (thanks to Aslak Knutsen for the links):

2. DeltaSpike CdiControl, the featherweight champion

The method to unit test CDI based programs I like to introduce next uses the Apache DeltaSpike CdiControl API introduced in my previous post. It is based on the ideas and experience I had with the Apache OpenWebBeans CdiTest module I created and heavily use since 3 years.

First of all, you will need a META-INF/beans.xml in your test class path. All your unit tests will at the end become CDI enabled classes! If you use maven, then just create an empty file.

$> touch src/test/resources/META-INF/beans.xml

Instead of creating a @Deployment like Arquillian does, we now just take the whole test-dependencies and scan them. Of course this means that starting the CDI container for a unit test will always have to scan all the classes which are enabled in JARs which have a META-INF/beans.xml marker file. But on the other hand we spare creating the @Deployment and always test the ‘real thing’.

Introducing the test base class

The first step is to create a base-class for all your unit tests. I will break down the full class into distinct blocks and explain what happens. I’m using testng in this case, but jUnit tests will look pretty similar.

The main thing we need is the DeltaSpike CdiContainer. CDI containers don’t like it to run concurrently within the same ClassLoader. Thus we have our CdiContainer as static member variable and share it across parallel tests.

public abstract class CdiContainerTest {
    protected static CdiContainer cdiContainer;

Now we also need to initialize and use it for each method under test. This is also a good place to set the ProjectStage the test should execute in.
After boot()-ing the CdiContainer we also start all the contexts.
If the container is already started, we just clean the contextual instances for the current thread.

    @BeforeMethod
    public final void setUp() throws Exception {
        containerRefCount++;

        if (cdiContainer == null) {
            ProjectStageProducer.setProjectStage(ProjectStage.UnitTest);

            cdiContainer = CdiContainerLoader.getCdiContainer();
            cdiContainer.boot();
            cdiContainer.getContextControl().startContexts();
        }
        else {
            // clean the Instances by restarting the contexts
            cdiContainer.getContextControl().stopContexts();
            cdiContainer.getContextControl().startContexts();
        }
    }

We also do proper cleanup after each method finishes. We especially cleanup all RequestScoped beans to ensure that e.g. disposal methods for @RequestScoped EntityManagers gets invoked.

    
    @AfterMethod
    public final void tearDown() throws Exception {
        if (cdiContainer != null) {
            cdiContainer.getContextControl().stopContext(RequestScoped.class);
            cdiContainer.getContextControl().startContext(RequestScoped.class);
            containerRefCount--;
        }
    }

Another little trick: in the @BeforeClass we ensure that the container is booted and then do some CDI magic. We get the InjectionTarget and fill all InjectionPoints of our very unit test subclass via the inject() method.
This allows us to use @Inject in our unit test classes which extend our CdiContainerTest.

    @BeforeClass
    public final void beforeClass() throws Exception {
        setUp();
        cdiContainer.getContextControl().stopContext(RequestScoped.class);
        cdiContainer.getContextControl().startContext(RequestScoped.class);

        // perform injection into the very own test class
        BeanManager beanManager = cdiContainer.getBeanManager();

        CreationalContext creationalContext = beanManager.createCreationalContext(null);

        AnnotatedType annotatedType = beanManager.createAnnotatedType(this.getClass());
        InjectionTarget injectionTarget = beanManager.createInjectionTarget(annotatedType);
        injectionTarget.inject(this, creationalContext);
    }

After the suite is finished we need to properly shutdown() the container.

    @AfterSuite
    public synchronized void shutdownContainer() throws Exception {
        if (cdiContainer != null) {
            cdiContainer.shutdown();
            cdiContainer = null;
        }
    }

}

You can see the full class in my lightweightEE sample (work in progress):
CdiContainerTest.java

The usage

Just write your unit test class and extend the CdiContainerTest class.

public class CustomerServiceTest extends CdiContainerTest
{

You can even use injection in your unit tests, because they get resolved in the @BeforeClass method of the base class.

    private @Inject CustomerService custSvc;
    private @Inject UserService usrSvc;
...

And just use the injected resources in your test

    @Test(groups = "createData")
    public void testCustomerCreation() throws Exception {
        Customer cust1 = new Customer();
        ...
        custSvc.createCustomer(cust1);
    }

Btw, to use the latest Apache DeltaSpike snapshots you might need to configure the following maven repository in your project or ~/.m2/settings.xml :

  <repositories>
    <repository>
      <id>people.apache.snapshots</id>
      <url>https://repository.apache.org/content/repositories/snapshots/</url>
      <releases>
        <enabled>false</enabled>
      </releases>
      <snapshots>
        <enabled>true</enabled>
      </snapshots>
    </repository>
  </repositories>

Limits

The CdiContainer based unit tests work really fine if your project under test does not rely on EE container resources, like JNDI or JMS. It might work for e.g. EJB if you use Apache OpenWebBeans plus Apache OpenEJB or JBoss Weld plus JBoss microAS.
I would need to test this out to give more detailed infos.
As a general rule of thumbs: once you need heavyweight EE resources in your backend, you should go the Arquillian route.

If you only use JNDI for configuring a JPA DataSource then please look at the documentation of Apache MyFaces CODI ConfigurableDataSource. This will give you much more flexibility and removes the need of the almost-ever-broken JNDI configuration in your persistence.xml. Instead you can use plain CDI mechanics to configure your database settings.

Summary

If you like to do more end-to-end or integration testing, then Arquillian is worth looking at. If you just need a quick unit test for your CDI based project, then take a look at my CdiContainerTest base class and grab the ideas you like. And don’t forget to provide feedback ;)

Why is OpenWebBeans so fast?

The Speed-King

Apache OpenWebBeans [1] is considered the Speed-King of dependency injection containers. A blog post of Gehard Petracek showed this pretty nicely [2][3]. But why is this? During performance tests for the big EE application I developed with colleagues since late 2009 (40k users, 5 mio pages/day) we came around a few critical points.

1. Static vs Dynamic

The CDI specification requires to scan all the classes which are in locations which have a META-INF/beans.xml marker file and store them in a big list of Bean instances. This information will only get scanned at startup, checked for inconsistency and further optimized. At runtime all the important information is immediately available without any further processing.

This is afaik different in Spring which allows some kind of ‘dynamic’ reconfiguration. Afaik you can switch some parts of the bean configuration programmatically at runtime. This might be neat for some usecases (e.g. scripting integration) but definitely requires the container to do much more work at runtime. It also prevents to apply some kind of very aggressive caching.

2. Reduce String Operations

In dependency injection containers you often store evaluation results in Maps. We pretty often have constellations similar to the following example:

java.lang.reflect.Method met = getInjectionMethod();
Class clazz = getInjectionClass();
String key = clazz.getName() + "/" + met.getName() + parametersString(met.getParameterTypes());
Object o = map.get(key);

This obviously is nothing I would ever write into a piece of performance intense code! The first step is to use a StringBuilder instead

Using StringBuilder

StringBuilder sb = new StringBuilder();
sb.append(clazz.getName()).append('/').append(met.getName()).append('/');
appendParameterTypes(sb, met.getParameterTypes());
String key = sb.toString();

This is now twice as fast. But still way too slow for us ;)

Increase StringBuilder capacity

If I remember the sources correctly StringBuider by default has a internal buffer capacity of 16 bytes and gets doubled every time this buffer exceeds. In our example above this most times leads to expanding the internal capacity 3 or 4 times (depending on the lenght of the parameters, etc). Every time the capacity gets extended the internal String handling must perform a alloc + memcpy(newbuffer, oldbuffer) + free(oldbuffer). We can reduce this by initially allocating more space already:

StringBuilder sb = new StringBuilder(200);
sb.append(clazz.getName()).append('/').append(met.getName()).append('/');
...

This already helps a lot, the code now runs 3x as fast as with the default StringBuilder(). The downside is that you always need 200 bytes for each key at least. And we are still far from what’s possible performance wise.

Hash based solution

Another solution looks like the following:

long key = clazz.getName().hashCode() + 29L * met.getName().hashCode()
for(Type t:met.getParameterTypes()) {
  key += 29L * t.hashCode();
}
Object o = map.get(key);

This performs much faster than any String based solution. Up to 50 times for 3 String operands to be more precise. Even more if more Strings need to be concatenated. Not only is the key creation much faster because it needs zero memory allocation. It also makes the map access faster as well. The downside is that the hash keys might clash.

Own Key Object

The solution we use in OWB now is to create an own key object which implements hashCode() and equals() in an optimized way.

3. ELResolver tuning

Expression Language based programming is excessively used in JSP and JSF pages and also in other frameworks. It is fairly easy to use, plugable and integrates very well in existing frameworks. But many people are not aware that the EL integration is pretty expensive. It works by going down a chain of registered ELResolver implementations until one of them found the requested object.

Nested EL calls

We will explain the impact of the ELResolver by going through a single EL invocation. Imagine the following line in a JSF page:

<inputText value="#{shoppingCart.user.address.city}">

How many invocations to an ELResolver#getValue() do you think gets executed?

The answer is: many! Let’s say we have 10 ELResolvers in the chain. This is not a fantasy value but comes pretty close to the reality. While evaluating the whole expression, the EL integration splits the given EL-expression parts on the dots (‘.’) and starts the resolving with the outerleft term (“shoppingCart”). This EL-part will be passed down the ELResolver chain until one ELResolver knows the given name. Since the cheap EL-Resolvers which have a high hit-ratio are usually put first our CDI-ELResolver will be somewhere in the middle. This means that we already got approximately 5 other ELResolver invocations before we find the bean…

The Answer: our single EL expressin will roughly perform 30 ELResolver invocations already!

But what can we do to improve the performance of our very own WebBeansELResolver at least?

EL caching

The OpenWebBeans WebBeansELResolver (or any CDI containers ELResolver) will typically execute the following code to get a bean:

Set<Bean> beans = beanManager.getBeans(name);
CreationalContext creationalContext = beanManager.createCreationalContext(bean);
Bean bean = beanManager.resolve(beans);
Object contextualReference = beanManager.getReference(bean, Object.class, creationalContext);

This gives us quite a few things we can cache.

a.) we can cache the found Bean.

b.) for NormalScoped beans (most scopes, except @Dependent) we can even cache the contextualReference because in this case we always will get a Proxy anyway (see the spec: ‘Contextual Reference’).

Negative caching

Searching a bean is expensive. Searching it and NOT finding anything is even more expensive!

To prevent our ELResolver from doing this over and over again, we also cache the misses. This did speed up the OpenWebBeans EL integration big times.

4. Proxy tuning

One of the most elaborated parts in OWB is the possibility to configure custom proxies for any scope.

The job of a Contextual Reference

Proxies in CDI are used for implementing Interceptors, Serialization and mainly for on-the-fly resolving the underlying ‘Contextual Instance’ of a ‘Contextual Reference’. You can read more about this in the latest JavaTechJournal [4] ‘CDI Introduction’ article. In this last part the proxy will lookup the correct Contextual Instance for each method invocation and then redirect the method invocation to this resolved instance. In OpenWebBeans this is done in the NormalScopedBeanInterceptorHandler.

But is it really necessary to always perform this expensive lookup?

Caching resolved Contextual Instances

Let’s take a look at a @RequestScoped bean. Once the servlet request gets started the resolved contextual instance doesn’t change anymore until the end of the request. So it should be possible to ‘cache’ this contextual reference and clean the cache when the servlet request ends. OpenWebBeans does this by providing an own Proxy-MethodHandler for @RequestScoped beans, the RequestScopedBeanInterceptorHandler, which stores this info in a ThreadLocal:

private static ThreadLocal<HashMap<OwbBean, CacheEntry>> cachedInstances 
    = new ThreadLocal<HashMap<OwbBean, CacheEntry>>();

An @ApplicationScoped bean in a WebApp doesn’t change the instance at all once we resolved it. Thus we use an ApplicationScopedBeanInterceptorHandler [5] to even cache more aggressively.

Cache your own Scopes

As Gerhards blog post [2] shows we can also use an OWB feature to configure the builtin and also custom Proxy-MethodHandlers for any given Scope. The configuration is done via OpenWebBeans own configuration mechanism explained in a previous blog post.

Simply create a file META-INF/openwebbeans/openwebbeans.properties which contains a content similar to the following:

org.apache.webbeans.proxy.mapping.javax.enterprise.context.ApplicationScoped=org.apache.webbeans.intercept.ApplicationScopedBeanInterceptorHandler

The config looks like the following:

org.apache.webbeans.proxy.mapping.[fully qualified scope name]=[proxy method-handler classname]

Summary

Why the hell did we put so much time and tricks into Apache OpenWebBeans?

The answer is easy: Our application shows study codes of curricula on a single page. Up to 1600 lines in a <h:dataTable> resulting in > 450.000 EL invocations! We tried this with other containers which used to take 6 seconds to render this page.

With Apache tomcat + Apache OpenWebBeans + Apache MyFaces-2.1.x [6] + JUEL [7] + Apache OpenJPA [8] we are now down to 350ms for this very page …

have fun and LieGrue,
strub

PS: we already explained a few of our tricks to our friends from the Weld community. This resulted in their @RequestScoped beans getting 3 times faster and now being almost as fast as in OWB ;)

[1] Apache OpenWebBeans
[2] os890 blog 1
[3] os890 blog 2
[4] Java-Tech-Journal CDI special
[5] ApplicationScopedBeanInterceptorHandler.java
[6] Apache MyFaces
[7] JUEL EL-2.2 Implementation
[8] Apache OpenJPA

Control CDI Containers in SE and EE

The Problem

Have you ever tried to boot a CDI container like JBoss Weld or Apache OpenWebBeans in Java SE? While doing this task you will end up knee deep in the implementation code of the container you are using! Not only is this a complicated task to do, but your code will also end up being non-portably because you need to invoke container specific methods.

Where do I need this at all?

Booting a CDI container is not only useful if you like to hack Java SE apps, like a standalone SWT application. It is also very valuable if you can boot a CDI container for the unit tests of your business services and you don’t like to setup Arquillian for that task.

The Solution – Apache DeltaSpike CdiControl

The Apache DeltaSpike project [1][2] is a collection of CDI-Extensions which are created by a large community of CDI enthusiasts. It consists of most of the Apache MyFaces CODI and JBoss Seam3 community members, plus other high-profile experts in this area. The functionality of DeltaSpike is growing with each day!

One of the many features which DeltaSpike already provides in this yet early stage is a way to boot any CDI-Container and control it’s Context lifecycles via a few very simple interfaces [3]. Your code will end up being completely independent of the CDI implementation you use!

There are currently two implementations of this API, one for Apache OpenWebBeans, the other one for JBoss Weld. Both also successfully passed a small internal TCK test suite. The respective implementation simply gets activated by putting the correct impl-JAR into the classpath.

Maven Integration

For an Apache Maven project, you can just add the cdictrl-api and one of the impl JARs as dependencies in your pom.xml.

For using it with Apache OpenWebBeans:

<dependency>
    <groupId>org.apache.deltaspike.cdictrl</groupId>
    <artifactId>deltaspike-cdictrl-api</artifactId>
    <version>${deltaspike.version}</version>
</dependency>
<dependency>
    <groupId>org.apache.deltaspike.cdictrl</groupId>
    <artifactId>deltaspike-cdictrl-owb</artifactId>
    <version>${deltaspike.version}</version>
</dependency>

For JBoss Weld it’s almost the same, just with the weld impl jar:

<dependency>
    <groupId>org.apache.deltaspike.cdictrl</groupId>
    <artifactId>deltaspike-cdictrl-api</artifactId>
    <version>${deltaspike.version}</version>
</dependency>
<dependency>
    <groupId>org.apache.deltaspike.cdictrl</groupId>
    <artifactId>deltaspike-cdictrl-weld</artifactId>
    <version>${deltaspike.version}</version>
</dependency>

Note: Those features will get released with deltaspike-0.2-incubating. In the meantime the deltaspike.version is 0.2-incubating-SNAPSHOT and it is available in the Apache snapshots repository:
https://repository.apache.org/content/repositories/snapshots/org/apache/deltaspike/cdictrl/deltaspike-cdictrl-api/0.2-incubating-SNAPSHOT/

How to use the API

There are basically two parts

  1. The CdiContainer Interface will provide you with a way to boot and shutdown the CDI Container in JavaSE apps.
  2. The ContextControl interface is provides a way to control the lifecycle of the built-in Contexts of the CDI container.

CdiContainer usage

You can use the CdiContainerLoader as a simple factory to gain access to the underlying CdiContainer implementation. If you like to boot a CDI container for a unit test or in a Java SE application then just use the following fragment:

// this will give you a CdiContainer for Weld or OWB, depending on the jar you added
CdiContainer cdiContainer = CdiContainerLoader.getCdiContainer();

// now we gonna boot the CDI container. This will trigger the classpath scan, etc
cdiContainer.boot();

// and finally we like to start all built-in contexts
cdiContainer.getContextControl().startContexts();

// now we can use CDI in our SE application. 
// And there is not a single line of OWB or Weld specific code in your project!

// finally we gonna stop the container 
cdiContainer.shutdown();

Pretty much self explaining, isn’t? Of course, those 2 classes are of no interest for Java EE applications since the CDI Container already gets properly booted and shut down by the Servlet container integration.

ContextControl usage

The ContextControl interface allows you to start and stop built-in standard Contexts like @RequestScoped, @ConversationScoped, @SessionScoped, etc. It is provided as @Dependent bean and can get injected in the classic CDI way. This is not only usable in Java SE projects but also very helpful in Servlets and Java EE containers!

The following samples should give you an idea about the power of this tool:

Restarting the RequestContext in a unit test

I pretty frequently had the problem that I needed to test my classes with attached and also with detached JPA entities. In most of our big real world projects we are using the entitymanager-per-request approach [4] and thus have a producer method which creates a @RequestScoped EntityManager. Since a single unit test is usually treated as one ‘request’ I had problems detaching my entities. With the ContextControl this is no problem anymore as the following code fragment shows:

@Test
public void testMyBusinessLogic() {
  doSomeJpaStuff()
  MyEntity me = em.find(...);
  
  ContextControl ctxCtrl = BeanProvider.getContextualReference(ContextControl.class);

  // stoping the request context will dispose the @RequestScoped EntityManager
  ctxCtrl.stopContext(RequestScoped.class);

  // and now immediately restart the context again
  ctxCtrl.startContext(RequestScoped.class);

  // the entity 'em' is now in a detached state!
  doSomeStuffWithTheDetachedEntity(em);
}

Attaching a Request Context to a new thread in EE

Everyone who tried to access @RequestScoped CDI beans in a new Thread created in a Servlet or any other Java EE related environment has for sure experienced the same pain: accessing the @RequestScoped bean will result in a ContextNotActiveException. But how comes? Well, the Request Context usually gets started for a particular thread via a simple ServletRequestListener. The problem is obvious: no servlet-request means that there is no Servlet Context for the Thread! But this sucks if you like to reuse your nice business services in e.g. a Quartz Job. The ContextControl can help you in those situations as well:

public class CdiJob implements org.quartz.Job {
  public void execute(JobExecutionContext context) throws JobExecutionException {
    ContextControl ctxCtrl = 
      BeanProvider.getContextualReference(ContextControl.class);

    // this will implicitly bind a new RequestContext to your current thread
    ctxCtrl.startContext(RequestScoped.class);

    doYourWork();

    // at the end of the Job, we gonna stop the RequestContext
    // to ensure that all beans get properly cleaned up.
    ctxCtrl.stopContext(RequestScoped.class);
  }
}

And there are tons of other situations where this can be useful…

LieGrue,
strub

PS: Gonna change the motd to ‘All is under (cdi-) control!’
PPS: we will most probably introduce a similar API in the CDI-1.1 specification

[1] https://git-wip-us.apache.org/repos/asf?p=incubator-deltaspike.git
[2] https://issues.apache.org/jira/browse/DELTASPIKE
[3] https://git-wip-us.apache.org/repos/asf?p=incubator-deltaspike.git;a=tree;f=deltaspike/cdictrl/api/src/main/java/org/apache/deltaspike/cdise/api
[4] http://docs.redhat.com/docs/en-US/JBoss_Enterprise_Web_Server/1.0/html/Hibernate_Entity_Manager_Reference_Guide/transactions.html

Follow

Get every new post delivered to your Inbox.