Applying interceptors to producer methods

Interceptors are really cool if you have a common problem and like to apply it to without making every single colleague copy the same code over again and again to apply a solution over the whole code base.

In my case it was the urge to log out SOAP and REST invocations to other systems. I also like to add a logCorrelationId via HTTP header to each outgoing SOAP call. You can read more about the background over in my other logCorrelation blog post.

I’ll focus on integrating SOAP clients, but you can easily do the same for REST clients as well.

Integrating a SOAP client in an EE project

Usually I create a CDI producer for my SOAP ports. That way I can easily mock them out with a local dummy implementation by just using CDI’s @Specializes or @Alternative. If you combine this with with Apache DeltaSpike @Exclude and the DeltaSpike Configuration system then you can even even enable those Mock via ProjectStage or a configuration setting.

Consider you have a WSDL and you create a SOAP client with the interface CustomerService.

What we like to get from a ‘consumer’ perspective is the following usage:

public class SomeFancyClass {
  private @Inject CustomerService customerService;
  ...
}

Which means you need a CDI producer method, e.g. something like:

@ApplicationScoped
public class CusomerServiceSoapClientProducer {
  @ConfigProperty(name = "myproject.customerService.endpointUrl")
  private String customerServiceEndpointUrl;

  @Produces
  @RequestScoped
  @LogTiming
  public CustomerService createSoapPort() {
    // generated from the WSDL, e.g. via CXF
    CustomerServiceService svc = new CustomerServiceService();
    CustomerServiceServicePort port = svc.getCustomerServiceServicePort();

    // this sets the endpoint URL during producing.
    ((BindingProvider) port).getRequestContext().
           put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY, customerServiceEndpointUrl);

    return port;
  }
}

Side note: the whole class could also be @RequestScoped to get the endpoint URL evaluated on every request. We could of course also use the DeltaSpike ConfigResolver programmatically to gain the same. But the whole point of setting the endpoint URL manually is that we don’t need to change the WSDL and have to recompile the project on every server change. We can also use different endpoints for various boxes (test vs production environments, or different customers) that way.

What is this @LogTiming stuff?

Now it becomes interesting! We now have a SOAP client which looks like a regular CDI bean from a ‘user’ point of view. But we like to get more information about that outgoing call. After all it’s an external system and we have no clue how it behaves in terms of performance. That means we like to protocol each and every SOAP call and log out it’s duration. Of course since we not only have 1 SOAP service client but multiple dozen ones we like to do this via an Interceptor!

@Inherited
@InterceptorBinding
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.TYPE})
public @interface LogTiming {
}

Applying an Interceptor on a producer method?

Of course the code written above DOES work. But it behaves totally different as many of you will guess.
If you apply an interceptor annotation to a producer method, then it will not intercept the calls to the produced bean!
Instead it will just intercept the invocation of the producer method. A producer method gets invoked when the Contextual Instance gets created. For a @Produces @RequestScoped annotated producer method this will happen the first time a method on the produced CDI bean gets called in the very request (or thread for non-servlet request based threads). And exactly this call gets intercepted.

If we would just apply a stopwatch to this interceptor then we would get the info about how long it took to create the soap client. That’s not what we want! We like to get the times from each and every usage of that CustomerService invocation! So what does our LogTiming interceptor do?

Proxying the Proxy

The trick we apply is to to use our LogTiming Interceptor to wrap the produced SOAP port in yet another proxy. And this proxy logs out the request times, etc. As explained before we cannot use CDI interceptors, but we can use java.lang.reflect.Proxy!:

@LogTiming
@Interceptor
public class WebserviceLoggingInterceptor {

    @AroundInvoke
    private Object wrapProxy(InvocationContext ic) throws Exception {
        Object producedInstance = ic.proceed();
        Class[] interfaces = producedInstance.getClass().getInterfaces();
        Class<?> returnType = ic.getMethod().getReturnType();
        return Proxy.newProxyInstance(ClassUtils.getClassLoader(null), interfaces, new LoggingInvocationHandler(producedInstance, returnType));
    }
}

This code will register our reflect Proxy in the CDI context and each time someone calls a method on the injected CustomerService it will hit the LogInvocationHandler. This handler btw can also do other neat stuff. It can pass over the logCorrelationId (explanation see my other blog post linked above) as HTTP header to the outgoing SOAP call.

The final LoggingInvocationHandler looks like the following:

public class LoggingInvocationHandler implements InvocationHandler {
    private static final long SLOW_CALL_THRESHOLD = 100; // ms
 
    private final Logger logger;
    private final T delegate;

    public LoggingInvocationHandler(T delegate, Class loggerClass) {
        this.delegate = delegate;
        this.logger = LoggerFactory.getLogger(loggerClass);
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if (EXCLUDED_METHODS.contains(method.getName())) {
            // don't log toString(), hashCode() etc...
            return method.invoke(delegate, args);
        }

        long start = System.currentTimeMillis();

        try {
            // setting log correlation header if any logCorrelationId is set on the thread.
            String logCorrelationId = LogCorrelationUtil.getCorrelationId();
            if (StringUtils.isNotEmpty(logCorrelationId) && delegate instanceof BindingProvider) {
                BindingProvider port = (BindingProvider) delegate;
                Map<String, List> headers = (Map<String, List>) port.getRequestContext().get(MessageContext.HTTP_REQUEST_HEADERS);
                if (headers == null) {
                    headers = new HashMap<>();
                }
                headers.put(LogCorrelationUtil.REQUEST_HEADER_CORRELATION_ID, Collections.singletonList(logCorrelationId));
                port.getRequestContext().put(MessageContext.HTTP_REQUEST_HEADERS, headers);
            }

            // continue with the real call
            return method.invoke(delegate, args);
        }
        finally {
            long duration = System.currentTimeMillis() - start;
            if (duration <= SLOW_CALL_THRESHOLD) {
                logger.info("soapRemoteCall took={} ms service={} method={}", duration, delegate.getClass().getName, method.getName());
            }
            else {
                // log a more detailed msg, e.g. with params
            }
        }
    }

Limitations

Of course this trick only works if the producer method returns an interface! That’s caused by the reflect Proxies are only available for pure interfaces.

I’m trying to remove this limitations by bringing intercepetors for produced instances to CDI-2.0 as well on working on a interceptors spec change to introduce ways to create subclassing proxies as easy as interface proxies. Stay tuned!

Transaction handling in EJBs and JavaEE7 @Transactional

Handling transactions in EJBs is easy, right? Well, in theory it should be. But how does the theory translate into reality once you leave the ivory tower?

I’ll show you a small example. Let’s assume we have 2 infrastructure service EJBs:

@Stateless
public class StorageServiceImpl implements StorageService {
  private @EJB CustomerService customerService;
  private @PersistenceContext EntityManager em;

  public void chargeStorage(int forYear) throws CustomerNotFoundException {
    storeNiceLetterInDb(em);
    Customer c = customerService.getCurrentCustomer(); 
    doSomethingElseInDB(); 
  }
} 

And now for the CustomerService which is an EJB as well:

@Stateless
public class CustomerServiceImpl implements CustomerService {
  public Customer getCurrentCustomer() throws CustomerNotFoundException {
    // do something if there is a current customer
    // otherwise throw a CustomerNotFoundException
  }
}

The Sunshine Case

Let’s first look at what happens if no problems occur at runtime.

In the normal operation mode some e.g. JSF backing bean will call storageService.chargeService(2015);. The implicit transaction interceptor will use a TransactionManager (all done in the interceptor which you do not see in your code) to check whether a Transaction is already open. If not it will open a new transaction and remember this fact. The same check will happen in the implicit transaction interceptor for the CustomerService.

When leaving CustomerService#getCurrentCustomer the interceptor will recognize that it didn’t open the transaction and thus will simply return. Otoh when leaving StorageService#chargeStorage it’s interceptor will commit the transaction and close the EntityManager.

Broken?: Handling checked Exceptions

Once we leave the sunny side of the street and hit some problems the whole handling start to become messy. Let’s look what happens if there is a checked CustomerNotFoundException thrown in CustomerService#getCurrentCustomer. Most people will now find their first surprise: The database changes done in storeNiceLetterInDb() will get committed into the database.

So we got an Exception but the transaction still got committed? WT*piep*!
Too bad that this is not a bug but the behaviour is exactly as specified in “9.2.1 Application Exceptions” of the EJB specification:

An application exception does not automatically result in marking the transaction for rollback unless the ApplicationException annotation is applied to the exception class and is specified with the rollback element value true…

So this means we could annotate the CustomerNotFoundException with @javax.ejb.ApplicationException(rollback=true) to force a rollback.
And of course we need to do this for ALL checked exceptions if we like to get a rollback.

Broken?: Handling unchecked Exceptions

The good news upfront: unchecked Exceptions (RuntimeExceptions) will usually cause a rollack of your transaction (unless annotated as @AppliationException(rollback=false) of course).

Let’s assume there is some other entity lookup in the code and we get a javax.persistence.EntityNotFoundException if the address of the customer couldn’t be found. This will rollback your transaction.

But what can we do if this is kind of expected and you just like to use a default address in that case? The natural solution would be to simply catch this Exception in the calling method. In our case that would be a try/catch block in StorageServiceImpl#chargeStorage.

That’s a great idea – but it doesn’t work in many containers!

Some containers interpret the spec pretty strictly and do the Exception check on _every_ layer (EJB spec 9.3.6) . And if the interceptor in the CustomerService detects an Exception then the implicit EJB interceptor will simply roll back the whole transaction and mark it as “rollbackOnly”. Catching this Exception in an outer level doesn’t help a bit. You will not get your changes into the database. And if you try to do even more on the database then you will blow up again with something like “The connection was already marked for rollback”.

And how is that with @javax.transaction.Transactional?

Basically the same like with EJBs. In my opinion this was a missed chance to clean up this behaviour.
You can read this up in chapter 3.6 of the JTA-1.2 specification.

The main difference is how to demarcate rollback vs commit exceptions. You can use the rollbackOn and dontRollbackOn attributes of @Transactional:

@Transactional(rollbackOn={SQLException.class}, dontRollbackOn={SQLWarning.class})

Now what about DeltaSpike @Transactional?

In Apache DeltaSpike @Transactional and it’s predecessor Apache MyFaces CODI @Transactional we have a much cleaner handling:

Exceptions only get handled on the layer where the transaction got opened. If you catch an Exception along the way than we do not care about it.

Any Exception on the outermost layer will cause a rollback of your transaction. It doesn’t matter if it is a RuntimeException or a checked Exception.

If there was no Exception in the outermost interceptor then we will commit the transaction.

PS: please note that I explicitly used interfaces in my samples. Otherwise you will get NIV (No Interface View) objects which again might behave slightly different as they use a totally different proxying technique and default behaviour. But that might be enough material for yet another own blog post.
PPS: I also spared you EJBs with TransactionManagementType.BEAN. That one is also pretty much non-portable by design as you effectively cannot nest them as it forces you to either commit or rollback the tx on every layer. Some containers work fine while others really force this.

The right Scope for JBatch Artifacts

In my recent JavaLand conference talk about JSR-352 JBatch and Apache BatchEE I briefly mentioned that JBatch Artifacts should have a scope of @Dependent (or Prototype scope if you are using Spring). Too bad there was not enough time to dig into the problem in depth so here comes the detailed explanation.

What is a JBatch Artifact

A JBatch batch needs a Job Specification Language XML file in META-INF/batch-jobs/*.xml files. These files describes how your batch job is built up.

Here is a small example of how such a batch JSL file could look like

<job id="mysamplebatch" version="1.0" xmlns="http://xmlns.jcp.org/xml/ns/javaee">
  <step id="mysample-step">
    <listeners>
      <listener ref="batchUserListener" >
      <properties>
        <property name="batchUser" value="#{batchUser}"/>
      </properties>
      </listener>
    </listeners>
    <batchlet ref="myWorkerBatchlet">
      <properties>
        <property name="inputFile" value="#{inputFile}"/>
      </properties>
    </batchlet>
  </step>

In JSR-352 an Artifact are all pieces which are defined in your JBatch JSL file and get requested by the container. In the sample above this would be batchUserListener and myWorkerBatchlet.

The following types can be referenced as Batch Artifacts from within your JSL:

  • Batchlets
  • ItemReader
  • ItemProcessor
  • ItemWriter
  • JobListener
  • StepListener
  • CheckpointAlgorithm
  • Decider
  • PartitionMapper
  • PartitionReducer
  • PartitionAnalyzer
  • PartitionCollector

The Batch Artifact Lifecycle

The JBatch spec is actually pretty clear what lifecycle needs to get applied on Batch Artifacts:

11.1 Batch Artifact Lifecycle
All batch artifacts are instantiated prior to their use in the scope in which they are declared in the Job XML and are valid for the life of their containing scope. There are three scopes that pertain to artifact lifecycle: job, step, and step-partition.
One artifact per Job XML reference is instantiated. In the case of a partitioned step, one artifact per Job XML reference per partition is instantiated. This means job level artifacts are valid for the life of the job. Step level artifacts are valid for the life of the step. Step level artifacts in a partition are valid for the life of the partition.
No artifact instance may be shared across concurrent scopes. The same instance must be used in the applicable scope for a specific Job XML reference.

The problem is that whenever you use a JavaEE artifact then you might get only a proxy. Of course the reference to this proxy gets thrown away correctly but the instance behind the proxy might survive. Let’s look at how this works internally.

How Batch Artifacts get resolved

A JBatch implementation can provide it’s own mechanism to load the artifacts. This is needed as it is obviously different whether you use BatchEE with CDI or if you use Spring Batch (which also implements JSR-352).
In general there are 3 different ways you can reference a Batch Artifact in your JSL:

  1. Via a declaration in an optional META-INF/batch.xml file. See the section 10.7.1 of the specification for further information.
  2. Via it’s fully qualified class name.
    In BatchEE we first try to get the class via BeanManager#getBeans(Type) and BeanManager#getReference. If that doesn’t give us the desired Contextual Reference (CDI proxy) then we simply call ClassLoader#loadClass create the Batch Artifact with newInstance() and apply injection into this instance
  3. Via it’s EL name. More precisely we use BeanManager#getBeans(String elName) plus a call to BeanManager#getReference() as shown above.

We now know what a Batch Artifact is. Whenever you are on a JavaEE Server you will most likely end up with a CDI or EJB Bean which got resolved via the BeanManager. If you are using Spring-Batch then you will most times get a nicely filled Spring bean.

The right Scope for a Batch Artifact

I’ve seen the usage of @javax.ejb.Stateless on Batch Artifacts in many samples. I guess the people writing such samples never used JBatch in real production yet 😉 Why so? Well, let’s look at what would happen if we implement our StepListener as stateless EJB:

@javax.ejb.Stateless
@javax.inject.Named // this makes it available for EL lookup
public class BatchUserListener implements StepListener {
  @Inject 
  @BatchProperty
  private String batchUser;

  @Override
  public void beforeStep() throws Exception {
     setUserForThread(   
  }

  @Override
  public void afterStep() throws Exception {
    clearUserForThread();
  }
}

Now let’s assume that the BatchUserListener gets not only used in my sample batch but in 30 other batches of my company (this ‘sample’ is actually taken from a HUGE real world project where we use Apache BatchEE since over a year now).

What will happen if e.g. a ‘DocumentImport’ batch runs before my sample batch? The first batch who uses this StepListener will create the instance. At the time when the instance gets created by the container (and ONLY at that time) it will also perform all the injection. That means it will look up the ‘batchUser’ parameter and injects it into my @BatchProperty String. Let’s assume this DocumentImport batch uses a ‘documentImportUser’. So this is what we will get injected into the ‘batchUser’ variable;

Once the batch step is done the @Stateless instance might be put back into some pool cache. And if I’m rather unlucky then exactly this very instance will later get re-used for mysample-step. But since the listener already exists there will no injection be performed on that instance. What means that the steplistener STILL contains the ‘documentImportUser’ and not the ‘mySampleUser’ which I explicitly did set as parameter of my batch.

The very same issue also will happen for all injected Variables which do not use proxies, e.g.:

  • @Inject StepContext
  • @Inject JobContext

TL;DR: The Solution

Use @Dependent scoped beans for your Batch Artifacts and only use another scope if you really know what you are doing.

If you like to share code across different items of a Step or a Job then you can also use BatchEE’s @StepScoped and @JobScoped which is available through a portable BatchEE CDI module

CDI in EARs

Foreword

I was banging my head against the wall for the last few days when trying to solve a few tricky issues we saw with EAR support over at Apache DeltaSpike. I’m writing this up to clean my mind and to share my knowledge with other EAR-pig-wrestlers…

The EAR ClassLoader dilemma

EARs are a constant pain when it comes to portability amongst servers. This has to do with the fact that JavaEE still doesn’t define any standards for visibility. There is no clear rule about how the ClassLoaders, isolation and visibility has to be set up. There is just as single paragraph (JavaEE7 spec 8.3.1) about which classes you might probably see.

There are 2 standard ClassLoader setups we see frequently in EARs.
For the sake of further discussion we assume an EAR with the following structure:

sample.ear
├── some-ejb.jar
├── lib
│   ├── some-shared.jar
│   └── another-shared.jar
├── war1.war
│   └── WEB-INF
│       ├── classes 
│       └── lib
│           └── war1-lib.jar
└── war2.war
    └── WEB-INF
        ├── classes 
        └── lib
            └── war2-lib.jar

Flat ClassLoader

The whole EAR is served by just a single flat ClassLoader. If you have 2 WARs inside your application then the classes in each of it can see each other. And also the classes in the shared EAR lib folder can see. This is e.g. used in JBoss-4 (though not from the beginning). You can also configure most of the other containers to use this setup for your EAR. But nowadays it’s hardly a default anymore (and boy is that good!)

Hierarchic ClassLoader

This is the setup used by most containers these days – but it’s still not a mandated by the spec! The Container will provide a shared EAR Application ClassLoader which itself only contains the shared ear libs. This is the parent ClassLoader of all the ClassLoaders in the EAR. Each WAR and ejb-jar inside your EAR will get an own child WebAppClassLoader.

This means war2 doesn’t see any classes or resources from war1 and the other way around. It further means that the shared libs do not see the classes of war1 nor war2, etc!

If you need some caches in your shared libs, then you need to rely on the ThreadContextClassLoader (TCCL) as outlined in JavaEE 7 paragraph 8.2.5. While this section is about “Dynamic Class Loading” it also is valid for caches and storing other dynamic information in static variables. Otherwise you end up mixing values from war1 and probably re-use them in war2 (where you will get a ClassNotFound exception). Even if your 2 WARs contain the same jar (e.g. commons-lang.jar) the Class instances are different as they come from a different ClassLoader. If you store those in a shared-lib jar then you most probably end up with the (in)famous “Cannot cast class Xxx to class Xxx”.

One common solution to this problem will look something like:

public class MyInfoStore {
  private Map<ClassLoader, Set<Info>> infoMap;

  public void storeInfo(Info info) {
    ClassLoader tccl = Thread.currentThread().getContextClassLoader(); // probably guarded for SecurityManager
    infoMap.put(tccl, info);
  }
...
}

Of course you must really be careful to clean up this Map during shutdown. In CDI Applications you can use the @Observes BeforeShutdown event to trigger the cleanup.

The impact on us programmers

These various scenarios make it really hard to write any form of portable framework who runs fine inside of EARs. This is not only true for client frameworks but also for container frameworks itself like CDI and Spring.

Integrating CDI containers in EARs

It is pretty obvious that – mostly due to the lack of a guaranteed default isolation scenario in JavaEE – CDI containers have a hard time in finding a nice and portable handling of CDI beans and Extensions in EARs. I got involved in CDI in late 2008 when the name of the spec still was WebBeans. And that name was taken literally – it originally was only targetting web applications and not EARs. The EAR support only got roughly added (according to some interpretations) shortly before the EE-6 specification got published. So there are multiple reasons why CDI in EARs is not really a first class citizen yet.

To my knowledge there are 3 sane ways how a container can integrate CDI in ears. And of those 3 sane ways, 5 are used in various containers 😉

A.) 1 BeanManager per EAR

All the EAR is handled via 1 BeanManager. This is the way JBoss WildFly seems to handle things. First the BeanManager gets created and all it’s CDI Extensions get loaded (TODO research: all or only the ones from the shared libs?).

In reality it’s a bit more complicated. Weld uses a single BeanManager for each and every JAR in your EAR. Don’t ask me why, but that’s what I’ve seen. Still you only get one set of Extensions for your whole EAR. Keep this in mind.

The TCCL always seems to stay the shared EAR ApplicationClassLoader during boot time. Even while processing the classes of the WAR files. Thus the TCCL during @Observes ProcessAnnotatedType will be the shared EAR ApplicationClassLoader. But if you access your Extensions (or the information collected by them) later at runtime you might end up with a different TCCL. This is especially true for all servlet requests in your WARs. This makes it really hard to store anything with the usual Map<ClassLoader, Set<Info>> trick.

B.) 1 BeanManager per WAR + 1 ‘shared’ BeanManager

Each WAR fully boots up his own BeanManager. The shared EAR libs also get an own BeanManager to serve non-servlet requests like JMS and remote EJB invocations. Classes in shared EAR-libs simply get discovered over again for each WAR. Each WAR also has it’s own set of Extensions (they are usually 1:1 to the BeanManager). This is what I’ve seen in Apache Geronimo, IBM WebSphere and early Apache TomEE containers.

This is a bit tricky when it comes to handling of the @ApplicationScoped context (see CDI-129 for more information) or if you handle EJBs (which are by nature shared across the EAR). WebSphere e.g. seems to solve this by having an own Bean<SomeSharedEarLibClass> instance for each BeanManager (and thus for each WAR) but they share a single ApplicationContext storage and those beans are equals(). Probably they just compare the PassivationCapable#getPassivationId()?

It usually works fairly nice and it allows the usage of most common programming patterns. This also allows to modify classes from shared libs via Extensions you register in a single WAR. The downside is that you have to scan and process all the shared classes over and over again for each WAR. This obviously slows down the application boot time.

Oh this mode strictly seen also conflicts with the requirements for modularity of section 5 of the CDI specification. Overall I’m not very happy with section 5 but it should be mentioned.

C.) Hierarchic BeanManagers

In this case we have an 1:1 relation between a ClassLoader and the BeanManager. The shared EAR libs will get scanned by an EAR-BeanManager. Then each of the WARs get scanned via their own WebAppBeanManager. In contrast to scenario B. these WebAppBaenManagers will only scan the classes of the local WARs WEB-INF/lib and WEB-INF/classes and not the shared ear lib over again. If you need information from shared EAR lib classes then the WebAppBeanManger simply delegates to it’s ‘parent’ EAR-BeanManager. Apache TomEE uses this mode since 1.7.x

There are a few tricks the container vendor need to do in this case. E.g. propagating events and Bean lookups to the parent BeanManager, etc. Or to suppress sending CDI Extension Events to the parent (we recently learned this the hard way – is fixed in TomEE-1.7.2 which is to be released soon).

It also has an important impact on CDI Extension programmers: As your Extensions are 1:1 to the BeanManager we now also have the ClassPath split up into different Extension instances. This works fine for ProcessAnnotatedType but can be tricky in some edge cases. E.g. the DeltaSpike MessageBundleException did collect info about a certain ProducerBean and stored in in the Extension for later usage in @Observes AfterBeanDiscovery. Too bad that in my case this certain ProducerMethod was in a shared ear lib and thus gets picked up by the Extension-instance of the EAR-BeanManager but the ‘consumer’ (the interface annotated with @MessageBundle is in some WARs. And the WebAppBeanManager of this WAR obviously is not the one scanning the ProducerMethod of the class in the shared ear lib. Thus the Extension created a NullPointerException. This will be fixed in the upcoming Apache DeltaSpike-1.2.2 release.

The Impact on poor CDI Extension programmers

TODO: this is a living document. I’ll add more info and put it up review.

The tricky CDI ServiceProvider pattern

CDI is really cool, and works great. But it’s not the hammer for all your problems!

I’ve seen the following a lot of times and I confess that I also used it in the first place: The CDI ServiceProvider pattern. But first let’s clarify what it is:

The ServiceProvider pattern

Consider you have a bunch of plugable modules and all of them might add an implementation for a certain Interface:

// this is just a marker interface for all my plugins
public interface MyPlugin{} 

The idea is that extensions can later easily implement this marker interface and the system will automatically pick up all the implementations accordingly.

public class PluginA implements MyPlugin {
  ...
}

public class PluginB implements MyPlugin {
  ...
}

The known way

I’ve seen lots of blogs recommending the following solution:

public static  List getContextualReferences(Class type, Annotation... qualifiers)
{
    BeanManager beanManager = getBeanManager();

    Set<Bean> beans = beanManager.getBeans(type, qualifiers);

    List result = new ArrayList(beans.size());

    for (Bean bean : beans)
    {
        Bean<?> bean = beanManager.resolve(beans);
        CreationalContext<?> creationalContext = beanManager.createCreationalContext(bean);
        result.add(beanManager.getReference(bean, type, creationalContext););
    }
    return result;
}

The original implementation can be found in the Apache DeltaSpike BeanProvider (and in Apache MyFaces CODI before that).

This worked really well for some time. But suddenly a few surprises happened.

What’s the problem with getBeans()?

One day we got a bug report in Apache OpenWebBeans that OWB is broken because the pattern shown above doesn’t work: JIRA issue OWB-658

What we found out is that the whole pattern doesn’t work anymore once one single plugin is defined as @Alternative

I digged into the issue and became aware that this is not an OWB issue but BeanManager#getBeans() is just not made for this usage! So let’s have a look about what the CDI spec says:

11.3.4:
The method BeanManager.getBeans() returns the set of beans which have the given required type and qualifiers and are available for injection

The important part here is the difference between “give me all beans of type X” and “give me all beans which can be used for an InjectionPoint of type X”. Those 2 are fundamentally different, because in the later case all other beans are no candidates for injection anymore if a single @Alternative annotated bean gets spotted. OWB did it just right!

Possible Solution 1

Don’t use @Alternative on any of your plugin classes 😉

That sounds a bit ridiculous, but at least it works.

Possible Solution 2

You could create a Message which collects all your plugins in a CDI-Extension-like way.
First we need a data object to store all the collected plugins.

public class PluginDetection {
  private List plugins = new ArrayList();

  public void addPlugin(MyPlugin plugin) {
    plugins.add(plugin);
  }

  public List getPlugins() {
    return plugins;
  }
}

Then we gonna fire this around:

private @Inject Event pdEvent;

public List detectPlugins() {
  PluginDetection pd = new PluginDetection();
  pdEvent.fire(pd);
  return pd.getPlugins();
}

The plugins just need to Observe this event and register themself:

public void register(@Observes PluginDetection pd) {
  pd.add(this);
}

Since the default for @Observes is Reception.ALWAYS, the contextual instance will automatically get created if it doesn’t yet exist.

But there is also a small issue with this approach: There is no way to disable/override a plugin with @Alternative anymore, since the messaging doesn’t do any bean resolving at all.

Unit Testing Strategies for CDI based projects

Even after 2 years of CDI going public there are still frequent questions about how to unit test CDI based applications. Well, here it goes.

1. Arquillian, the heavyweight champion

Arquillian is the core part of the EE testing effort of JBoss. The idea is to write a unit test exactly once and run it on the target platform via some container integration connectors.

How does Arquillian tests look like?

I will not give a detailed introduction here, but I’d like to mention that an Arquillian test will usually contain a @Deployment section which packages all the classes under test into an own temporary Java packaging, like a JAR, EAR, WAR, etc and uses this temporary artifact to run the tests.

There are basically 2 ways to run Arquillian tests. I’ll not go into details but only give a very rough (and probably not too exact) overview:

Remote/Managed style Arquillian tests

In this case the Arquillian tests will be started in an extenal Java VM, e.g. a started JBossAS7 or GlassFish-3.2. If you start your unit test locally, it will get ‘deployed’ to the other VM and executed over there. You can debug through as if you were working over there which is established via the java debugging and profiling API. Some container connectors also allow to locally run ‘as-client’.

The Good: this is running exactly on the target platform you do your real stuff on.
The Bad: Slower than native unit tests. No full control over the lifecycle, etc. You e.g. cannot simulate multiple servlet requests in one unit test.

Embedded style Arquillian tests

Instead of deploying your Arquillian test to another VM you are starting a standalone CDI or embedded EE container in the current Java VM.

The Good: Much faster as going remote. And you don’t need any server running.
The Bad: Sometimes this leads to ClassLoader isolation problems between your @Deployment and your unit test. This leads to problems when you e.g. like to unit test classes which starts an EntityManager. The reason is that e.g. any persistence.xml will get picked up twice. Once from your regular classpath and the second time from the @Deployment.

You can read more about it in the JBoss Arquillian documentation (thanks to Aslak Knutsen for the links):

2. DeltaSpike CdiControl, the featherweight champion

The method to unit test CDI based programs I like to introduce next uses the Apache DeltaSpike CdiControl API introduced in my previous post. It is based on the ideas and experience I had with the Apache OpenWebBeans CdiTest module I created and heavily use since 3 years.

First of all, you will need a META-INF/beans.xml in your test class path. All your unit tests will at the end become CDI enabled classes! If you use maven, then just create an empty file.

$> touch src/test/resources/META-INF/beans.xml

Instead of creating a @Deployment like Arquillian does, we now just take the whole test-dependencies and scan them. Of course this means that starting the CDI container for a unit test will always have to scan all the classes which are enabled in JARs which have a META-INF/beans.xml marker file. But on the other hand we spare creating the @Deployment and always test the ‘real thing’.

Introducing the test base class

The first step is to create a base-class for all your unit tests. I will break down the full class into distinct blocks and explain what happens. I’m using testng in this case, but jUnit tests will look pretty similar.

The main thing we need is the DeltaSpike CdiContainer. CDI containers don’t like it to run concurrently within the same ClassLoader. Thus we have our CdiContainer as static member variable and share it across parallel tests.

public abstract class CdiContainerTest {
    protected static CdiContainer cdiContainer;

Now we also need to initialize and use it for each method under test. This is also a good place to set the ProjectStage the test should execute in.
After boot()-ing the CdiContainer we also start all the contexts.
If the container is already started, we just clean the contextual instances for the current thread.

    @BeforeMethod
    public final void setUp() throws Exception {
        containerRefCount++;

        if (cdiContainer == null) {
            ProjectStageProducer.setProjectStage(ProjectStage.UnitTest);

            cdiContainer = CdiContainerLoader.getCdiContainer();
            cdiContainer.boot();
            cdiContainer.getContextControl().startContexts();
        }
        else {
            // clean the Instances by restarting the contexts
            cdiContainer.getContextControl().stopContexts();
            cdiContainer.getContextControl().startContexts();
        }
    }

We also do proper cleanup after each method finishes. We especially cleanup all RequestScoped beans to ensure that e.g. disposal methods for @RequestScoped EntityManagers gets invoked.

    
    @AfterMethod
    public final void tearDown() throws Exception {
        if (cdiContainer != null) {
            cdiContainer.getContextControl().stopContext(RequestScoped.class);
            cdiContainer.getContextControl().startContext(RequestScoped.class);
            containerRefCount--;
        }
    }

Another little trick: in the @BeforeClass we ensure that the container is booted and then do some CDI magic. We get the InjectionTarget and fill all InjectionPoints of our very unit test subclass via the inject() method.
This allows us to use @Inject in our unit test classes which extend our CdiContainerTest.

    @BeforeClass
    public final void beforeClass() throws Exception {
        setUp();
        cdiContainer.getContextControl().stopContext(RequestScoped.class);
        cdiContainer.getContextControl().startContext(RequestScoped.class);

        // perform injection into the very own test class
        BeanManager beanManager = cdiContainer.getBeanManager();

        CreationalContext creationalContext = beanManager.createCreationalContext(null);

        AnnotatedType annotatedType = beanManager.createAnnotatedType(this.getClass());
        InjectionTarget injectionTarget = beanManager.createInjectionTarget(annotatedType);
        injectionTarget.inject(this, creationalContext);
    }

After the suite is finished we need to properly shutdown() the container.

    @AfterSuite
    public synchronized void shutdownContainer() throws Exception {
        if (cdiContainer != null) {
            cdiContainer.shutdown();
            cdiContainer = null;
        }
    }

}

You can see the full class in my lightweightEE sample (work in progress):
CdiContainerTest.java

The usage

Just write your unit test class and extend the CdiContainerTest class.

public class CustomerServiceTest extends CdiContainerTest
{

You can even use injection in your unit tests, because they get resolved in the @BeforeClass method of the base class.

    private @Inject CustomerService custSvc;
    private @Inject UserService usrSvc;
...

And just use the injected resources in your test

    @Test(groups = "createData")
    public void testCustomerCreation() throws Exception {
        Customer cust1 = new Customer();
        ...
        custSvc.createCustomer(cust1);
    }

Btw, to use the latest Apache DeltaSpike snapshots you might need to configure the following maven repository in your project or ~/.m2/settings.xml :

  <repositories>
    <repository>
      <id>people.apache.snapshots</id>
      <url>https://repository.apache.org/content/repositories/snapshots/</url>
      <releases>
        <enabled>false</enabled>
      </releases>
      <snapshots>
        <enabled>true</enabled>
      </snapshots>
    </repository>
  </repositories>

Limits

The CdiContainer based unit tests work really fine if your project under test does not rely on EE container resources, like JNDI or JMS. It might work for e.g. EJB if you use Apache OpenWebBeans plus Apache OpenEJB or JBoss Weld plus JBoss microAS.
I would need to test this out to give more detailed infos.
As a general rule of thumbs: once you need heavyweight EE resources in your backend, you should go the Arquillian route.

If you only use JNDI for configuring a JPA DataSource then please look at the documentation of Apache MyFaces CODI ConfigurableDataSource. This will give you much more flexibility and removes the need of the almost-ever-broken JNDI configuration in your persistence.xml. Instead you can use plain CDI mechanics to configure your database settings.

Summary

If you like to do more end-to-end or integration testing, then Arquillian is worth looking at. If you just need a quick unit test for your CDI based project, then take a look at my CdiContainerTest base class and grab the ideas you like. And don’t forget to provide feedback 😉

Why is OpenWebBeans so fast?

The Speed-King

Apache OpenWebBeans [1] is considered the Speed-King of dependency injection containers. A blog post of Gehard Petracek showed this pretty nicely [2][3]. But why is this? During performance tests for the big EE application I developed with colleagues since late 2009 (40k users, 5 mio pages/day) we came around a few critical points.

1. Static vs Dynamic

The CDI specification requires to scan all the classes which are in locations which have a META-INF/beans.xml marker file and store them in a big list of Bean instances. This information will only get scanned at startup, checked for inconsistency and further optimized. At runtime all the important information is immediately available without any further processing.

This is afaik different in Spring which allows some kind of ‘dynamic’ reconfiguration. Afaik you can switch some parts of the bean configuration programmatically at runtime. This might be neat for some usecases (e.g. scripting integration) but definitely requires the container to do much more work at runtime. It also prevents to apply some kind of very aggressive caching.

2. Reduce String Operations

In dependency injection containers you often store evaluation results in Maps. We pretty often have constellations similar to the following example:

java.lang.reflect.Method met = getInjectionMethod();
Class clazz = getInjectionClass();
String key = clazz.getName() + "/" + met.getName() + parametersString(met.getParameterTypes());
Object o = map.get(key);

This obviously is nothing I would ever write into a piece of performance intense code! The first step is to use a StringBuilder instead

Using StringBuilder

StringBuilder sb = new StringBuilder();
sb.append(clazz.getName()).append('/').append(met.getName()).append('/');
appendParameterTypes(sb, met.getParameterTypes());
String key = sb.toString();

This is now twice as fast. But still way too slow for us 😉

Increase StringBuilder capacity

If I remember the sources correctly StringBuider by default has a internal buffer capacity of 16 bytes and gets doubled every time this buffer exceeds. In our example above this most times leads to expanding the internal capacity 3 or 4 times (depending on the lenght of the parameters, etc). Every time the capacity gets extended the internal String handling must perform a alloc + memcpy(newbuffer, oldbuffer) + free(oldbuffer). We can reduce this by initially allocating more space already:

StringBuilder sb = new StringBuilder(200);
sb.append(clazz.getName()).append('/').append(met.getName()).append('/');
...

This already helps a lot, the code now runs 3x as fast as with the default StringBuilder(). The downside is that you always need 200 bytes for each key at least. And we are still far from what’s possible performance wise.

Hash based solution

Another solution looks like the following:

long key = clazz.getName().hashCode() + 29L * met.getName().hashCode()
for(Type t:met.getParameterTypes()) {
  key += 29L * t.hashCode();
}
Object o = map.get(key);

This performs much faster than any String based solution. Up to 50 times for 3 String operands to be more precise. Even more if more Strings need to be concatenated. Not only is the key creation much faster because it needs zero memory allocation. It also makes the map access faster as well. The downside is that the hash keys might clash.

Own Key Object

The solution we use in OWB now is to create an own key object which implements hashCode() and equals() in an optimized way.

3. ELResolver tuning

Expression Language based programming is excessively used in JSP and JSF pages and also in other frameworks. It is fairly easy to use, plugable and integrates very well in existing frameworks. But many people are not aware that the EL integration is pretty expensive. It works by going down a chain of registered ELResolver implementations until one of them found the requested object.

Nested EL calls

We will explain the impact of the ELResolver by going through a single EL invocation. Imagine the following line in a JSF page:

<inputText value="#{shoppingCart.user.address.city}">

How many invocations to an ELResolver#getValue() do you think gets executed?

The answer is: many! Let’s say we have 10 ELResolvers in the chain. This is not a fantasy value but comes pretty close to the reality. While evaluating the whole expression, the EL integration splits the given EL-expression parts on the dots (‘.’) and starts the resolving with the outerleft term (“shoppingCart”). This EL-part will be passed down the ELResolver chain until one ELResolver knows the given name. Since the cheap EL-Resolvers which have a high hit-ratio are usually put first our CDI-ELResolver will be somewhere in the middle. This means that we already got approximately 5 other ELResolver invocations before we find the bean…

The Answer: our single EL expressin will roughly perform 30 ELResolver invocations already!

But what can we do to improve the performance of our very own WebBeansELResolver at least?

EL caching

The OpenWebBeans WebBeansELResolver (or any CDI containers ELResolver) will typically execute the following code to get a bean:

Set<Bean> beans = beanManager.getBeans(name);
CreationalContext creationalContext = beanManager.createCreationalContext(bean);
Bean bean = beanManager.resolve(beans);
Object contextualReference = beanManager.getReference(bean, Object.class, creationalContext);

This gives us quite a few things we can cache.

a.) we can cache the found Bean.

b.) for NormalScoped beans (most scopes, except @Dependent) we can even cache the contextualReference because in this case we always will get a Proxy anyway (see the spec: ‘Contextual Reference’).

Negative caching

Searching a bean is expensive. Searching it and NOT finding anything is even more expensive!

To prevent our ELResolver from doing this over and over again, we also cache the misses. This did speed up the OpenWebBeans EL integration big times.

4. Proxy tuning

One of the most elaborated parts in OWB is the possibility to configure custom proxies for any scope.

The job of a Contextual Reference

Proxies in CDI are used for implementing Interceptors, Serialization and mainly for on-the-fly resolving the underlying ‘Contextual Instance’ of a ‘Contextual Reference’. You can read more about this in the latest JavaTechJournal [4] ‘CDI Introduction’ article. In this last part the proxy will lookup the correct Contextual Instance for each method invocation and then redirect the method invocation to this resolved instance. In OpenWebBeans this is done in the NormalScopedBeanInterceptorHandler.

But is it really necessary to always perform this expensive lookup?

Caching resolved Contextual Instances

Let’s take a look at a @RequestScoped bean. Once the servlet request gets started the resolved contextual instance doesn’t change anymore until the end of the request. So it should be possible to ‘cache’ this contextual reference and clean the cache when the servlet request ends. OpenWebBeans does this by providing an own Proxy-MethodHandler for @RequestScoped beans, the RequestScopedBeanInterceptorHandler, which stores this info in a ThreadLocal:

private static ThreadLocal<HashMap<OwbBean, CacheEntry>> cachedInstances 
    = new ThreadLocal<HashMap<OwbBean, CacheEntry>>();

An @ApplicationScoped bean in a WebApp doesn’t change the instance at all once we resolved it. Thus we use an ApplicationScopedBeanInterceptorHandler [5] to even cache more aggressively.

Cache your own Scopes

As Gerhards blog post [2] shows we can also use an OWB feature to configure the builtin and also custom Proxy-MethodHandlers for any given Scope. The configuration is done via OpenWebBeans own configuration mechanism explained in a previous blog post.

Simply create a file META-INF/openwebbeans/openwebbeans.properties which contains a content similar to the following:

org.apache.webbeans.proxy.mapping.javax.enterprise.context.ApplicationScoped=org.apache.webbeans.intercept.ApplicationScopedBeanInterceptorHandler

The config looks like the following:

org.apache.webbeans.proxy.mapping.[fully qualified scope name]=[proxy method-handler classname]

Summary

Why the hell did we put so much time and tricks into Apache OpenWebBeans?

The answer is easy: Our application shows study codes of curricula on a single page. Up to 1600 lines in a <h:dataTable> resulting in > 450.000 EL invocations! We tried this with other containers which used to take 6 seconds to render this page.

With Apache tomcat + Apache OpenWebBeans + Apache MyFaces-2.1.x [6] + JUEL [7] + Apache OpenJPA [8] we are now down to 350ms for this very page …

have fun and LieGrue,
strub

PS: we already explained a few of our tricks to our friends from the Weld community. This resulted in their @RequestScoped beans getting 3 times faster and now being almost as fast as in OWB 😉

[1] Apache OpenWebBeans
[2] os890 blog 1
[3] os890 blog 2
[4] Java-Tech-Journal CDI special
[5] ApplicationScopedBeanInterceptorHandler.java
[6] Apache MyFaces
[7] JUEL EL-2.2 Implementation
[8] Apache OpenJPA