Applying interceptors to producer methods

Interceptors are really cool if you have a common problem and like to apply it to without making every single colleague copy the same code over again and again to apply a solution over the whole code base.

In my case it was the urge to log out SOAP and REST invocations to other systems. I also like to add a logCorrelationId via HTTP header to each outgoing SOAP call. You can read more about the background over in my other logCorrelation blog post.

I’ll focus on integrating SOAP clients, but you can easily do the same for REST clients as well.

Integrating a SOAP client in an EE project

Usually I create a CDI producer for my SOAP ports. That way I can easily mock them out with a local dummy implementation by just using CDI’s @Specializes or @Alternative. If you combine this with with Apache DeltaSpike @Exclude and the DeltaSpike Configuration system then you can even even enable those Mock via ProjectStage or a configuration setting.

Consider you have a WSDL and you create a SOAP client with the interface CustomerService.

What we like to get from a ‘consumer’ perspective is the following usage:

public class SomeFancyClass {
  private @Inject CustomerService customerService;
  ...
}

Which means you need a CDI producer method, e.g. something like:

@ApplicationScoped
public class CusomerServiceSoapClientProducer {
  @ConfigProperty(name = "myproject.customerService.endpointUrl")
  private String customerServiceEndpointUrl;

  @Produces
  @RequestScoped
  @LogTiming
  public CustomerService createSoapPort() {
    // generated from the WSDL, e.g. via CXF
    CustomerServiceService svc = new CustomerServiceService();
    CustomerServiceServicePort port = svc.getCustomerServiceServicePort();

    // this sets the endpoint URL during producing.
    ((BindingProvider) port).getRequestContext().
           put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY, customerServiceEndpointUrl);

    return port;
  }
}

Side note: the whole class could also be @RequestScoped to get the endpoint URL evaluated on every request. We could of course also use the DeltaSpike ConfigResolver programmatically to gain the same. But the whole point of setting the endpoint URL manually is that we don’t need to change the WSDL and have to recompile the project on every server change. We can also use different endpoints for various boxes (test vs production environments, or different customers) that way.

What is this @LogTiming stuff?

Now it becomes interesting! We now have a SOAP client which looks like a regular CDI bean from a ‘user’ point of view. But we like to get more information about that outgoing call. After all it’s an external system and we have no clue how it behaves in terms of performance. That means we like to protocol each and every SOAP call and log out it’s duration. Of course since we not only have 1 SOAP service client but multiple dozen ones we like to do this via an Interceptor!

@Inherited
@InterceptorBinding
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.TYPE})
public @interface LogTiming {
}

Applying an Interceptor on a producer method?

Of course the code written above DOES work. But it behaves totally different as many of you will guess.
If you apply an interceptor annotation to a producer method, then it will not intercept the calls to the produced bean!
Instead it will just intercept the invocation of the producer method. A producer method gets invoked when the Contextual Instance gets created. For a @Produces @RequestScoped annotated producer method this will happen the first time a method on the produced CDI bean gets called in the very request (or thread for non-servlet request based threads). And exactly this call gets intercepted.

If we would just apply a stopwatch to this interceptor then we would get the info about how long it took to create the soap client. That’s not what we want! We like to get the times from each and every usage of that CustomerService invocation! So what does our LogTiming interceptor do?

Proxying the Proxy

The trick we apply is to to use our LogTiming Interceptor to wrap the produced SOAP port in yet another proxy. And this proxy logs out the request times, etc. As explained before we cannot use CDI interceptors, but we can use java.lang.reflect.Proxy!:

@LogTiming
@Interceptor
public class WebserviceLoggingInterceptor {

    @AroundInvoke
    private Object wrapProxy(InvocationContext ic) throws Exception {
        Object producedInstance = ic.proceed();
        Class[] interfaces = producedInstance.getClass().getInterfaces();
        Class<?> returnType = ic.getMethod().getReturnType();
        return Proxy.newProxyInstance(ClassUtils.getClassLoader(null), interfaces, new LoggingInvocationHandler(producedInstance, returnType));
    }
}

This code will register our reflect Proxy in the CDI context and each time someone calls a method on the injected CustomerService it will hit the LogInvocationHandler. This handler btw can also do other neat stuff. It can pass over the logCorrelationId (explanation see my other blog post linked above) as HTTP header to the outgoing SOAP call.

The final LoggingInvocationHandler looks like the following:

public class LoggingInvocationHandler implements InvocationHandler {
    private static final long SLOW_CALL_THRESHOLD = 100; // ms
 
    private final Logger logger;
    private final T delegate;

    public LoggingInvocationHandler(T delegate, Class loggerClass) {
        this.delegate = delegate;
        this.logger = LoggerFactory.getLogger(loggerClass);
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if (EXCLUDED_METHODS.contains(method.getName())) {
            // don't log toString(), hashCode() etc...
            return method.invoke(delegate, args);
        }

        long start = System.currentTimeMillis();

        try {
            // setting log correlation header if any logCorrelationId is set on the thread.
            String logCorrelationId = LogCorrelationUtil.getCorrelationId();
            if (StringUtils.isNotEmpty(logCorrelationId) && delegate instanceof BindingProvider) {
                BindingProvider port = (BindingProvider) delegate;
                Map<String, List> headers = (Map<String, List>) port.getRequestContext().get(MessageContext.HTTP_REQUEST_HEADERS);
                if (headers == null) {
                    headers = new HashMap<>();
                }
                headers.put(LogCorrelationUtil.REQUEST_HEADER_CORRELATION_ID, Collections.singletonList(logCorrelationId));
                port.getRequestContext().put(MessageContext.HTTP_REQUEST_HEADERS, headers);
            }

            // continue with the real call
            return method.invoke(delegate, args);
        }
        finally {
            long duration = System.currentTimeMillis() - start;
            if (duration <= SLOW_CALL_THRESHOLD) {
                logger.info("soapRemoteCall took={} ms service={} method={}", duration, delegate.getClass().getName, method.getName());
            }
            else {
                // log a more detailed msg, e.g. with params
            }
        }
    }

Limitations

Of course this trick only works if the producer method returns an interface! That’s caused by the reflect Proxies are only available for pure interfaces.

I’m trying to remove this limitations by bringing intercepetors for produced instances to CDI-2.0 as well on working on a interceptors spec change to introduce ways to create subclassing proxies as easy as interface proxies. Stay tuned!

What is LogCorrelation?

While working on an article about CDI interceptors on producer methods I mention logCorrelation. I will not go into detail on this topic over at the other blog post as it would be simply too much over there. And this gives a great topic for a separate post anyway. And here we go…

So what is LogCorrelation?

Consider you have a more or less distributed application topology. You might have a server which does maintain customer data. There might be another box handling all the document archive, another one which holds the calculation kernel, etc etc.

Nowadays all people would say that are microservices. 8 years ago all people called it SOA. To be honest I GIVE A SHIT how some sales people name it as all this is around since much longer than I’m working in the industry (which is a whoopie 25 years already). It’s just modular applications talking with each other somehow. Sometimes via SOAP or REST, but maybe even via MessageQueue, shared database tables or file based with batches handling the passing over – to me it doesn’t matter much.

But for all those the problem is the same: Consider a user clicks on some button in his browser or fat client. This triggers an ‘application action’. And this single action might hit the first server, then this server pings another one, etc. Synchronous or asynchronous also doesn’t matter. This might go all over the place in your company and even externally.  At the end something happens and the user most times gets some response. And it is really, REALLY hard to tell what’s wrong and where it went wrong if something doesn’t work as expected or returns wrong results. Often you don’t even have a clue which servers were involved. And if your whole application starts to behave ‘laggy’ then you will have a hard time judging which system you need to tinker with.

Now how cool would it be if you could follow this single action over all the involved servers?

And this is exactly what logCorrelation does!

What is the trick?

The trick is really plain simple. Each ‘action’ gets an own unique logCorrelationId. That might be a UUID for example. The only requirement is that it’s ‘decently’ unique.

If a server gets a request then he checks if there was a logCorrelationId passed to him. If so, then he takes this id and stores it in a ThreadLocal. If there was no id passed, then this is a ‘new’ action and we generate a fresh logCorrelationId. Of course this logCorrelationId will also get set as e.g. HTTP header for all subsequent outgoing HTTP calls on this very thread.

Where do I show the logCorrealationId?

Our applications now all handle this logCorrelationId properly, but where can I look at it? What is the benefit of all this?

At my customers I mainly use Apache log4j as logging backend, (often with slf4j as API). The point is that only log4j (and logback, but with way worse performance) support a nice little feature called MDC which stands for Mapped Diagnostic Context.  It is basically a ThreadLocal<Map<String, String>> which will get logged out in each and every line you log out on this very thread.

This log4j feature can also be accessed via the slf4j API. E.g. in a javax.servlet.Filter

MDC.set("correlationId", logCorrelationId);
MDC.set("sessionId", httpSessionId);
MDC.set("userId", loggedInUser.getId());

For enabling it in the log output you need to configure a ConversionPattern:

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
    <appender name="console" class="org.apache.log4j.ConsoleAppender">
        <param name="Target" value="System.out"/>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d{ISO8601} [%t] %X{sessionId} %X{userId} %X{correlationId} %-5p %c{2} %m%n"/>
        </layout>
    </appender>

If you logging is configured properly in your company and you funnel all back into log aggregation systems like ELK (OpenSource with commercial support offering) or Splunk (Commercial with limited free offering) then you can now simply follow a single action over all the various systems.

What about non-user requests?

Non user requests can sometimes even be filled with more information. At a few customers we use Camunda BPMN Suite (OpenSource with commercial support). The core has a Thread which basically polls the DB and fetches new tasks to execute from it. Those will then get ‘executed’ in a parallel thread. For those threads we intercept the Executor and fill the logCorrelationId with the camunda jobId which basically is a uuid starting with ‘cam-‘. So once a process task blows up we can exactly figure what went wrong – even on a different server.

Of course this trick is not limited to the process engine…

PS: how does my stuff look like?

Probably it’s also worth sharing my LogCorrelationUtil:

/**
 * Helper for log correlation.
 *
 * @author Mark Struberg
 */
public class LogCorrelationUtil {

    public static final String REQUEST_HEADER_CORRELATION_ID = "X_LOG_CORRELATION_ID";
    public static final String MDC_CORRELATION_ID = "correlationId";

    private LogCorrelationUtil() {
    }

    /**
     * Creates a new log correlation Id ONLY IF there is no existing one!.
     * Takes about 4 uS, because I use a much faster UUID algo
     *
     * @param logEnvironment prefix for the logCorrelationId if a new one has to be created. Determines the environment the uuid got created in.
     * @param existingLogCorrelationId oder {@code null} if there is none yet
     */
    public static String enforceLogCorrelationId(LogEnvironment logEnvironment, String existingLogCorrelationId) {
        if (existingLogCorrelationId != null && existingLogCorrelationId.length() > 0) {
            return existingLogCorrelationId;
        }
        ThreadLocalRandom random = ThreadLocalRandom.current();
        String uuid = new UUID(random.nextLong(), random.nextLong()).toString();

        if (logEnvironment != null) {
            StringBuilder sb = new StringBuilder(60);
            sb.append(logEnvironment);
            sb.append(uuid);
            uuid = sb.toString();
        }
        return uuid;
    }

    /**
     * @return the logCorrelationId for this thread or {@code null}
     */
    public static final String getCorrelationId() {
        return MDC.get(MDC_CORRELATION_ID);
    }

    /**
     * Set the given logCorrelationId for the current Thread.
     */
    public static final void setCorrelationId(String logCorrelationId) {
        MDC.put(MDC_CORRELATION_ID, logCorrelationId);
    }

    /**
     * Clears the logCorrelationId from the current Thread.
     * This method MUST be called at the end of each request 
     * to prevent mem leaks!
     */
    public static final void clearCorrelationId() {
        MDC.remove(MDC_CORRELATION_ID);
    }
}

Being ‘unstoppable’ – a Batchlets tale

How to stop a JBatch Batch

JSR-352 (JBatch) is a great specification. It takes care of many situations a user don’t think about most times. For example how to stop a Batch. But sometimes it cannot relief you from putting some brain into it.

How to trigger a Batch stop?

The JobOperator has a method to stop a specific execution: JobOperator#stop(long executionId).

Of course the JobOperator will not immediately kill the worker thread with this batch but tries to gracefully shut down the Batch.

Stopping a ‘Chunk Step’

First, what is a ‘Chunk’? A chunk is a batch <step> which consists of an ItemReader an optional ItemProcessor and an ItemWriter. A ‘chunk’ defines the transaction size of the processing. Let’s consider a chunk-size of 10. This means that our step processes 10 items and then commits all of them in a single commit.

The processing order is as following
ItemReader, ItemProcessor, ItemReader, ItemProcessor,… until we did read and process our 10 items. After that all the 10 items will get handed over to the ItemWriter to store them somewhere. After that a commit happens and the loop starts over with the next items.

If you call JobOperator#stop(executionId) for a Chunk Step then the loop which invokes the ItemReader, ItemProcessor and ItemWriter will continue with reading and processing the current Item and then hand over all the currently chained Items to the ItemWriter. After that the loop exits gracefully.

That’s nice and clean! But what about Batchlets?

Stopping a ‘Batchlet’

There is a good reason why I write this post today. In the last few weeks we had a few Batchlets which didn’t behave ‘nice’ to our ops team. Those beasts didn’t want to stop working! Of course the problem only occurred in production and not in all the tests done before. Simply because in production we have millions of items to process whereas in the test they just fed in a a few thousand items.

So why didn’t those Batchlets stop?

First we have to understand what a Batchlet is. Opposite to a Chunk Step a Batchlet is a ‘do-it-yourself’ thingy. The JBatch runtime really hands over all the control to your code. It doesn’t even do Transactions for you! It is really all in your hands. Usually such a batchlet contains a processing loop as well:

public class MyBatchlet extends AbstractBatchlet {
  @Override
  public String process() throws Exception {
    List items = readAllItemsToProcess();
    for (MyItem item : items) {
      processAndStoreMyItem();
    }
    return "OK";
  }
}

That’s nice…. but won’t stop for you…

So what is missing? Yes, AbstractBatchlet implements an empty stop() method. And this is often a bad idea…

Our code should better look somehow like the following:

public class MyBatchlet implements Batchlet {
  private volatile boolean shouldStop = false;

  @Override
  public void stop() {
    shouldStop = true;
  }

  @Override
  public String process() throws Exception {
    List items = readAllItemsToProcess();
    for (MyItem item : items) {
      processAndStoreMyItem();
      if (shouldStop) {
        return "STOPPING";
      }
    }
    return "OK";
  }
}

There are a few important details:
1.) the boolean shouldStop field really needs to be volatile. That is because the stop() method gets called from a different thread and otherwise the new value might not be visible to the worker thread. Read up more on volatile over at the excellent Angelika Langers Java Memory Model talk.

2.) I’m thinking about preventing the usage of “extends AbstractBatchlet” via a checkstyle rule. It’s actually not worth having this AbstractBatchlet. People should be aware that they missed the stop() functionality!

JTA vs resource-local performance

A few years ago I did a simple test to compare how JTA handling (via UserTransaction) compares to resource local transaction handling. Back then using JTA had a rather big impact on the performance. Time to repeat this test with a modern EE server.

So I went on and created a very simple JPA sample which does a loop and creates 1000 Customer entries in a simple h2 memdb. I did choose h2 memdb because it’s pretty fast. At least much faster than any production ready DB which stores the stuff on a disk. In the end we like to know the performance of JTA and not bench the database.

My simple sample can be downloaded at https://github.com/struberg/jtabench
To start it just run
$> mvn clean install tomee:run

So far my tests don’t show a huge problem.

When I run the benchmark against the resource-local part (http://localhost:8080/jtabench/customer/nonjta) I get
Resource-Local: 21.6 pages/second.
That means 21600 inserts per second.

If I do the very same benchmark against the JTA part (http://localhost:8080/jtabench/customer/jta) I get about
JTA: 19.0 pages/second.
And please remember that h2 memdb is really fast! Thus with a real database load the difference will simply be negligible.

If you reproduce the test yourself locally then don’t forget to clean the databases inbetween benchmark runs with http://localhost:8080/jtabench/customer/reset . This will delete all temporarily created Customer entries in the dbs.

Note: I’m not quite sure how much optimization geronimo-tx applies if there is only a single DataSource involved. Need to dig that myself. Probably will provide a follow up test with 2++ different databases…

The (mostly) unknown story behind javax.ejb.EJBException

Yesterday I blogged about what impact Exceptions do have on JavaEE transactions in EJB.
But there is another funny EJB Exception mechanism waiting for you to get discovered – the javax.ejb.EJBException.

This handling dates back to the times when EJB was mainly intended to be a competition to NeXT Distributed Objects and Microsoft DCOM. Back in those days it was all about ‘Client-Server Architecture’ and people tried to spread the load to different servers on the network. A single EJB back then needed 4 classes and was inherently remote by default.

Only much later EJBs got a ‘local’ behaviour as well. And only in EJB-3.1 the No-Interface View (NIV) got introduced which made interfaces obsolete and are local-only.
But for a very long time remoting was THE default operation mode of EJBs. So all the behaviour was tailored around this – regardless whether you are really using remoting or are running in the very same JVM.

The impact of remoting

The tricky situation with remote calls is that you cannot be sure that every class is available on the client.

Imagine a server which uses JPA. This might throw a javax.persistence.EntityNotFoundException. But what if the caller – a Swing EJB client app – doesn’t have any JPA classes on it’s classpath?
This will end up in a ClassNotFoundException or NoClassDefFoundException because de-serialisation of the EntityNotFoundException will blow up on the client side.

To avoid this from happening the server will serialize a javax.ejb.EJBException instead of the originally thrown Exception in certain cases. The EJBException will contain the original Exceptions stack trace as pure Strings. So you at least have the information about what was going wrong in a human readable format.

If you like to read up the very details then check out 9.4 Client’s View of Exceptions in the EJB specification.

Switching on the ‘Auto Pilot”

Some containers like e.g. OpenEJB/TomEE contain a dual-strategy. We have a ‘container’ (ThrowableArtifact) which wraps the orignal Throwable plus the String interpretation and sends both informations as fallback over the line.

On the client side the de-serialization logic of ThrowableArtifact first tries to de-serialize the original Exception. Whenever this is possible you will get the originally thrown Exception on the client side. If this didn’t work then we will use the passed information and instead of the original Exception we throw an EJBException with the meta information as Strings.

The impact on your program?

The main impact for you as programmer is that you need to know that you probably not only need to catch the original Exception but also an EJBException. So this really changes the way your code needs to be written.
And of course if you only got the EJBException then you do not exactly know what was really going on. If you need to react on different Exceptions in different ways then you might try to look it up in the exception message but you have no type-safe way anymore. In that case it might be better to catch it on the server side and send an explicit @ApplicationException over the line.

When do I get the EJBException and when do I get the original one?

I’d be happy to have a good answer myself ;)

My experience so far is that it is not well enough specified when each of them gets thrown. But there are some certain course grained groups of container behaviour:

  • Container with Auto-Pilot mode; Those containers will always try to give you the original Exception. And only if it is really not technically possible will give you an EJBException. E.g. TomEE works that way.
  • Container who use the original Exception for ‘local’ calls and EJBException for remote calls.
  • Container who will always give you an EJBException – even for local invocations. I have not seen those for quite some time though. Not sure if this is still state of the art?

Any feedback about which container behaves what way is welcome. And obviously also if you think there is another category!

Transaction handling in EJBs and JavaEE7 @Transactional

Handling transactions in EJBs is easy, right? Well, in theory it should be. But how does the theory translate into reality once you leave the ivory tower?

I’ll show you a small example. Let’s assume we have 2 infrastructure service EJBs:

@Stateless
public class StorageServiceImpl implements StorageService {
  private @EJB CustomerService customerService;
  private @PersistenceContext EntityManager em;

  public void chargeStorage(int forYear) throws CustomerNotFoundException {
    storeNiceLetterInDb(em);
    Customer c = customerService.getCurrentCustomer(); 
    doSomethingElseInDB(); 
  }
} 

And now for the CustomerService which is an EJB as well:

@Stateless
public class CustomerServiceImpl implements CustomerService {
  public Customer getCurrentCustomer() throws CustomerNotFoundException {
    // do something if there is a current customer
    // otherwise throw a CustomerNotFoundException
  }
}

The Sunshine Case

Let’s first look at what happens if no problems occur at runtime.

In the normal operation mode some e.g. JSF backing bean will call storageService.chargeService(2015);. The implicit transaction interceptor will use a TransactionManager (all done in the interceptor which you do not see in your code) to check whether a Transaction is already open. If not it will open a new transaction and remember this fact. The same check will happen in the implicit transaction interceptor for the CustomerService.

When leaving CustomerService#getCurrentCustomer the interceptor will recognize that it didn’t open the transaction and thus will simply return. Otoh when leaving StorageService#chargeStorage it’s interceptor will commit the transaction and close the EntityManager.

Broken?: Handling checked Exceptions

Once we leave the sunny side of the street and hit some problems the whole handling start to become messy. Let’s look what happens if there is a checked CustomerNotFoundException thrown in CustomerService#getCurrentCustomer. Most people will now find their first surprise: The database changes done in storeNiceLetterInDb() will get committed into the database.

So we got an Exception but the transaction still got committed? WT*piep*!
Too bad that this is not a bug but the behaviour is exactly as specified in “9.2.1 Application Exceptions” of the EJB specification:

An application exception does not automatically result in marking the transaction for rollback unless the ApplicationException annotation is applied to the exception class and is specified with the rollback element value true…

So this means we could annotate the CustomerNotFoundException with @javax.ejb.ApplicationException(rollback=true) to force a rollback.
And of course we need to do this for ALL checked exceptions if we like to get a rollback.

Broken?: Handling unchecked Exceptions

The good news upfront: unchecked Exceptions (RuntimeExceptions) will usually cause a rollack of your transaction (unless annotated as @AppliationException(rollback=false) of course).

Let’s assume there is some other entity lookup in the code and we get a javax.persistence.EntityNotFoundException if the address of the customer couldn’t be found. This will rollback your transaction.

But what can we do if this is kind of expected and you just like to use a default address in that case? The natural solution would be to simply catch this Exception in the calling method. In our case that would be a try/catch block in StorageServiceImpl#chargeStorage.

That’s a great idea – but it doesn’t work in many containers!

Some containers interpret the spec pretty strictly and do the Exception check on _every_ layer (EJB spec 9.3.6) . And if the interceptor in the CustomerService detects an Exception then the implicit EJB interceptor will simply roll back the whole transaction and mark it as “rollbackOnly”. Catching this Exception in an outer level doesn’t help a bit. You will not get your changes into the database. And if you try to do even more on the database then you will blow up again with something like “The connection was already marked for rollback”.

And how is that with @javax.transaction.Transactional?

Basically the same like with EJBs. In my opinion this was a missed chance to clean up this behaviour.
You can read this up in chapter 3.6 of the JTA-1.2 specification.

The main difference is how to demarcate rollback vs commit exceptions. You can use the rollbackOn and dontRollbackOn attributes of @Transactional:

@Transactional(rollbackOn={SQLException.class}, dontRollbackOn={SQLWarning.class})

Now what about DeltaSpike @Transactional?

In Apache DeltaSpike @Transactional and it’s predecessor Apache MyFaces CODI @Transactional we have a much cleaner handling:

Exceptions only get handled on the layer where the transaction got opened. If you catch an Exception along the way than we do not care about it.

Any Exception on the outermost layer will cause a rollback of your transaction. It doesn’t matter if it is a RuntimeException or a checked Exception.

If there was no Exception in the outermost interceptor then we will commit the transaction.

PS: please note that I explicitly used interfaces in my samples. Otherwise you will get NIV (No Interface View) objects which again might behave slightly different as they use a totally different proxying technique and default behaviour. But that might be enough material for yet another own blog post.
PPS: I also spared you EJBs with TransactionManagementType.BEAN. That one is also pretty much non-portable by design as you effectively cannot nest them as it forces you to either commit or rollback the tx on every layer. Some containers work fine while others really force this.

The right Scope for JBatch Artifacts

In my recent JavaLand conference talk about JSR-352 JBatch and Apache BatchEE I briefly mentioned that JBatch Artifacts should have a scope of @Dependent (or Prototype scope if you are using Spring). Too bad there was not enough time to dig into the problem in depth so here comes the detailed explanation.

What is a JBatch Artifact

A JBatch batch needs a Job Specification Language XML file in META-INF/batch-jobs/*.xml files. These files describes how your batch job is built up.

Here is a small example of how such a batch JSL file could look like

<job id="mysamplebatch" version="1.0" xmlns="http://xmlns.jcp.org/xml/ns/javaee">
  <step id="mysample-step">
    <listeners>
      <listener ref="batchUserListener" >
      <properties>
        <property name="batchUser" value="#{batchUser}"/>
      </properties>
      </listener>
    </listeners>
    <batchlet ref="myWorkerBatchlet">
      <properties>
        <property name="inputFile" value="#{inputFile}"/>
      </properties>
    </batchlet>
  </step>

In JSR-352 an Artifact are all pieces which are defined in your JBatch JSL file and get requested by the container. In the sample above this would be batchUserListener and myWorkerBatchlet.

The following types can be referenced as Batch Artifacts from within your JSL:

  • Batchlets
  • ItemReader
  • ItemProcessor
  • ItemWriter
  • JobListener
  • StepListener
  • CheckpointAlgorithm
  • Decider
  • PartitionMapper
  • PartitionReducer
  • PartitionAnalyzer
  • PartitionCollector

The Batch Artifact Lifecycle

The JBatch spec is actually pretty clear what lifecycle needs to get applied on Batch Artifacts:

11.1 Batch Artifact Lifecycle
All batch artifacts are instantiated prior to their use in the scope in which they are declared in the Job XML and are valid for the life of their containing scope. There are three scopes that pertain to artifact lifecycle: job, step, and step-partition.
One artifact per Job XML reference is instantiated. In the case of a partitioned step, one artifact per Job XML reference per partition is instantiated. This means job level artifacts are valid for the life of the job. Step level artifacts are valid for the life of the step. Step level artifacts in a partition are valid for the life of the partition.
No artifact instance may be shared across concurrent scopes. The same instance must be used in the applicable scope for a specific Job XML reference.

The problem is that whenever you use a JavaEE artifact then you might get only a proxy. Of course the reference to this proxy gets thrown away correctly but the instance behind the proxy might survive. Let’s look at how this works internally.

How Batch Artifacts get resolved

A JBatch implementation can provide it’s own mechanism to load the artifacts. This is needed as it is obviously different whether you use BatchEE with CDI or if you use Spring Batch (which also implements JSR-352).
In general there are 3 different ways you can reference a Batch Artifact in your JSL:

  1. Via a declaration in an optional META-INF/batch.xml file. See the section 10.7.1 of the specification for further information.
  2. Via it’s fully qualified class name.
    In BatchEE we first try to get the class via BeanManager#getBeans(Type) and BeanManager#getReference. If that doesn’t give us the desired Contextual Reference (CDI proxy) then we simply call ClassLoader#loadClass create the Batch Artifact with newInstance() and apply injection into this instance
  3. Via it’s EL name. More precisely we use BeanManager#getBeans(String elName) plus a call to BeanManager#getReference() as shown above.

We now know what a Batch Artifact is. Whenever you are on a JavaEE Server you will most likely end up with a CDI or EJB Bean which got resolved via the BeanManager. If you are using Spring-Batch then you will most times get a nicely filled Spring bean.

The right Scope for a Batch Artifact

I’ve seen the usage of @javax.ejb.Stateless on Batch Artifacts in many samples. I guess the people writing such samples never used JBatch in real production yet ;) Why so? Well, let’s look at what would happen if we implement our StepListener as stateless EJB:

@javax.ejb.Stateless
@javax.inject.Named // this makes it available for EL lookup
public class BatchUserListener implements StepListener {
  @Inject 
  @BatchProperty
  private String batchUser;

  @Override
  public void beforeStep() throws Exception {
     setUserForThread(   
  }

  @Override
  public void afterStep() throws Exception {
    clearUserForThread();
  }
}

Now let’s assume that the BatchUserListener gets not only used in my sample batch but in 30 other batches of my company (this ‘sample’ is actually taken from a HUGE real world project where we use Apache BatchEE since over a year now).

What will happen if e.g. a ‘DocumentImport’ batch runs before my sample batch? The first batch who uses this StepListener will create the instance. At the time when the instance gets created by the container (and ONLY at that time) it will also perform all the injection. That means it will look up the ‘batchUser’ parameter and injects it into my @BatchProperty String. Let’s assume this DocumentImport batch uses a ‘documentImportUser’. So this is what we will get injected into the ‘batchUser’ variable;

Once the batch step is done the @Stateless instance might be put back into some pool cache. And if I’m rather unlucky then exactly this very instance will later get re-used for mysample-step. But since the listener already exists there will no injection be performed on that instance. What means that the steplistener STILL contains the ‘documentImportUser’ and not the ‘mySampleUser’ which I explicitly did set as parameter of my batch.

The very same issue also will happen for all injected Variables which do not use proxies, e.g.:

  • @Inject StepContext
  • @Inject JobContext

TL;DR: The Solution

Use @Dependent scoped beans for your Batch Artifacts and only use another scope if you really know what you are doing.

If you like to share code across different items of a Step or a Job then you can also use BatchEE’s @StepScoped and @JobScoped which is available through a portable BatchEE CDI module

Follow

Get every new post delivered to your Inbox.