Friday, 21 October 2011

The Spring Batch framework's simplest tutorials (for 2.1.8)

Getting going with the Spring Batch framework is no mean feat. The official documentation is 136 pages long, and whilst incredibly useful, doesn't quite provide that first step on the ladder in the form of functioning tutorials/examples.

The Springsource website does also provide a great number of workable samples and downloads, which again, are fantastic when you are up and running with it, but are limited in their use until, again, you have that foot on the ladder.

After some searching I did find three brilliant tutorials, which provided a great deal of information to me. The tutorials are on 0xCAFEBABEs blog here:

Spring Batch Hello World 1

Spring Batch Hello World 2
and
Spring Batch Hello World 3

The only drawback is that they are a bit "2008" and therefore do not work out of the box with Spring 3 and Spring Batch 2.x.

At this point, I would say, if you are keen to learn about Spring Batch from the starting line, I would do what I did, and simply follow the tutorials. You'll have to fight your way through the stack-traces, exceptions and compiler errors, but this will really educate you, and force you to examine your XML, and the API docs, rather than just copy and pasting the examples into Eclipse and being done with it.

However, to give this post some beef, I will outline the updates I made to the tutorials for deployment on Spring 3 and Spring Batch 2.1.8. This might not be the best, or only way to modify these tutorials, and I would welcome any alternatives or feedback in the comments section.

Spring Batch Hello World 1 (update)

Once you have been through the tutorial, you'll first be faced with some compiler errors in PrintTasklet.java. This is because the Spring Batch 2.x tasklet interface has changed, it now takes a StepContribution object and ChunkConext as parameters to it's execute function, and it's return type is a RepeatStatus. This is made clear in the updated Springsource API docs. Ultimately, your class will now look like this:

public class PrintTasklet implements Tasklet {     private String message;     public void setMessage(String message) {         this.message = message;     }         public RepeatStatus execute(StepContribution arg0, ChunkContext arg1) throws Exception {         System.out.print(message);         return RepeatStatus.FINISHED;     } }

There are also a couple of configuration changes you will need to make. In your applicationContext.xml, the jobRepository bean will need a forth constructor-arg (again, as you can see in the API), bringing it's definition to:

 <bean id="jobRepository" class="org.springframework.batch.core.repository.support.SimpleJobRepository">
    <constructor-arg>
        <bean class="org.springframework.batch.core.repository.dao.MapJobInstanceDao"/>
    <constructor-arg>
    <constructor-arg>
        <bean class="org.springframework.batch.core.repository.dao.MapJobExecutionDao" />
    </constructor-arg>
    <constructor-arg>
        <bean class="org.springframework.batch.core.repository.dao.MapStepExecutionDao"/>
    </constructor-arg>
    <constructor-arg>
        <bean class="org.springframework.batch.core.repository.dao.MapExecutionContextDao"/>
    </constructor-arg>
 </bean>


You will also need to add a definition for a transactionManager bean to your appliationContext.xml:

<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>

In the simpleJob.xml, you must pass this transactionManager into the taskletStep bean, making it's definition:

<bean id="taskletStep" abstract="true" class="org.springframework.batch.core.step.tasklet.TaskletStep">
    <property name="jobRepository" ref="jobRepository"/>
    <property name="transactionManager" ref="transactionManager"/>
</bean>

And that should get you working, use the batch file from the tutorial to launch your job from the cmd line.

Spring Batch Hello World 2 (update)

It is no more difficult getting the second tutorial up and running in your 2.1.8 environment too. In fact, it's a similar set of changes. ParameterPrintTasklet.java needs the same update, to become:

public class ParameterPrintTasklet extends StepExecutionListenerSupport
        implements Tasklet {

    private String message;

    public void beforeStep(StepExecution stepExecution) {
        JobParameters jobParameters = stepExecution.getJobParameters();
        message = jobParameters.getString("message");
    }

    public RepeatStatus execute(StepContribution stepcontribution,
            ChunkContext chunkcontext) throws Exception {
        System.out.println(message);
        return RepeatStatus.FINISHED;
    }
}


and so long as you have kept your applicationContext.xml up to date from the above tutorial, all you need to change in the simpleJob.xml is to pass the transactionManager bean into your taskletStep (note this bean is now defined within the parameterJob bean):

<property name="steps">  
  <list>  
    <bean class="org.springframework.batch.core.step.tasklet.TaskletStep">  
      <property name="tasklet" ref="print"/>  
      <property name="jobRepository" ref="jobRepository"/>  
      <property name="transactionManager" ref="transactionManager"/>
    <bean>  
  <list>  
</property> 


Spring Batch Hello World 3 (update)

Despite being the most (functionally) complicated example, there is not a massive amount to change in tutorial 3 to make it work in the modern world. Basically, the definitions of the beans need updating to reflect the new constructors and properties of the v2.1.8 Spring Batch API.

To keep it really short and sweet, here is the simpleJob.xml once I had finished with the update:


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd">
                                
    <import resource="applicationContext.xml"/>
    
    <!-- Set up our reader and its properties -->
    <bean id="itemReader" class="org.springframework.batch.item.file.FlatFileItemReader">  
      <property name="resource" value="file:hello.txt" />  
      <property name="recordSeparatorPolicy" ref="recordSeparatorPolicy" />  
      <property name="lineMapper" ref="lineMapper" />  
    </bean>

    <bean id="recordSeparatorPolicy" class="org.springframework.batch.item.file.separator.SimpleRecordSeparatorPolicy"/>
    
    <bean id="lineMapper" class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="lineTokenizer" ref="lineTokenizer" />
        <property name="fieldSetMapper" ref="fieldSetMapper" />
    </bean>
    
    <bean id="fieldSetMapper" class="org.springframework.batch.item.file.mapping.PassThroughFieldSetMapper" />
    
    <bean id="lineTokenizer" class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
        <constructor-arg value="," />
    </bean>
    
    <!-- Set up our writer, and it's properties -->
    <bean id="itemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">  
      <property name="resource" value="file:hello2.txt" />
      <property name="lineAggregator" ref="lineAggregator"/>
    </bean>

    <bean id="lineAggregator" class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">  
      <property name="delimiter" value=" "/>
    </bean>
    
    <!-- Set up our transformation step with these beans in -->      
    <bean id="step" class="org.springframework.batch.core.step.item.SimpleStepFactoryBean">  
      <property name="transactionManager" ref="transactionManager" />  
      <property name="jobRepository" ref="jobRepository" />  
      <property name="itemReader" ref="itemReader" />  
      <property name="itemWriter" ref="itemWriter" />  
    </bean>
    
    <!-- Set up our job to run said step -->
    <bean id="readwriteJob" class="org.springframework.batch.core.job.SimpleJob">  
      <property name="name" value="readwriteJob" />  
      <property name="steps">  
        <list>  
          <ref local="step"/>  
        </list>  
      </property>  
      <property name="jobRepository" ref="jobRepository"/>
    </bean>  
    
</beans>


The big differences are in the definitions of FlatFileItemReader and FlatFileItemWriter.

The DelimitedLineTokenizer from 1.1.4 is no longer a direct property of the FlatFileItemReader, and instead needs to be passed into a lineMapper bean, which in turn is then injected into the FlatFileItemReader.

The FlatFileItemWriter no longer needs a fieldSetCreator bean, but leave the bean definition in as the itemReader does still need it.


Don't forget to give your SimpleStepFactoryBean the transactionManager, and you should be good to go. The paths of the resource files are relative to the root of the Java project.


4 comments:

  1. Thanks for pointing out the right articles to kickstart spring batch. Gives some motivation to finally jump in to batch.

    ReplyDelete
    Replies
    1. Thanks Anupam. Please let me know how you get on with this post, as you can see, I finished it about a year ago now and I'd love to know if it still all works and is relevant!

      Delete
  2. nice post for beginners. Thanks

    ReplyDelete