arrow left
Back to Developer Education

    Getting Started with Spring Boot Batch Processing

    Getting Started with Spring Boot Batch Processing

    Spring Batch is a lightweight, open-source framework created to develop scalable batch processing applications. Batch processing is mostly used by applications that process a large quantity of data at a given time. For example, payroll systems use batch processing to send out payments to employees at a given time of the month. <!--more--> Spring Batch does not include an in-built scheduling framework. It can be used with Quartz or Control-M scheduling frameworks to process data at a scheduled time.

    In this tutorial, we will be developing a Spring Boot application that reads data from a CSV file and stores it in an SQL database (H2 database).

    Table of contents

    Prerequisites

    1. Java Development kit (JDK) installed on your computer.
    2. Some knowledge in Spring Boot.

    Application setup

    • On your browser, navigate to spring intializr.
    • Set the project name to springbatch.
    • Add lombok, spring web, h2 database, spring data jpa, and spring batch as the project dependencies.
    • Click on generate to download the generated project zip file.
    • Decompress the downloaded file and open it on your preferred IDE.

    Data layer

    • Create a new package named domain in the root project package.
    • In the domain package created above, create a file named Customer and add the code below.
    @Entity(name = "person")
    @Getter // Lombok annotation to generate Getters for the fields
    @Setter // Lombok annotation to generate Setters for the fields
    @AllArgsConstructor // Lombok annotation to generate a constructor will all of the fields in the class
    @NoArgsConstructor // Lombok annotation to generate an empty constructor for the class
    @EntityListeners(AuditingEntityListener.class)
    public class Customer {
        @Id // Sets the id field as the primary key in the database table
        @Column(name = "id") // sets the column name for the id property
        @GeneratedValue(strategy = GenerationType.AUTO) // States that the id field should be autogenerated
        private Long id;
    
        @Column(name = "last_name")
        private String lastName;
        @Column(name = "first_name")
        private String firstName;
    
        // A method that returns firstName and Lastname when an object of the class is logged
        @Override
        public String toString() {
            return "firstName: " + firstName + ", lastName: " + lastName;
        }
    }
    

    The class above has an id field for the primary key in the database, lastName and firstName fields that we will be getting from the data.csv file.

    Repository layer

    • Create a new package named repositories in the root project package.
    • In the repositories package created above, create an interface named CustomerRepository and add the code below.
    // The interface extends JpaRepository that has the CRUD operation methods
    public interface CustomerRepository extends JpaRepository<Customer, Long> {
    }
    

    Processor

    • Create a new package named processor in the root project package.
    • In the processor package, create a new Java file named CustomerProcessor then add the code below.
    public class CustomerProcessor implements ItemProcessor<Customer, Customer> {
        // Creates a logger
        private static final Logger logger = LoggerFactory.getLogger(CustomerProcessor.class);
        // This method transforms data form one form to another.
        @Override
        public Customer process(final Customer customer) throws Exception {
            final String firstName = customer.getFirstName().toUpperCase();
            final String lastName = customer.getLastName().toUpperCase();
            // Creates a new instance of Person
            final Customer transformedCustomer = new Customer(1L, firstName, lastName);
            // logs the person entity to the application logs
            logger.info("Converting (" + customer + ") into (" + transformedCustomer + ")");
            return transformedCustomer;
        }
    }
    

    The class above transforms data from one form to another. The ItemProcessor<I, O> takes in the input data (I), transforms it, then returns the result as the output data (O).

    In our case, we have declared the Customer entity as both the input and output, meaning our data form is maintained.

    Configuration layer

    • Create a new package named config in the root project package. This package will contain all of our configurations.
    • In the config package, create a new Java file named BatchConfiguration and add the code below.
    @Configuration // Informs Spring that this class contains configurations
    @EnableBatchProcessing // Enables batch processing for the application
    public class BatchConfiguration {
    
        @Autowired
        public JobBuilderFactory jobBuilderFactory;
    
        @Autowired
        public StepBuilderFactory stepBuilderFactory;
    
        @Autowired
        @Lazy
        public CustomerRepository customerRepository;
    
        // Reads the sample-data.csv file and creates instances of the Person entity for each person from the .csv file.
        @Bean
        public FlatFileItemReader<Customer> reader() {
            return new FlatFileItemReaderBuilder<Customer>()
            .name("customerReader")
            .resource(new ClassPathResource("data.csv"))
            .delimited()
            .names(new String[]{"firstName", "lastName"})
            .fieldSetMapper(new BeanWrapperFieldSetMapper<>() {{
            setTargetType(Customer.class);
            }})
            .build();
        }
    
        // Creates the Writer, configuring the repository and the method that will be used to save the data into the database
        @Bean
        public RepositoryItemWriter<Customer> writer() {
            RepositoryItemWriter<Customer> iwriter = new RepositoryItemWriter<>();
            iwriter.setRepository(customerRepository);
            iwriter.setMethodName("save");
            return iwriter;
        }
    
        // Creates an instance of PersonProcessor that converts one data form to another. In our case the data form is maintained.
        @Bean
        public CustomerProcessor processor() {
            return new CustomerProcessor();
        }
    
        // Batch jobs are built from steps. A step contains the reader, processor and the writer.
        @Bean
        public Step step1(ItemReader<Customer> itemReader, ItemWriter<Customer> itemWriter)
        throws Exception {
    
            return this.stepBuilderFactory.get("step1")
            .<Customer, Customer>chunk(5)
            .reader(itemReader)
            .processor(processor())
            .writer(itemWriter)
            .build();
        }
    
        // Executes the job, saving the data from .csv file into the database.
        @Bean
        public Job customerUpdateJob(JobCompletionNotificationListener listener, Step step1)
        throws Exception {
    
            return this.jobBuilderFactory.get("customerUpdateJob").incrementer(new RunIdIncrementer())
            .listener(listener).start(step1).build();
        }
    }
    
    • In the config package, create another Java class named JobCompletionNotificationListener and add the code below.
    @Component
    public class JobCompletionListener extends JobExecutionListenerSupport {
        // Creates an instance of the logger
        private static final Logger log = LoggerFactory.getLogger(JobCompletionListener.class);
        private final CustomerRepository customerRepository;
    
        @Autowired
        public JobCompletionListener(CustomerRepository customerRepository) {
            this.customerRepository = customerRepository;
        }
    
        // The callback method from the Spring Batch JobExecutionListenerSupport class that is executed when the batch process is completed
        @Override
        public void afterJob(JobExecution jobExecution) {
            // When the batch process is completed the the users in the database are retrieved and logged on the application logs
            if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
            log.info("!!! JOB COMPLETED! verify the results");
            customerRepository.findAll()
            .forEach(person -> log.info("Found (" + person + ">) in the database.") );
            }
        }
    }
    

    Controller layer

    • Create a new package named controllers in the root project package.
    • In the controllers package created above, create a Java class named BatchController and add the code snippet below.
    @RestController
    @RequestMapping(path = "/batch")// Root path
        public class BatchController {
        @Autowired
        private JobLauncher jobLauncher;
        @Autowired
        private Job job;
    
        // The function below accepts a GET request to invoke the Batch Process and returns a String as response with the message "Batch Process started!!".
        @GetMapping(path = "/start") // Start batch process path
        public ResponseEntity<String> startBatch() {
            JobParameters Parameters = new JobParametersBuilder()
            .addLong("startAt", System.currentTimeMillis()).toJobParameters();
            try {
            jobLauncher.run(job, Parameters);
            } catch (JobExecutionAlreadyRunningException | JobRestartException
            | JobInstanceAlreadyCompleteException | JobParametersInvalidException e) {
    
            e.printStackTrace();
            }
            return new ResponseEntity<>("Batch Process started!!", HttpStatus.OK);
        }
    }
    

    Application configuration

    In the resource directory, add the code below in the application.properties file.

    # Sets the server port from where we can access our application
    server.port=8080
    # Disables our batch process from automatically running on application startup
    spring.batch.job.enabled=false
    

    Testing

    Open Postman and send a GET request to http://localhost:8080/batch/start to start the batch process.

    Postman GET request

    After sending the GET request, we can see that the batch process running from the application logs.

    Batch process running

    Conclusion

    Now that you have learned how to execute batch processes, configure the application we have developed to use Spring Boot Scheduler to schedule jobs that run at a given time automatically rather than sending an HTTP call to start a job.

    You can download the complete source code here.

    Happy coding!


    Peer Review Contributions by: Odhiambo Paul

    Published on: Jul 7, 2021
    Updated on: Jul 15, 2024
    CTA

    Start your journey with Cloudzilla

    With Cloudzilla, apps freely roam across a global cloud with unbeatable simplicity and cost efficiency