2013/11/15

Devoxx 2013 - Batch Applications for the Java Platform (JSR 352)

Scott Kurz
JSR352 spec: Batch Applications for the Java Platform
  • JEE7
    • CDI: not required, but used in the ref. spec
  • java6 pre-req
  • SE-friendly (minus transactional req.); but not part of SE...
    • can use other dependency injection frameworks.
implementations (under development!):
  • ref impl: developed exclusively within IBM: readonly for public -- not for production use
  • Spring Batch
  • RedHat
  • Batch EE project Apache (tomee) ???
sequential batch reasons
  • transactional (ACID)
  • efficiency gains
  • massive parallelization not needed
  • facilitate restarts
    • reasons:
      • invalid data
      • lock contentions
      • batch window closed
    • restart:
      • within job: skip completed steps
      • within step: skip processed data-chunks
key concepts (roles)
  • implementor: programming model
    • batchlet: free form
    • chunk: ETL pattern (Extract Transform Load
  • orchestrator:
    • job specification: batch.xml
      • step
      • batchlet
      • support for EL
    • can change depending on implementations!
    • flow control:
      • define exception handling (e.g. skip)
      • interpret and act on return codes
      • retry config (retry with/without rollback)
      • define parallel jobs or steps (aka. partitioning) + support merging (config + programming model (Partion* classes)
  • executor: runtime environment
    • basic components
      • Job Repository
      • JobOperator
    • implementation-specific (not defined in the JSR):
      • clustering, security, ...
      • performance
execution of Job
  • JobInstance
  • JobExecution
  • StepExecution - chunk loop step: one global transaction:
    • ItemReader
      • data access / deserialization of records
      • checkpoint function (positioning)
    • ItemProcessor:
      • core business logic
      • Context gotchas
        • Jobcontext / StepContext: not reliable; consider partitions as running on separate jvms
        • transient data: careful on restart
        • -> design "context" as part of your batch
    • ItemWriter
      • similar to ItemReader, but for output
      • accepts "chunk" of output objects
      • no restrictions on data access
nice first step, looks a bit like EJB 2.x, but, hey, this is only 1.0...:
  • xml config with vendor-specific constructs
  • no really "open" / "usable" ref implementation
  • no "old" and a lot of vendor-specific aspects.
  • no annotations

No comments: