Grid Persistence

Performance, scalability and reliability are requirements for many 24/7 web applications. Design, implementation and maintenance of such systems is still a challenge, and you probably will never start on a greenfield but have to integrate existing technologies.

Resoa Persistence architecture was developed under the focus of grid persistence from the beginning. Finding a solution for a platform independent database clustering of several distributed service container nodes was one of the central requirement within Resoa’s early stage design.

Resoa Master / Slave conception

Within a Resoa grid, only one node can act as a persistence MASTER for a service domain. MASTER nodes are responsible for

  • synchronized transaction executions which require consistence checks
  • synchronization of nodes, joining the grid during runtime

The SLAVE nodes task profile:

  • Request the sequential task execution from domain MASTER. The processing of the task execution log must be finished before the accepting of service execution requests.
  • Read requests are always performed against the slave storage back-end.
  • Write/Delete transactions are passed to the domain MASTER for execution
  • Monitoring of the grid for a domain MASTER availability. If the domain master node crashes, a new master node must be assigned automatically.

The Resoa persistence Task Journal

Calling a persistor.store(myObject) can cause many statements (we call it ‘Instruction’) to the persistent backend as object relation resolving might create additional instructions . In case of RDBMS persistence, all relations to complex types must be resolved, additionally there is the requirement of updating potential n:m relation tables.

Resoa runs its own persistence service model, designed within resoa.persistence.journal.xsd. Central class is org.resoa.persistence.journal.Task. All instructions within a task instance are performed under one transaction context.

Task instances are created for every WRITE/DELETE storage requests both on MASTER and SLAVE nodes. Slave nodes store an encrypted JSON representation of every new task within a local domain specific folder (set by the property journalRootDir) before passing the instance to the domain MASTER node. The status of the persistence request changes now to STORAGE_COMMIT.

MASTER nodes are responsible for the task execution. In case of a successful storage transaction the task status changes to MASTER_COMMIT. All domain slave nodes are informed about the execution, the task is added to the domain execution log.

When receiving a task with status STORAGE_COMMIT or CANCELLED, slave nodes delete the local journal file again.  STORAGE_COMMIT tasks are synchronously processed into the back-end storage, the task is added to the domain execution log. The handling of cancelled tasks depend on the Persistence policy (see below).

When a MASTER node crashes, the slave nodes will assign a new domain master. The new master now process all outstanding task files in the journal directory and informs the remaining slave nodes by sending the latest execution logs (standard is 60sec).

The Resoa persistence Policies

Persistence in distributed MASTER / SLAVE systems always is a compromise between transaction speed and the accepted delay in SLAVE node consistency. Resoa currently offers 2 policies:

  • LogfilePersistPolicy: On SLAVE nodes, client storage requests succeed, when the task journal file is written to disk . For MASTER nodes there is no difference to the MasterCommitPolicy
  • MasterCommitPolicy: Client storage requests succeed, when the domain MASTER node has successfully performed the back-end storage transaction. SLAVE nodes might run into a timeout in this cause, although the MASTER might set the STORAGE_COMMIT to the transaction. Check the policy implementation source file documentation for more information about timeout management.

Every node runs a TransactionLogger for persistence request, which run into an error or timeout. Check the org.resoa.persistence.TransactionLogger implementations for more information about.