Camunda meets Cassandra orn Horstman, AndrÊ Hartmann and Lukas Niemeier from Zalando Tech visited us yesterday evening to present their prototype for running Camunda engine on Apache Cassandra Training. They published their slides. Zalando is a "multinational e-commerce company that specialises in selling shoes, clothing and other fashion and lifestyle products online". 2014 they had a revenue of ₏2.3 billion and currently have 8,500 employees (Source: Wikipedia). AND: they are Camunda enterprise edition subscribers and use Camunda process engine for processing their orders. Whenever you buy something in their online shop, a process instance is kicked off in Camunda process engine. Zalando's current Architecture Zalando's system needs to scale horizontally. Currently Zalando's order processing runs on PostgreSQL database. They partition their order and process engine data over 8 Shards. Each shard is an independent instance of PostgreSQL. Such an "instance" is a small cluster with replication for performance and failover. At the application server level they run on Apache Tomcat and use the Spring Framework. For each shard, they have a datasource for which they create an instance of camunda process engine. This is replicated over 16 nodes. When a new order comes in, it is assigned to one of the shards and an entry is made in a global mapping table, mapping orderIds to shards. Then the corresponding process instance is stated in that shard.
Zalando's current Architecture When messages come in, they are first correlated to an order. When the order is resolved, they can deduce the responsible shard for the order and resolve the corresponding process engine. They say that this works quite well but has some drawbacks: They need to implement the sharding themselves, The mapping table needs to be maintained, Queries must be done against all of the shards and data must be aggregated manually. In the context of their "Hackweek", they looked into Apache Cassandra as an alternave. The Cassandra Prototype