Clustering

Clustering



	Index
	Main
	Architecture


	Detail
	Tests
	Components
	Security
	Clustering

Abstract

Clustering is a Good Thing, because it allows scalability without the exponential price of huge server boxes.
On the other hand, there is no transparent layer making a JVM transparently replicated on multiple hosts. This means clustering must be handled at application level.

Replication

The problem is that Shaman is made of several high-level components (like Cocoon Webapp or Tomcat HTTP server), but there is no uniform clusterization scheme. Worse, those which have clustering features don't provide anything for "hot" data replication, when data is a program (like a dynamic page for Web front-end), or End-User data (like a course to be published).

For example, Tomcat 4 supports session-aware load-balancing when ran from Apache HTTPD. But there is no tool for dynamically replicating served content. We understand and respect the reasons of that choice, because the replication scheme highly depends of applicative context.

So, our decision is to not support replication features yet. Nevertheless, our design takes care of an hypothetical "let's clusterize it" scenario.
Keeping that perspective in mind helped us to achieve a better design, anyways.

Multiple VMs

The three subsystems (Spirit, Insight, Legend) are designed to run on different physical hosts. This is, by itself, a rudimentary approach of load-balancing.

The "big picture" diagram shows subsystems running in different nodes, which are JVMs running on (potentially) different physical hosts.
It is also possible to make them run in the same Virtual Machine. This is especially useful for unit tests.

Perspectives

Legend

Slide client API considers a URI as a namespace. This implicitly allows several WebDAV servers.

Spirit

Cocoon 2 can virtually take advantage of Tomcat's load balancing feature.
But content-files replication should be handled manually. It's out of the scope of the Shaman project to provide a "distributed Cocoon" tool.

Insight

The replication features that Insight could offer depend of the persistence mechanism used internally. Prevayler allows this, but it would require much addtional work.

Conclusion

We'll wait to see how Shaman behaves in a production context before (eventually) studying a replication solution.
Then, a centralized, dedicated administration tool should be considered.