Autonomically Scalable Services in WAN

By | March 23, 2012

So I’ve got only three days left to research on the area and fortunately a big headache as well. Nothing like a pill, coffee and we’re good to go! At least I can blame any inconsistency in what I write with my physical state 😛

After reading “Resource-Aware Migratory Services in Wide-Area Shared Computing Environments”, I took a look at the paper I mentioned previously : “Building Autonomically Scalable Services on Wide-Area Shared Computing Platforms”. The papers are from the same authors and basically covers the issues I mentioned in the previous post, such as the lack of proper policies and metrics.

The authors claim this paper contribution is:

  • Models for estimating service capacity likely to be provided by a replica in the near future.
  • Models for dynamic control of the degree of service replication in order to provision the required aggregate service capacity based on the estimated service capacities of the replicas.
  • A reliable registry service for clients to locate service replicas.

The results show that this system can predict capacity with 95% accuracy, which doesn’t seem bad, depending on the threshold at which the service is replicated.

Overall there were some topics that i found interesting:

  • The load balancing is done at two levels: registry level and node level. At the node level this load balancing can be achieved by either redirecting requests or redirecting clients. Also, the replicas send tokens to each others in order to estimate how much requests they can redirect to each replica.
  • They kept the deployment agents, this agents keep a soft state of the information of replicas and are responsible for the dynamic control of the degree of replication. They create, launch and detect failure of replicas. Interestingly, they create a recovery agent that is responsible for monitoring the deployment agent and revive it if needed. If shutdowns in PlanetLab or in WAN are somewhat frequent, then the probability of both going down in a short period of time shouldn’t be that small. This could render the service vulnerable.
  • They seem to be able to read my mind (before I even think about it), and so they decided to allow reads on the replicas in order to have a better load balancing. The way this replication is done is still not very explicit, maybe I have to go back to the paper that describes Ajanta.

I know I had a few more things to say about it but today I’m the one with limited resources, both mental and physical.

Leave a Reply