Impact of reliability in the scalability of Flume

I will be describing the mechanisms to provide fault tolerance in Flume as well as its impact on scalability due to the increasing number of flows and possible bottlenecks. In this case, reliability is the ability to continue delivering events and logging them in the HDFS in the face of failures, without losing data. This… Read More »

Dimemas architecture and Multi-level cache simulations

So the conclusion of the Measurement and Tools course of UPC consisted on delivering a report on the performance analysis of a NAS benchmark on a simulated super computer and to test some parallel applications on different simulated cache architectures. The used tools consisted of intel’s pin tool and Dimemas. From the first, we took… Read More »

Speed Through installation of Flume-NG on Amazon AWS

This is a thunder-speed (:P) through installation of Flume NG (1.x) on Amazon AWS. I tried to make it as fast as possible. Note that for a final product, more attention should be spent on security issues. Tools used: puttygen putty scp pssh cloudera manager 3.7 flume-ng Operating System: Ubuntu (on my machine) Suse on… Read More »

Update on Offline Routing and Regenerator Placement and Dimensioning

I should have posted some results already on this topic since the deadline to deliver the presentation has already passed. It seems that we have made a mistake while implementing the decoder function in the BRKGA java code. The Routing and Placement seem to be well addressed but we are getting wrong values in the dimensioning. Still… Read More »

Flume-based Independent News-Aggregator

As a final project of our scalable distributed systems course, me and two friends decided to implement a system that would read RSS entries from multiple websites and provide access to it through a search engine, web API or a nice webpage. We found that flume would do the trick since it allows us to… Read More »

Biased Random-Key Genetic Algorithm

I have been writing the report on the BRKGA meta-heuristic so I will quote a short description I wrote about this genetic algorithm 🙂 We have chosen the Biased Random-Key Genetic Algorithm (BRKGA) meta-heuristic in order to effectively solve this complex optimization problem. Typically, genetic algorithms evolve a population applying the principle of the survival… Read More »

High Availability of Services in Wide-Area Shared Computing Networks

I finally managed to put together the final report for the research that I have been doing. Unfortunately, we only managed to get access to planetLab less than a week before the final delivery (sending lots of emails to multiple entities). We could only do minor experiments on it. As always, I am a bit… Read More »

Cloud scale identity fabric

This week I did a presentation on the importance of having identity management as a service. It highlights the potentialities of protocols such as OpenID that is used, for example, by Google. It was based on the article of Eric Olden with the same name. It doesn’t rely only on the properties on OpenID, its… Read More »