Tag Archives: flume

Impact of reliability in the scalability of Flume

I will be describing the mechanisms to provide fault tolerance in Flume as well as its impact on scalability due to the increasing number of flows and possible bottlenecks. In this case, reliability is the ability to continue delivering events and logging them in the HDFS in the face of failures, without losing data. This… Read More »

Speed Through installation of Flume-NG on Amazon AWS

This is a thunder-speed (:P) through installation of Flume NG (1.x) on Amazon AWS. I tried to make it as fast as possible. Note that for a final product, more attention should be spent on security issues. Tools used: puttygen putty scp pssh cloudera manager 3.7 flume-ng Operating System: Ubuntu (on my machine) Suse on… Read More »

Flume-based Independent News-Aggregator

As a final project of our scalable distributed systems course, me and two friends decided to implement a system that would read RSS entries from multiple websites and provide access to it through a search engine, web API or a nice webpage. We found that flume would do the trick since it allows us to… Read More »