Posts

Learn Apache Flume By Gopal Sir

Image
Apache Flume - Introduction What is Flume? Apache Flume is a tool/service/data ingestion mechanism for collecting aggregating and transporting large amounts of streaming data such as log files, events (etc...) from various sources to a centralized data store. Flume is a highly reliable, distributed, and configurable tool. It is principally designed to copy streaming data (log data) from various web servers to HDFS. Advantages of Flume Here are the advantages of using Flume − ·          Using Apache Flume we can store the data in to any of the centralized stores (HBase, HDFS). ·          When the rate of incoming data exceeds the rate at which data can be written to the destination, Flume acts as a mediator between data producers and the centralized stores and provides a steady flow of data between them. ·          Flume provides the feature of contextual routing. ·          The transactions in Flume are channel-based where two