Sink component destinations include hdfs, logger, avro, thrift, ipc, file, null, HBase, solr, custom.Įvent Transmission unit, the basic unit of Flume data transmission, sends data from the source to the destination in the form of events. Once the transaction is committed, the Channel deletes the event from its internal buffer. Once batch events are successfully written to the storage system or the next Flume Agent, Sink uses Channel to submit transactions. **Before deleting data from Channel in bulk, each sink starts a transaction with Channel. Sink Sink continuously polls the Channel events and removes them in batches, and writes these events to the storage or indexing system in batches, or is sent to another Flume Agent. Therefore, no data will be lost when the program is closed or the machine is down. FLUME: Design arid Culibration o/ Long-Throated Measuring Flumes lLRl Prrblicatiori 54 Clemmens, Wahl, Bos, and Replogle, Visual Basic 4.0, for 2001. If you need to be concerned about data loss, then Memory Channel should not be used, because program death, machine downtime or restart will cause data loss. FLUME: A Computer Mudelfiir Etiniating Flow Rntes tlirirrgh Long- Throoted Measuring Flumes USDA ARS-57 Clemmens, Bos and Replogle, 1993. Memory Channel is suitable for use in scenarios where data loss is not a concern. Channel is thread-safe and can handle several Source write operations and several Sink read operations simultaneously.įlume comes with two channels: Memory Channel and File Channel. Therefore, Channel allows Source and Sink to operate at different rates. The Source component can process log data of various types and formats, including avro, thrift, exec, jms, spooling directory, netcat, sequence generator, syslog, http, and legacy.Ĭhannel Channel is a buffer between Source and Sink. Source Source is the component responsible for receiving data to Flume Agent. Agent mainly consists of 3 parts, Source, Channel and Sink. It is the basic unit of Flume data transmission. It sends data from the source to the destination in the form of events. Simple architecture diagram Detailed architecture diagram Agent Agent is a JVM process. ![]() ![]() The main function of flume is to read the data of the server's local disk in real time and write the data to HDFS 3. Flume is based on a streaming architecture, flexible and simple. Flume is a highly available, highly reliable, provided by Cloudera, Distributed massive log collection, aggregation and transmission system.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |