hadoop - 在 Ubuntu 上安装 Flume

标签 hadoop hdfs flume

我想在 ubuntu 上安装 Apache Flume 1.4.0 以移动数据并将其存储到 HDFS,但我找不到显示如何正确安装 flume 的安装指南。我下载了二进制 zip。有什么帮助吗?

最佳答案

Ubuntu 中的 Apache Flume 安装步骤:

Step 1 : Download the Latest Version of Flume. 

Step 2 : tar –xzvf apache-flume-1.5.0-bin.tar.gz

Step 3 : sudo mv apache-flume-1.5.2-bin /usr/local/flume

Step 4 : sudo nano ~/.bashrc

Step 5 : export FLUME_HOME=/usr/local/flume
export FLUME_CONF_DIR=$FLUME_HOME/conf
export FLUME_CLASS_PATH=$FLUME_CONF_DIR
export PATH=$FLUME_HOME/bin:$PATH

Step 6 : 

cp conf/flume-env.sh.template conf/flume-env.sh
Step 7 : 

sudo nano conf/flume-env.sh
JAVA_HOME=/usr/lib/jvm/jdk1.8.0
JAVA_OPTS="-Xms100m -Xmx200m -Dcom.sun.management.jmxremote"
Step 8: 
sudo nano flume-conf.properties.template 


agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100

# Define a source on agent and connect to channel memoryChannel.
agent.sources.tail-source.type = exec
agent.sources.tail-source.command = tail -F /opt/hadoop-2.6.0/logs/hadoop-hadoop-datanode-node1.log
agent.sources.tail-source.channels = memoryChannel

# Define a sink that outputs to logger.
agent.sinks.log-sink.channel = memoryChannel
agent.sinks.log-sink.type = logger

agent.sinks.hdfs-sink.channel = memoryChannel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://node1:8020/flumedata/
agent.sinks.hdfs-sink.hdfs.fileType = DataStream

# Activate channel, source and sinks
agent.channels = memoryChannel
agent.sources = tail-source
agent.sinks = log-sink hdfs-sink
Start Flume-ng agent

source ~/.bashrc

flume-ng agent -n agent -f conf/flume.conf -Dflume.root.logger=DEBUG,console
flume-ng --help 

希望这对您有所帮助。

关于hadoop - 在 Ubuntu 上安装 Flume,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20759634/

相关文章:

json - Hive Metastore 列宽限制

hadoop - Hive 读取序列文件

Hadoop 资源管理

hadoop - 根据Java中的创建日期删除远程hdfs中的文件和目录

Python HDFS : Cannot read file

hadoop - 我在 Ubuntu 20.04 中安装了 Hadood 3.2.1,但出现错误

hadoop - HIVE HA 通过 zookeeper (JDBC)

hadoop - 我想在不同服务器上的水槽中读取日志文件

java - 无法使用水槽从远程 HDFS 写入

apache - 提高水槽性能的指导方针是什么