hadoop - Flume 内存 channel 在启动时已满

标签 hadoop memory flume channel

我在使用 Flume 内存 channel 时遇到问题。我运行了一个 Flume 代理,它淹没了内存 channel 并且日志开始溢出“ channel 已满,现在无法写入数据。源将在 250 毫秒后重试”

到目前为止一切顺利。我停止代理,编辑 flume.conf 以增加容量并重试。问题是 Flume 在启动时已经溢出了相同的消息:

`16/05/14 00:21:48 INFO node.Application: Starting new configuration:
 { sourceRunners:{s1=EventDrivenSourceRunner: { source:Spool  
  Directory source s1: { spoolDir: /home/vagrant/logs } }} 
  sinkRunners:{kafka-avro-sink2=SinkRunner: { 
  policy:org.apache.flume.sink.DefaultSinkProcessor@63203b59 
  counterGroup:{ name:null counters:{} } }, kafka-avro-
   sink1=SinkRunner: { 
   policy:org.apache.flume.sink.DefaultSinkProcessor@591882e6 
  counterGroup:{ name:null counters:{} } }} channels:
  {mem1=org.apache.flume.channel.MemoryChannel{name: mem1}} }
  16/05/14 00:21:48 INFO node.Application: Starting Channel mem1
   16/05/14 00:21:48 INFO instrumentation.MonitoredCounterGroup:       
  Monitored counter group for type: CHANNEL, name: mem1: Successfully  
  registered new MBean.
   16/05/14 00:21:48 INFO instrumentation.MonitoredCounterGroup: 
  Component type: CHANNEL, name: mem1 started
  16/05/14 00:21:48 INFO node.Application: Starting Sink kafka-avro- 
  sink2
  16/05/14 00:21:48 INFO node.Application: Starting Sink kafka-avro-sink1
  16/05/14 00:21:48 INFO node.Application: Starting Source s1
  16/05/14 00:21:48 INFO source.SpoolDirectorySource:  
  SpoolDirectorySource source starting with directory:  
  /home/vagrant/logs
  16/05/14 00:21:48 INFO instrumentation.MonitoredCounterGroup: 
  Monitored counter group for type: SOURCE, name: s1: Successfully  
  registered new MBean.
  16/05/14 00:21:48 INFO instrumentation.MonitoredCounterGroup: 
   Component type: SOURCE, name: s1 started
  16/05/14 00:21:49 WARN source.SpoolDirectorySource: The channel is  
  full, and cannot write data now. The source will try again after 
  250 milliseconds
  16/05/14 00:21:49 INFO avro.ReliableSpoolingFileEventReader: Last  
  read was never committed - resetting mark position.
  16/05/14 00:21:49 WARN source.SpoolDirectorySource: The channel is  
  full, and cannot write data now. The source will try again after  
  500 milliseconds`

所以当我启动代理时 channel 已经满了。如何手动重置或清除它?

我已经为类似的问题搜索了几个小时,但没有成功。一个有点恼人的问题是 https://flume.apache.org/文本如此之多,以至于几乎所有与水槽相关的东西都被谷歌索引了。因此,例如搜索“水槽 channel 总是满的”总是会给出 https://flume.apache.org/作为第一名的结果。

最佳答案

您似乎无法为您的 channel 分配足够的内存。当您的传入事件速率高于内存 channel 的可用内存时。

尝试使用如下示例: xmx 是示例选项,您可以逐渐增加并根据并行运行的代理数量查看内存分配的最佳值。

例如:

flume-ng agent -n $agentnumber -c ../../config/conf/ -f ../../config/conf/youragentconf.conf -Xmx3g

此外,您还需要检查代理配置文件中的其他配置参数,引用this Memory channel from Flume Guide .

关于hadoop - Flume 内存 channel 在启动时已满,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37238284/

相关文章:

scala - Spark(Scala)从驱动程序写入(和读取)本地文件系统

c++ - 在 C++ 中与 hadoop 服务交互

我可以执行驻留在数据段(ELF 二进制文件)中的代码吗?

python - 最小化在 Python 中对磁盘的读取和写入以进行内存繁重的操作

hadoop - Namenode 高可用性客户端请求

java - 以自定义格式将 apache pig 数据输出到文件

c - 重新分配后如何更改保存的地址

hadoop - Impala - 找不到文件错误

hadoop - Flume 创建小文件

hadoop - 使用 FILE channel 配置 flume 的多个源时出现 channel 锁定错误