apache - 如何测量从csv文件导入数据到Hbase的时间?

标签 apache hadoop mapreduce hdfs hbase

我使用logs.csv命令将数据从hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns="HBASE_ROW_KEY,log" logs hdfs://ip:9000/tmp/logs.csv文件导入到Hbase表中。在执行命令的末尾,我得到如下所示的摘要,但是没有关于将数据添加到Hbase所花费的时间的信息。您知道如何检查吗?

    2018-10-06 23:09:17,647 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task: Final Counters for attempt_local1534176268_0001_m_000001_0: Counters: 21
    File System Counters
        FILE: Number of bytes read=37162012
        FILE: Number of bytes written=37835107
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=162892986
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Map-Reduce Framework
        Map input records=175896
        Map output records=175896
        Input split bytes=106
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=18
        Total committed heap usage (bytes)=2075918336
    ImportTsv
        Bad Lines=0
    File Input Format Counters 
        Bytes Read=28671162
    File Output Format Counters 
        Bytes Written=0
2018-10-06 23:09:17,647 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: Finishing task: attempt_local1534176268_0001_m_000001_0
2018-10-06 23:09:17,647 INFO  [Thread-37] mapred.LocalJobRunner: map task executor complete.
2018-10-06 23:09:18,191 INFO  [main] mapreduce.Job: Job job_local1534176268_0001 completed successfully
2018-10-06 23:09:18,220 INFO  [main] mapreduce.Job: Counters: 21
    File System Counters
        FILE: Number of bytes read=74323793
        FILE: Number of bytes written=75670214
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=297114810
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=7
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Map-Reduce Framework
        Map input records=1000000
        Map output records=1000000
        Input split bytes=212
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=55
        Total committed heap usage (bytes)=4151836672
    ImportTsv
        Bad Lines=0
    File Input Format Counters 
        Bytes Read=162892986
    File Output Format Counters 
        Bytes Written=0

yarn-site.xml:
<configuration>
   <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
   </property>
</configuration>

LOGS
2018-10-16 09:39:53,350 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-10-16 09:39:53,350 WARN org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl is interrupted. Exiting.
2018-10-16 09:39:53,351 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2018-10-16 09:39:53,352 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2018-10-16 09:39:53,353 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2018-10-16 09:39:53,353 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2018-10-16 09:39:53,354 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From myserver/myip to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:238)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:369)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:637)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
Caused by: java.net.ConnectException: Call From myserver/myip to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.GeneratedConstructorAccessor30.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
    at org.apache.hadoop.ipc.Client.call(Client.java:1435)
    at org.apache.hadoop.ipc.Client.call(Client.java:1345)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
    at com.sun.proxy.$Proxy73.registerNodeManager(Unknown Source)
    at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:73)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
    at com.sun.proxy.$Proxy74.registerNodeManager(Unknown Source)
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:343)
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:232)
    ... 6 more
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
    at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
    at org.apache.hadoop.ipc.Client.call(Client.java:1381)
    ... 22 more
2018-10-16 09:39:53,358 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at myserver/myip
************************************************************/

最佳答案

它是map / reduce作业,您可以在yarn UI中看到执行时间。它的默认端口是8088。

关于apache - 如何测量从csv文件导入数据到Hbase的时间?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52686696/

相关文章:

hadoop - 在 hdfs 文件上运行 awk 脚本并将结果文件保存在 hdfs 中

java - mapreduce 文件输出的标题

apache - 从 PEM 文件获取 ASN.1 颁发者字符串?

apache - SSLVerifyClient 忽略过期日期

java - Hadoop:是否可以在hdfs中编辑文件?

hadoop - 如何在Custom Partitioner Hadoop中为每个reducer设置输出文件数

android - 通过 MultipartEntity 发送 Unicode 字符

django SSL apache 配置

java - Hadoop 中的动态节点

hadoop - hive : Read a struct value inside a map in hive