java - Ignite 启动时间过长,并且在应用程序运行时经常失败

标签 java spring tcp distributed ignite

我使用 Spring Boot 应用程序创建了两个节点的集群。没有任何外部资源 - 只有内存中的 DB 和 Apache Ignite 作为分布式缓存。当一个节点已启动且第二个节点已 promise 启动时,我对两个节点进行了 tcp 转储和 JFR。

我在尝试启动的节点上看到了这个: JFR for second node which is in starting state

十秒仅获取字节。与此同时,该节点上的另一个 Ignite SPI 线程“tcp-disco-msg-worker”尝试从同一远程 IP 和端口读取数据,也花费了大约 10 秒。

JFR SPI long read

同时,已经启动的第一个节点尝试执行相同的操作 - 读取一个字节大约十秒 First node

因此,它们似乎试图同时读取彼此的消息并互相阻止 - 当此 SPI 线程尝试读取时,它无法写入答案。

但我在另一个具有两台服务器且应用程序及其配置完全相同的集群上没有看到这种行为 - 没有任何长读取。

有谁知道我的第一个集群出了什么问题以及为什么 Ignite 尝试在两个节点上同时读取有关拓扑的元信息?

主要问题是,我的第一个集群的一个节点上的 Ignite 经常失败,因为 SPI 线程决定集群处于 SEGMENTED 状态。 SPI 线程之所以会进入这种状态,是因为它在常规通信期间无法从其中一个节点获取元信息。

该集群似乎存在一些网络问题,但所有低级测试表明一切正常。

我知道他们说的是 SEGMENTED 状态 - 长 GC、网络问题等等。但我没有看到这样的事情。 SPI 线程中的套接字通信期间只是非常长的读取。而且它只在一个集群上。另一个具有相同配置的集群不会出现此类问题。

有人能给我建议我应该检查或测试什么吗?

UPD: 我的配置:

    @Bean
public IgniteCache<String, WaveFile> getCache() {
    log.info("Ignite addresses: " + Arrays.asList(igniteAddresses));
    IgniteConfiguration conf = new IgniteConfiguration();
    Slf4jLogger gridLog = new Slf4jLogger(log);
    conf.setGridLogger(gridLog);
    conf.setClientMode(false);
    conf.setNetworkTimeout(networkTimeout);
    conf.setClassLoader(WaveFile.class.getClassLoader());

    TcpDiscoverySpi tcpDiscoverySpi = new TcpDiscoverySpi();
    TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
    ipFinder.setAddresses(Arrays.asList(igniteAddresses));
    tcpDiscoverySpi.setIpFinder(ipFinder);
    tcpDiscoverySpi.setAckTimeout(failureDetectionTimeout);
    conf.setDiscoverySpi(tcpDiscoverySpi);
    conf.setSqlConnectorConfiguration(null);

    MemoryConfiguration memoryConfiguration = new MemoryConfiguration();
    MemoryPolicyConfiguration memoryPolicyConfiguration = new MemoryPolicyConfiguration();
    memoryPolicyConfiguration.setInitialSize(5L * 1024 * 1024 * 1024);
    memoryConfiguration.setMemoryPolicies(memoryPolicyConfiguration);
    conf.setMemoryConfiguration(memoryConfiguration);

    Ignite ignite = Ignition.start(conf);
    // Cache Config
    CacheConfiguration<String, WaveFile> cacheConfig = new CacheConfiguration<>();
    cacheConfig.setName(cacheName);
    cacheConfig.setCacheMode(CacheMode.PARTITIONED);
    cacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
    cacheConfig.setCopyOnRead(false);
    cacheConfig.setEagerTtl(eagerTtl);
    cacheConfig.setExpiryPolicyFactory(FactoryBuilder.factoryOf(
            new TouchedExpiryPolicy(new Duration(TimeUnit.SECONDS, expirationPeriod))));
    return ignite.getOrCreateCache(cacheConfig);
}

日志。你可以看到有 4 个节点,但实际上我只在其中两个节点上进行了测试。

2017-10-27 14:01:53.084 [main] ERROR  - Failed to resolve default logging config file: config/java.util.logging.properties
2017-10-27 14:02:35.175 [main] INFO  com.voice.Application - Ignite addresses: [10.200.43.28, 10.200.43.32, 10.200.43.33, 10.200.43.34]
2017-10-27 14:02:35.369 [main] INFO  o.a.ignite.internal.IgniteKernal -

>>>    __________  ________________
>>>   /  _/ ___/ |/ /  _/_  __/ __/
>>>  _/ // (7 7    // /  / / / _/
>>> /___/\___/_/|_/___/ /_/ /___/
>>>
>>> ver. 2.1.0#20170720-sha1:a6ca5c8a
>>> 2017 Copyright(C) Apache Software Foundation
>>>
>>> Ignite documentation: http://ignite.apache.org

[14:02:35]    __________  ________________
[14:02:35]   /  _/ ___/ |/ /  _/_  __/ __/
[14:02:35]  _/ // (7 7    // /  / / / _/
[14:02:35] /___/\___/_/|_/___/ /_/ /___/
[14:02:35]
[14:02:35] ver. 2.1.0#20170720-sha1:a6ca5c8a
[14:02:35] 2017 Copyright(C) Apache Software Foundation
[14:02:35]
[14:02:35] Ignite documentation: http://ignite.apache.org
[14:02:35]
[14:02:35] Quiet mode.
[14:02:35]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[14:02:35]
2017-10-27 14:02:35.370 [main] INFO  o.a.ignite.internal.IgniteKernal - Config URL: n/a
2017-10-27 14:02:35.370 [main] INFO  o.a.ignite.internal.IgniteKernal - Daemon mode: off
[14:02:35] OS: Linux 3.10.0-327.el7.x86_64 amd64
2017-10-27 14:02:35.370 [main] INFO  o.a.ignite.internal.IgniteKernal - OS: Linux 3.10.0-327.el7.x86_64 amd64
2017-10-27 14:02:35.371 [main] INFO  o.a.ignite.internal.IgniteKernal - OS user: mopopov
2017-10-27 14:02:35.371 [main] INFO  o.a.ignite.internal.IgniteKernal - PID: 31516
[14:02:35] VM information: Java(TM) SE Runtime Environment 1.8.0_121-b13 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.121-b13
2017-10-27 14:02:35.372 [main] INFO  o.a.ignite.internal.IgniteKernal - Language runtime: Java Platform API Specification ver. 1.8
2017-10-27 14:02:35.372 [main] INFO  o.a.ignite.internal.IgniteKernal - VM information: Java(TM) SE Runtime Environment 1.8.0_121-b13 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.121-b13
2017-10-27 14:02:35.373 [main] INFO  o.a.ignite.internal.IgniteKernal - VM total memory: 10.0GB
2017-10-27 14:02:35.374 [main] INFO  o.a.ignite.internal.IgniteKernal - Remote Management [restart: off, REST: on, JMX (remote: off)]
2017-10-27 14:02:35.374 [main] INFO  o.a.ignite.internal.IgniteKernal - IGNITE_HOME=null
2017-10-27 14:02:35.374 [main] INFO  o.a.ignite.internal.IgniteKernal - VM arguments: [-Xmx12000m, -XX:+UnlockCommercialFeatures, -XX:+HeapDumpOnOutOfMemoryError, -Dspring.profiles.active=prod]
2017-10-27 14:02:35.374 [main] INFO  o.a.ignite.internal.IgniteKernal - System cache's MemoryPolicy size is configured to 40 MB. Use MemoryConfiguration.systemCacheMemorySize property to change the setting.
2017-10-27 14:02:35.388 [main] INFO  o.a.ignite.internal.IgniteKernal - Configured caches [in 'sysMemPlc' memoryPolicy: ['ignite-sys-cache']]
2017-10-27 14:02:35.395 [pub-#14%null%] WARN  o.a.ignite.internal.GridDiagnostic - Initial heap size is 250MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[14:02:35] Initial heap size is 250MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[14:02:35] Configured plugins:
2017-10-27 14:02:35.477 [main] INFO  o.a.i.i.p.p.IgnitePluginProcessor - Configured plugins:
[14:02:35]   ^-- None
2017-10-27 14:02:35.478 [main] INFO  o.a.i.i.p.p.IgnitePluginProcessor -   ^-- None
[14:02:35]
2017-10-27 14:02:35.478 [main] INFO  o.a.i.i.p.p.IgnitePluginProcessor -
2017-10-27 14:02:35.555 [main] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
2017-10-27 14:02:35.560 [main] WARN  o.a.i.s.c.tcp.TcpCommunicationSpi - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[14:02:35] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
2017-10-27 14:02:35.586 [main] WARN  o.a.i.s.c.noop.NoopCheckpointSpi - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
2017-10-27 14:02:35.627 [main] WARN  o.a.i.i.m.c.GridCollisionManager - Collision resolution is disabled (all jobs will be activated upon arrival).
[14:02:35] Security status [authentication=off, tls/ssl=off]
2017-10-27 14:02:35.628 [main] INFO  o.a.ignite.internal.IgniteKernal - Security status [authentication=off, tls/ssl=off]
2017-10-27 14:02:36.072 [main] INFO  o.a.i.i.p.r.p.t.GridTcpRestProtocol - Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
2017-10-27 14:02:36.133 [main] INFO  o.a.ignite.internal.IgniteKernal - Non-loopback local IPs: 10.200.43.34, fe80:0:0:0:250:56ff:fe93:3f60%eno16777984
2017-10-27 14:02:36.133 [main] INFO  o.a.ignite.internal.IgniteKernal - Enabled local MACs: 005056933F60
2017-10-27 14:02:36.188 [main] INFO  o.a.i.s.d.tcp.TcpDiscoverySpi - Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=64941c34-2912-4934-8c92-c0c089391207]
2017-10-27 14:02:51.267 [main] WARN  o.a.i.s.d.tcp.TcpDiscoverySpi - Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
2017-10-27 14:03:46.397 [exchange-worker-#29%null%] INFO  o.a.ignite.internal.exchange.time - Started exchange init [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], crd=false, evt=10, node=TcpDiscoveryNode [id=64941c34-2912-4934-8c92-c0c089391207, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.34, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 4212-bio-lanta-01/10.200.43.34:47500], discPort=47500, order=13, intOrder=8, lastExchangeTime=1509102156183, loc=true, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], evtNode=TcpDiscoveryNode [id=64941c34-2912-4934-8c92-c0c089391207, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.34, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 4212-bio-lanta-01/10.200.43.34:47500], discPort=47500, order=13, intOrder=8, lastExchangeTime=1509102156183, loc=true, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], customEvt=null]
2017-10-27 14:03:47.022 [exchange-worker-#29%null%] INFO  o.a.i.i.p.cache.GridCacheProcessor - Started cache [name=ignite-sys-cache, memoryPolicyName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL]
2017-10-27 14:03:47.062 [exchange-worker-#29%null%] INFO  o.a.i.i.p.cache.GridCacheProcessor - Started cache [name=SHARED_AUDIO_CACHE, memoryPolicyName=default, mode=PARTITIONED, atomicity=ATOMIC]
2017-10-27 14:03:57.115 [grid-nio-worker-tcp-comm-0-#17%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Established outgoing communication connection [locAddr=/10.200.43.34:53456, rmtAddr=/10.200.43.28:47100]
2017-10-27 14:03:57.172 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture - Finished waiting for partition release future [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], waitTime=0ms]
2017-10-27 14:03:57.252 [exchange-worker-#29%null%] INFO  o.a.ignite.internal.exchange.time - Finished exchange init [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], crd=false]
2017-10-27 14:03:57.293 [sys-#32%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture - Snapshot initialization completed [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], time=0ms]
2017-10-27 14:03:57.344 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.GridCachePartitionExchangeManager - Rebalancing scheduled [order=[ignite-sys-cache, SHARED_AUDIO_CACHE]]
2017-10-27 14:03:57.344 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.GridCachePartitionExchangeManager - Rebalancing started [top=AffinityTopologyVersion [topVer=13, minorTopVer=0], evt=NODE_JOINED, node=64941c34-2912-4934-8c92-c0c089391207]
2017-10-27 14:03:57.345 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionDemander - Starting rebalancing [mode=SYNC, fromNode=110e4e93-2954-4b76-8649-a23f4a83990a, partitionsCount=100, topology=AffinityTopologyVersion [topVer=13, minorTopVer=0], updateSeq=1]
2017-10-27 14:03:57.391 [utility-#34%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionDemander - Completed (final) rebalancing [fromNode=110e4e93-2954-4b76-8649-a23f4a83990a, topology=AffinityTopologyVersion [topVer=13, minorTopVer=0], time=51 ms]
2017-10-27 14:03:57.395 [utility-#34%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionDemander - Starting rebalancing [mode=ASYNC, fromNode=110e4e93-2954-4b76-8649-a23f4a83990a, partitionsCount=518, topology=AffinityTopologyVersion [topVer=13, minorTopVer=0], updateSeq=1]
2017-10-27 14:03:57.449 [sys-#36%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionDemander - Completed (final) rebalancing [fromNode=110e4e93-2954-4b76-8649-a23f4a83990a, topology=AffinityTopologyVersion [topVer=13, minorTopVer=0], time=51 ms]
[14:03:57] Performance suggestions for grid  (fix if possible)
2017-10-27 14:03:57.463 [main] INFO  o.a.ignite.internal.IgniteKernal - Performance suggestions for grid  (fix if possible)
[14:03:57] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
2017-10-27 14:03:57.463 [main] INFO  o.a.ignite.internal.IgniteKernal - To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[14:03:57]   ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options)
2017-10-27 14:03:57.463 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options)
[14:03:57]   ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[14:03:57]   ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options)
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options)
[14:03:57]   ^-- Speed up flushing of dirty pages by OS (alter vm.dirty_expire_centisecs parameter by setting to 500)
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Speed up flushing of dirty pages by OS (alter vm.dirty_expire_centisecs parameter by setting to 500)
[14:03:57]   ^-- Reduce pages swapping ratio (set vm.swappiness=10)
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Reduce pages swapping ratio (set vm.swappiness=10)
[14:03:57]   ^-- Disable fully synchronous writes (set 'writeSynchronizationMode' to PRIMARY_SYNC or FULL_ASYNC)
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal -   ^-- Disable fully synchronous writes (set 'writeSynchronizationMode' to PRIMARY_SYNC or FULL_ASYNC)
[14:03:57] Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning
2017-10-27 14:03:57.464 [main] INFO  o.a.ignite.internal.IgniteKernal - Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning
[14:03:57]
2017-10-27 14:03:57.465 [main] INFO  o.a.ignite.internal.IgniteKernal -
[14:03:57] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
2017-10-27 14:03:57.465 [main] INFO  o.a.ignite.internal.IgniteKernal - To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
[14:03:57]
[14:03:57] Ignite node started OK (id=64941c34)
2017-10-27 14:03:57.465 [main] INFO  o.a.ignite.internal.IgniteKernal -
2017-10-27 14:03:57.466 [main] INFO  o.a.ignite.internal.IgniteKernal -

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.1.0#20170720-sha1:a6ca5c8a97e9a4c9d73d40ce76d1504c14ba1940
>>> +----------------------------------------------------------------------+
>>> OS name: Linux 3.10.0-327.el7.x86_64 amd64
>>> CPU(s): 6
>>> Heap: 10.0GB
>>> VM name: 31516@4212-bio-lanta-01
>>> Local node [ID=64941C34-2912-4934-8C92-C0C089391207, order=13, clientMode=false]
>>> Local node addresses: [4212-bio-lanta-01/0:0:0:0:0:0:0:1%lo, /10.200.43.34, /127.0.0.1]
>>> Local ports: TCP:11211 TCP:47100 TCP:47500

[14:03:57] Topology snapshot [ver=13, servers=2, clients=0, CPUs=12, heap=22.0GB]
2017-10-27 14:03:57.468 [main] INFO  o.a.i.i.m.d.GridDiscoveryManager - Topology snapshot [ver=13, servers=2, clients=0, CPUs=12, heap=22.0GB]
2017-10-27 14:03:58.944 [exchange-worker-#29%null%] INFO  o.a.ignite.internal.exchange.time - Started exchange init [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=1], crd=false, evt=18, node=TcpDiscoveryNode [id=64941c34-2912-4934-8c92-c0c089391207, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.34, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 4212-bio-lanta-01/10.200.43.34:47500], discPort=47500, order=13, intOrder=8, lastExchangeTime=1509102156183, loc=true, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], evtNode=TcpDiscoveryNode [id=64941c34-2912-4934-8c92-c0c089391207, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.34, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, 4212-bio-lanta-01/10.200.43.34:47500], discPort=47500, order=13, intOrder=8, lastExchangeTime=1509102156183, loc=true, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false], customEvt=CacheAffinityChangeMessage [id=729ac7d5f51-0fc83046-7af0-474f-8487-f526629190a8, topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], exchId=null, partsMsg=null, exchangeNeeded=true]]
2017-10-27 14:03:58.952 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture - Finished waiting for partition release future [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=1], waitTime=0ms]
2017-10-27 14:03:58.975 [exchange-worker-#29%null%] INFO  o.a.ignite.internal.exchange.time - Finished exchange init [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=1], crd=false]
2017-10-27 14:03:58.991 [sys-#41%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture - Snapshot initialization completed [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=1], time=0ms]
2017-10-27 14:03:58.994 [exchange-worker-#29%null%] INFO  o.a.i.i.p.c.GridCachePartitionExchangeManager - Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=13, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, node=110e4e93-2954-4b76-8649-a23f4a83990a]
2017-10-27 14:04:01.144 [main] INFO  com.voice.Application - Run config context -> ok
2017-10-27 14:04:01.145 [main] INFO  com.voice.Application - StartUp date: Fri Oct 27 14:01:48 MSK 2017

UPD2:节点发生故障时记录:

    2017-10-28 07:48:56.320 [grid-nio-worker-tcp-comm-2-#19%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:52329]
2017-10-28 07:48:56.323 [grid-nio-worker-tcp-comm-3-#20%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/0:0:0:0:0:0:0:1:47100, rmtAddr=/0:0:0:0:0:0:0:1:55692]
2017-10-28 07:48:56.323 [grid-nio-worker-tcp-comm-0-#17%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:52335]
2017-10-28 07:48:59.535 [RTP-948] INFO  com.voice.rtp.RtpReceiverImpl - Start callId 887B4B84-C47F-461E-35B0-97D0D7F86A51-7092 RTP processing with local port 36496
2017-10-28 07:49:00.116 [grid-nio-worker-tcp-comm-1-#18%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/0:0:0:0:0:0:0:1:47100, rmtAddr=/0:0:0:0:0:0:0:1:55698]
2017-10-28 07:49:00.117 [grid-nio-worker-tcp-comm-2-#19%null%] INFO  o.a.i.s.c.tcp.TcpCommunicationSpi - Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:52341]
2017-10-28 07:49:00.372 [tcp-disco-msg-worker-#2%null%] WARN  o.a.i.s.d.tcp.TcpDiscoverySpi - Node is out of topology (probably, due to short-time network problems).
2017-10-28 07:49:00.373 [disco-event-worker-#28%null%] WARN  o.a.i.i.m.d.GridDiscoveryManager - Local node SEGMENTED: TcpDiscoveryNode [id=c1886afe-b28f-45b5-a91b-cfdd9e635374, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.33, 127.0.0.1], sockAddrs=[4212-bio-lanta-03/10.200.43.33:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=15, intOrder=10, lastExchangeTime=1509102474635, loc=true, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false]
2017-10-28 07:49:00.399 [disco-event-worker-#28%null%] WARN  o.a.i.i.m.d.GridDiscoveryManager - Stopping local node according to configured segmentation policy.
2017-10-28 07:49:00.400 [disco-event-worker-#28%null%] WARN  o.a.i.i.m.d.GridDiscoveryManager - Node FAILED: TcpDiscoveryNode [id=110e4e93-2954-4b76-8649-a23f4a83990a, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.28, 127.0.0.1], sockAddrs=[/10.200.43.28:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=11, intOrder=7, lastExchangeTime=1509102530512, loc=false, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false]
2017-10-28 07:49:00.413 [disco-event-worker-#28%null%] INFO  o.a.i.i.m.d.GridDiscoveryManager - Topology snapshot [ver=16, servers=3, clients=0, CPUs=18, heap=31.0GB]
2017-10-28 07:49:00.450 [Thread-52685] INFO  o.a.i.i.p.r.p.t.GridTcpRestProtocol - Command protocol successfully stopped: TCP binary
2017-10-28 07:49:00.452 [disco-event-worker-#28%null%] INFO  o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture - Snapshot initialization completed [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], time=0ms]
2017-10-28 07:49:00.470 [disco-event-worker-#28%null%] WARN  o.a.i.i.m.d.GridDiscoveryManager - Node FAILED: TcpDiscoveryNode [id=64941c34-2912-4934-8c92-c0c089391207, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.34, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /10.200.43.34:47500], discPort=47500, order=13, intOrder=8, lastExchangeTime=1509102540526, loc=false, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false]
2017-10-28 07:49:00.479 [disco-event-worker-#28%null%] INFO  o.a.i.i.m.d.GridDiscoveryManager - Topology snapshot [ver=17, servers=2, clients=0, CPUs=12, heap=21.0GB]
2017-10-28 07:49:00.480 [disco-event-worker-#28%null%] WARN  o.a.i.i.m.d.GridDiscoveryManager - Node FAILED: TcpDiscoveryNode [id=f6678eb1-f2f6-4da1-8d4d-a259c435b5a6, addrs=[0:0:0:0:0:0:0:1%lo, 10.200.43.32, 127.0.0.1], sockAddrs=[/10.200.43.32:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=14, intOrder=9, lastExchangeTime=1509102550535, loc=false, ver=2.1.0#20170720-sha1:a6ca5c8a, isClient=false]
2017-10-28 07:49:00.481 [disco-event-worker-#28%null%] INFO  o.a.i.i.m.d.GridDiscoveryManager - Topology snapshot [ver=18, servers=1, clients=0, CPUs=6, heap=10.0GB]
2017-10-28 07:49:01.125 [RTP-948] ERROR com.voice.rtp.RtpReceiverImpl - During RTP socket communications
javax.cache.CacheException: class org.apache.ignite.internal.NodeStoppingException: Operation has been cancelled (node is stopping).
        at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1312)
        at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2630)
        at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:1188)

DEBUG 模式下的日志在这里:https://drive.google.com/drive/folders/0B00-0mjd5sy1MjNnSkdyR1M4VFk?usp=sharing

最佳答案

问题是 DNS 中缺少集群主机名。更正此问题后,不再有十秒长的读取,并且集群至少在半天内仍能正常工作。

关于java - Ignite 启动时间过长,并且在应用程序运行时经常失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46961122/

相关文章:

java - 在 Java 中传递抽象类

c++ - 建立P2P TCP/UDP连接(NAT穿越)

c - TCP 连接错误 115 Operation in Progress 是什么原因?

c - MSG_CONFIRM 和 TCP

java - android.view.InflateException : Binary XML file line #6: Error inflating class com. google.android.gms.maps.MapView

java - 即使使用 JavascriptExecutor 滚动也无法点击元素

java - 当我复制两个空路径时会发生什么并且为什么它不抛出异常?

spring - "Unrecognized field (..), not marked as ignorable"当 jaxb 解码 xml 输入时

java - Spring , thymeleaf : redirecting to <select> value

java - 在发送给客户端之前注释掉 json