java - IBM JRE gencon 策略中的歧义和详细 :gc output

我正在协助一个项目调整他们的应用程序服务器环境，并且在来自 IBM JRE 的 verbose:gc 日志中看到一些相当困惑的输出，我对它的熟悉程度远不如 Hotspot。在许多情况下，那里的内容似乎与我在供应商的文档中看到的内容相矛盾。这是针对在 Websphere 上运行的应用程序，在 IBM 的 1.7_64 JRE 上运行。此特定运行使用 gencon GC 策略，堆大小为 4GB。这是全局 GC 的 verbose:gc 输出示例，它是许多歧义的来源，我将在下面详细说明:

<exclusive-start id="8574" timestamp="2015-09-08T22:48:45.819" intervalms="2919.893">
    <response-info timems="0.229" idlems="0.069" threads="43" lastid="0000000005FF6800" lastname="WebContainer : 37" />
</exclusive-start>
<sys-start id="8575" timestamp="2015-09-08T22:48:45.820" intervalms="3028473.245" />
<cycle-start id="8576" type="global" contextid="0" timestamp="2015-09-08T22:48:45.820" intervalms="3028473.281" />
<gc-start id="8577" type="global" contextid="8576" timestamp="2015-09-08T22:48:45.820">
    <mem-info id="8578" free="462493104" total="4155310080" percent="11">
        <mem type="nursery" free="80236904" total="934084608" percent="8" />
        <mem type="tenure" free="382256200" total="3221225472" percent="11">
            <mem type="soa" free="221195336" total="3060164608" percent="7" />
            <mem type="loa" free="161060864" total="161060864" percent="100" />
        </mem>
        <remembered-set count="16780" />
    </mem-info>
</gc-start>
<allocation-stats totalBytes="745771152" >
    <allocated-bytes non-tlh="193377184" tlh="552393968" />
    <largest-consumer threadName="WebContainer : 49" threadId="000000000659E600" bytes="156075608" />
</allocation-stats>
<gc-op id="8579" type="mark" timems="609.025" contextid="8576" timestamp="2015-09-08T22:48:46.429">
    <trace-info objectcount="9486509" scancount="7257956" scanbytes="246533016" />
    <finalization candidates="1528" enqueued="411" />
    <ownableSynchronizers candidates="32" cleared="0" />
    <references type="soft" candidates="63260" cleared="0" enqueued="0" dynamicThreshold="30" maxThreshold="32" />
    <references type="weak" candidates="27120" cleared="11" enqueued="0" />
    <references type="phantom" candidates="17361" cleared="7885" enqueued="7885" />
    <stringconstants candidates="153217" cleared="105"    />
</gc-op>
<gc-op id="8580" type="classunload" timems="1.114" contextid="8576" timestamp="2015-09-08T22:48:46.430">
    <classunload-info classloadercandidates="4306" classloadersunloaded="26" classesunloaded="26" quiescems="0.000" setupms="0.955" scanms="0.108" postms="0.051" />
</gc-op>
<gc-op id="8581" type="sweep" timems="18.520" contextid="8576" timestamp="2015-09-08T22:48:46.449" />
<gc-op id="8582" type="compact" timems="2209.475" contextid="8576" timestamp="2015-09-08T22:48:48.658">
    <compact-info movecount="9482641" movebytes="539414400" reason="compact on aggressive collection" />
</gc-op>
<gc-end id="8583" type="global" contextid="8576" durationms="2839.085" timestamp="2015-09-08T22:48:48.659">
    <mem-info id="8584" free="3615704600" total="4155310080" percent="87">
        <mem type="nursery" free="834618336" total="934084608" percent="89" />
        <mem type="tenure" free="2781086264" total="3221225472" percent="86">
            <mem type="soa" free="2620025400" total="3060164608" percent="85" />
            <mem type="loa" free="161060864" total="161060864" percent="100" />
        </mem>
        <pending-finalizers system="367" default="44" reference="7885" classloader="0" />
        <remembered-set count="15785" />
    </mem-info>
</gc-end>
<cycle-end id="8585" type="global" contextid="8576" timestamp="2015-09-08T22:48:48.659" />
<sys-end id="8586" timestamp="2015-09-08T22:48:48.659" />
<exclusive-end id="8587" timestamp="2015-09-08T22:48:48.659" durationms="2839.704" />

问题包括:

1) GC 操作的时间戳(标记为 gc-op)似乎是操作完成时的时间戳，而不是操作开始时的时间戳。 IBM 文档暗示时间戳是该操作发生的时间 (http://www-01.ibm.com/support/knowledgecenter/SSYKE2_7.0.0/com.ibm.java.zos.71.doc/diag/tools/gcpd_verbose_operation.html)。但是，在查看操作的时间戳和持续时间(由 timems 属性标识)时，这些数字并不相加。但是，如果时间戳被解释为操作完成的时间，而不是操作开始的时间，它们确实加起来并且有意义。为了说明，在上面的示例输出中执行了以下操作，这些是它们提供的持续时间和时间戳(我已经四舍五入了数字):

    operation - duration - timestamp
    mark - 609ms - 48:46.429
    classunload - 1ms - 48:46.430
    sweep - 19ms - 48:46.449
    compact - 2209ms - 48:48.658

如您所见，如果将时间戳解释为开始时间，则时间戳没有意义。但是，如果将其解释为结束时间，它们就会这样做。例如，classunload 时间戳值等于 makr 时间戳加上 classunload 持续时间(如果时间戳指示操作何时开始，则不加上标记持续时间)。类似地，扫描时间戳等于类卸载时间戳加上扫描持续时间。

有人可以确认这是否是预期的行为，还是我只是读错了？

2) 另一个问题似乎是日志中的每个操作从开始到结束似乎都是句号 - 世界操作。我预计清除 GC 将全部停止，但是，我从 IBM 看到的所有文档都表明至少应该同时完成全局 GC 的标记阶段的一部分。但是，这也不是上面 GC 日志条目中发生的情况。在这个例子中，看起来一切都是作为一个句号完成的，并且没有标记的任何部分或扫掠是同时完成的。根据(较旧的)文档，至少部分标记应该同时完成，而不是全部作为停止世界暂停的一部分(http://www.ibm.com/developerworks/java/library/j-ibmjava2/) 是否有设置让tenured generation中的global GC同时运行一部分循环？我知道这是在 IBM JREs optavgpause 收集器和 Hotspot 并发收集器中完成的，它们都不是特别新的技术，所以不清楚为什么 gencon 每次必须执行时总是强制完全停止世界一个全局集合。这很重要，因为上面指出的 2.8 秒暂停将使我们与客户的 SLA 发生冲突。

3) 在上面的全局 GC 中，从 nursery 的占用数量可以明显看出，它也在该操作期间被收集(在 GC 之前，nursery 显示只有 8% 空闲，GC 之后只有 89% 空闲)。但是，我没有在输出中看到有关托儿所中发生的 GC 的相关信息。我希望看到一个 scavange 操作记录在这里。有谁知道为什么缺少这个？这是因为在全局 GC 期间，也通过与全局 GC 发生时的永久空间相同的堆遍历(标记-清除-紧凑)方法收集 nursury？

最佳答案

您是对的，因为您看到了由覆盖整个周期的独占开始和独占结束指示的世界集合的停止。我觉得有趣的是，当堆甚至没有完全分配时，您会看到一个全局 gc。

我认为的原因是你有 sys-start 和 sys-end 条目，它们似乎表明你的应用程序(或远程 GC)中的某些东西正在触发 System.gc() 并且你应该首先解决这个问题并且获取一组新的 gc 日志。

关于java - IBM JRE gencon 策略中的歧义和详细 :gc output，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32487359/

java - IBM JRE gencon 策略中的歧义和详细 :gc output

上一篇：java - 如何让 jackson 在反序列化期间将所有类似 map 的节点转换到我的自定义类？

下一篇：java - Spring @Configuration 是如何被发现的？