java - 为什么 String.equals 对于不相同(但相等)的 String 对象要慢得多?

标签 java string performance equals

我正在研究 String.equals() 是否真的很糟糕,并且在尝试对其进行一些基准测试时遇到了一些令人惊讶的结果。

使用 jmh ,我写了一个简单的测试(最后是代码和 pom),看看该函数可以在 1 秒内运行多少次。

Benchmark                                Mode  Samples          Score   Score error  Units
c.s.SimpleBenchmark.testEqualsIntern    thrpt        5  698910949.710  47115846.650  ops/s
c.s.SimpleBenchmark.testEqualsNew       thrpt        5     529118.774     21164.872  ops/s
c.s.SimpleBenchmark.testIsEmpty         thrpt        5  470846539.546  19922172.099  ops/s

The this is a 1300x factor between testEqualsIntern and testEqualsNew which is frankly quite surprising to me.

The code for String.equals() does have a test for the same object, which would kick the identical (interned in this case) string objects out quite quickly. I just have have significant difficulty believing that the additional code which appears to amount to walking over an array of size 1 for the two tests and comparing elements is that much of a performance hit.

I've also put in a test with another simple method call in the String to make sure I wasn't seeing something that is too crazy.

package com.shagie;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

public class SimpleBenchmark {
    public final static int ITERATIONS = 1000;
    public final static String EMPTY = "";
    public final static String NEW_EMPTY = new String("");

    @Benchmark
    public int testEqualsIntern() {
        int count = 0;
        String str = EMPTY;

        for(int i = 0; i < ITERATIONS; i++) {
            if(str.equals(EMPTY)) {
                count++;
            }
        }
        return count;
    }

    @Benchmark
    public int testEqualsNew() {
        int count = 0;
        String str = NEW_EMPTY;

        for(int i = 0; i < ITERATIONS; i++) {
            if(str.equals(EMPTY)) {
                count++;
            }
        }
        return count;
    }

    @Benchmark
    public int testIsEmpty() {
        int count = 0;
        String str = NEW_EMPTY;

        for(int i = 0; i < ITERATIONS; i++) {
            if(str.isEmpty()) {
                count++;
            }
        }
        return count;
    }


    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
          .include(".*" + SimpleBenchmark.class.getSimpleName() + ".*")
          .warmupIterations(5)
          .measurementIterations(5)
          .forks(1)
          .build();

        new Runner(opt).run();
    }
}

maven 的 .pom(如果您想重现它,可以自己快速设置它):

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.shagie</groupId>
    <artifactId>bench</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>String Benchmarks with JMH</name>

    <prerequisites>
        <maven>3.0</maven>
    </prerequisites>

    <dependencies>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-core</artifactId>
            <version>${jmh.version}</version>
        </dependency>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-generator-annprocess</artifactId>
            <version>${jmh.version}</version>
            <scope>provided</scope>
        </dependency>
    </dependencies>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <jmh.version>0.9.5</jmh.version>
        <javac.target>1.6</javac.target>
        <uberjar.name>benchmarks</uberjar.name>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <compilerVersion>${javac.target}</compilerVersion>
                    <source>${javac.target}</source>
                    <target>${javac.target}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.2</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <finalName>${uberjar.name}</finalName>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>org.openjdk.jmh.Main</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
        <pluginManagement>
            <plugins>
                <plugin>
                    <artifactId>maven-clean-plugin</artifactId>
                    <version>2.5</version>
                </plugin>
                <plugin>
                    <artifactId>maven-deploy-plugin</artifactId>
                    <version>2.8.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-install-plugin</artifactId>
                    <version>2.5.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-jar-plugin</artifactId>
                    <version>2.4</version>
                </plugin>
                <plugin>
                    <artifactId>maven-javadoc-plugin</artifactId>
                    <version>2.9.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-resources-plugin</artifactId>
                    <version>2.6</version>
                </plugin>
                <plugin>
                    <artifactId>maven-site-plugin</artifactId>
                    <version>3.3</version>
                </plugin>
                <plugin>
                    <artifactId>maven-source-plugin</artifactId>
                    <version>2.2.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-surefire-plugin</artifactId>
                    <version>2.17</version>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>

</project>

这是自动生成的(对组和工件进行了适当的调整):

$ mvn archetype:generate \
          -DinteractiveMode=false \
          -DarchetypeGroupId=org.openjdk.jmh \
          -DarchetypeArtifactId=jmh-java-benchmark-archetype \
          -DgroupId=org.sample \
          -DartifactId=test \
          -Dversion=1.0

运行测试:

$ mvn clean install
$ java -jar target/benchmarks.jar ".*SimpleBenchmark.*" -wi 5 -i 5 -f 1

这将是一个问题,它运行的 Java 版本是:

$ java -version
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)

硬件(可能会有问题)是 OS X,Intel Xeon 处理器上的 10.9.4。

最佳答案

编写有缺陷的微基准测试非常容易……而您会陷入困境。

了解发生了什么的唯一方法是查看汇编代码。您必须自己检查生成的代码是否符合您的预期,或者是否发生了一些不需要的魔法。让我们一起尝试吧。您必须使用 addProfile(LinuxPerfAsmProfiler.class) 才能查看汇编代码。

testEqualsIntern 的汇编代码是什么:

....[Hottest Region 1]..............................................................................
[0x7fb9e11acda0:0x7fb9e11acdc8] in org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop

                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@19 (line 103)
                  0x00007fb9e11acd82: movzbl 0x94(%rdx),%r11d   ;*getfield isDone
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@29 (line 105)
                  0x00007fb9e11acd8a: mov    $0x2,%ebp
                  0x00007fb9e11acd8f: test   %r11d,%r11d
                  0x00007fb9e11acd92: jne    0x00007fb9e11acdcc  ;*ifeq
                                                                 ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@32 (line 105)
                  0x00007fb9e11acd94: nopl   0x0(%rax,%rax,1)
                  0x00007fb9e11acd9c: xchg   %ax,%ax            ;*aload
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@13 (line 103)
6.50%    3.37%    0x00007fb9e11acda0: mov    0xb0(%rdi),%r11d   ;*getfield i1
                                                                ; - org.openjdk.jmh.infra.Blackhole::consume@2 (line 350)
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@19 (line 103)
0.06%    0.05%    0x00007fb9e11acda7: mov    0xb4(%rdi),%r10d   ;*getfield i2
                                                                ; - org.openjdk.jmh.infra.Blackhole::consume@15 (line 350)
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@19 (line 103)
0.06%    0.09%    0x00007fb9e11acdae: cmp    $0x3e8,%r10d
0.03%             0x00007fb9e11acdb5: je     0x00007fb9e11acdf1  ;*return
                                                                ; - org.openjdk.jmh.infra.Blackhole::consume@38 (line 354)
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@19 (line 103)
48.85%   44.47%    0x00007fb9e11acdb7: movzbl 0x94(%rdx),%ecx    ;*getfield isDone
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@29 (line 105)
0.33%    0.62%    0x00007fb9e11acdbe: add    $0x1,%rbp          ; OopMap{r9=Oop rbx=Oop rdi=Oop rdx=Oop off=226}
                                                                ;*ifeq
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@32 (line 105)
0.03%    0.05%    0x00007fb9e11acdc2: test   %eax,0x16543238(%rip)        # 0x00007fb9f76f0000
                                                                ;   {poll}
42.31%   49.43%    0x00007fb9e11acdc8: test   %ecx,%ecx
                   0x00007fb9e11acdca: je     0x00007fb9e11acda0  ;*aload_2
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@35 (line 106)
                  0x00007fb9e11acdcc: mov    $0x7fb9f706fe40,%r10
                  0x00007fb9e11acdd6: callq  *%r10              ;*invokestatic nanoTime
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@36 (line 106)
                  0x00007fb9e11acdd9: mov    %rbp,0x10(%rbx)    ;*putfield operations
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@51 (line 108)
                  0x00007fb9e11acddd: mov    %rax,0x28(%rbx)    ;*putfield stopTime
                                                                ; - org.sample.generated.MyBenchmark_testEqualsIntern::testEqualsIntern_thrpt_jmhLoop@39 (line 106)
....................................................................................................

您可能知道,JMH 获取您的基准测试代码并将其插入到自己的测量循环中。您可以通过查看 target/generated-sources 文件夹轻松查看生成的代码。您必须了解此代码的外观才能将其与程序集进行比较。

有趣的部分在这里:

public void testEqualsIntern_avgt_jmhLoop(InfraControl control, RawResults result, MyBenchmark_1_jmh l_mybenchmark0_0, Blackhole_1_jmh l_blackhole1_1) throws Throwable {
    long operations = 0;
    long realTime = 0;
    result.startTime = System.nanoTime();
    do {
        l_blackhole1_1.consume(l_mybenchmark0_0.testEqualsIntern());
        operations++;
    } while(!control.isDone);
    result.stopTime = System.nanoTime();
    result.realTime = realTime;
    result.operations = operations;
}

好吧,你看到这个漂亮的 do/while 循环做了两件事:

  • 调用你的函数
  • 调用 consume 来防止 Hotspot 不必要的优化?

现在让我们回到装配体。尝试在其中找到这三个操作(循环、使用和您的代码)。你能 ?

可以看到JMH循环,就是0x00007fb9e11acdb7: movzbl 0x94(%rdx),%ecx ;*getfield isDone和下面的跳转。

可以看到黑洞,就是从0x00007fb9e11acda00x00007fb9e11acdb5:

但是你的代码在哪里?没了。您没有遵循 JMH 的良好做法,并允许 Hotspot 删除您的代码。您正在对 NOOP 进行基准测试。顺便说一句,你有没有试过对 NOOP 进行基准测试?这是一件好事,当您看到接近这个数字时,您就知道必须非常小心。

您可以对第二个基准进行相同的分析。我没有仔细阅读它的汇编代码,但您将能够发现您的 for 循环和对 equals 的调用。您可以让他们再次阅读 JMH 示例,以尽量避免此类问题。

TL;DR 编写正确的微/纳米基准非常困难,你应该仔细检查你是否知道你测量的是什么。集会是唯一的出路。观看所有演示文稿并阅读 Aleksey 的所有博客文章以了解更多信息。他做得很好。最后,此类测量在现实生活中几乎总是无用的,但却是一种很好的学习工具。

关于java - 为什么 String.equals 对于不相同(但相等)的 String 对象要慢得多?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25351490/

相关文章:

java - 为什么java需要Class.forName或者动态加载?

xcode - 本地化不适用于 XCode 6 中的 XIB 文件

java - 使二维数组不等式检查更快

java - 如何在java中检查邮件的用户名和密码的身份验证?

java - Selenium WebDriver (java) 可以与浏览器的检查工具元素选择器交互吗?

string - 在 AWK 中,如何拆分与 "record"具有相同字符串的连续行?

python - 将数据转储并加载到 JSON 中

java - (int) Math.sqrt(n) 比 (int) Math.floor(Math.sqrt(n)) 慢很多

javascript - CSS 'background-attachment:fixed' 非常滞后

java - 拖动鼠标在 JPanel 上移动