java - String 和 StringBuilder 操作在内存使用方面的比较

标签 java string stringbuilder

根据 KathySierra 的 SCJP 学习指南:

The java.lang.StringBuffer and java.lang.StringBuilder classes should be used when you have to make modifications to strings of characters. As we discussed, String objects are immutable, so if you choose to do a lot of manipulations with String objects, you will end up with a lot of abandoned String objects in the String pool

为了解决这个问题,我查看了 String 类和 StringBuilder 的代码 source here .

String 的简化代码如下所示:

public final class String(){
     private final char [] value; //Final helps in immutability, I guess.
     public String (String original){
         value = original.value;
      }
}

StringBuilder 的简化版本如下所示:

public final class StringBuilder{
    char [] value;
    public StringBuilder(String str) {
        value = new Char[(str.length() + 16)]; // 16 here is implementation dependent.
    append(str);
}

public StringBuilder append(String str){

            //Add 'str' characters in value array if its size allows,
        else
            // Create new array of size newCapacity and copy contents of 'value' in that.
            //value = Arrays.copyOf(value, newCapacity);// here old array object is lost.

        return this;
    }
}

假设我们有一个案例:

使用字符串类:

String s1 = "abc"; // Creates one object on String pool.
s1 = s1+"def"; // Creates two objects - "def " on String pool
// and "abcdef" on the heap.

如果我使用 StringBuilder,代码将变为:

 StringBuilder s1 = StringBuilder("abc");

 // Creates one String object "abc " on String pool.
 // And one StringBuilder object "abc" on the heap.
 s1.append("def");
 // Creates one string object "def" on String pool.
 // And changes the char [] inside StringBuilder to hold "def" also.

在 StringBuilder s2 = s1.append("def"); 中,保存字符串的 char 数组被新的 char 数组替换的机会相同。旧数组现在引用较少,将被垃圾收集。

我的查询是:

使用简单的字符串连接和 StringBuilder append() 方法,进入字符串池的 String 对象的数量是相同的。

根据上面列出的代码,StringBuilder 确实首先使用了更大的 char 数组,而 String 对象包含与它持有的字符串。

  1. StringBuilder 的使用如何比正常更有效 String 字符串操作类?
  2. SCJP 指南中给出的陈述是否错误?

最佳答案

关键是expandCapacity函数:

void expandCapacity(int minimumCapacity) {
    int newCapacity = (value.length + 1) * 2; //important part here
    if (newCapacity < 0) {
        newCapacity = Integer.MAX_VALUE;
    } else if (minimumCapacity > newCapacity) {
        newCapacity = minimumCapacity;
    }
    value = Arrays.copyOf(value, newCapacity);
}

每次需要调整大小时,通过将基础数组调整为 2 倍,amortized time需要附加 1 个字符的情况已最小化。

Wikipedia有一个很好的解释:

As n elements are inserted, the capacities form a geometric progression. Expanding the array by any constant proportion ensures that inserting n elements takes O(n) time overall, meaning that each insertion takes amortized constant time. The value of this proportion a leads to a time-space tradeoff: the average time per insertion operation is about a/(a−1), while the number of wasted cells is bounded above by (a−1)n. The choice of a depends on the library or application: some textbooks use a = 2, but Java's ArrayList implementation uses a = 3/2 and the C implementation of Python's list data structure uses a = 9/8.

Many dynamic arrays also deallocate some of the underlying storage if its size drops below a certain threshold, such as 30% of the capacity. This threshold must be strictly smaller than 1/a in order to support mixed sequences of insertions and removals with amortized constant cost.

Dynamic arrays are a common example when teaching amortized analysis.

现在对于您的特定示例,它不会产生任何影响,但您会在附加大量字符时看到效果:

public static void main(String[] args){
    int numAppends = 200000;
    int numRepetitions = 3;
    long[] time1 = new long[numRepetitions];
    long[] time2 = new long[numRepetitions];
    long now;
    for (int k = 0; k < numRepetitions; k++){
        String s = "";
        now = System.nanoTime();
        for(int i = 0; i < numAppends ; i++) {
            s = s + "a";
        }
        time1[k] = (System.nanoTime() - now) / 1000000;
        StringBuilder sb = new StringBuilder();
        now = System.nanoTime();
        for(int i = 0; i < numAppends ; i++) {
            sb.append("a");     
        }
        time2[k] = (System.nanoTime() - now) / 1000000;
        System.out.println("Rep "+k+", time1: "+time1[k]+ " ms, time2: " + time2[k] + " ms");
    }
}

输出:

Rep 0, time1: 13509 ms, time2: 7 ms
Rep 1, time1: 13086 ms, time2: 1 ms
Rep 2, time1: 13167 ms, time2: 1 ms

此外,我还计算了 Arrays.copyOf() 方法在 extendCapacity() 中被调用的次数作为基准测试:第一次迭代是 49 次,但在第二次和第三次迭代中只有 15 次。输出结果如下:

newCapacity: 34
newCapacity: 70
newCapacity: 142
newCapacity: 286
newCapacity: 574
newCapacity: 1150
newCapacity: 2302
newCapacity: 4606
newCapacity: 9214
newCapacity: 18430
newCapacity: 36862
newCapacity: 73726
newCapacity: 147454
newCapacity: 294910
newCapacity: 42
Rep 2, time1: 12955 ms, time2: 134 ms

关于java - String 和 StringBuilder 操作在内存使用方面的比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20429442/

相关文章:

string - 字符串内的换行符显示在 TMemoBox 上

java - 大量注释代码会影响类文件的大小吗?

c - C 中的名称排序

每次运行时 Java 字符串到字节数组的变化

c# - 将字节数组的内容转换为字符串

Android StringBuilder 与字符串连接

java - 不同版本的依赖支持 :design

java - 简单的纸牌游戏,让用户选择游戏中的牌组数量

Java:WAITING第一帧,同时从第二帧检索数据,然后点击关闭按钮,控件也进入第一帧

java - 如何在给定数字中缺少数字的文件中用数字拆分?