java - 从有界 nextInt(int bound) 结果中查找 Java.util.Random 种子

背景

我一直在阅读并试图解决各种问题/答案，这些问题/答案与从 Java.util.Random 中找到种子有关，给出了 nextInt() 的输出。 .

执行nextInt(int bound)是:

public int nextInt(int bound) {
   if (bound <= 0)
     throw new IllegalArgumentException("bound must be positive");

   if ((bound & -bound) == bound)  // i.e., bound is a power of 2
     return (int)((bound * (long)next(31)) >> 31);

   int bits, val;
   do {
       bits = next(31);
       val = bits % bound;
   } while (bits - val + (bound-1) < 0);
   return val;
 }

执行next(int bits)是:

protected int next(int bits) {
    long oldseed, nextseed;
    AtomicLong seed = this.seed;
    do {
        oldseed = seed.get();
        nextseed = (oldseed * multiplier + addend) & mask;
    } while (!seed.compareAndSet(oldseed, nextseed));
    return (int)(nextseed >>> (48 - bits));
}

其中乘数是 0x5DEECE66DL , 加数是 0xBL , 掩码是 (1L << 48) - 1 .这些是十六进制值，其中 L 是 long 的 Java 约定转换。

调用 nextInt()没有界限，完整的 32 位从 next(32) 返回而不是用 bits % bound 丢弃位.

问题

如果不完全暴力破解全部 2⁴⁸ 种可能性，我将如何在 x 次调用 nextInt(n) 后找到当前种子(假设界限永远不是 2 的幂)？例如，假设我想找到 10 次调用 nextInt(344) 的种子。 [251, 331, 306, 322, 333, 283, 187, 54, 170, 331]。
如何确定找到正确种子所需的数据量，而不仅仅是另一个产生相同起始数据的种子？
给定奇数/偶数边界是否会发生变化？

最佳答案

Without completely bruteforcing the full 2⁴⁸ possibilities, how would I go about finding the current seed after x amount of calls to nextInt(n) (assuming the bound is never a power of 2)?

让我们首先删除此处用于多线程、错误测试和 bound 的代码2 的幂。事情归结为

public int nextInt_equivalent(int bound) {
   int bits, val;
   do {
       seed = (seed * multiplier + addend) & mask; // 48-bit LCG
       bits = seed >> 17;                          // keep the top 31 bits
       val = bits % bound;                         // build val in [0, bound)
   } while (bits - val + (bound-1) < 0);
   return val;
 }

接下来我们必须了解while (bits - val + (bound-1) < 0)是什么是关于。在这里使用bits仅当它处于 bound 宽度倍数的区间内时，从而确保 val 的均匀分布.该间隔是 [0, (1L<<31)/bound*bound ).
while 条件相当于 while (bits >= (1L<<31)/bound*bound) ，但执行速度更快。 (1L<<31)%bound 会出现这种情况bits 的最高值来自 1L<<31 .当bound是 344，这发生在 bits 的 8 个值上2³¹，或每 10 亿美元约 3.7 个。

这种情况非常罕见，一种合理的方法是假设它不会发生。另一种是希望它发生，并通过查看导致该罕见事件的种子是否导致序列 val 来测试它是否(以及何时)发生。在给定的发现。我们只有((1L<<31)%bound)<<17 (此处略高于一百万)值为 seed进行测试，这是相当可行的。不管怎样，在其余部分我假设排除了这种可能性，并立即考虑生成器。

当 bound是偶数，或者更一般地是 2^S 的倍数，对于某些 S>0，观察输出的低位 S 位 val (我们可以找到)也是 bits 的低位 S 位，因此 seed 的等级 [17, 17+S) 的位.并且种子的低 17+S 位完全独立于其他 31-S 位。当bound是344=8×43，我们有S=3，因此我们可以攻击seed的低位17+S=20位独立地。我们直接得到seed的S=3位从第一个val .

我们得到了seed的低17位通过消除:对于 2¹⁷ 个候选者中的每一个，给定我们已知的 S=3 位，seed 的 17+S=20 位是否是导致一系列 val哪些低阶 S 位与给定序列匹配？有了足够的值，我们就可以完全确定 17+S 位。我们需要 ⌈17/S+1⌉ = 7 val将 seed 的 17+S 低位缩小为单个值通过这种方式。如果我们得到的少，我们下一步需要保留几个候选人。在问题中我们有足够的 val缩小到一个值，并确信我们做对了。

然后当我们有 seed 的这 17+S=20 位时，我们可以用适度的蛮力找到剩余的 31-S=28。我们可以为 seed 测试未知位的 2²⁸ 个值并检查哪个与已知的 val 完全匹配.但更好:我们知道seed % (bound<<17)完全正确，因此只需要测试 2³¹/bound seed 的值(这里大约有 600 万)。

How can I determine the amount of data I'd need to find the correct seed?

除了病理性 LCG 和许多其他 PRNG 之外，所有的工作启发式方法是您需要与状态中的位一样多的信息，因此是 48 位。每个输出都会给你 log2( bound ) 位，因此你需要⌈48/log2( bound )⌉ 值，这里是 6(这需要跟踪 seed 的低 20 位的一些候选者，因此在第二阶段需要相应的更多工作)。额外的值让人相信实际状态已恢复，但 AFAIK 错误的猜测不会发生，除非 while开始发挥作用。

Does this change given bounds that are either odd/even?

上述攻击策略不适用于奇数bound (我们无法单独猜测低位，需要搜索 2⁴⁸/bound 的 seed 值)。然而，有更好的攻击，更少的猜测，即使我们大大增加状态位数也适用，包括奇数 bound .它们更难解释(阅读:我几乎无法让他们使用数学包，也无法解释如何；请参阅 question)。

关于java - 从有界 nextInt(int bound) 结果中查找 Java.util.Random 种子，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65576946/

java - 从有界 nextInt(int bound) 结果中查找 Java.util.Random 种子

背景

问题

上一篇：r - ggplot2 是否使用/继承了 R 基本图形中的一些参数？

下一篇：GitHub 操作 : "You' ve used 100% of included services for GitHub Storage (GitHub Actions and Packages)"after deleting all Artifacts