algorithm - 何时终止使用 alpha beta 修剪和转置表的迭代加深？

我怎么知道什么时候可以停止增加迭代加深算法的深度 negamax alpha beta 修剪和转置表？以下伪代码取自维基页面:

function negamax(node, depth, α, β, color)
 alphaOrig := α

 // Transposition Table Lookup; node is the lookup key for ttEntry
 ttEntry := TranspositionTableLookup( node )
 if ttEntry is valid and ttEntry.depth ≥ depth
     if ttEntry.Flag = EXACT
         return ttEntry.Value
     else if ttEntry.Flag = LOWERBOUND
         α := max( α, ttEntry.Value)
     else if ttEntry.Flag = UPPERBOUND
         β := min( β, ttEntry.Value)
     endif
     if α ≥ β
         return ttEntry.Value
 endif

 if depth = 0 or node is a terminal node
     return color * the heuristic value of node

 bestValue := -∞
 childNodes := GenerateMoves(node)
 childNodes := OrderMoves(childNodes)
 foreach child in childNodes
     val := -negamax(child, depth - 1, -β, -α, -color)
     bestValue := max( bestValue, val )
     α := max( α, val )
     if α ≥ β
         break

 // Transposition Table Store; node is the lookup key for ttEntry
 ttEntry.Value := bestValue
 if bestValue ≤ alphaOrig
     ttEntry.Flag := UPPERBOUND
 else if bestValue ≥ β
     ttEntry.Flag := LOWERBOUND
 else
     ttEntry.Flag := EXACT
 endif
 ttEntry.depth := depth 
 TranspositionTableStore( node, ttEntry )

 return bestValue

这是迭代深化调用:

while(depth < ?)
{
    depth++;
    rootNegamaxValue := negamax( rootNode, depth, -∞, +∞, 1)
}

当然，当我知道游戏中的总步数时，我可以使用 depth < numberOfMovesLeft作为上限。但是如果没有给出这个信息，我什么时候知道另一个 negamax 调用没有给出比上一次运行更好的结果？我需要在算法中更改什么？

最佳答案

简短的回答是:当你没时间了(换位表与答案/问题无关)

这里我假设你的评估函数是合理的(给出了位置的良好近似值)。

将迭代加深与 alpha beta 相结合的主要思想如下:假设您有 15 秒的时间来想出最佳着法。你能搜索多远？我不知道，也没有人知道。您可以尝试搜索直到 depth = 8 才发现搜索在 1 秒内完成(因此您浪费了 14 秒的时间)。通过反复试验，您发现 depth = 10 会在 13 秒内给出结果。所以你决定一直使用它。但是现在出现了严重的错误(你的 alpha beta 修剪得不够好，一些位置需要太多时间来评估)并且你的结果在 15 秒内没有准备好。因此，您要么采取了随机行动，要么输掉了比赛。

为了避免发生这种情况，最好准备好结果。因此，您执行以下操作。获取 depth=1 的最佳结果并存储。找到 depth=2 的最佳结果，并覆盖它。等等。不时检查剩余时间，如果真的接近时间限制 - 返回您的最佳着法。

现在您无需担心时间问题，您的方法将给出您迄今为止找到的最佳结果。通过对不同子树的所有这些重新计算，您只会浪费一半的资源(如果您检查整棵树，但在 alpha-beta 中您很可能不会)。额外的好处是，现在您可以在每次深度迭代中从最好到最差重新排序移动，从而使修剪更加积极。

关于algorithm - 何时终止使用 alpha beta 修剪和转置表的迭代加深？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33090419/

algorithm - 何时终止使用 alpha beta 修剪和转置表的迭代加深？

上一篇：java - 比较同一类变量的最佳方法是什么？

下一篇：algorithm - 嵌套循环的运行时间