python - 检查列表中的重复项时循环会中断吗？

我正在尝试使用另一个列表中的随机结果生成一个列表，并且我希望这样做时没有重复项(除了一个异常(exception))。当我检查重复项时，就会出现问题 - 当它找到重复项时，它会自动中断循环，但我不知道为什么。

据我所知，一切似乎都是正确的。我已经通过 pythontutor.com/visualise 运行了代码，我尝试了不同的代码片段来检查重复项，我已将循环更改为 for 循环、while 循环、范围循环。我测试了我对turning_point的定义，我什至将它复制粘贴到循环本身而不是将其用作函数，我尝试更改“if”语句的位置等。我已经为此花了整整一个晚上我仍然无法解决问题。

编辑:我实际上并不希望在特定实例上有如此大的权重(在本例中为“结论”)，我这样做只是为了测试重复项检查。实际上，权重更接近 3、1、1、2、1 等。另一件事是，在我的实际代码中，action_tables 的长度是 43 个值，而不是 7 个。


#first list to draw random items from
action_table = ["conclusion", "none", "confrontation", "protector", "crescendo", "destroy the thing", "meta"]

#code to draw random items from first list
#first variable weighted to guarantee duplicates for testing
def turning_point():
    turning_point = (random.choices(action_table, [500, 1, 1, 1, 1, 1, 1]))
    return turning_point

#generate second list called 'plot_points'
plot_points = []
for x in range(int(input("How many Turning Points to generate? "))):
    tp = turning_point()
#the code below is what I got from this site
#added tp != to allow duplicates of "none" result
    if any(plot_points.count(tp) > 1 for tp in plot_points) and tp != "none": 
        continue
#results get added to the plot_points list:
    plot_points.append(tp)
print(plot_points)

如果我删除检查重复项的行，这就是我得到的:

[['conclusion'], ['conclusion'], ['meta'], ['conclusion'], ['conclusion']]

如果我不删除该行，这就是我得到的:

[['conclusion'], ['conclusion']]

我想要得到的是这样的:

[['conclusion'], ['none'], ['none'], ['meta'], ['protector']]

最佳答案

错误在这里:

tp != "none"

tp始终是一个只有一个元素的列表，因为 random.choices()默认情况下返回包含单个元素的列表。来自 documentation for random.choices() :

random.<b>choices</b>(<i>population, weights=None, *, cum_weights=None, k=1</i>) Return a k sized list of elements chosen from the population with replacement.

与 k左为 1，tp每次都是一个 1 元素列表，并且永远不能等于 "none" 。它将等于 ["none"] ，或["conclusion"] ，等等。这意味着 `tp != "none"始终为真。

接下来是您的any()仅当存在多个具有当前所选值的嵌套列表(因此至少有 2 个)时，测试才会启动。此时，您开始跳过出现两次的任何内容，因为 tp != "none"始终为真:

>>> plot_points = [["conclusion", "conclusion"]]
>>> tp = ["conclusion"]
>>> any(plot_points.count(tp) > 1 for tp in plot_points)
True
>>> tp != "none"
True

您对给出的选择的权重使得除了 "conclusion" 之外的任何东西都非常非常不可能被选中。对于您的 7 个选项，["conclusion"]您调用 turning_point() 的 506 次中将被选中 500 次函数，因此大多数情况下都会发生上述情况(每 1036579476493 次实验中的 976562500000 次将连续 5 次出现 ["conclusion"] ，或者每 35 次测试中大约有 33 次出现)。因此，您极少会看到任何其他选项被生成两次，更不用说 3 次了(每 64777108 次测试中只有 3 次会重复任何其他选项 3 次或以上)。

如果您必须生成一个列表，其中除了 none 之外没有任何内容重复。 ，那么权衡选择就没有意义了。此刻"conclusion"已被选中，无论如何您都无法再次选中它。如果目标是使 "conclusion" 极有可能元素是结果的一部分，然后在最后将其单独交换，然后首先对剩余选择的列表进行打乱。通过改组，您可以将结果缩小到第一个 N 的大小。元素都是随机的，并且是唯一的:

>>> import random
>>> action_table = ["confrontation", "protector", "crescendo", "destroy the thing", "meta"]
>>> random.shuffle(action_table)  # changes the list order in-place
>>> action_table[:3]
['meta', 'crescendo', 'destroy the thing']

您可以用 "none" 填充该列表。元素使其足够长以满足长度要求，然后插入 conclusion根据应包含的机会处于随机位置:

def plot_points(number):
    action_table = ["none", "confrontation", "protector", "crescendo", "destroy the thing", "meta"]
    if number > 6:
        # add enough `"none"` elements
        action_table += ["none"] * (number - 6)
    random.shuffle(action_table)
    action_table = action_table[:number]
    if random.random() > 0.8:
        # add in a random "conclusion"
        action_table[random.randrange(len(number))] = "conclusion"
    return action_table

请注意，这是一个伪加权选择； conclusion 80% 的情况下都会被选中，并且仅使用"none" 来保留唯一性。重复以填充结果。否则，其他元素就无法具有唯一性。

但是，如果您必须拥有

输出列表中的唯一值(可能重复 "none" )
输入的加权选择

那么你想要一个weighted random sample selection without replacement 。您可以使用标准 Python 库来实现这一点:

import heapq
import math
import random

def weighted_random_sample(population, weights, k):
    """Chooses k unique random elements from a population sequence.

    The probability of items being selected is based on their weight.

    Implementation of the algorithm by Pavlos Efraimidis and Paul
    Spirakis, "Weighted random sampling with a reservoir" in 
    Information Processing Letters 2006. Each element i is selected
    by assigning ids with the formula R^(1/w_i), with w_i the weight
    for that item, and the top k ids are selected. 

    """ 
    if not 0 <= k < len(population):
        raise ValueError("Sample larger than population or negative")
    if len(weights) != len(population):
        raise ValueError("The number of weights does not match the population")

    key = lambda iw: math.pow(random.random(), 1 / iw[1])
    decorated = heapq.nlargest(k, zip(population, weights), key=key)
    return [item for item, _ in decorated]

如果您需要 7 件或更少的元素，请使用此选项来选择您的元素，否则需要额外 "none"值并随机播放(因为无论如何最终都会选择所有 7 个项目):

def plot_points(number):
    action_table = ["conclusion", "none", "confrontation", "protector", "crescendo", "destroy the thing", "meta"]

    if number > len(action_table):
        # more items than are available
        # pad out with `"none"` elements and shuffle
        action_table += ["none"] * (number - len(action_table))
        random.shuffle(action_table)
        return action_table

    weights = [3, 1, 1, 1, 2, 2, 1]
    return weighted_random_sample(action_table, weights, number)

演示:

>>> plot_points(5)
['none', 'conclusion', 'meta', 'crescendo', 'destroy the thing']
>>> plot_points(5)
['conclusion', 'destroy the thing', 'crescendo', 'meta', 'confrontation']
>>> plot_points(10)
['none', 'crescendo', 'protector', 'confrontation', 'meta', 'destroy the thing', 'none', 'conclusion', 'none', 'none']

当然，如果你是真的 action_table更大并且你不允许选择比你有 Action 更多的情节点，根本不需要填充东西，你只需使用 weighted_random_sample()直接。

关于python - 检查列表中的重复项时循环会中断吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56924413/

python - 检查列表中的重复项时循环会中断吗？

上一篇：python - 如何将Python中的曲柄尼科尔森方法应用于像薛定谔的波动方程

下一篇：python - 将 np.tril 和 np.triu 堆叠在一起