go - 从 golang 中受限键范围内的映射生成的 slice 中随机选择元素。有 O(1) 的捷径吗？

在我模拟多粒子进化的程序中，我有一个 map ，它采用键值 pop(人口规模)并返回包含具有该人口的地点的 slice : myMap[pop][]int.这些 slice 通常都很大。

在每个进化步骤中，我选择一个随机种群大小 RandomPop。然后我想随机选择一个人口至少为 RandomPop 的网站。 sitechosen 用于更新我的人口结构，我利用第二张 map 有效地更新 myMap 键。我当前的(缓慢的)实现看起来像

func Evolve( ..., myMap map[int][]int ,...){

    RandomPop = rand.Intn(rangeofpopulation)+1

    for i:=RandPop,; i<rangeofpopulation;i++{
        preallocatedslice=append(preallocatedslice,myMap[i]...)
    }

    randomindex:= rand.Intn(len(preallocatedslice))
    sitechosen= preallocatedslice[randomindex]

    UpdateFunction(site)

    //reset preallocated slice 
    preallocatedslice=preallocatedslice[0:0]

}

这段代码(显然)在将值从映射复制到预分配 slice 时遇到了巨大的瓶颈，runtime.memmove 占用了我 87% 的 CPU 使用率。我想知道是否有一种 O(1) 方法可以随机选择 myMap 指示的 slice 并集中包含的条目，键值介于 0 和 RandomPop 之间？如果有人知道它们，我对允许您操作自定义哈希表的包持开放态度。建议不需要安全的并发

尝试了其他方法:我之前让我的 map 记录了所有值至少为 pop 的站点，但这占用了 >10GB 的内存并且很愚蠢。我尝试存储指向相关 slice 的指针以制作查找 slice ，但 go 禁止这样做。我可以总结每个 slice 的长度并基于此生成一个随机数，然后按长度遍历 myMap 中的 slice ，但这比仅保留我的人口的更新 cdf 并进行二进制搜索要慢得多在上面。二分搜索速度很快，但更新 cdf，即使手动完成，也是 O(n)。我真的希望滥用哈希表来加快随机选择和更新速度(如果可能的话)

我有一个模糊的想法是编造某种 map 的嵌套结构，指向它们的内容，也指向比他们的键小的 map 或其他东西。

最佳答案

我在查看您的代码时有一个问题。为什么必须将值从 map 复制到 slice ？我的意思是，我认为我正在遵循背后的逻辑......但我想知道是否有办法跳过这一步。

所以我们有:

func Evolve( ..., myMap map[int][]int ,...){

    RandomPop = rand.Intn(rangeofpopulation)+1

    for i:=RandPop,; i<rangeofpopulation;i++{
        // slice of preselected `sites`. one of this will be 'siteChosen'
        // we expect to have `n sites` on `preAllocatedSlice`
        // where `n` is the amount of iterations, 
        // ie; n = rangeofpopulation - RandPop
        preallocatedslice=append(preallocatedslice,myMap[i]...) 
    }

    // Once we have a list of sites, we select `one`
    // under a normal distribution every site ha a chance of 1/n to be selected.
    randomindex:= rand.Intn(len(preallocatedslice))
    sitechosen= preallocatedslice[randomindex]

    UpdateFunction(site)
    ...

}

但是如果我们将其更改为:

func Evolve( ..., myMap map[int][]int ,...){

    if len(myMap) == 0 {
        // Nothing to do, print a log! 
        return
    }

    // This variable will hold our site chosen!
    var siteChosen []int

    // Our random population size is a value from 1 to rangeOfPopulation 
    randPopSize := rand.Intn(rangeOfPopulation) + 1

    for i := randPopSize; i < rangeOfPopulation; i++ {
        // We are going to pretend that the current candidate is the siteChosen 
        siteChosen = myMap[i]

        // Now, instead of copying `myMap[i]` to preAllocatedSlice
        // We will test if the current candidate is actually the 'siteChosen` here:

        // We know that the chances for an specific site to be the chosen is 1/n,
        // where n = rangeOfPopulation - randPopSize
        n := float64(rangeOfPopulation - randPopSize)
        // we roll the dice...
        isTheChosenOne := rand.Float64() > 1/n

        if isTheChosenOne {
            // If the candidate is the Chosen site, 
            // then we don't need to iterate over all the other elements.
            break
        }

    }

    // here we know that `siteChosen` is a.- a selected candidate, or 
    // b.- the last element assigned in the loop 
    // (in the case that `isTheChosenOne` was always false [which is a probable scenario])
    UpdateFunction(siteChosen)
    ...
}

另外，如果你想计算n，或者在循环外计算1/n。所以这个想法是在循环内测试候选人是否是 siteChosen，并避免将候选人复制到这个预选池。

关于go - 从 golang 中受限键范围内的映射生成的 slice 中随机选择元素。有 O(1) 的捷径吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52611168/

go - 从 golang 中受限键范围内的映射生成的 slice 中随机选择元素。有 O(1) 的捷径吗？

上一篇：使用 VGO 和 go.mod 找不到 Go 模块

下一篇：go - 对于主题的某些分区，kafka 偏移量和滞后是未知的