用特定索引中的 numpy 数组的其他值替换特定值的 Pythonic 方法

标签 python arrays numpy

我已经看到了我问题的每个部分的答案。例如 np.where(arr, b, c) 将 arr 中的所有 b 转换为 c。或者 arr[arr == b] = c 做同样的事情。但是，我在一个 numpy 数组 labels_test 中有 1000 个标签，包括 1 和 6。我想将 30% 的正确标签翻转为错误标签以生成错误数据集。因此，我创建了以下应更改的索引列表。

l = [np.random.choice(1000) for x in range(100)] (I am not sure if each index is repeated once)

我想要类似的东西

np.put(labels_test, l, if labels_test[l] ==1, then 6 and  if labels_test[l] ==6, then 1`

我们可以为下面的玩具示例做这件事:

np.random.seed(1)
labels_test = [np.random.choice([1,6]) for x in range(20)]
[6, 6, 1, 1, 6, 6, 6, 6, 6, 1, 1, 6, 1, 6, 6, 1, 1, 6, 1, 1]

最佳答案

这是一种方法:

>>> labels_test = np.random.choice([1, 6], 20)
>>> ind = np.random.choice(labels_test.shape[0], labels_test.shape[0]//3, replace=False)
>>> labels_test
array([1, 6, 1, 1, 6, 1, 1, 1, 6, 1, 1, 1, 6, 6, 6, 6, 6, 1, 1, 1])
>>> labels_test[ind] = 7 - labels_test[ind]
>>> labels_test
array([1, 6, 1, 6, 6, 6, 1, 1, 6, 1, 6, 1, 1, 6, 1, 6, 6, 1, 1, 6])

这通过无放回抽样恰好翻转了 30%(四舍五入到最接近的整数)。根据您的要求，一个合适的替代方案可能是以 0.3 的概率选择每个标签。

关于用特定索引中的 numpy 数组的其他值替换特定值的 Pythonic 方法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58470227/

上一篇：python - 缺少可选依赖项 'tables' 。在 Pandas to_hdf

下一篇：Python:通过添加累计股息并取复合年增长率(CAGR)来计算总返回

相关文章：

python - 如何使用 Numpy/Scipy 编写一致的代码？

python - 仅使用python置换最后一列

python - Pandas 的自然排序

python - 如何将缺少季度的 0 数据行插入到 pandas 数据框中？

javascript - 将 mysql json 推送到数组中

python - scipy 中正常值的数值积分

python - 在 pandas 系列上使用 apply 方法获取 TypeError 'Series' 对象是可变的，因此无法对其进行哈希处理

Python assert_call_with，有通配符吗？

javascript - 如何从 React Native 中的两个数组中提取不相同的元素？

c - 为什么我不能使用 malloc 将数组大小设置为大于所需的大小？