python - Numpy.genfromtxt 删除 dtype.names 中的方括号

我正在尝试使用 numpy.genfromtxt 从文件中读取数据。我将名称参数设置为以逗号分隔的字符串列表，例如

names = ['a', '[b]', 'c']

但是，当返回数组时，dtype.names 值返回 ('a', 'b', 'c')

deletechars 参数未设置或强制为 None。我已经检查过，使用具有带方括号的命名列的 dtype 创建 numpy.ndarray 会保留方括号，因此 genfromtxt 一定会删除方括号。有没有办法关闭这个意想不到的功能？

请注意，如果 names 参数设置为 True，也会发生此行为。我已经在 numpy 版本 1.6.1 和 1.9.9 中对此进行了测试

最佳答案

我之前曾在 numpy issue tracker 上提示过这种字段名称损坏行为。和邮件列表。它也出现在several中previous questions就这样。

事实上，默认情况下，np.genfromtxt 会修改字段名称，即使您通过传递字符串列表作为 names= 参数直接指定字段名称:

import numpy as np
from io import BytesIO

s = '[5],name with spaces,(x-1)!\n1,2,3\n4,5,6'

x = np.genfromtxt(BytesIO(s), delimiter=',', names=True)
print(repr(x))
# array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], 
#       dtype=[('5', '<f4'), ('name_with_spaces', '<f4'), ('x1\n1', '<f4')])

names = s.split(',')[:3]
x = np.genfromtxt(BytesIO(s), delimiter=',', skip_header=1, names=names)
print(repr(x))
# array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], 
#       dtype=[('5', '<f4'), ('name_with_spaces', '<f4'), ('x1\n1', '<f4')])

尽管事实上包含非字母数字字符的字段名称是完全合法的，但还是会发生这种情况:

x2 = np.empty(2, dtype=dtype)
x2[:] = [(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)]
print(repr(x2))
# array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], 
#       dtype=[('[5]', '<f4'), ('name with spaces', '<f4'), ('(x-1)!\n1', '<f4')])

我无法理解这种行为的逻辑。

正如您所见，传递 None 作为 deletechars= 参数不足以防止这种情况发生，因为该参数在内部初始化为一组默认值numpy._iotools.NameValidator 内的字符.

但是，您可以传递一个空序列:

x = np.genfromtxt(BytesIO(s), delimiter=',', names=True, deletechars='')
print(repr(x))
# array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], 
#       dtype=[('[5]', '<f8'), ('name_with_spaces', '<f8'), ('(x-1)!', '<f8')])

这可以是空字符串、列表、元组等。只要其长度为零就没有关系。

关于python - Numpy.genfromtxt 删除 dtype.names 中的方括号，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35234643/

python - Numpy.genfromtxt 删除 dtype.names 中的方括号

上一篇：python - 一维数组的加权平滑 - Python

下一篇：python - Python 中的文本文件到元组到字典