python - get genfromtxt/loadtxt 忽略被忽略的列/行中的数据类型

我有一个包含整数数据的文件，其中前几行/列用于名称。

我希望能够使用 genfromtxt 或 loadtxt 并且仍然让 numpy 将其作为同质数组读取。为此，我使用了选项 skiprows 和 usecols 但它没有帮助。在下面的(工作)示例中，我希望 print(test_array.shape) 给出 (3,3) 和 print(test.array) 给出

[[0 0 0]
 [0 1 0]
 [1 0 0]]

在尝试加载文件之前，有没有什么方法可以实现我想要的效果，而无需使用 unix 工具修剪第一行/第一列？请注意，我想要加载的实际文件是 B-I-G(约 6 GB)，因此任何解决方案都不应该消耗太多计算量。

from __future__ import print_function
from StringIO import StringIO #use io.StringIO with py3
import numpy as np

example_file = StringIO("FID 1 2 3\n11464_ATCACG 0 0 0\n11465_CGATGT 0 1 0\n11466_TTAGGC 1 0 0")
test_array = np.loadtxt(example_file, skiprows=1, usecols=(1,), dtype=int)

print(test_array.shape) #(3,)
print(test_array) #[0 0 1]

最佳答案

您可以在 np.genfromtxt 中使用 usecols 和 skip_header 标志。然后它就可以正常工作:

test_array = np.genfromtxt(example_file, skip_header=1, usecols=(1,2,3))
>>> print(test_array)
[[ 0.  0.  0.]
 [ 0.  1.  0.]
 [ 1.  0.  0.]]

关于python - get genfromtxt/loadtxt 忽略被忽略的列/行中的数据类型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19616884/

上一篇：python - 如何创建自动更新的 sqlite 表？

下一篇：javascript - 使用 XHR 更新服务器文件并向 html select 添加选项

相关文章：

python - 在 matplotlib 中突出显示点序列

python - os.walk 目录名中的空列表是什么？

python - Kivy - 在其他屏幕中创建的访问实例

c# - numpy 的日志函数发生了什么？有没有办法提高性能？

python - Numpy 积或张量积问题

python - 显示最小宽度为 0 的数字

python - 重新索引 MultiIndex 数据帧的特定级别

python - 使用此命令 `np.linalg.eig(H*H)` 计算特征分解是否合适？

python - 通过检查多个条件更改 Pandas 列值

python - 在numpy中四舍五入到有效数字