如果我只有 numpy.array
的字符串表示:
>>> import numpy as np
>>> arr = np.random.randint(0, 10, (10, 10))
>>> print(arr) # this one!
[[9 4 7 3]
[1 6 4 2]
[6 7 6 0]
[0 5 6 7]]
如何将其转换回 numpy 数组?手动实际插入 ,
并不复杂,但我正在寻找一种编程方法。
用 ,
替换空格的简单正则表达式实际上适用于个位数整数:
>>> import re
>>> sub = re.sub('\s+', ',', """[[8 6 2 4 0 2]
... [3 5 8 4 5 6]
... [4 6 3 3 0 3]]
... """)
>>> sub
'[[8,6,2,4,0,2],[3,5,8,4,5,6],[4,6,3,3,0,3]],' # the trailing "," is a bit annoying
它可以转换成几乎(dtype可能会丢失但没关系)完全相同的数组:
>>> import ast
>>> np.array(ast.literal_eval(sub)[0])
array([[8, 6, 2, 4, 0, 2],
[3, 5, 8, 4, 5, 6],
[4, 6, 3, 3, 0, 3]])
但它对多位数整数和 float 失败:
>>> re.sub('\s+', ',', """[[ 0. 1. 6. 9. 1. 4.]
... [ 4. 8. 2. 3. 6. 1.]]
... """)
'[[,0.,1.,6.,9.,1.,4.],[,4.,8.,2.,3.,6.,1.]],'
因为这些在开头有一个额外的,
。
解决方案不一定需要基于正则表达式,任何其他适用于 unabriged 的方法(未使用 ...
缩短)bool/int/float/具有 1-4 维的复杂数组就可以了。
最佳答案
这是一个非常手动的解决方案:
import re
import numpy
def parse_array_str(array_string):
tokens = re.findall(r''' # Find all...
\[ | # opening brackets,
\] | # closing brackets, or
[^\[\]\s]+ # sequences of other non-whitespace characters''',
array_string,
flags = re.VERBOSE)
tokens = iter(tokens)
# Chomp first [, handle case where it's not a [
first_token = next(tokens)
if first_token != '[':
# Input must represent a scalar
if next(tokens, None) is not None:
raise ValueError("Can't parse input.")
return float(first_token) # or int(token), but not bool(token) for bools
list_form = []
stack = [list_form]
for token in tokens:
if token == '[':
# enter a new list
stack.append([])
stack[-2].append(stack[-1])
elif token == ']':
# close a list
stack.pop()
else:
stack[-1].append(float(token)) # or int(token), but not bool(token) for bools
if stack:
raise ValueError("Can't parse input - it might be missing text at the end.")
return numpy.array(list_form)
或者基于检测在何处插入逗号的手动解决方案:
import re
import numpy
pattern = r'''# Match (mandatory) whitespace between...
(?<=\]) # ] and
\s+
(?= \[) # [, or
|
(?<=[^\[\]\s])
\s+
(?= [^\[\]\s]) # two non-bracket non-whitespace characters
'''
# Replace such whitespace with a comma
fixed_string = re.sub(pattern, ',', array_string, flags=re.VERBOSE)
output_array = numpy.array(ast.literal_eval(fixed_string))
关于python - 解析 numpy 数组的字符串表示,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43879345/