python - 如何递归地将二维数组拆分为张量？

我已经将长度为 2 的元组作为索引的数据框

          1   2  -1
(0, 1)    0   1   0
(0, 2)    1   0   0
(0, -1)   0   0   0
(1, 1)    1   0   0
(1, 2)    0   1   0
(1, -1)   1   1   1

进入 numpy 2D 数组并设法通过拆分函数将其拆分为 3D 数组(关于第一个值):

arr = np.array(np.array_split(arr,2))

结果

[[[0 1 0]
 [1 0 0]
 [0 0 0]]

[[1 0 0]
 [0 1 0]
 [1 1 1]]]

我想创建一个函数来进一步进行拆分，例如，从 (0,0,0,0)(长度为 4)个索引创建 5D 张量。

关于如何递归执行此操作的任何想法？

最佳答案

使用以下代码生成示例数据:

import pandas as pd
import numpy as np
import itertools

def create_fake_data_frame(nlevels = 2, ncols = 3):
    result = pd.DataFrame(
        index=itertools.product(*(nlevels * [[0, 1]])),
        data=np.arange(ncols*2**nlevels).reshape(2**nlevels, ncols)
    )
    result = convert_index_of_tuples_to_multiindex(result)
    return result

def convert_index_of_tuples_to_multiindex(df):
    return df.set_index(pd.MultiIndex.from_tuples(df.index))

# Increase nlevels to get dataframes with more levels in their MultiIndex
df = create_fake_data_frame(nlevels=3)
print(df)

这是结果:

        0   1   2
0 0 0   0   1   2
    1   3   4   5
  1 0   6   7   8
    1   9  10  11
1 0 0  12  13  14
    1  15  16  17
  1 0  18  19  20
    1  21  22  23

然后，修改数据框，使每一行包含一个列，其值是相应行中的值的列表原始数据框:

def data_frame_with_single_column_of_lists(df):
    if len(df.columns) <= 1:
        return df
    result = df.apply(collapse_columns_into_lists, axis=1)
    return result

def collapse_columns_into_lists(s):
    result = s.copy()
    result['lists'] = result.values.tolist()
    result = result[['lists']]
    return result

df = data_frame_with_single_column_of_lists(df)
print(df)

输出将是这样的:

              lists
0 0 0     [0, 1, 2]
    1     [3, 4, 5]
  1 0     [6, 7, 8]
    1   [9, 10, 11]
1 0 0  [12, 13, 14]
    1  [15, 16, 17]
  1 0  [18, 19, 20]
    1  [21, 22, 23]

最后用下面的代码得到张量

def increase_list_nesting_by_removing_an_index_level(df):
    def list_of_lists(series):
        result = series.to_frame().set_index(series.index.droplevel(-1))
        result = result.apply(lambda x: x['lists'], axis=1).to_frame()
        result = [x[0] for x in result.values.tolist()]
        return result
    grouped = df.groupby(df.index.droplevel(-1))
    result = grouped.agg(list_of_lists)
    if type(result.index[0]) == tuple:
        result = convert_index_of_tuples_to_multiindex(result)
    return result

def tensor_from_data_frame(df):
    if df.index.nlevels <= 1:
        return np.array([i[0] for i in df.values])

    result = increase_list_nesting_by_removing_an_index_level(df)
    result = tensor_from_data_frame(result)
    return result

tensor = tensor_from_data_frame(df)
print(tensor)

结果是这样的:

[[[[ 0  1  2]
   [ 3  4  5]]

  [[ 6  7  8]
   [ 9 10 11]]]


 [[[12 13 14]
   [15 16 17]]

  [[18 19 20]
   [21 22 23]]]]

关于python - 如何递归地将二维数组拆分为张量？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63661000/

python - 如何递归地将二维数组拆分为张量？

上一篇：node.js - X509_check_private_key :key values mismatch error: DPS , 物联网集线器

下一篇：mongodb - typeorm mongo 全文搜索 - 按 $meta : "textScore" 排序