python - 如何根据条件将数据框列拆分为单独的列

我正在尝试将以下数据框拆分为单独的列。我希望一列中的所有文本和数字在空白处分开。

df[0].head(10)

0                                                   []
1               [Andaman and Nicobar, 194, 52, 142, 0]
2        [Andhra Pradesh, 40,646, 19,814, 20,298, 534]
3                [Arunachal Pradesh, 609, 431, 175, 3]
4                   [Assam, 20,646, 6,490, 14,105, 51]
5                  [Bihar, 23,589, 8,767, 14,621, 201]
6                      [Chandigarh, 660, 169, 480, 11]
7              [Chhattisgarh, 4,964, 1,429, 3,512, 23]
8    [Dadra and Nagar Haveli and Daman, 585, 182, 4...
9                          [Daman and Diu, 0, 0, 0, 0]
Name: 0, dtype: object

如果我只在空白处拆分并展开，虽然数字被正确拆分，但文本被拆分成多列。由于不同观察的文本跨越不同数量的列，我无法再次连接它们。显然，解决方案是编写正确的“正则表达式”并对其进行拆分。我无法弄清楚所需的正则表达式，因此请求输入。

df1 = df[0].str.split(' ', expand= True)
df1.head(10)
    0   1   2   3   4   5   6   7   8   9
0   []  None    None    None    None    None    None    None    None    None
1   [Andaman    and     Nicobar,    194,    52,     142,    0]  None    None    None
2   [Andhra     Pradesh,    40,646,     19,814,     20,298,     534]    None    None    None    None
3   [Arunachal  Pradesh,    609,    431,    175,    3]  None    None    None    None
4   [Assam,     20,646,     6,490,  14,105,     51]     None    None    None    None    None
5   [Bihar,     23,589,     8,767,  14,621,     201]    None    None    None    None    None
6   [Chandigarh,    660,    169,    480,    11]     None    None    None    None    None
7   [Chhattisgarh,  4,964,  1,429,  3,512,  23]     None    None    None    None    None
8   [Dadra  and     Nagar   Haveli  and     Daman,  585,    182,    401,    2]
9   [Daman  and     Diu,    0,  0,  0,  0]  None    None    None

我期望的结果应该是这样的:

        0                                   1       2       3       4       5       6       7       8       9
    0   []                                  None    None    None    None    None    None    None    None    None
    1   [Andaman and Nicobar,               194,    52,     142,    0]      None    None    None    None    None
    2   [Andhra Pradesh,                    40,646, 19,814, 20,298, 534]    None    None    None    None    None
    3   [Arunachal Pradesh,                 609,    431,    175,    3]      None    None    None    None    None
    4   [Assam,                             20,646, 6,490,  14,105, 51]     None    None    None    None    None
    5   [Bihar,                             23,589, 8,767,  14,621, 201]    None    None    None    None    None
    6   [Chandigarh,                        660,    169,    480,    11]     None    None    None    None    None
    7   [Chhattisgarh,                      4,964,  1,429,  3,512,  23]     None    None    None    None    None
    8   [Dadra and Nagar Haveli and Daman,  585,    182,    401,    2]      None    None    None    None    None
    9   [Daman and Diu,                     0,      0,      0,      0]      None    None    None    None    None

最佳答案

您可以使用 str.replace 和 str.extract 来 reshape 数据框。

names = df[0].str.extract('(\D+)').replace('\[|,','',regex=True).rename(columns={0 : 'names'})


df_new = names.join(df[0].str.replace('\D+,','').str.strip(']').str.split(' ',expand=True))

print(df_new)

                                  names 0        1        2        3     4
0                   Andaman and Nicobar       194,      52,     142,     0
1                        Andhra Pradesh    40,646,  19,814,  20,298,   534
2                     Arunachal Pradesh       609,     431,     175,     3
3                                 Assam    20,646,   6,490,  14,105,    51
4                                 Bihar    23,589,   8,767,  14,621,   201
5                            Chandigarh       660,     169,     480,    11
6                          Chhattisgarh     4,964,   1,429,   3,512,    23
7      Dadra and Nagar Haveli and Daman       585,     182,     4...  None
8                         Daman and Diu         0,       0,       0,     0

关于python - 如何根据条件将数据框列拆分为单独的列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62968023/

python - 如何根据条件将数据框列拆分为单独的列

上一篇：machine-learning - 如何计算精度和F1？

下一篇：添加 com.google.android.material 后，Android 底部导航栏颜色变为黑色