python-3.x - 数据框在所有行的特定字符串之前拆分

我有一个数据框 (df)，其中包含来自网络抓取练习的 30 000 行

Name     NameID                                                            Age

John     www.link.com/www.link.com/https://www.link.com/ct/John             25
Samanta  www.link.com/www.link.com/https://www.link.com/ct/Samanta          24
Johnny   www.link.com/www.link.com/                                         22
Mary     www.link.com/www.link.com/https://www.link.com/ct/Mary             35

我想以只阅读“https://www.link.com/ct/”部分的方式清理“NameID”行。所以我的输出数据框应该是这样的:

 Name     NameID                                  Age

John     https://www.link.com/ct/John             25
Samanta  https://www.link.com/ct/Samanta          24
Johnny                                            22
Mary     https://www.link.com/ct/Mary             35

到目前为止我的代码:

df['NameID'] = df['NameID'].str.split("https://www.link.com/ct/")[1][1]
df['NameID'] =  "https://www.link.com/ct/" + df['NameID'].astype(str)

现在的输出如下所示:

Name     NameID                                  Age

John     https://www.link.com/ct/John             25
Samanta  https://www.link.com/ct/John             24
Johnny   https://www.link.com/ct/John             22
Mary     https://www.link.com/ct/John             35

有什么帮助吗？

最佳答案

你已经接近了，你需要 .str[1]。尝试将您的代码更改为:

df['NameID'] = df['NameID'].str.split("https://www.link.com/ct/").str[1]
df['NameID'] =  "https://www.link.com/ct/" + df['NameID'].astype(str)

df

      Name                           NameID  Age
0     John     https://www.link.com/ct/John   25
1  Samanta  https://www.link.com/ct/Samanta   24
2   Johnny      https://www.link.com/ct/nan   22
3     Mary     https://www.link.com/ct/Mary   35

您可以稍微调整您的代码以返回 ''，正如您在所需结果中指定的那样。

关于python-3.x - 数据框在所有行的特定字符串之前拆分，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66349310/

python-3.x - 数据框在所有行的特定字符串之前拆分

上一篇：docker - 无法创建新的 OS 线程(已经有 2 个；errno=22)

下一篇：laravel - 返回一个构建器，其每个关系的模型都匹配特定条件