python - 将浮点值更改为整数值，然后在 Pandas 数据框中连接

我有一个名为“sample”的数据框，它包含三列:“birthDay”、“birthMonth”和“birthYear”，并且包含浮点值，如下图所示:

我想添加新列“dateOfBirth”并以整数格式输入并获取以下数据框:

我尝试了 sample["dateOfBirth"] = sample["birthDay"].map(str)。 +"/"+ baseball["birthMonth"].map(str) +"/"+ baseball["birthYear"].map(str).但结果是 "11.0/3.0/1988.0" 和 "4.0/20.0/2001.0"。

非常感谢您的帮助。

最佳答案

设置

sample = pd.DataFrame([
        [3., 11., 1988.],
        [20., 4., 2001.],
    ], columns=['birthDay', 'birthMonth', 'birthYear'])

选项 1
使 dateOfBirth 成为一系列 Timestamps

# dictionary map to rename to canonical date names
# enables convenient conversion using pd.to_datetime
m = dict(birthDay='Day', birthMonth='Month', birthYear='Year')
sample['dateOfBirth'] = pd.to_datetime(sample.rename(columns=m))

sample

选项 2
如果你坚持要一个字符串
将 dt 访问器与 strftime

一起使用

# dictionary map to rename to canonical date names
# enables convenient conversion using pd.to_datetime
m = dict(birthDay='Day', birthMonth='Month', birthYear='Year')

sample['dateOfBirth'] = pd.to_datetime(sample.rename(columns=m)) \
                          .dt.strftime('%-m/%-d/%Y')

sample

选项 3
如果你真的想从值中重建
使用 apply

f = '{birthMonth:0.0f}/{birthDay:0.0f}/{birthYear:0.0f}'.format
sample['dateOfBirth'] = sample.apply(lambda x: f(**x), 1)
sample

nulls
如果一个或多个日期列有缺失值:
选项 1 和 2 不需要任何更改，无论如何都是推荐的选项。
如果你想从 float 构造，我们可以使用 bool 掩码和 loc 来分配。

sample = pd.DataFrame([
        [3., 11., 1988.],
        [20., 4., 2001.],
        [20., np.nan, 2001.],
    ], columns=['birthDay', 'birthMonth', 'birthYear'])

sample

f = '{birthMonth:0.0f}/{birthDay:0.0f}/{birthYear:0.0f}'.format
mask = sample[['birthDay', 'birthMonth', 'birthYear']].notnull().all(1)
sample.loc[mask, 'dateOfBirth'] = sample.apply(lambda x: f(**x), 1)
sample

时间
给定样本

时间
给定采样时间 10,000

关于python - 将浮点值更改为整数值，然后在 Pandas 数据框中连接，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41030317/

python - 将浮点值更改为整数值，然后在 Pandas 数据框中连接

上一篇：python - 重载 getitem 接受另一个参数

下一篇：python - 在操作数据帧时引发 ValueError ('Series lengths must match to compare' )

python - 将浮点值更改为整数值，然后在 Pandas 数据框中连接

上一篇：python - 重载 __getitem__ 接受另一个参数

下一篇：python - 在操作数据帧时引发 ValueError ('Series lengths must match to compare' )

上一篇：python - 重载 getitem 接受另一个参数