python - pandas2ri.ri2py_dataframe(r_dataframe) 返回 float 而不是 ISO-8601 (YYYY-MM-DD) 格式的日期

标签 python r pandas dataframe

代码

# First, convert the input dataframe to an R dataframe to be used by our R function:
input_dataframe_r = pandas2ri.py2ri(input_dataframe)
output_dataframe_r = r_generate_notifications(input_dataframe_r, metric_name, lookback, moving_average, sigmas)

# And convert it back to a Pandas dataframe:
output_dataframe_py = pandas2ri.ri2py_dataframe(output_dataframe_r)

print('output dataframe r:', output_dataframe_r)
print('\n')
print('output dataframe py:', output_dataframe_py)
print('\n')

问题描述

我有一个 Python 中的 Pandas 数据框,我想对其进行一些 R 数学运算。因此,我接受一个 Pandas 数据帧 input_dataframe 的参数,执行一些操作(在本例中,它是一个名为 r_generate_notifications() 的 R 函数),然后转换回使用 output_dataframe_py = pandas2ri.ri2py_dataframe(output_dataframe_r) 的 Python Pandas 数据帧。

问题是 R 代码使用 ymd() 返回一些日期,当我转换为 Pandas 数据帧时,这些日期都转换为 float 。我不确定这是否是代码中的错误或错误,或者是否是用户错误。我还在 Pandas Github 上将此作为错误发布:https://github.com/pandas-dev/pandas/issues/21044

预期输出(R 数据帧)

output dataframe r:    notify_day daily_value is_high_value time_period_length time_period_value
1  2017-05-09    11033.79             1                  7          30938.45
2  2017-05-18     1613.64             1                  7          25669.63
3  2017-05-19     2121.38             1                  7          28048.14
4  2017-05-26     1774.44             1                  7          28185.27
5  2017-06-12      693.24             1                  7          26170.57
6  2017-06-24     2275.77             1                  7          36550.32
7  2017-06-29     5336.76             1                  7          32748.46
8  2017-06-30     8921.38             1                  7          43366.39
9  2017-07-11     4007.84             0                  7          28986.47
10 2017-07-20     5766.12             0                  7          24627.51
11 2017-08-01     4150.32             1                  7          24760.60
12 2017-08-04      734.40             0                  7          20645.43
13 2017-08-12        0.00             1                  7           9898.20
14 2017-12-29     5000.00             1                  7          12467.02
15 2018-01-28        0.00             1                  7          12538.81
16 2018-02-14        0.00             1                  7          14351.24
17 2018-02-20    10628.82             1                  7          20905.00
18 2018-03-16      237.44             1                  7          24400.76
19 2018-03-21      917.96             1                  7          26485.20
20 2018-03-24     1272.85             1                  7          39287.70
21 2018-03-26     3231.26             1                  7          41543.95
22 2018-03-29     9493.31             1                  7          43060.81
23 2018-03-30    21696.04             0                  7          34854.90
24 2018-03-31     1403.33             0                  7          13158.86
25 2018-04-06        0.00             0                  7          15240.38
26 2018-04-08      453.68             0                  7          18004.12
27 2018-04-18     4666.36             1                  7          27038.60
28 2018-04-21        0.00             0                  7          24620.15
29 2018-04-23     4306.88             1                  7          27470.00
   time_period_start time_period_end comparison_days_ago comparison_value
1         2017-05-03      2017-05-09                  28         19056.30
2         2017-05-12      2017-05-18                  14         21610.99
3         2017-05-13      2017-05-19                  28         24321.11
4         2017-05-20      2017-05-26                  28         14530.01
5         2017-06-06      2017-06-12                  28         20087.97
6         2017-06-18      2017-06-24                  28         30796.60
7         2017-06-23      2017-06-29                  14         28394.23
8         2017-06-24      2017-06-30                  28         22758.57
9         2017-07-05      2017-07-11                  14         36122.77
10        2017-07-14      2017-07-20                  28         29509.53
11        2017-07-26      2017-08-01                   7         19662.71
12        2017-07-29      2017-08-04                  28         30518.06
13        2017-08-06      2017-08-12                   1          4487.40
14        2017-12-23      2017-12-29                  28             0.00
15        2018-01-22      2018-01-28                  28         10393.82
16        2018-02-08      2018-02-14                  28          2177.36
17        2018-02-14      2018-02-20                  28           602.64
18        2018-03-10      2018-03-16                  28         19042.76
19        2018-03-15      2018-03-21                  28         14042.68
20        2018-03-18      2018-03-24                  28          9351.16
21        2018-03-20      2018-03-26                  28          7909.36
22        2018-03-23      2018-03-29                  28           464.28
23        2018-03-24      2018-03-30                   1         43060.81
24        2018-03-25      2018-03-31                  14         24163.32
25        2018-03-31      2018-04-06                  14         17591.66
26        2018-04-02      2018-04-08                  14         39418.18
27        2018-04-12      2018-04-18                  14         12906.06
28        2018-04-15      2018-04-21                  28         39287.70
29        2018-04-17      2018-04-23                  14         18153.08
   comparison_period_start comparison_period_end
1               2017-04-05            2017-04-11
2               2017-04-28            2017-05-04
3               2017-04-15            2017-04-21
4               2017-04-22            2017-04-28
5               2017-05-09            2017-05-15
6               2017-05-21            2017-05-27
7               2017-06-09            2017-06-15
8               2017-05-27            2017-06-02
9               2017-06-21            2017-06-27
10              2017-06-16            2017-06-22
11              2017-07-19            2017-07-25
12              2017-07-01            2017-07-07
13              2017-08-05            2017-08-11
14              2017-11-25            2017-12-01
15              2017-12-25            2017-12-31
16              2018-01-11            2018-01-17
17              2018-01-17            2018-01-23
18              2018-02-10            2018-02-16
19              2018-02-15            2018-02-21
20              2018-02-18            2018-02-24
21              2018-02-20            2018-02-26
22              2018-02-23            2018-03-01
23              2018-03-23            2018-03-29
24              2018-03-11            2018-03-17
25              2018-03-17            2018-03-23
26              2018-03-19            2018-03-25
27              2018-03-29            2018-04-04
28              2018-03-18            2018-03-24
29              2018-04-03            2018-04-09

实际输出(Python/Pandas 数据帧)

output dataframe py:     notify_day  daily_value  is_high_value  time_period_length  \
0      17295.0     11033.79            1.0                   7   
1      17304.0      1613.64            1.0                   7   
2      17305.0      2121.38            1.0                   7   
3      17312.0      1774.44            1.0                   7   
4      17329.0       693.24            1.0                   7   
5      17341.0      2275.77            1.0                   7   
6      17346.0      5336.76            1.0                   7   
7      17347.0      8921.38            1.0                   7   
8      17358.0      4007.84            0.0                   7   
9      17367.0      5766.12            0.0                   7   
10     17379.0      4150.32            1.0                   7   
11     17382.0       734.40            0.0                   7   
12     17390.0         0.00            1.0                   7   
13     17529.0      5000.00            1.0                   7   
14     17559.0         0.00            1.0                   7   
15     17576.0         0.00            1.0                   7   
16     17582.0     10628.82            1.0                   7   
17     17606.0       237.44            1.0                   7   
18     17611.0       917.96            1.0                   7   
19     17614.0      1272.85            1.0                   7   
20     17616.0      3231.26            1.0                   7   
21     17619.0      9493.31            1.0                   7   
22     17620.0     21696.04            0.0                   7   
23     17621.0      1403.33            0.0                   7   
24     17627.0         0.00            0.0                   7   
25     17629.0       453.68            0.0                   7   
26     17639.0      4666.36            1.0                   7   
27     17642.0         0.00            0.0                   7   
28     17644.0      4306.88            1.0                   7   

    time_period_value  time_period_start  time_period_end  \
0            30938.45            17289.0          17295.0   
1            25669.63            17298.0          17304.0   
2            28048.14            17299.0          17305.0   
3            28185.27            17306.0          17312.0   
4            26170.57            17323.0          17329.0   
5            36550.32            17335.0          17341.0   
6            32748.46            17340.0          17346.0   
7            43366.39            17341.0          17347.0   
8            28986.47            17352.0          17358.0   
9            24627.51            17361.0          17367.0   
10           24760.60            17373.0          17379.0   
11           20645.43            17376.0          17382.0   
12            9898.20            17384.0          17390.0   
13           12467.02            17523.0          17529.0   
14           12538.81            17553.0          17559.0   
15           14351.24            17570.0          17576.0   
16           20905.00            17576.0          17582.0   
17           24400.76            17600.0          17606.0   
18           26485.20            17605.0          17611.0   
19           39287.70            17608.0          17614.0   
20           41543.95            17610.0          17616.0   
21           43060.81            17613.0          17619.0   
22           34854.90            17614.0          17620.0   
23           13158.86            17615.0          17621.0   
24           15240.38            17621.0          17627.0   
25           18004.12            17623.0          17629.0   
26           27038.60            17633.0          17639.0   
27           24620.15            17636.0          17642.0   
28           27470.00            17638.0          17644.0   

    comparison_days_ago  comparison_value  comparison_period_start  \
0                  28.0          19056.30                  17261.0   
1                  14.0          21610.99                  17284.0   
2                  28.0          24321.11                  17271.0   
3                  28.0          14530.01                  17278.0   
4                  28.0          20087.97                  17295.0   
5                  28.0          30796.60                  17307.0   
6                  14.0          28394.23                  17326.0   
7                  28.0          22758.57                  17313.0   
8                  14.0          36122.77                  17338.0   
9                  28.0          29509.53                  17333.0   
10                  7.0          19662.71                  17366.0   
11                 28.0          30518.06                  17348.0   
12                  1.0           4487.40                  17383.0   
13                 28.0              0.00                  17495.0   
14                 28.0          10393.82                  17525.0   
15                 28.0           2177.36                  17542.0   
16                 28.0            602.64                  17548.0   
17                 28.0          19042.76                  17572.0   
18                 28.0          14042.68                  17577.0   
19                 28.0           9351.16                  17580.0   
20                 28.0           7909.36                  17582.0   
21                 28.0            464.28                  17585.0   
22                  1.0          43060.81                  17613.0   
23                 14.0          24163.32                  17601.0   
24                 14.0          17591.66                  17607.0   
25                 14.0          39418.18                  17609.0   
26                 14.0          12906.06                  17619.0   
27                 28.0          39287.70                  17608.0   
28                 14.0          18153.08                  17624.0   

    comparison_period_end  
0                 17267.0  
1                 17290.0  
2                 17277.0  
3                 17284.0  
4                 17301.0  
5                 17313.0  
6                 17332.0  
7                 17319.0  
8                 17344.0  
9                 17339.0  
10                17372.0  
11                17354.0  
12                17389.0  
13                17501.0  
14                17531.0  
15                17548.0  
16                17554.0  
17                17578.0  
18                17583.0  
19                17586.0  
20                17588.0  
21                17591.0  
22                17619.0  
23                17607.0  
24                17613.0  
25                17615.0  
26                17625.0  
27                17614.0  
28                17630.0  

“pd.show_versions()”的输出

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-122-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.14.3
scipy: None
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

最佳答案

作为一种解决方法,您可以在将 R 日期发送回 Python 之前合并一些将 R 日期转换为字符串的内容吗?

library(lubridate)
df[sapply(df, is.Date)] <- lapply(df[sapply(df, is.Date)], as.character)

我并不真正处理日期,所以这是我的(简单)理解。 R 将日期存储为数字,并包含一些额外的信息,给出编号开始的日期/时间、时区信息等。当您的数据帧返回到 Python 时,看起来这些信息在翻译中丢失了,因此将它们存储为字符是可能安全得多。

关于python - pandas2ri.ri2py_dataframe(r_dataframe) 返回 float 而不是 ISO-8601 (YYYY-MM-DD) 格式的日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50341900/

相关文章:

python - Pandas:astype错误字符串到 float (无法将字符串转换为 float : '7,50')

python - 计算多列中的零

python - 尝试将带有 ctypes 的 numpy 数组转换为 C 会出现段错误

r - 在ggmap中的点之间绘制曲线

r - 检查每行右侧的列是否具有特定值

r - 在R中成功找到另一个df中的字符串后如何正确确定名称

python - 将多列连接成一列,同时复制其他列的值

php - 将嵌套的 PHP 数组转换为嵌套的 Python 字典

python - Django - 当我扩展 index.html 时,我没有得到变量

python - 在 scikit-learn 中向文本向量化器添加新词