python - 每周总和不等于每月总和 pandas

标签 python pandas

这是我的数据的一个子集:(用于最小可重现代码)

01-01-20,128921.04
02-01-20,125338.56
03-01-20,100824.66
04-01-20,129203.39
05-01-20,164149.36
06-01-20,120360.65
07-01-20,113249.99
08-01-20,130191.88
09-01-20,101189.75
10-01-20,103243.14
11-01-20,105493.14
12-01-20,140929.83
13-01-20,111561.6
14-01-20,99481.08
15-01-20,121164.38
16-01-20,99812.96
17-01-20,87739.93
18-01-20,107949.75
19-01-20,115452.02
20-01-20,130104.09
21-01-20,110515.02
22-01-20,121331.84
23-01-20,96556.54
24-01-20,80605.72
25-01-20,92569.24
26-01-20,138800.83
27-01-20,92882.83
28-01-20,92318.18
29-01-20,102570.51
30-01-20,83223.74
31-01-20,78875.25
01-02-20,92812.68
02-02-20,100769.52
03-02-20,100436.34
04-02-20,90445.25
05-02-20,111517.47
06-02-20,97293.94
07-02-20,86028.53
08-02-20,98991.03
09-02-20,139218.02
10-02-20,112893.87
11-02-20,88203.69
12-02-20,103741.31
13-02-20,78289.06
14-02-20,87642.83
15-02-20,89486.19
16-02-20,135595
17-02-20,107230.7
18-02-20,93497.96
19-02-20,110915.03
20-02-20,85369.66
21-02-20,89731.12
22-02-20,100323.54
23-02-20,138563.05
24-02-20,94594.24
25-02-20,92561.89
26-02-20,138485.5
27-02-20,90229.11
28-02-20,80458.92
29-02-20,98203.12
01-03-20,133082.48
02-03-20,97570.18
03-03-20,85414.1
04-03-20,92503.9
05-03-20,80030.37
06-03-20,93478.99
07-03-20,103215.04
08-03-20,97862.86
09-03-20,106146.68
10-03-20,86679.19
11-03-20,84391.71
12-03-20,56142.73
13-03-20,73461.15
14-03-20,68306.21
15-03-20,81142.78
16-03-20,47639.16
17-03-20,45149.38
18-03-20,44238.06
19-03-20,101972.48
20-03-20,59555.46
21-03-20,60318.8
22-03-20,83097.88
23-03-20,44619.96
24-03-20,74030.29
25-03-20,122993.73
26-03-20,84086.55
27-03-20,94102.68
28-03-20,96118.46
29-03-20,125752.09
30-03-20,90891.25
31-03-20,101679.05
01-04-20,86656.78
02-04-20,93691.5
03-04-20,88798.26
04-04-20,94957.87
05-04-20,120458.96
06-04-20,93696.39
07-04-20,100141.77
08-04-20,137059.7
09-04-20,100172.98
10-04-20,138442.39
11-04-20,93426.7
12-04-20,133409.21
13-04-20,102913.83
14-04-20,130870.24
15-04-20,125922.79
16-04-20,129950.29
17-04-20,107779.74
18-04-20,115455.62
19-04-20,178068.78
20-04-20,109783.91
21-04-20,146683.79
22-04-20,126564.06
23-04-20,147555.21
24-04-20,118911.06
25-04-20,130732.86
26-04-20,155833.93
27-04-20,126342.17
28-04-20,162196.66
29-04-20,150163.04
30-04-20,183581.71
01-05-20,136151.93
02-05-20,115936.72
03-05-20,174859.86
04-05-20,146856.66
05-05-20,157275.86
06-05-20,138543.82
07-05-20,147154.01
08-05-20,116823.23
09-05-20,125782.39
10-05-20,151169.15
11-05-20,130266.93
12-05-20,149603.86
13-05-20,118424.35
14-05-20,163479.16
15-05-20,128805.56
16-05-20,133935.71
17-05-20,186671.37
18-05-20,111359.55
19-05-20,65546
20-05-20,282153.48
21-05-20,169061.38
22-05-20,135515.54
23-05-20,151844.17
24-05-20,186766.55
25-05-20,189683.25
26-05-20,169501.28
27-05-20,184518.67
28-05-20,153312.42
29-05-20,195488.54
30-05-20,164950.36
31-05-20,203940.3
01-06-20,157677.94
02-06-20,152351.26
03-06-20,186902.06
04-06-20,169101.35
05-06-20,177544.26
06-06-20,187603.88
07-06-20,258896.99
08-06-20,200043.02
09-06-20,236888.89
10-06-20,177424.67
11-06-20,192883.2
12-06-20,160047.5
13-06-20,165664.22
14-06-20,248579.8
15-06-20,170747.19
16-06-20,231164.85
17-06-20,167219.84
18-06-20,156622
19-06-20,173309.74
20-06-20,157201.39
21-06-20,193249.82
22-06-20,134175.03
23-06-20,199480.99
24-06-20,157446.14
25-06-20,150972.53
26-06-20,154339.08
27-06-20,148897.88
28-06-20,182340.04
29-06-20,141234.55
30-06-20,288646.56
01-07-20,282625.02
02-07-20,248820.12
03-07-20,235875.02
04-07-20,240315.18
05-07-20,270446.5
06-07-20,154621.78
07-07-20,138240.25
08-07-20,146545.45
09-07-20,124980.06
10-07-20,
11-07-20,
...

当我在 pandas 中读取这些数据时,我尝试按周和按月对其进行汇总,如下所示:

week_wise = data.groupby([(data.index.year), (data.index.week)])\
                    .sum().loc[2020:][[col for col in data.columns if 'Date' not in col]]

month_wise = data.groupby([(data.index.year), (data.index.month)])\
                    .sum().loc[2017:][[col for col in data.columns if 'Date' not in col]]

但是 month_wise 有值:(其余为零)

$3,426,610.90
$2,933,528.57
$2,615,673.65
$3,730,222.20
$4,785,382.06
$5,478,656.67

总计为:22970074.05 而 week_wise 返回值:

$648,437.00
$814,658.40
$743,161.70
$770,483.30
$643,452.70
$723,930.60
$695,851.90
$725,631.10
$727,615.30
$650,075.40
$556,270.50
$441,971.20
$641,703.80
$677,133.70
$796,349.10
$890,961.30
$936,064.80
$1,049,232.00
$983,605.10
$1,011,187.00
$1,102,247.00
$1,261,395.00
$1,290,078.00
$1,381,531.00
$1,249,515.00
$1,127,652.00
$1,707,963.00
$564,387.50

总计为:24,164,107.40 美元

但是,当我在 pandas 中求和时,两个总和都等于第三个值,即:2.481254e+07。我之前提到的总和是通过将值粘贴到 Excel 并应用 SUM 公式获得的。那么,当我知道我正确地对它们进行分组时,为什么这两个总和不同。更重要的是,为什么它们都不与外部计算的总和相匹配? 非常感谢任何帮助!

最佳答案

我已经重新检查了您的值,pandas 确实给了您正确的总和。

为什么每月总和与 pandas 总和不匹配?

您的数据有进入 7 月的日期,但您提到的总和只到 6 月。所以很明显,当您将这些值粘贴到外部软件时,它会更少!

为什么每周总和与 pandas 总和不匹配?

它确实匹配 pandas 总和。我在 Google 表格中进行了计算,但您可以通过任何其他方式查看,两者的总和相同。

关于python - 每周总和不等于每月总和 pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62958592/

相关文章:

python - 如何使用opencv拍照并同时可视化网络摄像头

python - 根据正则表达式拆分字符串

pandas - 将具有 nan 值的 str 类型字典转换为 dict 类型对象

python - 如何从不等长列表的字典创建 DataFrame,并截断到特定长度?

python - 有没有办法使用 `json.dump` 和 `gzip` ?

python - wx.Grid 和 wx.StockCursor

python - 如何使用二进制搜索来搜索大量名称

python - pandas python 中的 3 个月聚合和转移周期分组

python - 用 Pandas 的值列表替换选定的整数值

python - 将 pandas 输出格式化为 csv