我的数据
Chemical date concentration limit
A 01-01-2016 0.2 0.01
A 01-02-2016 0.2 0.01
A 01-01-2017 0.005 0.01
A 01-02-2017 0.2 0.01
B 01-01-2016 0.3 0.1
B 01-02-2016 0.05 0.1
B 01-01-2017 0.2 0.1
B 01-02-2017 0.2 0.1
C 01-01-2016 1.2 1
C 01-02-2016 0.8 1
C 01-01-2017 0.9 1
C 01-02-2017 0.9 1
我想显示每种化学品每年超过限制的百分比(请注意,每个限制都是不同的)。所以我想要得到这样的东西
Year A B C
2016 100% 50% 50%
2017 50% 100% 0
我已经有了计算每种化学物质每年超过的次数的代码,但是在计算百分比时我弄错了。
这个我得数数了。
library(tidyverse)
counts<- data %>%
group_by(Chemical, grp = format(date, format = '%Y')) %>%
mutate(exceed = concentration >= limit) %>% # TRUE/FALSE
summarise(tot_exceed = sum(exceed)) %>% # count each T/F
spread(Chemical, tot_exceed, fill = 0)
所以我明白了
Year A B C
2016 2 1 1
2017 1 2 0
对于百分比,我尝试了这个。
percentage_exceed<- data %>%
group_by(Chemical, grp = format(date, format = '%Y')) %>%
mutate(exceed = concentration >= limit, countconc = length(concentration))
%>%
summarise(percent = (sum(exceed)/countconc)*100) %>%
spread(Chemical, percent, fill = 0)
但是我没有得到我想要的结果。你能帮我吗?
最佳答案
dt = read.table(text = "
Chemical date concentration limit
A 01-01-2016 0.2 0.01
A 01-02-2016 0.2 0.01
A 01-01-2017 0.005 0.01
A 01-02-2017 0.2 0.01
B 01-01-2016 0.3 0.1
B 01-02-2016 0.05 0.1
B 01-01-2017 0.2 0.1
B 01-02-2017 0.2 0.1
C 01-01-2016 1.2 1
C 01-02-2016 0.8 1
C 01-01-2017 0.9 1
C 01-02-2017 0.9 1
", header=T)
library(tidyverse)
library(lubridate)
dt %>%
mutate(year = year(dmy(date))) %>%
group_by(year, Chemical) %>%
summarise(Total = n(),
Num_exceed = sum(concentration >= limit)) %>%
ungroup() %>%
mutate(Prc = paste0(Num_exceed / Total * 100,"%")) %>%
select(year, Chemical, Prc) %>%
spread(Chemical, Prc)
# # A tibble: 2 x 4
# year A B C
# <dbl> <chr> <chr> <chr>
# 1 2016 100% 50% 50%
# 2 2017 50% 100% 0%
关于r - 每个值每年的百分比,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53813647/