r - 如何在 R 中将连续值分配给变量，同时根据不同变量中包含的值的数量定义序列

我有患者数据，其中患者在不同时间点接受了相同的评估。我想按日期对这些评估进行顺序编号。

这是我的输入:

12 x 3 df with cols: pt_id, assess_date, assess_id

这是我想要的输出:

12 x 5 df with cols: pt_id, assess_date, assess_id, num_assess, assess_num

这是我尝试过的:

data <- data %>% 
           group_by(pt_id) %>%
           mutate(num_assess <- n_distinct(assess_date))

data$assess_num <- NA

data <- data %>% 
           group_by(pt_id) %>% 
           for(i in 1:num_assess) {
              assess_num <- i
            }

我还尝试使用 n_distinct 来定义序列而不创建assess_num 变量，但这也不起作用

这是我收到的错误:

for (.in i) 1:num_assess 中出现错误: 4 个参数传递给 'for'，需要 3

想法？ TIA!

最佳答案

来自 @desc 的巧妙解决方案。如果您的日期格式为日期，并且您希望它是数字，则以下脚本可以工作。这使用了 desc 中的 data.example(谢谢)，但日期格式是 d/m/y，这就是为什么 as.Date 中的 format 是 "%d/%m/%Y"。

> data.example = structure(list(pt_id = c(1234L, 1234L, 1234L, 1234L, 4567L, 4567L, 
+                                         4567L, 4567L, 8900L, 8900L, 8900L, 8900L), assess_date = structure(c(1L, 
+                                                                                                              2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("1/1/2019", 
+                                                                                                                                                                      "1/2/2019", "1/3/2019", "1/4/2019"), class = "factor"), assess_id = c(64L, 
+                                                                                                                                                                                                                                            64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L)), class = "data.frame", row.names = c(NA, 
+                                                                                                                                                                                                                                                                                                                                         -12L))
> 
> data.example$assess_date <- as.Date(data.example$assess_date, format = "%d/%m/%Y")
> data.example$assess_num <- as.numeric(format(data.example$assess_date, "%m"))
> data.example
   pt_id assess_date assess_id assess_num
1   1234  2019-01-01        64          1
2   1234  2019-02-01        64          2
3   1234  2019-03-01        64          3
4   1234  2019-04-01        64          4
5   4567  2019-01-01        64          1
6   4567  2019-02-01        64          2
7   4567  2019-03-01        64          3
8   4567  2019-04-01        64          4
9   8900  2019-01-01        64          1
10  8900  2019-02-01        64          2
11  8900  2019-03-01        64          3
12  8900  2019-04-01        64          4

关于r - 如何在 R 中将连续值分配给变量，同时根据不同变量中包含的值的数量定义序列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54954195/

r - 如何在 R 中将连续值分配给变量，同时根据不同变量中包含的值的数量定义序列

上一篇：authentication - 无法登录 Bot Framework for Microsoft Teams channel

下一篇：scikit-learn - 置信区间的高斯过程回归估计