r - 在 R 中循环获取 ChangePoint 数据

标签 r for-loop dplyr reshape2

这是我的示例数据:

datex <- c(rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24),
           rep("2018-01-18",24),rep("2018-01-19",24))
hourx <- c(rep(0:23,18))
country <- c(rep("indonesia",48),rep("brunei",48),rep("filipina",48),rep("kamboja",48),
             rep("laos",48),rep("malaysia",48),rep("myanmar",48),rep("singapura",48),rep("vietnam",48))
transaction <- c(1080,993,1014,1003,885,934,1068,1128,1188,1335,1218,1297,1186,1123,1139,1170,1207,1223,1145,1178,1300,1282,1209,922,1797,979,997,851,1099,962,1070,1028,1182,1431,1431,1498,1364,1357,1115,1248,1191,1350,1297,1276,1448,1319,1097,1131,1122,964,881,893,986,874,980,978,1115,1127,1357,1308,1198,1173,1090,1134,1175,1182,1230,1125,1319,1298,1138,941,1216,1060,926,831,1003,975,1085,1046,1166,1519,0,1537,1181,1237,1235,1407,1174,1196,1355,1416,1341,1309,1132,1162,1111,948,1028,979,920,1000,1077,1020,1287,1224,1263,1406,1262,1089,1169,1272,1270,1146,1280,1105,1275,1291,1170,965,1376,953,1004,889,1084,1073,1042,1182,1326,1599,0,1522,1229,1163,1091,1353,1105,1305,1426,1362,1478,1201,1166,1068,1184,947,915,1074,1024,918,1041,1231,1217,1096,1307,1271,1166,1202,1127,1240,1307,1113,1216,1179,1302,1215,1106,979,2232,1028,1036,924,1168,930,1116,1088,1054,1589,0,1526,1307,1371,1234,1203,1203,1157,1343,1445,1397,1238,1192,1057,1156,966,989,959,1143,912,1066,1014,1300,1110,1224,1223,1248,1314,1107,1270,1118,1179,1158,1164,1385,1280,1143,938,1523,1010,1043,883,1127,953,1155,1077,1162,1537,0,1442,1189,1351,1162,1263,1309,1264,1357,1498,1358,1351,1127,1122,105,211,790,297,138,100,106,102,102,99,434,402,464,710,85,144,405,91,95,101,106,99,127,120,94,614,327,280,215,99,101,104,103,181,0,103,657,1724,22,279,418,135,87,169,174,112,125,99,1067,942,1065,938,981,944,1023,1095,1100,1079,1210,1250,1293,1165,1138,1279,1284,1156,1250,1136,1264,1254,1173,921,3866,989,1019,807,1120,940,1173,1043,1071,1619,0,1524,1348,1259,0,1232,1217,1313,1360,1409,1359,1212,1088,1069,1068,930,897,973,918,967,997,1005,1299,1055,1253,1535,1255,1272,1156,1252,1301,1097,1233,1153,1320,1234,1143,945,1772,1046,944,890,1039,940,1065,1151,1189,1509,0,1483,1265,1169,1243,1118,1118,1342,1351,1511,1373,1257,1122,1141,1110,920,827,920,1006,856,1076,1205,1183,1233,1316,1228,1111,1101,1092,1244,1212,1174,1147,1124,1254,1159,1089,929,1215,977,850,855,1191,964,1126,1055,1160,1613,0,1403,1207,1491,1108,1160,0,1293,1286,1409,1335,1307,1046,1177)
mydata <- data.frame(datex, hourx, country, transaction)

我的任务是根据我拥有的每个国家/地区获取交易的变更点数据。这是我的手动脚本,用于从国家“印度尼西亚”获取变更点数据

# Manual Script
library(changepoint)
cp_data <- subset(mydata, country == "indonesia")
cp_process <- cpt.meanvar(cp_data$transaction, method = "PELT") 
cp_index <- cpts(cp_process)+1

cp_index
# [1]  8 24 34 36

cp_result <- rbind(cp_data[cp_index[1],],
               cp_data[cp_index[2],],
               cp_data[cp_index[3],],
               cp_data[cp_index[4],])
cp_result
# datex hourx   country transaction
# 8  2018-01-18     7 indonesia        1128
# 24 2018-01-18    23 indonesia         922
# 34 2018-01-19     9 indonesia        1431
# 36 2018-01-19    11 indonesia        1498

请帮忙,我如何循环获取数据中任何国家/地区的所有变更点数据?谢谢

最佳答案

您可以将已有的代码放入函数中并将其应用于每个国家/地区

library(changepoint)
library(dplyr)

filter_change_point <- function(transaction) {
  cp_process <- cpt.meanvar(transaction, method = "PELT") 
  cp_index <- cpts(cp_process)+1
  cp_index
}

mydata %>%
  group_by(country) %>%
  slice(filter_change_point(transaction)) %>%
  ungroup

#   datex      hourx country   transaction
#   <chr>      <int> <chr>           <dbl>
# 1 2018-01-19     9 brunei           1519
# 2 2018-01-19    12 brunei           1181
# 3 2018-01-19     9 filipina         1599
# 4 2018-01-19    11 filipina         1522
# 5 2018-01-18     7 indonesia        1128
# 6 2018-01-18    23 indonesia         922
# 7 2018-01-19     9 indonesia        1431
# 8 2018-01-19    11 indonesia        1498
# 9 2018-01-18     7 kamboja          1231
#10 2018-01-18    23 kamboja           979
# … with 23 more rows

关于r - 在 R 中循环获取 ChangePoint 数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67159144/

相关文章:

linux - Bash for 循环 - 列出/etc/init.d 中的文件,然后对结果运行命令

r - 在 purrr 中处理不同长度的向量

r - 从不同的 data.table 向 data.table 的列赋值的最 "data.table"方法是什么

r - 删除所有小于指定值的时间

javascript - vanilla JS for 循环问题

r - 按行自动回归

r - 将 tibble/dataframe 转换为带有数组的嵌套 JSON

Amazon EC2 上的 R(RedHat/Centos 实例)

r - 检查某个元素是否存在于另一个向量中并打印其值

python - 循环时追加到文件安全吗?