r - 当我的数据未正确对齐时计算均方根误差

标签 r machine-learning

我有这些数据,我试图计算实际值和预测值的均方根误差:

# A time tibble: 6 x 4
# Index: index
  IRI_KEY index      value key   
    <dbl> <date>     <dbl> <fct> 
1  648459 2005-01-31  1.43 actual
2  648459 2005-02-07  1.16 actual
3  648459 2005-02-14  1.22 actual
4  648459 2005-02-21  1.16 actual
5  648459 2005-02-28  1.04 actual
6  648459 2005-03-07  1.45 actual

尾部

# A time tibble: 6 x 4
# Index: index
  IRI_KEY index      value key    
    <dbl> <date>     <dbl> <fct>  
1      NA 2011-12-12  1.79 predict
2      NA 2011-12-19  1.76 predict
3      NA 2011-12-26  1.76 predict
4      NA 2012-01-02  1.67 predict
5      NA 2012-01-09  1.64 predict
6      NA 2012-01-16  1.69 predict

首先,我尝试使用该列中的相同 ID 键填充 NA 值(这些 ID 键在每个数据帧上都会发生变化)。因此,“实际”结果有一个分配给它们的 ID 键,但“预测”结果由于某种原因没有分配给它们。

其次,我尝试计算“实际”和“预测”的均方根误差。在我使用 spread 函数后,由于“实际”和“预测”两列中的 NA 值,我所返回的结果为“NaN”。

如何计算均方根误差或如何配置数据以使日期匹配?

我训练了一个模型,直到 2011-01-24 日期,并从 2011-01-242012-01-16 对其进行了测试

 rmse_calculation <- 
    df %>%
      spread(key = key, value = value) %>%
    rename(truth    = actual,
           estimate = predict)
  rmse(truth, estimate)

数据:

df <- structure(list(IRI_KEY = c(648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, 648459, 
648459, 648459, 648459, 648459, 648459, 648459, 648459, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA), index = structure(c(12814, 12821, 12828, 12835, 12842, 
12849, 12856, 12863, 12870, 12877, 12884, 12891, 12898, 12905, 
12912, 12919, 12926, 12933, 12940, 12947, 12954, 12961, 12968, 
12975, 12982, 12989, 12996, 13003, 13010, 13017, 13024, 13031, 
13038, 13045, 13052, 13059, 13066, 13073, 13080, 13087, 13094, 
13101, 13108, 13115, 13122, 13129, 13136, 13143, 13150, 13157, 
13164, 13171, 13178, 13185, 13192, 13199, 13206, 13213, 13220, 
13227, 13234, 13241, 13248, 13255, 13262, 13269, 13276, 13283, 
13290, 13297, 13304, 13311, 13318, 13325, 13332, 13339, 13346, 
13353, 13360, 13367, 13374, 13381, 13388, 13395, 13402, 13409, 
13416, 13423, 13430, 13437, 13444, 13451, 13458, 13465, 13472, 
13479, 13486, 13493, 13500, 13507, 13514, 13521, 13528, 13535, 
13542, 13549, 13556, 13563, 13570, 13577, 13584, 13591, 13598, 
13605, 13612, 13619, 13626, 13633, 13640, 13647, 13654, 13661, 
13668, 13675, 13682, 13689, 13696, 13703, 13710, 13717, 13724, 
13731, 13738, 13745, 13752, 13759, 13766, 13773, 13780, 13787, 
13794, 13801, 13808, 13815, 13822, 13829, 13836, 13843, 13850, 
13857, 13864, 13871, 13878, 13885, 13892, 13899, 13906, 13913, 
13920, 13927, 13934, 13941, 13948, 13955, 13962, 13969, 13976, 
13983, 13990, 13997, 14004, 14011, 14018, 14025, 14032, 14039, 
14046, 14053, 14060, 14067, 14074, 14081, 14088, 14095, 14102, 
14109, 14116, 14123, 14130, 14137, 14144, 14151, 14158, 14165, 
14172, 14179, 14186, 14193, 14200, 14207, 14214, 14221, 14228, 
14235, 14242, 14249, 14256, 14263, 14270, 14277, 14284, 14291, 
14298, 14305, 14312, 14319, 14326, 14333, 14340, 14347, 14354, 
14361, 14368, 14375, 14382, 14389, 14396, 14403, 14410, 14417, 
14424, 14431, 14438, 14445, 14452, 14459, 14466, 14473, 14480, 
14487, 14494, 14501, 14508, 14515, 14522, 14529, 14536, 14543, 
14550, 14557, 14564, 14571, 14578, 14585, 14592, 14599, 14606, 
14613, 14620, 14627, 14634, 14641, 14648, 14655, 14662, 14669, 
14676, 14683, 14690, 14697, 14704, 14711, 14718, 14725, 14732, 
14739, 14746, 14753, 14760, 14767, 14774, 14781, 14788, 14795, 
14802, 14809, 14816, 14823, 14830, 14837, 14844, 14851, 14858, 
14865, 14872, 14879, 14886, 14893, 14900, 14907, 14914, 14921, 
14928, 14935, 14942, 14949, 14956, 14963, 14970, 14977, 14984, 
14991, 14998, 15005, 15012, 15019, 15026, 15033, 15040, 15047, 
15054, 15061, 15068, 15075, 15082, 15089, 15096, 15103, 15110, 
15117, 15124, 15131, 15138, 15145, 15152, 15159, 15166, 15173, 
15180, 15187, 15194, 15201, 15208, 15215, 15222, 15229, 15236, 
15243, 15250, 15257, 15264, 15271, 15278, 15285, 15292, 15299, 
15306, 15313, 15320, 15327, 15334, 15341, 15348, 15355, 14998, 
15005, 15012, 15019, 15026, 15033, 15040, 15047, 15054, 15061, 
15068, 15075, 15082, 15089, 15096, 15103, 15110, 15117, 15124, 
15131, 15138, 15145, 15152, 15159, 15166, 15173, 15180, 15187, 
15194, 15201, 15208, 15215, 15222, 15229, 15236, 15243, 15250, 
15257, 15264, 15271, 15278, 15285, 15292, 15299, 15306, 15313, 
15320, 15327, 15334, 15341, 15348, 15355), class = "Date"), value = c(1.42824767211314, 
1.15935636773992, 1.21562423396038, 1.16087592721371, 1.03655779775518, 
1.45014602307116, 1.19603891069525, 1.51629361136222, 1.35248187520545, 
1.19313036089064, 1.23466779056019, 1.21321049528827, 1.19503355008839, 
1.25756070009974, 1.4265698115632, 1.62505506289166, 1.41987592239844, 
1.34910786957776, 1.65086551211581, 1.77928559677544, 1.83845338283235, 
1.83580669517815, 1.67750364243548, 1.63087084240032, 1.53321928861015, 
1.51027605545301, 1.63050891497539, 1.4862366729199, 1.76886165853052, 
1.51076508754458, 1.79972192638831, 1.40905379487872, 1.33271255551288, 
1.35204242431244, 1.33470871214462, 1.27922055778867, 1.19085428349673, 
1.16260508468215, 1.20011754027413, 1.08580541093404, 1.23437608684114, 
1.21860360879203, 1.27448459413395, 1.21110406725922, 1.1601869743601, 
1.21755561640455, 1.31757665372039, 1.3375114790927, 1.0713257506421, 
1.34170917203277, 1.21640792427817, 1.23702534970888, 1.30153826689137, 
1.10825732300252, 1.4498640625571, 1.33638090869313, 1.16528603779432, 
1.11272227006406, 1.24456830644998, 1.08932241378247, 1.2616330691761, 
1.24871988673321, 1.2694941591514, 1.25607040153018, 1.42968365090233, 
1.54595667506417, 1.4955572237206, 1.4892478841414, 1.56504325252197, 
1.48020564768688, 1.60796032677947, 1.64874719617122, 1.71994456839498, 
1.52466258015572, 2.00665769675817, 1.5300656898625, 1.56161480192207, 
1.50940919433804, 1.58004290205282, 1.51336116478672, 1.52644487316346, 
1.48834327809579, 1.46930588866215, 1.36631372054513, 1.44059995744445, 
1.5180608970494, 1.41812200439152, 1.49925405818079, 1.31689765959184, 
1.21907772239039, 1.30753278259585, 1.57082329051166, 1.49143144852514, 
1.38216169956339, 1.44188483722575, 1.13440481876605, 1.35646235618919, 
1.47180541887913, 1.40450054293681, 1.27349389048426, 1.29528063459954, 
1.19942539235927, 1.30759474350784, 1.22991541621001, 1.20042686177282, 
1.24226428750839, 1.31647286005394, 1.36705779079172, 1.17554267519, 
1.16899741356497, 1.43296930422939, 1.21387581786617, 1.38582954148396, 
1.28783359001873, 1.21384290720918, 1.3800531898995, 1.42612748362677, 
1.38707139063945, 1.35097838761857, 1.55653136781291, 1.56210596088512, 
1.46432027353204, 1.73308661593706, 1.81117569489508, 1.95852654293544, 
1.93756438714927, 1.66868242849234, 1.73257095484375, 1.61222759440404, 
1.6253033081607, 1.48045275517727, 1.55148523707946, 1.72319831625545, 
1.56521419500555, 1.71621780116811, 1.60341610957045, 1.42595759455634, 
1.38612153699359, 1.31866799689145, 1.24054071692306, 1.51727627953577, 
1.46369642055739, 1.62365073651995, 1.48616110223811, 1.45200956674952, 
1.57953386960066, 1.35567084839722, 1.31102635489236, 1.18306026521603, 
1.34712117984483, 1.48011116841267, 1.31952399321848, 1.28955782668141, 
1.2847877566953, 1.38781149288866, 1.14273437468433, 1.32920929503918, 
1.2625010370967, 1.24823649783691, 1.32533796650272, 1.15633519566998, 
1.35643701026316, 1.26623559190508, 1.2779230360069, 1.39393252918254, 
1.34649076777916, 1.57333920622946, 1.56022699221392, 1.37469189146892, 
1.40291710899725, 1.4280748466788, 1.54743544515232, 1.76335619076727, 
1.64154787821095, 1.59923604539536, 1.87343400218089, 1.64411494469552, 
2.06337974786263, 1.79679434639182, 1.6449219227191, 1.60191406672643, 
1.80552930592686, 1.71079029585556, 1.76324045494208, 1.75172505216794, 
1.75498018535201, 1.50968674188647, 1.71003279170095, 1.54195285019629, 
1.5921076852091, 1.58468618049952, 1.40466129233649, 1.51748231420534, 
1.70208735303382, 1.85091455978708, 1.74926589474411, 1.32473081658656, 
1.48632510896193, 1.49174172910242, 1.27765485727251, 1.42447037214314, 
1.79061536646697, 1.62876010610048, 1.3411075302794, 1.43372361571107, 
1.30745132006153, 1.11947181750234, 1.3814366092412, 1.46127530431355, 
1.29002256883274, 1.2398180314717, 1.39955595479061, 1.23212239602012, 
1.42879839332803, 1.40682720430913, 1.72735766769174, 1.24631738756635, 
1.3074957545204, 1.33889060033108, 1.3199907585375, 1.45011590182187, 
1.47464294283024, 1.63025156324233, 1.43518657525572, 1.70101536967133, 
1.53879821021698, 1.66216734455997, 1.71016735176043, 1.59135918593749, 
1.90867635488099, 1.81890270995124, 1.91974059472128, 1.99327137052795, 
1.93435630760761, 1.70498489998951, 1.94617134446234, 1.79821897461961, 
1.70860912987869, 1.62583692511332, 1.70284656450383, 1.75349832294427, 
1.55992661541648, 1.64923355919767, 1.58374450488874, 1.43099121772556, 
1.6720951989917, 1.63569433069745, 1.56297225511903, 1.37234218101439, 
1.62846496684787, 1.45468005665216, 1.46447402492545, 1.49422003453774, 
1.36416239454076, 1.26665784696327, 1.42621220161668, 1.31906561671418, 
1.56293535063656, 1.41124392931084, 1.48256373828037, 1.5517304198153, 
1.46775941254522, 1.33131685935843, 1.44720659024551, 1.37716132331258, 
1.52882131002331, 1.49772761055849, 1.49598881675741, 1.4436527605176, 
1.49981336417591, 1.48315006689715, 1.51558578205289, 1.42774117545654, 
1.53336088586741, 1.48915800705672, 1.32963057893323, 1.66758248928613, 
1.9868088974245, 1.6013366125517, 1.92183146143593, 2.03298402768403, 
2.04942459105045, 2.05963631611094, 1.94588422660773, 1.96201751711786, 
1.98861196288086, 1.81221909789435, 1.90862937715181, 1.88729319156364, 
1.73284989132999, 1.67492630039261, 1.96341445187649, 1.65044353051518, 
1.56141975468635, 1.53636815843546, 1.51077277695073, 1.69938051312308, 
1.73580450473473, 1.57871171461789, 1.71561146182678, 1.62626622286327, 
1.56926630931672, 1.60160751499099, 1.59430576978708, 1.70308313972817, 
1.60980830874125, 2.10166142081649, 1.70173661542508, 1.70249301702887, 
1.39310069185542, 1.26684098191883, 1.41331947258224, 1.38942452543244, 
1.49664320528791, 1.2603789824492, 1.39558797557555, 1.43708777666337, 
1.59937247371089, 1.52662972813667, 1.55327235669201, 1.36463952847974, 
1.48182556775146, 1.41239170175741, 1.37685943760087, 1.45727766276937, 
1.3575730455734, 1.44108325750848, 1.50900111871182, 1.77063869557307, 
1.97657756150903, 1.8883656161968, 1.95877784079574, 2.24763616539836, 
2.16270152004098, 2.09783115130459, 2.27274727762128, 2.46035830600469, 
2.02426295139333, 2.39422867018116, 1.92394855093771, 2.17810121167828, 
2.00612799115504, 1.78424713919667, 2.0400432816189, 1.79256103489444, 
1.91409478802101, 1.73150194252708, 1.6912418337357, 1.75277447327501, 
1.85091969553689, 1.56679068785437, 1.65536989557469, 1.60663371555237, 
2.01662556616972, 1.67081920134216, 1.66342984073245, 1.89203261364869, 
1.9256563676998, 2.17669290665916, 1.96119726513824, 1.77590077949215, 
1.91168033977827, 1.85350279506872, 1.8255506513295, 1.53524437633556, 
1.5037629155505, 1.45284901948611, 1.49595326670589, 1.4695719361713, 
1.52677696453418, 1.51511572943787, 1.51445931388811, 1.49460322790447, 
1.51590263465507, 1.50960240456058, 1.52183314361577, 1.48856608717673, 
1.52846450808968, 1.51187726330412, 1.45224630607792, 1.57541341759254, 
1.65339389119157, 1.5530652177716, 1.64143401336441, 1.66086748404603, 
1.66333965982899, 1.6648278163954, 1.64606881590278, 1.64903884298644, 
1.65370103757561, 1.63443933109847, 1.62254021970341, 1.56240224916707, 
1.56597342109254, 1.51312911363279, 1.65650202008756, 1.56305567273967, 
1.53024422924801, 1.49368346466127, 1.51304956503642, 1.5722161158099, 
1.59860807361417, 1.50182216056083, 1.60084661780045, 1.55037806375165, 
1.45223419121299, 1.62816386241768, 1.73158686552491, 1.62782065585366, 
1.71864664092992, 1.78757241857122, 1.76051806693622, 1.76237510711282, 
1.67273139326038, 1.63774229399432, 1.69105865330396), key = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("actual", 
"predict"), class = "factor")), row.names = c(NA, -416L), index_quo = ~index, index_time_zone = "UTC", class = c("tbl_time", 
"tbl_df", "tbl", "data.frame"))

编辑:

该模型从 2005-01-31 开始,到 2012-01-16 结束,并且有每周周期。模型中有 364 周(364/52 = 7 年)。我在前 6 年训练了模型(从 2005-01-312011-01-17) ),并在最后一年测试了模型(从 2011-01-24 周到 2012-01-16 )。

我有去年的预测,也有这一时期的实际值。我正在尝试计算我的预测或过去 52 周的 rmse。

编辑2:

所以基本上在查看 rmse_calculation 表(第 364 行)之后,我尝试“推高”预测列,然后删除预测列中的所有 NA 值,其中我将只剩下 52 个观察值,然后我可以计算52周的rmse。

编辑3:

填写 IRI_KEY 列并不那么重要。

最佳答案

看来我们可以安全地丢弃IRI_KEY以扩展键值index。有了这个,我们可以进行左连接或扩展来获得有效的相同关联:

df %>%
  select(-IRI_KEY) %>%
  spread(key, value) %>%
  filter(complete.cases(.))
# # A tibble: 52 x 3
#    index      actual predict
#    <date>      <dbl>   <dbl>
#  1 2011-01-24   1.39    1.54
#  2 2011-01-31   1.50    1.50
#  3 2011-02-07   1.26    1.45
#  4 2011-02-14   1.40    1.50
#  5 2011-02-21   1.44    1.47
#  6 2011-02-28   1.60    1.53
#  7 2011-03-07   1.53    1.52
#  8 2011-03-14   1.55    1.51
#  9 2011-03-21   1.36    1.49
# 10 2011-03-28   1.48    1.52
# # ... with 42 more rows
df %>%
  select(-IRI_KEY) %>%
  spread(key, value) %>%
  filter(complete.cases(actual, predict)) %>%
  with(., ModelMetrics::rmse(actual, predict))
# [1] 0.3130566

我们必须使用 filter(complete.cases(actual, Predict)) 因为 rmse 不需要 NA 值,并且它不接受其他 R 函数的常用标准 na.rm=TRUE

<小时/>

这种spread方法的缺点是它会丢弃您的IRI_KEY,因为(正如@MrFlick强调的那样)它不会在您的预测步骤中传输。另一种方法是将您的预测值左连接到相同索引行上:

df %>%
  filter(key == "predict") %>%
  select(index, value) %>%
  left_join(filter(df, key == "actual"), by="index") %>%
  rename(actual = value.y, predict = value.x)
# # A tibble: 52 x 5
#    index      predict IRI_KEY actual key   
#    <date>       <dbl>   <dbl>  <dbl> <fct> 
#  1 2011-01-24    1.54  648459   1.39 actual
#  2 2011-01-31    1.50  648459   1.50 actual
#  3 2011-02-07    1.45  648459   1.26 actual
#  4 2011-02-14    1.50  648459   1.40 actual
#  5 2011-02-21    1.47  648459   1.44 actual
#  6 2011-02-28    1.53  648459   1.60 actual
#  7 2011-03-07    1.52  648459   1.53 actual
#  8 2011-03-14    1.51  648459   1.55 actual
#  9 2011-03-21    1.49  648459   1.36 actual
# 10 2011-03-28    1.52  648459   1.48 actual
# # ... with 42 more rows

这允许我们同样使用rmse函数:

df %>%
  filter(key == "predict") %>%
  select(index, value) %>%
  left_join(filter(df, key == "actual"), by="index") %>%
  rename(actual = value.y, predict = value.x) %>%
  with(., ModelMetrics::rmse(actual, predict))
# [1] 0.3130566

注意:我没有开始使用这种方法,因为输出表明我知道预测值与我不知道的 IRI_KEY 值相关联(只有你这样做)。如果您不确定日期是否提供了足够的相关性来识别 key ,那么这种方法就是错误的,并且可能/将会导致稍后的分析管道中的错误推论。

关于r - 当我的数据未正确对齐时计算均方根误差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54296434/

相关文章:

python - 使用 t-SNE 和/或 PCA 时的 Gensim Doc2Vec 可视化问题

c++ - 如何使用某种统计方法匹配软聚合特征(眼睛、 Nose 、嘴巴)?

iphone - 适用于 iPhone 的朴素贝叶斯分类器?

python - 在python中生成具有三个类的3个圆圈数据集

mysql - 如何让 Shiny 的应用程序与云关系数据库(在 MySQL 中)对话?

r - 在 Sweave 中使用 Tufte-Latex 类

r - 将 geom_rect() 添加到 ggplot2 中的时间序列数据

r - 将经纬度坐标转换为R中的国家名称

r - 有没有一种快速的方法来替换 R 中的列值?

python - Tensorflow 在 python for 循环中太慢