r - data.table 中字符变量的 order() 如何工作?

标签 r data.table

我想根据字符变量(在我的示例中为 stage )对表中的行重新排序。如果我首先将所需的订单( order(dt1$stage) )保存到变量中,然后将其应用为 dt1[myorder, stage] - 效果很好。但是当我尝试执行相同的内联操作时,例如 dt1[order(dt1$stage), ] ,顺序不同!一定是我缺少的一些非常基本的东西......

dt1 <- fread('
id stage pos
 1 I       1
 2 II      2
 3 III     5
 4 IV      6
 5 IIa     3
 6 IIb     7
 7 IIIa    8
 8 IIIb    4
 9 IVa     9
10 IVb    10')

sort(dt1$stage) # OK
# I II IIa IIb III IIIa IIIb IV IVa IVb

myorder <- order(dt1$stage)
dt1[myorder         , stage] # OK
# I II IIa IIb III IIIa IIIb IV IVa IVb

dt1[order(dt1$stage), stage] # different!
# I II III IIIa IIIb IIa IIb IV IVa IVb

最佳答案

它正在执行快速 订单,而不是base::order。根据 ?data.table::order

Note that queries like x[order(.)] are optimised internally to use data.table's fast order.

Also note that data.table always reorders in "C-locale" (see Details). To sort by session locale, use x[base::order(.)].

data.table implements its own fast radix-based ordering.

data.table always reorders in "C-locale". As a consequence, the ordering may be different to that obtained by base::order. In English locales, for example, sorting is case-sensitive in C-locale. Thus, sorting c("c", "a", "B") returns c("B", "a", "c") in data.table but c("a", "B", "c") in base::ord

如果我们想从base复制sort,请使用base::order

dt1[base::order(stage)]$stage
#[1] "I"    "II"   "IIa"  "IIb"  "III"  "IIIa" "IIIb" "IV"   "IVa"  "IVb" 

关于r - data.table 中字符变量的 order() 如何工作?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65432097/

相关文章:

r - 计算机滴定与微阵列数据集中观察到的表达之间的相关性

r - 如何按照用户定义的顺序(例如非字母顺序)对数据框进行排序

r - R控制台左侧的 “+”符号是什么意思?

Java R 接口(interface) (JRI) 设置

用字符串中的平均值替换范围

r - 删除缺少值的行的最快方法?

r - 将 rollapply() 和 Weighted.mean() 组合在 data.table apply() 中以用于多列

给定条件替换 data.table 中的所有值

r - 根据函数的参数进行函数调用 %do% 与 %dopar% (foreach)

r - 如果索引列名是连接列名的前缀,Data.table join with index 会产生意外的结果