r - data.table 中字符变量的 order() 如何工作？

我想根据字符变量(在我的示例中为 stage )对表中的行重新排序。如果我首先将所需的订单( order(dt1$stage) )保存到变量中，然后将其应用为 dt1[myorder, stage] - 效果很好。但是当我尝试执行相同的内联操作时，例如 dt1[order(dt1$stage), ] ，顺序不同!一定是我缺少的一些非常基本的东西......

dt1 <- fread('
id stage pos
 1 I       1
 2 II      2
 3 III     5
 4 IV      6
 5 IIa     3
 6 IIb     7
 7 IIIa    8
 8 IIIb    4
 9 IVa     9
10 IVb    10')

sort(dt1$stage) # OK
# I II IIa IIb III IIIa IIIb IV IVa IVb

myorder <- order(dt1$stage)
dt1[myorder         , stage] # OK
# I II IIa IIb III IIIa IIIb IV IVa IVb

dt1[order(dt1$stage), stage] # different!
# I II III IIIa IIIb IIa IIb IV IVa IVb

最佳答案

它正在执行快速 订单，而不是base::order。根据 ?data.table::order

Note that queries like x[order(.)] are optimised internally to use data.table's fast order.

Also note that data.table always reorders in "C-locale" (see Details). To sort by session locale, use x[base::order(.)].

data.table implements its own fast radix-based ordering.

data.table always reorders in "C-locale". As a consequence, the ordering may be different to that obtained by base::order. In English locales, for example, sorting is case-sensitive in C-locale. Thus, sorting c("c", "a", "B") returns c("B", "a", "c") in data.table but c("a", "B", "c") in base::ord

如果我们想从base复制sort，请使用base::order

dt1[base::order(stage)]$stage
#[1] "I"    "II"   "IIa"  "IIb"  "III"  "IIIa" "IIIb" "IV"   "IVa"  "IVb"

关于r - data.table 中字符变量的 order() 如何工作？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65432097/

r - data.table 中字符变量的 order() 如何工作？

上一篇：java - 多级@JsonTypeInfo和@JsonSubTypes

下一篇：c# - WebView2 Source 属性不启动 CoreWebView2