r - 从 SPELL 数据创建序列对象

标签 r traminer

我正在尝试使用 seqdef 创建一个序列对象使用 SPELL 格式。这是我的数据示例:

spell <- structure(list(ID = c(1, 3, 3, 4, 5, 5, 6, 8, 9, 10, 11, 11, 
12, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 
15, 15, 15, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 
19), status = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 2, 3, 1, 2, 3, 2, 3, 1, 1, 1, 3, 1, 3, 3, 1, 3, 1, 1, 1, 
1, 1, 3, 3, 1, 3, 1, 1, 1), time1 = c(1, 1, 57, 1, 1, 91, 1, 
1, 1, 1, 1, 104, 1, 1, 60, 109, 121, 1, 42, 47, 54, 64, 72, 78, 
85, 116, 1, 29, 39, 69, 74, 78, 88, 1, 16, 40, 68, 1, 30, 123, 
1, 39, 51, 1, 61), time2 = c(125, 57, 125, 125, 91, 125, 125, 
125, 125, 125, 104, 125, 125, 60, 109, 121, 125, 42, 47, 54, 
64, 72, 78, 85, 116, 125, 29, 39, 69, 74, 78, 88, 125, 16, 40, 
68, 125, 30, 123, 125, 39, 51, 125, 61, 125)), .Names = c("ID", 
"status", "time1", "time2"), row.names = c(NA, 45L), class = "data.frame")

当我尝试定义序列对象时,抛出了一个奇怪的错误:
spell.seq <- seqdef(data=spell, informat="SPELL", id="ID", begin="time1", end="time2", 
                    status="status", limit=125,process=FALSE)

 [>] time axis: 1 -> 125
 [>] SPELL data converted into 17 STS sequences
 [>] 3 distinct states appear in the data: 
     1 = 1
     2 = 2
     3 = 3
 [>] state coding:
       [alphabet]  [label]  [long label] 
     1  1           1        1
     2  2           2        2
     3  3           3        3
 [>] 17 sequences in the data set
 [>] min/max sequence length: 125/125
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  invalid 'row.names' length

但是,如果我通过 seqformat 间接地做同样的事情,保留相同的参数,不会抛出错误:
sts <- seqformat(data=spell,from="SPELL",to="STS",
                 id="ID",begin="time1",end="time2",status="status",
                 limit=125,process=FALSE)

seqs <- seqdef(sts,right="DEL")

使用 TraMineR 1.8-5 和 R 3.0.0 Windows 7 64 位。这是一个错误还是我做错了什么?提前致谢。

最佳答案

快速浏览 seqdef() 的来源为如何row.names are set 显示它们是根据 id 的值设置的争论。

在看 ?seqdefid显示

id
optional argument for setting the rownames of the sequence object. If NULL (default), the rownames are taken from the input data. If set to "auto", sequences are numbered from 1 to the number of sequences. A vector of rownames of length equal to the number of sequences may be specified as well.



从您传递的问题中的示例 id="ID"不符合这些标准。将此更改为 id=NULL允许命令按预期完成并使用 identical( spell.seq, seqs) 检查相等性 yield true .

关于r - 从 SPELL 数据创建序列对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16110508/

相关文章:

r - 使用 ggplot 未完全填充等高线图

r - 比较两个数据集,显示总行数,如果不同则显示 subjectid

r - seqstatd 命令是否计算每个状态所花费时间的标准误差?

r - 使用 mapply 在 R/TraMineR 中创建序列对象?

r - 基于仅具有 1 个状态的上下文预测条件概率

python - MLWIC : Machine Learning for Wildlife Image Classification in R Issues with Python

r - 即使在函数调用中,data.table 操作也应该具有全局范围吗?

r - 在 R 中查找多列的平均值

r - 计算概率后缀树中上下文状态关系的提升?

entropy - 解释熵大小的经验法则