r - 表达式集 - 表型数据

标签 r bioconductor

我必须首先说我刚刚开始使用 R 进行编程。我无法创建我的数据的表达式集。当我尝试将assaydata 和phenodata 放在一起制作表达式集时,出现错误:

Error in validObject(.Object) : " invalid class ""ExpressionSet"" object: sampleNames differ between assayData and phenoData"

请看一下示例数据、我制作的表型数据表和 R 程序。我想应该修改 phenodata 以使其正常工作。

请告诉我如何解决这个问题并改变 phenodata。

AssayData                                                       
    0h-1    0h-2    6h-1    6h-2    12h-1   12h-2   24h-1   24h-2   48h-1   48h-2   72h-1   72h-2   96h-1   96h-2
171407  4.021342514 4.021342514 6.847201005 6.847201005 3.189312274 3.189312274 3.322687671 3.322687671 4.929574559 4.929574559 4.040127938 4.040127938 3.181587044 3.181587044
171415  267.8091012 267.8091012 358.8511895 358.8511895 266.4562608 266.4562608 210.259177  210.259177  243.1496956 243.1496956 248.2780935 248.2780935 235.7079055 235.7079055
171426  13.3620332  13.3620332  5.581083074 5.581083074 12.5236932  12.5236932  8.433621131 8.433621131 13.07390505 13.07390505 12.94673202 12.94673202 23.43214156 23.43214156
171453  37.65310777 37.65310777 27.88942772 27.88942772 54.7409581  54.7409581  78.86045287 78.86045287 63.61655487 63.61655487 67.31327606 67.31327606 62.35426899 62.35426899

PhenoData                                                       
        condition   time    rep                                         
0h-1    Control 0   1                                           
0h-2    Control 0   2                                           
6h-1    treatment   6   1                                           
6h-2    treatment   6   2                                           
12h-1   treatment   12  1                                           
12h-2   treatment   12  2                                           
24h-1   treatment   24  1                                           
24h-2   treatment   24  2                                           
48h-1   treatment   48  1                                           
48h-2   treatment   48  2                                           
72h-1   treatment   72  1                                           
72h-2   treatment   72  2                                           
96h-1   treatment   96  1                                           
96h-2   treatment   96  2   

我的代码:

library(""Biobase"")                                                        
library(""betr"")                                                                                                   
exprs <- as.matrix(read.table(""Timecourse-Assaydata.txt"", header=TRUE, sep=""\t"", row.names=1, as.is=TRUE))                                                      
pData <- read.table(""Timecourse-Phenodata.txt"", row.names=1, header=TRUE, sep=""\t"")                                                     
metadata <- data.frame(labelDescription = c(""Hour of treatment"", ""Treatment time"", ""number of replicates""), row.names = c(""condition"", ""time"", ""rep""))                                                      
phenoData <- new(""AnnotatedDataFrame"", data = pData, varMetadata = metadata)                                                  

exprspop <- new(""ExpressionSet"", exprs = exprs, phenoData = phenoData)    

Error in validObject(.Object) : " invalid class ""ExpressionSet"" object: sampleNames differ between assayData and phenoData"

最佳答案

这个问题的正确位置是 Bioconductor支持网站。最好提供一个可重现的例子,捕获问题的本质;创建可重现的示例通常有助于确定问题的原因。

library(Biobase)

exprs <- matrix(0, nrow=5, ncol=3,
                dimnames=list(letters[1:5], LETTERS[1:3]))
pData <- data.frame(id=c("foo", "bar", "baz"),
                    row.names=c("x", "y", "z"))
phenoData <- AnnotatedDataFrame(data=pData)

导致

> ExpressionSet(exprs, phenoData=phenoData)
Error in validObject(.Object) : 
  invalid class "ExpressionSet" object: sampleNames differ between assayData and
phenoData

问题是 exprscolname(即实验中的样本名称)与 row.names 不同pData(即样本的描述)

> row.names(pData)
[1] "x" "y" "z"
> colnames(exprs)
[1] "A" "B" "C"

解决办法是让它们相同

> colnames(exprs) <- row.names(pData)
> eset <- ExpressionSet(exprs, phenoData=phenoData)
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 5 features, 3 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: x y z
  varLabels: id
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

可以使用 assayDataReplace() 将其他元素添加到现有的 ExpressionSet,例如,

> assayDataElement(eset, "foo") <- sqrt(exprs)
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 5 features, 3 samples 
  element names: exprs, foo 
protocolData: none
phenoData
  sampleNames: x y z
  varLabels: id
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

还是从头开始

> env = new.env()
> env$exprs = exprs
> env$sqrt = sqrt(exprs)
> lockEnvironment(env)
> ExpressionSet(env, pData=pData)
ExpressionSet (storageMode: environment)
assayData: 5 features, 3 samples 
  element names: exprs, sqrt 
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation: 

关于r - 表达式集 - 表型数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7363991/

相关文章:

r - 如何突出显示 Bioconductor Enhancedvolcano 中的特定基因?

r - 为什么在将点传递给 purrr 中的映射函数时需要引用或取消引用点?

r - 如何在 R 中用网格绘制非线性决策边界?

r - 如何使用 ggplot2 绘制重叠范围

r - 用 NA 值绘制置信区间

r - 如何调整复杂热图中轴标签的字体大小?

r - 在分组箱线图上放置水平线

r - 如何使用制图器 R 最小化边交叉的数量?

r - 如何在 RStudio 中调试 S4 类

python - 用 rpy2 修改 r 对象