在 R 中正确读取序列矩阵

我下载了 GSE60341_series_matrix.txt.gz 发现 here当我将它读入 R 表作为，

x <-read.table("GSE60341_series_matrix.txt", fill = TRUE)

我按行获取所有信息。换句话说，我得到一个大小矩阵
(42977 行和 3 列)，而样本数应为 1951。
所以理想情况下，我应该得到一个包含 1951 行和(代表每个样本的 k 列)的表格。

打开文本文件让我，

    sapiens"    "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"  "Homo sapiens"
!Sample_title   "20120811_NC18_NC18_01" "20120811_NC18_NC18_02" "20120811_NC18_NC18_03" "20120811_NC18_NC18_04" "20120811_NC18_NC18_05"
    !Sample_characteristics_ch1 "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"    "stimulation: IFNb" "stimulation: Unstim"   "stimulation: Activated"

"lane: 9"   "lane: 11"  "lane: 12"  "lane: 1"   "lane: 2"   "lane: 3"   "lane: 4"   "lane: 5"   "lane: 6"   "lane: 7"   "lane: 8"   "lane: 9"   "lane: 10"  "lane: 11"  "lane: 12"  "lane: 1"   "lane: 2"   "lane: 3"

类别( lane 、 stimulation 、 Sample_title )中的信息作为行连接，但我希望它们在列中。我可以有一个表格，其中行代表样本，列代表，比如 [Sample_title, stimulation] ?

最佳答案

read.table用于读取通用 ASCII 表格式，该文件采用 NCBI 基因表达综合 (GEO) 使用的特殊格式。

以下是您需要做的:

通过将此代码粘贴到 R 中来安装用于读取 GEO 文件的 GEOQuery 包:

source("http://bioconductor.org/biocLite.R")
biocLite("GEOquery")

使用以下行将包加载到内存中:

library("GEOquery")

编辑以下行，将工作目录到文件的完整路径放在引号内，将数据作为对象读入内存 gse :

gse=getGEO(filename="~/Downloads/GSE60341_series_matrix.txt.gz")

现在，如果你运行 View(gse)您将在 gse 中看到一个格式良好的表格，其中包含 1950 行。

查看 GEOquery Documentation了解更多信息。

关于在 R 中正确读取序列矩阵，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28179032/

在 R 中正确读取序列矩阵

上一篇：binding - javafx 传递 fx :id to controller or parameter in fxml onAction method

下一篇：licensing - 是否有 Gradle 许可证插件或处理依赖项许可证的最佳方法是什么？