R选择至少一个值满足条件的整个列

标签 r matrix

我有一个大矩阵,约 300 行和 200000 列。我想通过选择至少有一个值大于 0.5 或小于 -0.5(不仅仅是那个特定值)的整个列来缩小这一范围。我想保留行名和列名。通过执行 tmp<-mymat > 0.5 | mymat < -0.5,我能够得到真假矩阵。 .我想提取所有至少有一个 TRUE 的列在他们中。我简单地尝试了 mymat[tmp]但这只会返回满足该条件的值的向量。如何获得原始矩阵的实际列?谢谢。

最佳答案

试试这个:

> set.seed(007) # for the example being reproducible
> X <- matrix(rnorm(100), 20) # generating some data
> X <- cbind(X, runif(20, max=.48)) # generating a column with all values < 0.5
> colnames(X) <- paste('col', 1:ncol(X), sep='') # some column names
> X # this is how the matrix looks like
              col1        col2         col3        col4         col5        col6
 [1,]  2.287247161  0.83975036  1.218550535  0.07637147  0.342585350 0.335107187
 [2,] -1.196771682  0.70534183 -0.699317079  0.15915528  0.004248236 0.419502015
 [3,] -0.694292510  1.30596472 -0.285432752  0.54367418  0.029219842 0.346358090
 [4,] -0.412292951 -1.38799622 -1.311552673  0.70480735 -0.393423429 0.212185020
 [5,] -0.970673341  1.27291686 -0.391012431  0.31896914 -0.792704563 0.224824248
 [6,] -0.947279945  0.18419277 -0.401526613  1.10924979 -0.311701865 0.415837389
 [7,]  0.748139340  0.75227990  1.350517581  0.76915419 -0.346068592 0.057660111
 [8,] -0.116955226  0.59174505  0.591190027  1.15347367 -0.304607588 0.007812921
 [9,]  0.152657626 -0.98305260  0.100525456  1.26068350 -1.785893487 0.298192099
[10,]  2.189978107 -0.27606396  0.931071996  0.70062351  0.587274672 0.216225091
[11,]  0.356986230 -0.87085102 -0.262742349  0.43262716  1.635794434 0.026097800
[12,]  2.716751783  0.71871055 -0.007668105 -0.92260172 -0.645423474 0.190567072
[13,]  2.281451926  0.11065288  0.367153007 -0.61558421  0.618992169 0.402829397
[14,]  0.324020540 -0.07846677  1.707162545 -0.86665969  0.236393598 0.248196976
[15,]  1.896067067 -0.42049046  0.723740263 -1.63951709  0.846500899 0.406511129
[16,]  0.467680511 -0.56212588  0.481036049 -1.32583924 -0.573645739 0.162457572
[17,] -0.893800723  0.99751344 -1.567868244 -0.88903673  1.117993204 0.383801555
[18,] -0.307328300 -1.10513006  0.318250283 -0.55760233 -1.540001132 0.347037954
[19,] -0.004822422 -0.14228783  0.165991451 -0.06240231 -0.438123899 0.262938992
[20,]  0.988164149  0.31499490 -0.899907630  2.42269298 -0.150672971 0.139233120
> 
> # defining a index for selecting if the condition is met
> ind <- apply(X, 2, function(X) any(abs(X)>0.5))  
> X[,ind] # since col6 only has values less than 0.5 it is not taken
              col1        col2         col3        col4         col5
 [1,]  2.287247161  0.83975036  1.218550535  0.07637147  0.342585350
 [2,] -1.196771682  0.70534183 -0.699317079  0.15915528  0.004248236
 [3,] -0.694292510  1.30596472 -0.285432752  0.54367418  0.029219842
 [4,] -0.412292951 -1.38799622 -1.311552673  0.70480735 -0.393423429
 [5,] -0.970673341  1.27291686 -0.391012431  0.31896914 -0.792704563
 [6,] -0.947279945  0.18419277 -0.401526613  1.10924979 -0.311701865
 [7,]  0.748139340  0.75227990  1.350517581  0.76915419 -0.346068592
 [8,] -0.116955226  0.59174505  0.591190027  1.15347367 -0.304607588
 [9,]  0.152657626 -0.98305260  0.100525456  1.26068350 -1.785893487
[10,]  2.189978107 -0.27606396  0.931071996  0.70062351  0.587274672
[11,]  0.356986230 -0.87085102 -0.262742349  0.43262716  1.635794434
[12,]  2.716751783  0.71871055 -0.007668105 -0.92260172 -0.645423474
[13,]  2.281451926  0.11065288  0.367153007 -0.61558421  0.618992169
[14,]  0.324020540 -0.07846677  1.707162545 -0.86665969  0.236393598
[15,]  1.896067067 -0.42049046  0.723740263 -1.63951709  0.846500899
[16,]  0.467680511 -0.56212588  0.481036049 -1.32583924 -0.573645739
[17,] -0.893800723  0.99751344 -1.567868244 -0.88903673  1.117993204
[18,] -0.307328300 -1.10513006  0.318250283 -0.55760233 -1.540001132
[19,] -0.004822422 -0.14228783  0.165991451 -0.06240231 -0.438123899
[20,]  0.988164149  0.31499490 -0.899907630  2.42269298 -0.150672971

# It could be done just in one step avoiding 'ind'
X[, apply(X, 2, function(X) any(abs(X)>0.5))]

关于R选择至少一个值满足条件的整个列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11652879/

相关文章:

R:将平均绩点可视化为(某种) fiddle 图

r - 逐行变异的有效方法

opengl - 如何检测 View 矩阵是左手矩阵还是右手矩阵?

c++ - 图像卷积和边界

r - 查找每个组的最大值并返回另一列

c++ - 在C++中将对象添加到2D vector

r - 如何将 Shiny 的数据表呈现为链接

r - 有没有办法让 ggplot 图表没有未使用的边距

r - 根据具有特定(非字母)顺序的字符列对 data.table 进行排序

java - OpenGL 3 (LWJGL) LookAt 矩阵混淆