R 如何在 Linux 上获取 R 可用的内存量

标签 r linux

我知道,我可以通过以下方式查询 /proc/meminfo:

memfree <- tryCatch(
    as.numeric(system("/usr/bin/awk '/MemAvailable/ {print $2}' /proc/meminfo", intern=TRUE))*1024,
    error = function(e) 0)

不幸的是,它有两个主要限制:

  • 生成子 shell 本身会占用大量内存,我已经多次看到它失败,但仍有许多 MB 可用内存。
  • 它与每个进程的内存限制不兼容assigned by kernel API .

R 肯定有办法知道可用内存的实际大小。但是去哪里找呢?


我已经在 R 的错误追踪器上添加了一个错误:https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16793

最佳答案

这是一个有趣的问题,我认为您实际上可以通过使用 Rcpp 来绕过它。这是一个可能的解决方案(代码中的注释):

#install ulimit package from github
#devtools::install_github("krlmlr/ulimit")

#one should delete all objects from workspace
#you have to uncomment that out yourself :)
#rm(list=ls())

library(Rcpp)

#source code for function that gets the memory used by a process
#taken from
#stackoverflow.com/questions/669438/how-to-get-memory-usage-at-run-time-in-c
src_string_1="
long mem_used_bytes(int pid) {
    long rss = 0L;
    FILE* fp = NULL;
    std::string file_path=\"/proc/\"+std::to_string(pid)+\"/statm\";
    if ( (fp = fopen(file_path.c_str(), \"r\" )) == NULL )
        return (size_t)0L;      /* Can't open? */
    if ( fscanf( fp, \"%*s%ld\", &rss ) != 1 )
    {
        fclose( fp );
        return (size_t)0L;      /* Can't read? */
    }
    fclose( fp );
    return (size_t)rss * (size_t)sysconf( _SC_PAGESIZE);
}
"

#source code for function that gets available memory
#code snippets taken from http://linux.die.net/man/2/getrlimit
src_string_2="
long mem_limit_bytes(int pid_int) {
    long res;
    struct rlimit tmp;
    pid_t pid=pid_int;
    prlimit(pid, RLIMIT_AS, NULL, &tmp);
    if (tmp.rlim_cur==-1L) {
      //there is no memory limit for the current process (should be default)
      Rcpp::Rcout<<\"No limit detected\\n\";
      struct sysinfo tmp2;
      sysinfo(&tmp2);
      res = tmp2.mem_unit * tmp2.totalram;
    } else {
      //memory limit set
      Rcpp::Rcout<<\"Limit detected\\n\";
      res=tmp.rlim_cur;
    }
    return res;
}
"

#compile functions; for convenience, we use c++11
cppFunction(src_string_1,
            plugins=c("cpp11"),
            includes=c("#include <string>",
                       "#include <sys/resource.h>",
                       "#include <unistd.h>"))
cppFunction(src_string_2,
            includes=c("#include <sys/resource.h>",
                       "#include <unistd.h>",
                       "#include <sys/sysinfo.h> "))

#memory without limit set; returns total system memory
mem_limit_bytes(Sys.getpid())/1e6
#No limit detected
#[1] 8228.246

#set limit for current R process
ulimit::memory_limit(4000)

#now the C++ function will detect the limit
mem_limit_bytes(Sys.getpid())/1e6
#Limit detected
#[1] 4194.304

现在试试mem_used_bytes函数

#first some garbage collection
gc()
old_mem_mb=mem_used_bytes(Sys.getpid())/1e6

#allocate a matrix with approx 800MB
NN=1e4
expected_memory_mb=NN^2*8/1e6
A=matrix(runif(NN**2),NN,NN)

#garbage collection, again
gc()

#query used memory again
new_mem_mb=mem_used_bytes(Sys.getpid())/1e6

#the following value should be close to 1
(new_mem_mb-old_mem_mb)/expected_memory_mb

编辑:这是一个更简单的单文件版本,减少了所需的 header 并使用纯 C++:

#include <Rcpp.h>
#include <unistd.h>
#include <sys/resource.h>
#include <sys/sysinfo.h>

// [[Rcpp::export]]
long mem_used_bytes(int pid) {
    long rss = 0L;
    FILE* fp = NULL;
    char filepath[128];
    snprintf(filepath, 127, "/proc/%d/statm", pid);
    if ( (fp = fopen(filepath, "r" )) == NULL )
        return (size_t)0L;      /* Can't open? */
    if ( fscanf( fp, "%*s%ld", &rss ) != 1 ) {
        fclose( fp );
        return (size_t)0L;      /* Can't read? */
    }
    fclose( fp );
    return (size_t)rss * (size_t)sysconf( _SC_PAGESIZE);
}

// [[Rcpp::export]]
long mem_limit_bytes(int pid_int) {
    long res;
    struct rlimit tmp;
    pid_t pid=pid_int;
    prlimit(pid, RLIMIT_AS, NULL, &tmp);
    if (tmp.rlim_cur==-1L) {
        //there is no memory limit for the current process (should be default)
        Rcpp::Rcout << "No limit detected\n";
        struct sysinfo tmp2;
        sysinfo(&tmp2);
        res = tmp2.mem_unit * tmp2.totalram;
    } else {
        //memory limit set
        Rcpp::Rcout << "Limit detected\n";
        res=tmp.rlim_cur;
    }
    return res;
}

/*** R
## memory without limit set; returns total system memory
mem_limit_bytes(Sys.getpid())/1e6

## try out the `mem_used_bytes` function
## first some garbage collection
gc()
old_mem_mb <- mem_used_bytes(Sys.getpid())/1e6

## allocate a matrix with approx 800MB
NN <- 1e4
expected_memory_mb <- NN^2 * 8 / 1e6
A <- matrix(runif(NN**2),NN,NN)

##garbage collection, again
gc()

## query used memory again
new_mem_mb <- mem_used_bytes(Sys.getpid())/1e6

## the following value should be close to 1
(new_mem_mb - old_mem_mb)/expected_memory_mb

*/

关于R 如何在 Linux 上获取 R 可用的内存量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36372397/

相关文章:

R 计算逗号和字符串的数量

R:计算时间间隔

python - 如何在不破坏 apt 的情况下更新 Python 3 的替代品?

linux - 如何在不以 root 身份运行的情况下从引导脚本启动 mongod?

c++ - 您使用什么工具来分析您的 C++ 应用程序的挂钟?

R优化双循环,矩阵操作

r - DiametermeR:如何在节点内插入换行符?

linux - 查找空文件,如果找到更新文件

linux - 创建在后台启动的 emacs 别名?

r - 如何使用 grid.arrange 排列任意数量的 ggplot?