r - 使用 foreach 时如何记录(打印或 futile.logger)

标签 r logging parallel-foreach

我想将 foreach 包与日志记录一起使用。我通常使用 futile.logger 包。当工作交给 worker 时,日志信息丢失(这很奇怪,因为您需要指示 foreach 日志包)

我看过 this post但它不使用 foreach

  library(foreach)                                                                                                                                                                                                                                                                                                       
  library(futile.logger)                                                                                                                                                                                                                                                                                                 
  library(doParallel)                                                                                                                                                                                                                                                                                                    
  flog.threshold(DEBUG)                                                                                                                                                                                                                                                                                                  
  cluster <- makeCluster(8)
  registerDoParallel(cluster)
  doStuff <- function(input){                                                                                                                                                                                                                                                                                            
    flog.debug('Doing some stuff with %s', input)                                                                                                                                                                                                                                                                      
    return(input)                                                                                                                                                                                                                                                                                                      
  }                                                                                                                                                                                                                                                                                                                      
  res <- lapply(FUN=doStuff, X=seq(1,8,1))
  # >> this prints                                                                                                                                                                                                                                                                         
  res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input)                                                                                                                                                                                                                                                                
  # >> this prints
  res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)        
  # >> this does not                                                                                                                                                                                                                          
  identical(res,res2) && identical(res,res3)

我不太关心并行后端,可以是任何东西,但我怎样才能让日志正常工作

最佳答案

遵循来自 How can I print when using %dopar% 的解决方案: 这个想法是使用 snow 来设置你的集群,并设置 outfile="" 将 worker 输出重定向到 master。

library(foreach)
library(futile.logger)
library(doParallel)

library(doSNOW)
cluster <- makeCluster(3, outfile="") # I only have 4 cores, but you could do 8
registerDoSNOW(cluster)
flog.threshold(DEBUG)

doStuff <- function(input){
  flog.info('Doing some stuff with %s', input) # change to flog.info
  return(input) 
  } 
res <- lapply(FUN=doStuff, X=seq(1,8,1))
# >> this prints                                                              
res2 <- foreach(input = seq(1,8,1)) %do% doStuff(input) 
# >> this prints
res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
# >> this prints too

输出:

> res3 <- foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input)  
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 3
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 1
INFO [2016-08-08 08:22:39] Doing some stuff with 2
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 5
INFO [2016-08-08 08:22:39] Doing some stuff with 4
Type: EXEC 
Type: EXEC 
INFO [2016-08-08 08:22:39] Doing some stuff with 6
INFO [2016-08-08 08:22:39] Doing some stuff with 7
INFO [2016-08-08 08:22:39] Doing some stuff with 8

输出到日志文件。这是一个输出到日志文件的替代方案,在How to log using futile logger from within a parallel method in R?之后。 .它的优点是输出更清晰,但仍然需要 flog.info:

library(doSNOW)
library(foreach)
library(futile.logger)
nworkers <- 3
cluster <- makeCluster(nworkers)
registerDoSNOW(cluster)
loginit <- function(logfile) flog.appender(appender.file(logfile))
foreach(input=rep('~/Desktop/out.log', nworkers), 
  .packages='futile.logger') %dopar% loginit(input)
doStuff <- function(input){
  flog.info('Doing some stuff with %s', input)
  return(input) 
  } 
foreach(input = seq(1,8,1), .packages='futile.logger') %dopar% doStuff(input) 
stopCluster(cluster)
readLines("~/Desktop/out.log")

输出:

> readLines("~/Desktop/out.log")
[1] "INFO [2016-08-08 10:07:30] Doing some stuff with 2"
[2] "INFO [2016-08-08 10:07:30] Doing some stuff with 1"
[3] "INFO [2016-08-08 10:07:30] Doing some stuff with 3"
[4] "INFO [2016-08-08 10:07:30] Doing some stuff with 4"
[5] "INFO [2016-08-08 10:07:30] Doing some stuff with 5"
[6] "INFO [2016-08-08 10:07:30] Doing some stuff with 6"
[7] "INFO [2016-08-08 10:07:30] Doing some stuff with 7"
[8] "INFO [2016-08-08 10:07:30] Doing some stuff with 8"

关于r - 使用 foreach 时如何记录(打印或 futile.logger),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38828344/

相关文章:

r - 为什么这个并行计算代码只使用了 1 个 CPU?

r - 将数据框的部分转置为单独的列

c++ - 在哪里存储可能从低完整性进程和 Windows 服务生成的日志文件?

r - R : how to use the cores 中的并行计算

java - Hibernate无法创建表

logging - GM_log 和其他 GM 函数在 Greasemonkey 脚本中不起作用

r - 在 R foreach() 下并行运行时无法识别动态库依赖项

r - 将预测的时间序列与 R 中的原始序列重叠

r - 如何从日期中提取季度?

r - 在新创建的列中,使用上面行中的值使用 dplyr 计算下一行