r - 如何在 R 中与 litR 并行处理 LAScatalog

我曾经使用以下代码处理 LIDAR 目录(使用来自伟大的 lidR 包的 LAScatalog 处理引擎):

library(lidR)

lasdir <- "D:\\LAS\\"
output <- "D:\\LAS\\PRODUCTS\\"
epsg = "+init=epsg:25829"
res = 1

no_cores <- detectCores()
cat <- lascatalog(lasdir = lasdir, 
                  outputdir = output, 
                  pattern = '*COL.laz$|*COL.LAZ$',
                  catname = "Catalog",
                  clipcat = FALSE, clipcatbuf = FALSE, clipbuf = 1000, clipcatshape = clipcatshape,
                  cat_chunk_buffer = 20,
                  cores = no_cores, progress = TRUE,
                  laz_compression = TRUE, epsg = epsg,
                  retilecatalog = FALSE, tile_chunk_buffer = 10,
                  tile_chunk_size = 1000,
                  filterask = FALSE,
                  filter = "-keep_first -drop_z_below 2")

DEM_output <- paste0(output,"DEM_", str_pad(res, 3, "left", pad = "0"), "/")
opt_output_files(cat) <- paste0(DEM_output,"{ORIGINALFILENAME}") #set filepaths
DEM <- grid_terrain(cat, res = res, algorithm = "knnidw"(k = 5, p = 2))

该库有一些实现，现在参数cores似乎不起作用，尽管该过程有效，但现在它不能并行工作。一条消息指出:选项不再受支持。请参阅？lidR 并行性。

现在如何并行处理目录？

最佳答案

自 lidR 2.1.0(2019 年 7 月)起，opt_core() 函数已被弃用。请参阅changelog .

The strategy used to process the tiles in parallel must now be explicitly declared by users. This is anyway how it should have been designed from the beginning! For users, restoring the exact former behavior implies only one change.

In versions < 2.1.0 the following was correct:
library(lidR)
ctg <- catalog("folder/")
opt_cores(ctg) <- 4L
hmean <- grid_metrics(ctg, mean(Z))
In versions >= 2.1.0 this must be explicitly declared with the future package:
library(lidR)
library(future)
plan(multisession)
ctg <- catalog("folder/")
hmean <- grid_metrics(ctg, mean(Z))

此外，这在名为 lidR-parallelism 的手册页中有完整记录。

?lidR::`lidR-parallelism`

chunk-based parallelism

When processing a LAScatalog, the internal engine splits the dataset into chunks and each chunk is read and processed sequentially in a loop. But actually this loop can be parallelized with the future package. By defaut the chunks are processed sequentially, but they can be processed in parallel by registering an evaluation strategy. For example, the following code is evaluated sequentially:
ctg <- readLAScatalog("folder/")
out <- grid_metrics(ctg, mean(Z))
But this one is evaluated in parallel with two cores:
library(future)
plan(multisession, workers = 2L)
ctg <- readLAScatalog("folder/")
out <- grid_metrics(ctg, mean(Z))
With chunk-based parallelism any algorithm can be parallelized by processing several subsets of a dataset [...]

要充分利用这种新语法，您需要了解 future 的工作原理。请参阅future .

关于r - 如何在 R 中与 litR 并行处理 LAScatalog，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59138769/

r - 如何在 R 中与 litR 并行处理 LAScatalog

chunk-based parallelism

上一篇：python - 如果至少一个单元格为 NaN，则 Pandas 连接行

下一篇：javascript - 将数组添加到现有对象