r - R 中文本对齐

标签 r text justify

如何在 R 中调整文本?我所说的对齐是指段落中每一行的长度完全相同(就像在开放式办公室或 Excel 中对齐时一样)。我尝试使用 strwrapcat 找到一个选项,但没有成功。

## Get some sample text example from wikipedia api
library(httr)
library(xml2)
name <- "Invictus"
url <- URLencode(sprintf("https://en.wikisource.org/w/api.php?action=parse&prop=text&page=%s&format=json", name))
res <- read_html(content(GET(url))$parse$text[[1]])
string <- iconv(xml_text(xml_find_all(res, "//p"), trim=TRUE), "latin1", "ASCII", sub=" ")[1:2]
(string <- trimws(gsub('\\n|\\s{3,}', ' ', paste(string, collapse=" "))))
# [1] "Out of the night that covers me, Black as the pit from pole to pole, I thank whatever gods may be For my unconquerable soul.  In the fell clutch of circumstance I have not winced nor cried aloud. Under the bludgeonings of chance My head is bloody, but unbow'd.  Beyond this place of wrath and tears Looms but the Horror of the shade, And yet the menace of the years Finds and shall find me unafraid.  It matters not how strait the gate, How charged with punishments the scroll, I am the master of my fate: I am the captain of my soul."

使用上述功能的一些尝试

## Using these I can get left/right/center justified text but not
## justified like in other text editing programs or newspapers.
width <- 30
cat(paste(strwrap(string, width=width), collapse='\n'))

## Or with cat
tokens <- strsplit(string, '\\s+')[[1]]               # tokenise to pass to cat
out <- capture.output(cat(tokens, fill=width, sep=" "))  # strings <= width chars
cat(paste(out, collapse='\n'))

最佳答案

好吧,如果没有内置的方法,这对于我的目的来说已经足够好了。感谢上面关于如何使用 html 样式的评论。

justify <- function(string, width=getOption('width'), 
                    fill=c('random', 'right', 'left')) {
    strs <- strwrap(string, width=width)
    paste(fill_spaces(strs, width, match.arg(fill)), collapse="\n")
}

fill_spaces <- function(lines, width, fill) {
    tokens <- strsplit(lines, '\\s+')
    res <- lapply(head(tokens, -1L), function(x) {
        nspace <- length(x)-1L
        extra <- width - sum(nchar(x)) - nspace
        reps <- extra %/% nspace
        extra <- extra %% nspace
        times <- rep.int(if (reps>0) reps+1L else 1L, nspace)
        if (extra > 0) {
            if (fill=='right') times[1:extra] <- times[1:extra]+1L
            else if (fill=='left') 
                times[(nspace-extra+1L):nspace] <- times[(nspace-extra+1L):nspace]+1L
            else times[inds] <- times[(inds <- sample(nspace, extra))]+1L
        }
        spaces <- c('', unlist(lapply(times, formatC, x=' ', digits=NULL)))
        paste(c(rbind(spaces, x)), collapse='')
    })
    c(res, paste(tail(tokens, 1L)[[1]], collapse = ' '))
}

cat(justify(string, width=40))
# Out  of the night  that covers me, Black
# as  the pit from  pole to pole, I  thank
# whatever   gods    may    be   For    my
# unconquerable soul. In  the fell  clutch
# of  circumstance I have  not  winced nor
# cried  aloud. Under the  bludgeonings of
# chance My  head  is bloody, but unbow'd.
# Beyond this  place  of  wrath and  tears
# Looms but  the Horror of the  shade, And
# yet  the menace of the years  Finds  and
# shall  find me unafraid. It  matters not
# how strait  the  gate,  How charged with
# punishments the scroll,  I am the master
# of  my fate:  I  am  the  captain  of my
# soul.

关于r - R 中文本对齐,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34710597/

相关文章:

r - 按组使用distinct(),并以R中另一列的值为条件

php - 将大数据从txt插入到mysql

html - Bootstrap nav-justified 不占用全宽

html - 如何防止 IE 删除 "empty"文本节点

css - 初学者 : Justify Text

r - 如何将值从一个数据帧传输到另一个数据帧?

r - 使用 R 和传感器加速度计数据检测跳跃

R - 将字母数字字符观察值与每个字母因子的列分开,每个观察值具有数字值

excel - 阅读时忽略文本文件中的空白行和空格

R:将 LIME 应用于 Quanteda 文本模型的问题