r - 按类型打印 data.frame 列和颜色

在我的针织文档中，我试图打印数据框的列。只是为了帮助可视化，我想根据另一列的值更改输出颜色。我有一个简单的例子如下。

date_vector <- rep(NA, 10)
type_vector <- rep(NA, 10)
types <- c("A", "B")
CDate <- Sys.Date()
date_vector[1] <- as.character(CDate)
type_vector[1] <- sample(types, size = 1)
for (i in 2:10) {
  CDate <- as.Date(CDate) + rexp(n = 1, rate = 1/5)
  date_vector[i] <- as.character(CDate)
  type_vector[i] <- sample(types, size = 1)
}

test_df <- data.frame(Date=date_vector, Type=type_vector)

当我打印 test_df$Date ，我看到以下

date_vector
[1] "2016-01-06" "2016-01-07" "2016-01-22" "2016-01-28" "2016-01-29" "2016-02-01" "2016-02-04"
[8] "2016-02-12" "2016-02-13" "2016-02-15"

相反，希望看到以下内容

由于条目的类型如下

type_vector
[1] "A" "A" "B" "B" "A" "A" "B" "A" "B" "A"

所以蓝色代表日期类型 A绿色表示类型为 B 的日期.

最佳答案

这个答案比问题更笼统。该问题要求一种根据另一列为数据框的一列着色的方法。该答案解决了根据指示要突出显示哪些元素的第二个逻辑矢量来突出显示矢量中元素的更一般情况。

原则上，这很简单:打印一个向量，突出显示另一个逻辑向量指示的元素。突出显示 x可以简单到将其包裹在 \\textcolor{blue}{x} 中或 \\emph{x} .

在实践中，并没有那么简单…… print(x)做了很多有用的事情:它在列中很好地排列数据，在字符数据周围添加引号，包装输出以尊重 getOption("width) , 将第一个元素的索引添加到每行输出，依此类推。问题是，我们不能使用 print打印突出显示的数据，因为 print转义 \\textcolor 中的反斜杠. standard solution这个问题是使用cat而不是 print .然而，cat不应用上面列出的任何好的格式。

因此，挑战在于编写一个函数来重现 print 的某些/所需功能。 .这是一项非常复杂的任务，因此我将自己限制在以下主要功能上:

总线宽<= getOption("width") .

自动在非数字和非逻辑值周围添加引号(如果 quote 未设置)。

将第一个元素的索引添加到每一行输出(如果 printIndex = TRUE )。

对数字输入应用舍入 ( digits )。

另外，这两个突出功能:

包裹 x 的元素由 condition 表示在“突出显示模式”

计算线宽时不要考虑高亮模式。这假设突出显示仅添加标记但不添加可见输出。

请注意，此功能缺少 print 的重要功能。比如处理缺失值。此外，它转换输入 x到字符(通过 as.character )。这样做的结果可能与 print 不同。因为输入类对应的 S3 方法( print.* )根本没有使用。

printHighlighted <- function(x, condition = rep(FALSE, length(x)), highlight = "\\emph{%s}", printIndex = TRUE, width = getOption("width"), digits = getOption("digits"), quote = NULL) {

  stopifnot(length(x) == length(condition))
  stopifnot(missing(digits) || (!missing(digits) && is.numeric(x))) # Raise error when input is non-numeric but "digits" supplied.

  if (missing(quote)) {
    if (is.numeric(x) || is.logical(x)) {
      quote <- FALSE
    } else {
      quote <- TRUE
    }
  }

  nquotes <- 0

  if (!printIndex) {
    currentLineIndex <- ""
  }

  if (is.numeric(x)) {
    x <- round(x, digits = digits)
  }

  fitsInLine <- function(x, elementsCurrentLine, currentLineIndex, nquotes, width) {
    return(sum(nchar(x[elementsCurrentLine])) + # total width of elements in current line
             nchar(currentLineIndex) + # width of the index of the first element (if shown)
             sum(elementsCurrentLine) - 1 + # width of spaces between elements
             nquotes <= # width of quotes added around elements
             width)
  }

  x <- as.character(x)
  elementsCurrentLine <- rep(FALSE, times = length(x))


  for (i in seq_along(x)) {

    if (!any(elementsCurrentLine) && printIndex) { # this is a new line AND show index
      currentLineIndex <- sprintf("[%s] ", i)
    }

    elementsCurrentLine[i] <- TRUE # Add element i to current line. Each line holds at least one element. Therefore, if i is the first element of this line, add it regardless of line width. If there already are elements in the line, the previous loop iteration checked that this element will fit.

    if (i < length(x)) { # not the last element

      # check whether next element will fit in this line
      elementsCurrentLineTest <- elementsCurrentLine
      elementsCurrentLineTest[i + 1] <- TRUE

      if (quote) {
        nquotes <- sum(elementsCurrentLineTest) * 2
      }

      if (fitsInLine(x, elementsCurrentLineTest, currentLineIndex, nquotes, width)) {
        next # Next element will fit; do not print yet.
      }
    }

    # Next element won't fit in current line. Print and start a new line.

    # print
    toPrint <- x[elementsCurrentLine]
    toMarkup <- condition[elementsCurrentLine]

    toPrint[toMarkup] <- sprintf(fmt = highlight, toPrint[toMarkup]) # add highlighting

    if (quote) {
      toPrint <- sprintf('"%s"', toPrint)
    }

    cat(currentLineIndex)
    cat(toPrint)
    cat("\n")

    # clear line
    elementsCurrentLine <- rep(FALSE, times = length(x))
  }
}

将此功能与 knitr 一起使用, chunk option results = "asis"必须使用，否则输出会被包裹在 verbatim 中负责突出显示的标记被显示而不是使用的环境。

最后，要重现正常块的外观，请将整个块包裹在

\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}
\begin{kframe}
\begin{alltt}
<<your-chunk>>=
printHighlighted(...)
@
\end{alltt}
\end{kframe}
\end{knitrout}

例子

为了节省一些空间，示例假设函数定义为 printHighlighted在文件 printHighlighted.R 中可用.

\documentclass{article}
\begin{document}

Some text ....

\begin{knitrout}\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}\begin{alltt}
<<results = "asis", echo = FALSE>>=
source("printHighlighted.R")
data <- seq(from = as.Date("2015-01-15"), by = "day", length.out = 100)
cond <- rep(FALSE, 100)
cond[c(3, 55)] <- TRUE

printHighlighted(x = data, condition = cond, highlight = "\\textcolor{blue}{%s}", width = 60)
@
\end{alltt}\end{kframe}\end{knitrout}

Some text ....

\end{document}

结果证明这很长......如果有人认为这对于这样一个简单的问题来说太过分了，我很乐意看到更短的解决方案。

关于r - 按类型打印 data.frame 列和颜色，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34653880/

r - 按类型打印 data.frame 列和颜色

上一篇：r - Knitr - 在主文档中生成图片，在附录中呈现

下一篇：LinkedIn 新 API - 无法获得 r_network 权限