r - 尝试使用 ldply 将列表转换为数据帧时出错((function (..., row.names = NULL, :arguments imply differing number of rows: )

我正在尝试使用 RStudio 抓取足球队球员的标准统计数据。我能够将信息提取到列表中，但无法将它们可视化为数据框，它给了我这个错误(Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = 正确，: 参数意味着不同的行数:33,27,24,35,5,4,54,38,18,2,1)我在 R 中是个菜鸟，我想不出解决它的方法，这是我正在使用的代码，以及我尝试从中提取数据的页面，非常欢迎任何帮助!!!

https://fbref.com/en/squads/2b390eca/2016-2017/Athletic-Bilbao

install.packages('rvest')
install.packages('plyr')
install.packages('dplyr')
library(rvest)
library(plyr)
library(dplyr)

years = c(2017:2018)
urls = list()
for (i in 1:length(years)) {
  url = paste0('https://fbref.com/en/squads/2b390eca/',years[i],'-',years[i+1],'/Athletic-Bilbao')
  urls[[i]] = url #https://fbref.com/en/squads/d5348c80/',years1[i],'-',years2[i+1],'/AEK-Athens
}


tbl = list()
years = 2017
j = 1
for (j in seq_along(urls)) {
  tbl[[j]] = urls[[j]] %>%
    read_html() %>%
    html_nodes("table") %>%
    html_table()
  tbl[[j]]$Year = years
  j = j+1
  years = years+1
}

Data = ldply(tbl,data.frame)

最佳答案

我发现需要进行两个修复。

您的第二个网址错误。我认为，您想要 years[i] + 1 ，即在索引之外移动 + 1。然后您将获得 2017-2018 和 2018-19。

其次，有许多表的行数和列数各不相同，当您只需要第一个(标准)时，您试图将它们全部连接起来。如果您只想要第一个表，则使用 html_node 而不是 html_nodes 即 html_node("table")。

我也不确定年份列是否设置为按您预期的方式工作，因为您当前将获得 2019 年和 2020 年。我已更改，因此您将获得 2017 年和 2018 年。您不需要增加 j 顺便说一句。

library(rvest)
library(plyr)
library(dplyr)

years = c(2017:2018)
urls = list()

for (i in 1:length(years)) {
  url = paste0('https://fbref.com/en/squads/2b390eca/',years[i],'-',years[i] + 1,'/Athletic-Bilbao')
  urls[[i]] = url 
}

tbl = list()

for (j in seq_along(urls)) {
  tbl[[j]] <- urls[[j]] %>%
              read_html() %>%
              html_node("table") %>%
              html_table()
  tbl[[j]]$Year = years[j]
}

data = ldply(tbl,data.frame)

关于r - 尝试使用 ldply 将列表转换为数据帧时出错((function (..., row.names = NULL, :arguments imply differing number of rows: )，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60875436/

r - 尝试使用 ldply 将列表转换为数据帧时出错((function (..., row.names = NULL, :arguments imply differing number of rows: )

上一篇：java - 基本的java参数传递，方法内的计算

下一篇：oracle - 新的 Grails 域类未在数据库中创建表