r - 根据模式向下填充

标签 r dplyr

我有 4 列 - City、Locality.Name、Buy.Rates 和 Buy.Rates.1。如果这些列(Locality.Name、Buy.Rates 和 Buy.Rates.1)中的值相同,则获取该值并将其填充到名为“Updated.City”的新列中。使用该值直到出现新值

structure(list(City = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), .Label = c("delhi-ncr", "gurgaon", "noida", 
"greater-noida", "ghaziabad", "faridabad", "mumbai", "bangalore", 
"chennai", "hyderabad", "pune", "kolkata", "ahmedabad", "bhubaneswar", 
"coimbatore", "indore", "nagpur", "vadodara", "chandigarh", "jaipur", 
"lucknow", "surat"), class = "factor"), Locality.Name = c("Delhi East", 
"Akshardham", "Dilshad Colony", "Dilshad Garden", "I P Extension", 
"Delhi South", "Adhchini", "Alaknanda", "Ashram", "Aya Nagar", 
"Chattarpur"), Buy.Rates = c("Delhi East", "Rs. 16,150 - 18,190/sq. ft.", 
"Rs. 5,398 - 6,290/sq. ft.", "Rs. 6,290 - 8,372/sq. ft.", "Rs. 8,288 - 9,435/sq. ft.", 
"Delhi South", "-", "Rs. 10,710 - 12,070/sq. ft.", "Rs. 9,520 - 11,008/sq. ft.", 
"-", "Rs. 3,485 - 4,760/sq. ft."), Buy.Rates.1 = c("Delhi East", 
"-1.7%", "-10.19%", "7.01%", "0.96%", "Delhi South", "-", "-3.24%", 
"-", "-", "7.78%")), row.names = c(1L, 2L, 3L, 4L, 5L, 70L, 71L, 
72L, 73L, 74L, 75L), class = "data.frame")

所需输出(添加列“Updated.City”)

+-----------+--------------+----------------+-----------------------------+-------------+
|   City    | Updated.City | Locality.Name  |          Buy.Rates          | Buy.Rates.1 |
+-----------+--------------+----------------+-----------------------------+-------------+
| delhi-ncr | Delhi East   | Delhi East     | Delhi East                  | Delhi East  |
| delhi-ncr | Delhi East   | Akshardham     | Rs. 16,150 - 18,190/sq. ft. | -1.70%      |
| delhi-ncr | Delhi East   | Dilshad Colony | Rs. 5,398 - 6,290/sq. ft.   | -10.19%     |
| delhi-ncr | Delhi East   | Dilshad Garden | Rs. 6,290 - 8,372/sq. ft.   | 7.01%       |
| delhi-ncr | Delhi East   | I P Extension  | Rs. 8,288 - 9,435/sq. ft.   | 0.96%       |
| delhi-ncr | Delhi South  | Delhi South    | Delhi South                 | Delhi South |
| delhi-ncr | Delhi South  | Adhchini       | -                           | -           |
| delhi-ncr | Delhi South  | Alaknanda      | Rs. 10,710 - 12,070/sq. ft. | -3.24%      |
| delhi-ncr | Delhi South  | Ashram         | Rs. 9,520 - 11,008/sq. ft.  | -           |
| delhi-ncr | Delhi South  | Aya Nagar      | -                           | -           |
| delhi-ncr | Delhi South  | Chattarpur     | Rs. 3,485 - 4,760/sq. ft.   | 7.78%       |
+-----------+--------------+----------------+-----------------------------+-------------+

最佳答案

使用dplyrtidyr:

library(dplyr)
library(tidyr)

df %>% 
  mutate(Updated.City = if_else(Locality.Name == Buy.Rates & Locality.Name == Buy.Rates.1,
                                Locality.Name, NA_character_)) %>% 
  fill(Updated.City, .direction = "down")

这首先使用 Locality.NameNA 的值创建 Updated.City,然后向下填充该列,替换 NAs。

这给出

        City  Locality.Name                   Buy.Rates Buy.Rates.1 Updated.City
1  delhi-ncr     Delhi East                  Delhi East  Delhi East   Delhi East
2  delhi-ncr     Akshardham Rs. 16,150 - 18,190/sq. ft.       -1.7%   Delhi East
3  delhi-ncr Dilshad Colony   Rs. 5,398 - 6,290/sq. ft.     -10.19%   Delhi East
4  delhi-ncr Dilshad Garden   Rs. 6,290 - 8,372/sq. ft.       7.01%   Delhi East
5  delhi-ncr  I P Extension   Rs. 8,288 - 9,435/sq. ft.       0.96%   Delhi East
6  delhi-ncr    Delhi South                 Delhi South Delhi South  Delhi South
7  delhi-ncr       Adhchini                           -           -  Delhi South
8  delhi-ncr      Alaknanda Rs. 10,710 - 12,070/sq. ft.      -3.24%  Delhi South
9  delhi-ncr         Ashram  Rs. 9,520 - 11,008/sq. ft.           -  Delhi South
10 delhi-ncr      Aya Nagar                           -           -  Delhi South
11 delhi-ncr     Chattarpur   Rs. 3,485 - 4,760/sq. ft.       7.78%  Delhi South

关于r - 根据模式向下填充,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64315040/

相关文章:

r - R as.POSIXct()丢弃小时数分钟和秒数

r - "summarise_at"和 "mutate_if"用于字符变量的描述性统计

r - 将列转换为日期类型会更改原始值的年份

r - 根据字段提取 HTML 中的值

r - 如何用第一个非缺失值替换列中的 na 而不会使用 R 删除仅具有缺失值的案例?

r - tidyr 在多列上使用separate_rows

按行查找矩阵或数据框的最低值(排序)

R dplyr 使用哈希函数(摘要)进行变异,需要 R 对象作为输入

用于计算偏差矩阵的 R 函数

r - 如何在另一列包含指定字符串的情况下设置NA?