减少同一列中的因子水平

标签 r

我需要降低因子变量“武器描述”的级别,该变量有 80 个级别,我希望它为 8。当我希望结果为二进制时,我之前使用过 grepl。现在我需要 8 个级别,我不知道如何继续。下面的例子是如果结果是二元的我将如何处理。我需要帮助将其扩展到 8 种类型。

crime_3yr$Weapon.Used<-ifelse(grepl(crime_3yr$Weapon.Description,pattern = "GUN|AXE|RIFLE"),"Melee","Ranged")

目前,前 10 个级别是:

    AIR PISTOL/REVOLVER/RIFLE/BB GUN"              
 [3] "ANTIQUE FIREARM"                               
 [4] "ASSAULT WEAPON/UZI/AK47/ETC"                   
 [5] "AUTOMATIC WEAPON/SUB-MACHINE GUN"              
 [6] "AXE"                                           
 [7] "BELT FLAILING INSTRUMENT/CHAIN"                
 [8] "BLACKJACK"                                     
 [9] "BLUNT INSTRUMENT"                              
[10] "BOARD"                                         
[11] "BOMB THREAT"

我希望它是:

hand gun"              
 [3] "hand gun"                               
 [4] "Assault rifle"                   
 [5] "Assault rifle"              
 [6] "melee"                                           
 [7] "melee"                
 [8] "melee"                                     
 [9] "melee"                              
[10] "misc"                                         
[11] "misc"

我意识到我还没有提供我想要的 8 个级别,因为我还没有决定最终因素。我只需要知道如何将原来的级别分成2个以上的级别。 80 个级别是

     [2] "AIR PISTOL/REVOLVER/RIFLE/BB GUN"              
 [3] "ANTIQUE FIREARM"                               
 [4] "ASSAULT WEAPON/UZI/AK47/ETC"                   
 [5] "AUTOMATIC WEAPON/SUB-MACHINE GUN"              
 [6] "AXE"                                           
 [7] "BELT FLAILING INSTRUMENT/CHAIN"                
 [8] "BLACKJACK"                                     
 [9] "BLUNT INSTRUMENT"                              
[10] "BOARD"                                         
[11] "BOMB THREAT"                                   
[12] "BOTTLE"                                        
[13] "BOW AND ARROW"                                 
[14] "BOWIE KNIFE"                                   
[15] "BRASS KNUCKLES"                                
[16] "CAUSTIC CHEMICAL/POISON"                       
[17] "CLEAVER"                                       
[18] "CLUB/BAT"                                      
[19] "CONCRETE BLOCK/BRICK"                          
[20] "DEMAND NOTE"                                   
[21] "DIRK/DAGGER"                                   
[22] "DOG/ANIMAL (SIC ANIMAL ON)"                    
[23] "EXPLOXIVE DEVICE"                              
[24] "FIRE"                                          
[25] "FIXED OBJECT"                                  
[26] "FOLDING KNIFE"                                 
[27] "GLASS"                                         
[28] "HAMMER"                                        
[29] "HAND GUN"                                      
[30] "HECKLER & KOCH 91 SEMIAUTOMATIC ASSAULT RIFLE" 
[31] "HECKLER & KOCH 93 SEMIAUTOMATIC ASSAULT RIFLE" 
[32] "ICE PICK"                                      
[33] "KITCHEN KNIFE"                                 
[34] "KNIFE WITH BLADE 6INCHES OR LESS"              
[35] "KNIFE WITH BLADE OVER 6 INCHES IN LENGTH"      
[36] "LIQUOR/DRUGS"                                  
[37] "M-14 SEMIAUTOMATIC ASSAULT RIFLE"              
[38] "M1-1 SEMIAUTOMATIC ASSAULT RIFLE"              
[39] "MAC-10 SEMIAUTOMATIC ASSAULT WEAPON"           
[40] "MAC-11 SEMIAUTOMATIC ASSAULT WEAPON"           
[41] "MACE/PEPPER SPRAY"                             
[42] "MACHETE"                                       
[43] "MARTIAL ARTS WEAPONS"                          
[44] "OTHER CUTTING INSTRUMENT"                      
[45] "OTHER FIREARM"                                 
[46] "OTHER KNIFE"                                   
[47] "PHYSICAL PRESENCE"                             
[48] "PIPE/METAL PIPE"                               
[49] "RAZOR"                                         
[50] "RAZOR BLADE"                                   
[51] "RELIC FIREARM"                                 
[52] "REVOLVER"                                      
[53] "RIFLE"                                         
[54] "ROCK/THROWN OBJECT"                            
[55] "ROPE/LIGATURE"                                 
[56] "SAWED OFF RIFLE/SHOTGUN"                       
[57] "SCALDING LIQUID"                               
[58] "SCISSORS"                                      
[59] "SCREWDRIVER"                                   
[60] "SEMI-AUTOMATIC PISTOL"                         
[61] "SEMI-AUTOMATIC RIFLE"                          
[62] "SHOTGUN"                                       
[63] "SIMULATED GUN"                                 
[64] "STARTER PISTOL/REVOLVER"                       
[65] "STICK"                                         
[66] "STRAIGHT RAZOR"                                
[67] "STRONG-ARM (HANDS, FIST, FEET OR BODILY FORCE)"
[68] "STUN GUN"                                      
[69] "SWITCH BLADE"                                  
[70] "SWORD"                                         
[71] "SYRINGE"                                       
[72] "TIRE IRON"                                     
[73] "TOY GUN"                                       
[74] "UNK TYPE SEMIAUTOMATIC ASSAULT RIFLE"          
[75] "UNKNOWN FIREARM"                               
[76] "UNKNOWN TYPE CUTTING INSTRUMENT"               
[77] "UNKNOWN WEAPON/OTHER WEAPON"                   
[78] "UZI SEMIAUTOMATIC ASSAULT RIFLE"               
[79] "VEHICLE"                                       
[80] "VERBAL THREAT"   

最佳答案

库(dplyr)

example <- data.frame(key = c(1:10), 
                      values = c("knife", "gun", "bomb", "fork", 
                                 "ball", "dog", "cat", "paper", 
                                 "redfish", "honey")
                      )
  key values
1   1  knife
2   2    gun
3   3   bomb
4   4   fork
5   5   ball
6   6    dog

example %>% 
    mutate(newValues = case_when(
        grepl(x = values, pattern = "knife|gun|bomb") ~ "weapon",
        grepl(x = values, pattern = "fork|ball|paper|honey") ~ "other",
        grepl(x = values, pattern = "cat|dog|redfish") ~ "pet",
        TRUE ~ "Unkwown")
    ) 

  key values newValues
1   1  knife    weapon
2   2    gun    weapon
3   3   bomb    weapon
4   4   fork     other
5   5   ball     other
6   6    dog       pet

注意:1. 如果您不想创建新列,只需重新分配相同的列即可; 2. 如果您需要一个因子,只需将 case_when 的结果通过管道传输到 factor

关于减少同一列中的因子水平,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49439495/

相关文章:

r - ggplot2 中的语音注视事件图

r - 如何计算R中沿线的两点之间的地理距离?

r - 通过向量索引矩阵

r - 是否可以创建具有随机和静态值的变量?右

r - 仅在没有重复项的 ID 字段上进行子集化(寻找比 for 循环更快的东西)

r - 如何向条形图添加重要性指示(星号)?

mysql - Rstudio-server 中的 sql 查询结果为空

r - 无法在点和点密度上映射多边形

r - DBSCAN 用于按位置和密度对数据进行聚类

r - 为 R : Remove the k that apears on the y axis when the dataset contains numbers larger than 1000 作图