python - 删除 Python 中的特定字符/字符串/字符序列

标签 python pandas dataframe type-conversion tuples

我正在创建一长串似乎是元组的列表,我希望稍后将其转换为数据帧,但是某些常见的字符序列阻止了这种情况的实现。输出一小部分的示例:

0,"GAME_ID                      21900001
EVENTNUM                            2
EVENTMSGTYPE                       12
EVENTMSGACTIONTYPE                  0
PERIOD                              1
WCTIMESTRING                  8:04 PM
PCTIMESTRING                    12:00
HOMEDESCRIPTION                      
NEUTRALDESCRIPTION                   
VISITORDESCRIPTION                   
SCORE                             NaN
SCOREMARGIN                       NaN
PERSON1TYPE                         0
PLAYER1_ID                          0
PLAYER1_NAME                      NaN
PLAYER1_TEAM_ID                   NaN
PLAYER1_TEAM_CITY                 NaN
PLAYER1_TEAM_NICKNAME             NaN
PLAYER1_TEAM_ABBREVIATION         NaN
PERSON2TYPE                         0
PLAYER2_ID                          0
PLAYER2_NAME                      NaN
PLAYER2_TEAM_ID                   NaN
PLAYER2_TEAM_CITY                 NaN
PLAYER2_TEAM_NICKNAME             NaN
PLAYER2_TEAM_ABBREVIATION         NaN
PERSON3TYPE                         0
PLAYER3_ID                          0
PLAYER3_NAME                      NaN
PLAYER3_TEAM_ID                   NaN
PLAYER3_TEAM_CITY                 NaN
PLAYER3_TEAM_NICKNAME             NaN
PLAYER3_TEAM_ABBREVIATION         NaN
VIDEO_AVAILABLE_FLAG                0
DESCRIPTION                          
TIME_ELAPSED                        0
TIME_ELAPSED_PERIOD                 0
Name: 0, dtype: object"

而所需的输出是:

GAME_ID                      21900001
EVENTNUM                            2
EVENTMSGTYPE                       12
EVENTMSGACTIONTYPE                  0
PERIOD                              1
WCTIMESTRING                  8:04 PM
PCTIMESTRING                    12:00
HOMEDESCRIPTION                      
NEUTRALDESCRIPTION                   
VISITORDESCRIPTION                   
SCORE                             NaN
SCOREMARGIN                       NaN
PERSON1TYPE                         0
PLAYER1_ID                          0
PLAYER1_NAME                      NaN
PLAYER1_TEAM_ID                   NaN
PLAYER1_TEAM_CITY                 NaN
PLAYER1_TEAM_NICKNAME             NaN
PLAYER1_TEAM_ABBREVIATION         NaN
PERSON2TYPE                         0
PLAYER2_ID                          0
PLAYER2_NAME                      NaN
PLAYER2_TEAM_ID                   NaN
PLAYER2_TEAM_CITY                 NaN
PLAYER2_TEAM_NICKNAME             NaN
PLAYER2_TEAM_ABBREVIATION         NaN
PERSON3TYPE                         0
PLAYER3_ID                          0
PLAYER3_NAME                      NaN
PLAYER3_TEAM_ID                   NaN
PLAYER3_TEAM_CITY                 NaN
PLAYER3_TEAM_NICKNAME             NaN
PLAYER3_TEAM_ABBREVIATION         NaN
VIDEO_AVAILABLE_FLAG                0
DESCRIPTION                          
TIME_ELAPSED                        0
TIME_ELAPSED_PERIOD                 0

如何去掉开头的 0 和 ",以及超过 TIME_ELAPSED_PERIOD 后末尾的垃圾?开头的 int 和底行的 int 加 1直到我的程序结束,这可能会上升到大约 320,000,所以我需要代码能够适应一系列 int 值。我认为在创建列表后执行此操作是最简单的,所以我没有必要向您展示我的任何代码。只需系统地处理字符就可以了。谢谢!

最佳答案

如果您的输入数据是列表形式,您可以尝试以下方法来满足您的需求:

inputlist = Your_list_to_be_corrected  #Assign your input list here

# Now, remove the rows in the list that have the format "Name: 0, dtype: object""
inputlist = [ x for x in inputlist if "dtype: object" not in x ]

#Now, correct the rows containing GAME_ID by removing the int number and special characters
sep = 'GAME_ID'
for index, element in enumerate(inputlist):
    if "GAME_ID" in element:
        inputlist[index] = 'GAME_ID' + element.split(sep, 1)[1]

关于python - 删除 Python 中的特定字符/字符串/字符序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60275899/

相关文章:

python - 如何获取Python中函数调用的路径?

python - Tensorflow:为什么 inception_v3 预测在评估中是 Nan?

java - 如何为java和python设置bash路径

python - Pandas :如何将多个单元格与列表/元组进行比较

python - 使用相同的索引 pandas 按行合并两个数据帧

python - 在字符串中搜索并获取Python中匹配前后的2个词

python - 如何遍历数据帧的行并检查列行中的值是否为 NaN

numpy - 从 txt 文件计算平均值、标准差的有效方法

python - 删除特定列中的异常值

python - python pandas 中的嵌套数据框/索引