Python pandas dataframe-从标题中删除列

标签 python numpy pandas header

我有以下代码:

data = pd.read_csv('audit_nor.csv')
d1 = pd.get_dummies(data)
header = d1.columns.values
print(header)
print(type(header))

输出如下:

['ID' 'Age' 'Income' 'Deductions' 'Hours' 'Adjustment' 'Adjusted'
 'Employment_Consultant' 'Employment_PSFederal' 'Employment_PSLocal'
 'Employment_PSState' 'Employment_Private' 'Employment_SelfEmp'
 'Employment_Unemployed' 'Employment_Volunteer' 'Education_Associate'
 'Education_Bachelor' 'Education_College' 'Education_Doctorate'
 'Education_HSgrad' 'Education_Master' 'Education_Preschool'
 'Education_Professional' 'Education_Vocational' 'Education_Yr10'
 'Education_Yr11' 'Education_Yr12' 'Education_Yr5t6' 'Education_Yr7t8'
 'Education_Yr9' 'Marital_Absent' 'Marital_Divorced' 'Marital_Married'
 'Marital_Married-spouse-absent' 'Marital_Unmarried' 'Marital_Widowed'
 'Occupation_Cleaner' 'Occupation_Clerical' 'Occupation_Executive'
 'Occupation_Farming' 'Occupation_Machinist' 'Occupation_Professional'
 'Occupation_Repair' 'Occupation_Sales' 'Occupation_Service'
 'Occupation_Support' 'Occupation_Transport' 'Sex_Female' 'Sex_Male'
 'Accounts_Cuba' 'Accounts_England' 'Accounts_Germany' 'Accounts_India'
 'Accounts_Indonesia' 'Accounts_Iran' 'Accounts_Ireland' 'Accounts_Jamaica'
 'Accounts_Malaysia' 'Accounts_Mexico' 'Accounts_Philippines'
 'Accounts_Portugal' 'Accounts_UnitedStates' 'Accounts_Vietnam']
<type 'numpy.ndarray'>

我正在尝试从标题中删除“ID”,因此我可以从数据框中删除整个“ID”列。我做了:

columns = header.delete('ID')

但出现错误:

AttributeError: 'numpy.ndarray' object has no attribute 'delete'

我想知道解决这个问题的正确方法应该是什么。谢谢!

最佳答案

您可以使用numpy.deletenumpy.where查找索引:

import numpy as np

print np.where(header=='ID')
(array([0], dtype=int64),)

columns = np.delete(header, np.where(header=='ID'))
print columns
['Age' 'Income' 'Deductions' 'Hours' 'Adjustment' 'Adjusted'
 'Employment_Consultant' 'Employment_PSFederal' 'Employment_PSLocal'
 'Employment_PSState' 'Employment_Private' 'Employment_SelfEmp'
 'Employment_Unemployed' 'Employment_Volunteer' 'Education_Associate'
 'Education_Bachelor' 'Education_College' 'Education_Doctorate'
 'Education_HSgrad' 'Education_Master' 'Education_Preschool'
 'Education_Professional' 'Education_Vocational' 'Education_Yr10'
 'Education_Yr11' 'Education_Yr12' 'Education_Yr5t6' 'Education_Yr7t8'
 'Education_Yr9' 'Marital_Absent' 'Marital_Divorced' 'Marital_Married'
 'Marital_Married-spouse-absent' 'Marital_Unmarried' 'Marital_Widowed'
 'Occupation_Cleaner' 'Occupation_Clerical' 'Occupation_Executive'
 'Occupation_Farming' 'Occupation_Machinist' 'Occupation_Professional'
 'Occupation_Repair' 'Occupation_Sales' 'Occupation_Service'
 'Occupation_Support' 'Occupation_Transport' 'Sex_Female' 'Sex_Male'
 'Accounts_Cuba' 'Accounts_England' 'Accounts_Germany' 'Accounts_India'
 'Accounts_Indonesia' 'Accounts_Iran' 'Accounts_Ireland' 'Accounts_Jamaica'
 'Accounts_Malaysia' 'Accounts_Mexico' 'Accounts_Philippines'
 'Accounts_Portugal' 'Accounts_UnitedStates' 'Accounts_Vietnam']

或者您可以使用list理解来删除ID:

columns = [x for x in header if x != 'ID']
print columns
['Age', 'Income', 'Deductions', 'Hours', 'Adjustment', 'Adjusted', 'Employment_Consultant', 'Employment_PSFederal', 'Employment_PSLocal', 'Employment_PSState', 'Employment_Private', 'Employment_SelfEmp', 'Employment_Unemployed', 'Employment_Volunteer', 'Education_Associate', 'Education_Bachelor', 'Education_College', 'Education_Doctorate', 'Education_HSgrad', 'Education_Master', 'Education_Preschool', 'Education_Professional', 'Education_Vocational', 'Education_Yr10', 'Education_Yr11', 'Education_Yr12', 'Education_Yr5t6', 'Education_Yr7t8', 'Education_Yr9', 'Marital_Absent', 'Marital_Divorced', 'Marital_Married', 'Marital_Married-spouse-absent', 'Marital_Unmarried', 'Marital_Widowed', 'Occupation_Cleaner', 'Occupation_Clerical', 'Occupation_Executive', 'Occupation_Farming', 'Occupation_Machinist', 'Occupation_Professional', 'Occupation_Repair', 'Occupation_Sales', 'Occupation_Service', 'Occupation_Support', 'Occupation_Transport', 'Sex_Female', 'Sex_Male', 'Accounts_Cuba', 'Accounts_England', 'Accounts_Germany', 'Accounts_India', 'Accounts_Indonesia', 'Accounts_Iran', 'Accounts_Ireland', 'Accounts_Jamaica', 'Accounts_Malaysia', 'Accounts_Mexico', 'Accounts_Philippines', 'Accounts_Portugal', 'Accounts_UnitedStates', 'Accounts_Vietnam']
#if you need filter df by columns
df = df[columns]

或者通过删除第一项来过滤array(ID必须是header的第一个元素):

columns = header[1:]
print columns
['Age' 'Income' 'Deductions' 'Hours' 'Adjustment' 'Adjusted'
 'Employment_Consultant' 'Employment_PSFederal' 'Employment_PSLocal'
 'Employment_PSState' 'Employment_Private' 'Employment_SelfEmp'
 'Employment_Unemployed' 'Employment_Volunteer' 'Education_Associate'
 'Education_Bachelor' 'Education_College' 'Education_Doctorate'
 'Education_HSgrad' 'Education_Master' 'Education_Preschool'
 'Education_Professional' 'Education_Vocational' 'Education_Yr10'
 'Education_Yr11' 'Education_Yr12' 'Education_Yr5t6' 'Education_Yr7t8'
 'Education_Yr9' 'Marital_Absent' 'Marital_Divorced' 'Marital_Married'
 'Marital_Married-spouse-absent' 'Marital_Unmarried' 'Marital_Widowed'
 'Occupation_Cleaner' 'Occupation_Clerical' 'Occupation_Executive'
 'Occupation_Farming' 'Occupation_Machinist' 'Occupation_Professional'
 'Occupation_Repair' 'Occupation_Sales' 'Occupation_Service'
 'Occupation_Support' 'Occupation_Transport' 'Sex_Female' 'Sex_Male'
 'Accounts_Cuba' 'Accounts_England' 'Accounts_Germany' 'Accounts_India'
 'Accounts_Indonesia' 'Accounts_Iran' 'Accounts_Ireland' 'Accounts_Jamaica'
 'Accounts_Malaysia' 'Accounts_Mexico' 'Accounts_Philippines'
 'Accounts_Portugal' 'Accounts_UnitedStates' 'Accounts_Vietnam']

#if you need filter df by columns
df = df[columns]

但是如果您需要删除列ID,请使用drop :

df = df.drop('ID', axis=1)

关于Python pandas dataframe-从标题中删除列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36185883/

相关文章:

python - 简单的python程序问题

python - groupby 和 resample 对 pandas 数据框的同时操作?

pandas - 如何在一次遍历行中将多列聚合为集合

python - 这里的ret和frame是什么意思?

python - 一侧填充太慢的两个三维数组的卷积

python - 从 Pandas 的每组中抽取 n 行

python - 使用最新版本的 python 创建 virtualenv

python - 如何使用 pass 语句

python - turtle 按键绑定(bind),为什么老是崩溃?

python - 对数组进行切片以排除单个元素