我写了下面的代码,它给了我这个错误:
"Given feature/column names do not match the ones for the data given during fit."
训练和预测数据具有相同的特征。
df_train = data_preprocessing(df_train)
#Split X and Y
X_train = df_train.drop(target_columns,axis=1)
y_train = df_train[target_columns]
#Create a boolean mask for categorical columns
categorical_columns = X_train.columns[X_train.dtypes == 'O'].tolist()
# Create a boolean mask for numerical columns
numerical_columns = X_train.columns[X_train.dtypes != 'O'].tolist()
# Scaling & Encoding objects
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
col_transformers = ColumnTransformer(
# name, transformer itself, columns to apply
transformers=[("scaler_onestep", numeric_transformer, numerical_columns),
("ohe_onestep", categorical_transformer, categorical_columns)])
#Manual PROCESSING
model = MultiOutputClassifier(
xgb.XGBClassifier(objective="binary:logistic",
colsample_bytree = 0.5
))
#Define a pipeline
pipeline = Pipeline([("preprocessing", col_transformers), ("XGB", model)])
pipeline.fit(X_train, y_train)
#Data Preprocessing
predicted = data_preprocessing(predicted)
X_predicted = predicted.drop(target_columns,axis=1)
predictions=pipeline.predict(X_predicted)
我在预测过程中出错。我该如何解决这个问题?我找不到任何解决方案。
最佳答案
尝试对 X_predicted 中的列重新排序,以便它们与 X_train 完全匹配。
关于python - 给定的特征/列名称与拟合期间给定的数据不匹配。错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68362413/