我最近使用 Scikit 模块编写了一个逻辑回归模型。然而,我很难绘制决策边界线。我明确地将系数和截距相乘并绘制它们(这反过来会抛出错误的数字)。
有人能给我指出如何绘制决策边界的正确方向吗?
有没有一种更简单的方法来绘制直线,而无需手动乘以系数和截距?
感谢一百万!
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
#Import Dataset
dataset = pd.read_csv("Students Exam Dataset.txt", names=["Exam 1", "Exam 2", "Admitted"])
print(dataset.head())
#Visualizing Dataset
positive = dataset[dataset["Admitted"] == 1]
negative = dataset[dataset["Admitted"] == 0]
plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
plt.title("Student Admission Plot")
plt.xlabel("Exam 1")
plt.ylabel("Exam 2")
plt.legend()
plt.plot()
plt.show()
#Preprocessing Data
col = len(dataset.columns)
x = dataset.iloc[:,0:col].values
y = dataset.iloc[:,col-1:col].values
print(f"X Shape: {x.shape} Y Shape: {y.shape}")
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1306)
#Initialize Model
reg = LogisticRegression()
reg.fit(x_train, y_train)
#Output
predictions = reg.predict(x_test)
accuracy = accuracy_score(y_test, predictions) * 100
coeff = reg.coef_
intercept = reg.intercept_
print(f"Accuracy Score : {accuracy} %")
print(f"Coefficients = {coeff}")
print(f"Intercept Coefficient = {intercept}")
#Visualizing Output
xx = np.linspace(30,100,100)
decision_boundary = (coeff[0,0] * xx + intercept.item()) / coeff[0,1]
plt.scatter(positive["Exam 1"], positive["Exam 2"], color="blue", marker="o", label="Admitted")
plt.scatter(negative["Exam 1"], negative["Exam 2"], color="red", marker="x", label="Not Admitted")
plt.plot(xx, decision_boundary, color="green", label="Decision Boundary")
plt.title("Student Admission Plot")
plt.xlabel("Exam 1")
plt.ylabel("Exam 2")
plt.legend()
plt.show()
最佳答案
Is there an easier way to plot the line without having to manually multiply the coefficients and the intercepts?
是的,如果您不需要从头开始构建它,mlxtend
中有一个从 scikit-learn 分类器绘制决策边界的出色实现。包裹。提供的链接中的文档内容丰富,并且可以使用 pip install mlxtend
轻松安装。
首先,关于您发布的代码的 Preprocessing
block 的几点:
1. x
不应包含类标签。
2. y
应该是一个 1d
数组。
#Preprocessing Data
col = len(dataset.columns)
x = dataset.iloc[:,0:col-1].values # assumes your labels are always in the final column.
y = dataset.iloc[:,col-1:col].values
y = y.reshape(-1) # convert to 1d
现在绘图就像这样简单:
from mlxtend.plotting import plot_decision_regions
plot_decision_regions(x, y,
X_highlight=x_test,
clf=reg,
legend=2)
此特定图通过包围 x_test
数据点来突出显示它们。
关于python - 使用 Matplotlib 绘制决策边界时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53379333/