python - 不使用 scikit learn 训练测试分割

我有一个房价预测数据集。我必须将数据集拆分为 train 和 test。
我想知道是否可以使用 numpy 或 scipy 来做到这一点？
我目前无法使用 scikit learn 。

最佳答案

我知道你的问题只是用numpy或scipy进行train_test_split，但实际上有一种非常简单的方法可以用Pandas来做到这一点:

import pandas as pd 

# Shuffle your dataset 
shuffle_df = df.sample(frac=1)

# Define a size for your train set 
train_size = int(0.7 * len(df))

# Split your dataset 
train_set = shuffle_df[:train_size]
test_set = shuffle_df[train_size:]

适合那些想要快速、简单的解决方案的人。

关于python - 不使用 scikit learn 训练测试分割，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47202182/

上一篇：delphi - 如何从 delphi 7 应用程序运行 mklink 系统命令 (cmd.exe)

下一篇：java - 如何使用 RestAssured post 方法上传多张图片

相关文章：

python - 如何根据功率谱密度确定我的数据是否为 1/f 噪声？

python - ScikitLearn 随机森林中的欠采样与 class_weight

python - 在 Windows 上搭建 Python 开发环境

python - 重新连接列表中特定的分割字符串 PYTHON

python - PYQT4-pyodbc 驱动程序错误

python - 在整个数据集上计算 TF-IDF 还是仅在训练数据上计算 TF-IDF？

python - 使用 sklearn RandomForestClassifier 进行分类

c++ - 使用 SWIG 用虚拟方法包装 C++ 类并在 python 中覆盖它们

python - 如何从结构化 numpy.array 访问多个字段？

python - 在日期列表中查找最近的过去日期