python - 如何将 tabulate lib 与 float64 : python 一起使用

标签 python kaggle

为了漂亮地打印数据,我在 python 中使用 tabulate 库。 这是我正在使用的代码:

train = pd.read_csv('../misc/data/train.csv')
test = pd.read_csv('../misc/data/test.csv')

# Prints the head of data prettily :)
print(tabulate(train.head(), headers='keys', tablefmt='psql'))

数据是来自kaggle的titanic数据集。现在,我需要对具有 float64 值的数据使用制表。这是给我错误的代码:

surv_age = train[train['Survived'] == 1]['Age'].value_counts()
dead_age = train[train['Survived'] == 0]['Age'].value_counts()

print(tabulate(surv_age, headers='keys', tablefmt='psql'))

df = pd.DataFrame([surv_age, dead_age])
df.index = ['Survived', 'Dead']
df.plot(kind='hist', stacked=True, figsize=(15, 8))
plt.xlabel('Age')
plt.ylabel('Number of passengers')
plt.show()

错误是: 回溯(最近一次调用最后一次):

  File "main.py", line 49, in <module>
    print(tabulate(surv_age, headers='keys', tablefmt='psql'))
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 1109, in tabulate
    tabular_data, headers, showindex=showindex)
  File "/usr/local/lib/python2.7/dist-packages/tabulate.py", line 741, in _normalize_tabular_data
    rows = [list(row) for row in vals]
TypeError: 'numpy.float64' object is not iterable

第 49 行是代码中的 print(tabulate(..) 行。

如何迭代数据的 float64 值,以便可以在表格中漂亮地打印?如果在 tabulate 中不可能,请建议一种可以做到这一点的 pretty-print 的替代方法。以下是 tabulate 功能的示例:

+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+
|    |   PassengerId |   Survived |   Pclass | Name                                                | Sex    |   Age |   SibSp |   Parch | Ticket           |    Fare | Cabin   | Embarked   |
|----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------|
|  0 |             1 |          0 |        3 | Braund, Mr. Owen Harris                             | male   |    22 |       1 |       0 | A/5 21171        |  7.25   | nan     | S          |
|  1 |             2 |          1 |        1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female |    38 |       1 |       0 | PC 17599         | 71.2833 | C85     | C          |
|  2 |             3 |          1 |        3 | Heikkinen, Miss. Laina                              | female |    26 |       0 |       0 | STON/O2. 3101282 |  7.925  | nan     | S          |
|  3 |             4 |          1 |        1 | Futrelle, Mrs. Jacques Heath (Lily May Peel)        | female |    35 |       1 |       0 | 113803           | 53.1    | C123    | S          |
|  4 |             5 |          0 |        3 | Allen, Mr. William Henry                            | male   |    35 |       0 |       0 | 373450           |  8.05   | nan     | S          |
+----+---------------+------------+----------+-----------------------------------------------------+--------+-------+---------+---------+------------------+---------+---------+------------+

最佳答案

引自tabulate文档,

The following tabular data types are supported:

  • list of lists or another iterable of iterables
  • list or another iterable of dicts (keys as columns)
  • dict of iterables (keys as columns)
  • two-dimensional NumPy array
  • NumPy record arrays (names as columns)
  • pandas.DataFrame

你的变量surv_age是一个形状为(342,)的一维numpy数组。您需要将其重新整形为二维 numpy 数组。您可以使用 numpy.reshape 轻松完成此操作,

surv_age = np.reshape(surv_age, (-1, 1))

您还可以使用 np.expand_dims 来执行此操作像这样,

surv_age = np.expand_dims(surv_age, axis=1)

关于python - 如何将 tabulate lib 与 float64 : python 一起使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41240764/

相关文章:

python 南!=南

python - Matplotlib - 如何设置 xticks 之间的距离(以毫米/厘米/点...为单位)

python - 在 Colab 上导入 Kaggle 数据集时出错

pandas - 您的笔记本尝试分配比可用内存更多的内存。已经重新启动了

pip - 'pip install kaggle' 工作正常 - 但 'kg command not found'

r - C5.0 决策树 - 名为 exit 且值为 1 的 c50 代码

python - Dataframe 访问第二级 MultiIndex

python - Python 中的多范围产品

python - 在 SQLAlchemy 中使用 bulk_update_mappings 更新具有不同值的多行

python - Windows 7 上的 Kaggle API