python - 关于 pandas.read_csv 的 float_precision 参数

标签 python algorithm pandas floating-point ieee-754

documentation对于这篇文章标题中的论点，他说:

float_precision : string, default None

Specifies which converter the C engine should use for floating-point values. The options are None for the ordinary converter, high for the high-precision converter, and round_trip for the round-trip converter.

我想更多地了解所提到的三种算法，最好不要深入研究源代码¹。

问:这些算法是否有名称，我可以通过谷歌搜索来准确了解它们的作用以及它们的不同之处？

(另外，一个附带问题:在这种情况下，“C 引擎”到底是什么？是 Pandas 特有的东西，还是 Python 范围内的东西？以上都不是？)

^{¹ 不熟悉所讨论的代码库，我预计我需要很长时间才能找到相关的源代码。但即使假设我设法找到它，我对这种算法的经验是它们的实现非常优化，而且在如此低的水平上，如果没有一些高水平的描述，至少对我来说，真的很难关注正在发生的事情。}

最佳答案

您询问了实际算法 - 我能找到的最接近的是: https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/parsers.pyx#L492

这是从一个相关的答案中摘录的，感谢 MaxU ( Understanding pandas.read_csv() float parsing )

Ordinary: double_converter_nogil = xstrtod
High: double_converter_nogil = precise_xstrtod
Round-Trip: double_converter_withgil = round_trip

从这里开始，您就进入了 C 地。您还问为什么 pandas 使用 C - 关键代码路径是用 Cython 或 C 编写的。

关于python - 关于 pandas.read_csv 的 float_precision 参数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44697714/

上一篇：python - 从 Python 解释器运行时获取 "ImportError: attempted relative import with no known parent package"

下一篇：python - 使用 Python 在两台计算机之间流式传输实时视频

相关文章：

algorithm - 将两个列表合并为一个保持顺序且没有重复元素的列表

python - 强制 pyplot 显示实际轴值

python - 使用 Pandas 连接两个或多个变量以创建新变量

python - Keras 中用于 idct 的自定义层

algorithm - AVL树和splay树的区别

python - 在映射函数中除以 0

python - Pywikibot 安装

algorithm - 为什么这种基于 DFS 的拓扑排序算法有效？

python - Pandas :检查列值是否小于任何先前的列值

python-3.x - 带有 numpy 数组的 Python Pandas 字典