Linux 错误 - 排序 : both SI and IEC prefixes present on units

我正在使用 Python 进行 Hadoop 流式处理，但排序有问题。我的测试数据给出了预期的结果:

cat movies.dat smallest.dat users.dat | ./Mapper.py | sort -h

显示输出:

Boomerang (1992)::F::3

但是，非测试数据不会:

cat movies.dat ratings.dat users.dat | ./Mapper.py | sort -h

报错:

sort: both SI and IEC prefixes present on units
close failed in file object destructor:
Error in sys.excepthook:

Original exception was:

因此，Hadoop Streaming 的输出是一个空文件。

测试数据取自原始文件；来自 MovieLens 数据集的 ratings.dat，例如:

1::1193::5::978300760
1::661::3::978302109

等等

谁能解释发生了什么，我能做什么？

最佳答案

我将输出顺序更改为 F::Boomerang (1992)::3，错误消失了。我不明白为什么；但这不再是问题。

关于Linux 错误 - 排序 : both SI and IEC prefixes present on units，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37040036/