今天尝试在Linux下执行如下命令,想测试hadoop中的Streaming接口(interface),
cat test.txt|php wc_mapper.php|python Reducer.py
发生错误:
"Traceback (most recent call last):
File "Reducer.py", line 7, in <module>
word,count = line.split()
ValueError: need more than 0 values to unpack
"
test.txt 的内容如下:
hello world
hello world
hello world
PHP编写的wc_mapper.php的内容是
#!/usr/bin/php
<?php
error_reporting(E_ALL ^ E_NOTICE);
$word2count = array();
while (($line = fgets(STDIN)) !== false) {
$line = trim($line);
$words = preg_split('/\W/', $line, 0, PREG_SPLIT_NO_EMPTY);
foreach ($words as $word) {
echo $word, chr(9), "1", PHP_EOL;
}
}
?>
而由 Python 编写的 Reducer.py 的内容是
#!/usr/bin/python
from operator import itemgetter
import sys
word2count = {}
for line in sys.stdin:
line = line.strip()
word,count = line.split()
try:
count = int(count)
word2count[word] = word2count.get(word, 0) + count
except ValueError:
pass
sorted_word2count = sorted(word2count.items(), key=itemgetter(0))
for word,count in sorted_word2count:
print '%s\t%s'%(word,count)
谁知道错误的原因,如何解决这个问题?
当我执行第一部分命令时
cat test.txt|php wc_mapper.php|sort
,我得到以下输出:
hello 1
hello 1
hello 1
world 1
world 1
world 1
第一行为空,但占一行。
最佳答案
在 split()
中提供分隔符功能
try:
word,count = line.split(" ")
except:
print("Error")
我已将单个空格作为分隔符。你可以相应地改变。
关于php - 值错误 : need more than 0 values to unpack error in Python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37571862/