from pyspark import SparkContext, SparkConf, sql
from pyspark.sql import Row
sc = SparkContext.getOrCreate()
sqlContext = sql.SQLContext(sc)
df = sc.parallelize([ \
Row(nama='Roni', umur=27, tingi=168), \
Row(nama='Roni', umur=6, tingi=168),
Row(nama='Roni', umur=89, tingi=168),])
df.show()
error: Traceback (most recent call last):
File "ipython-input-24-bfb18ebba99e", line 8, in df.show()
AttributeError: 'RDD' object has no attribute 'show'
最佳答案
错误很明显,因为 df
是一个 rdd。您应该在以下代码中使用 toDF
将其更改为数据框:
df = df.toDF()
df.show()
关于python - 属性错误 : 'RDD' object has no attribute 'show' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53618990/