我有一个包含 4 列的数据集:
“日期”
、“Num_week”
、“日历”
df.head()
看起来像:
Date Num_week Calendar
412 2012-01-01 1 (2012, 1)
413 2012-01-02 2 (2012, 1)
414 2012-01-03 2 (2012, 1)
415 2012-01-04 2 (2012, 1)
416 2012-01-05 2 (2012, 1)
我对列中的值进行排序:sorted(list(set(date_week['calendar'])))
结果:
['(2012, 1)',
'(2012, 10)',
'(2012, 11)',
'(2012, 12)',
'(2012, 2)',
'(2012, 3)', etc.
我尝试在循环中分隔年份和月份。
for year, month in list(set(date_week['calendar'])):
print(year, month)
但是得到 ValueError:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-168-cf01e0d2888e> in <module>()
----> 1 for year, month in list(set(date_week['calendar'])):
2 print(year, month)
ValueError: too many values to unpack (expected 2)
我已经尝试使用 .items()
并得到错误的结果。
你能帮我解决这个问题吗?
最佳答案
问题是没有元组,而是元组的字符串表示,所以需要先转换:
import ast
date_week['Calendar'] = date_week['Calendar'].apply(ast.literal_eval)
因此可以使用您的解决方案或替代方案:
for year, month in date_week['Calendar'].unique():
print(year, month)
2012 1
编辑:替代解决方案 Series.str.findall
并转换为元组:
date_week['Calendar'] = date_week['Calendar'].str.findall('\d+').apply(tuple)
print (date_week)
Date Num_week Calendar
412 2012-01-01 1 (2012, 1)
413 2012-01-02 2 (2012, 1)
414 2012-01-03 2 (2012, 1)
415 2012-01-04 2 (2012, 1)
416 2012-01-05 2 (2012, 1)
关于python - 需要解压的值太多(预计为 2 个)[列表],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55688033/