我在连接操作上遇到了非常严重的错误。我也尝试了合并(left_index,right_index),但结果相同。
索引是相同的(根据设计),通过两个索引上的index.is_unique (TRUE) 和index.get_duplicates() (EMPTY) 检查。
基础版:
df1.join(series)
merge(df1, series_as_df,
<小时/>
print tempres.index
奇怪的是打印值: 打印温度.索引.值 [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121:04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01-16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01-16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970-01-16 121:01:35 1970-01-16 121:01:25 ...]
如果需要的话我可以添加pickled系列和df...
使用最新的 pandas 版本 0.10.x
谢谢
吕克
我的代码(从较大的代码中剪切出来)
XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]
iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)
回顾我的问题:
我在基本 DataFrame 上使用连续合并和连接。有一次,我在尝试下一次合并/连接时开始出现错误。我无法在简单的测试中重现这一点,但我在问题开始之前保存了数据帧。
我找不到问题所在。
base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')
print base_df
print lookup_df
print base_df.count()
print base_df['VKCSKEY1']
print lookup_df['traf_key']
# reset index does not change a thing
base_df = base_df.reset_index(drop=True)
print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()
# checking value matches
for k in lookup_df['traf_key']:
print k, k in base_df['VKCSKEY1'].values
# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df,
how='left',
#how = 'outer',
left_on ='VKCSKEY1',
right_on ='traf_key',
#left_index=True,
#right_index = True,
#sort=True,
#suffixes=('', '.m'), copy=True
)
print df_result
输出:
1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T 89 non-null values
X 89 non-null values
Y 89 non-null values
precip_quantity_1hour 89 non-null values
pressure 89 non-null values
rel_humidity 89 non-null values
temp 89 non-null values
temp_max 0 non-null values
temp_min 0 non-null values
wind_direction 89 non-null values
wind_speed 89 non-null values
BC_TRAF 89 non-null values
closest 89 non-null values
closest.m 89 non-null values
AGGP.P50_ID 89 non-null values
AGGP.FUNC_CLASS 89 non-null values
AGGP.SPEED_CAT 89 non-null values
LINK_ID 89 non-null values
FUNC_CLASS 89 non-null values
SPEED_CAT 89 non-null values
AR_AUTO 89 non-null values
AR_BUS 89 non-null values
AR_TAXIS 89 non-null values
AR_CARPOOL 89 non-null values
AR_PEDEST 89 non-null values
AR_TRUCKS 89 non-null values
STCA20_PCT 89 non-null values
VKC_LINKNR 89 non-null values
TRVIC150R1 89 non-null values
closest.m 89 non-null values
closest.m.m 89 non-null values
VKCP.LINK_ID 89 non-null values
VKCP.FUNC_CLASS 89 non-null values
VKCP.SPEED 89 non-null values
VKCP.LINKNR 89 non-null values
VKCP.TWIN_ID 89 non-null values
VKCSKEY1 89 non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key 30 non-null values
weekday_nr 30 non-null values
linknr 30 non-null values
weekday 30 non-null values
vr0 30 non-null values
vr1 30 non-null values
vr2 30 non-null values
vr3 30 non-null values
vr4 30 non-null values
vr5 30 non-null values
vr6 30 non-null values
vr7 30 non-null values
vr8 30 non-null values
vr9 30 non-null values
vr10 30 non-null values
vr11 30 non-null values
vr12 30 non-null values
vr13 30 non-null values
vr14 30 non-null values
vr15 30 non-null values
vr16 30 non-null values
vr17 30 non-null values
vr18 30 non-null values
vr19 30 non-null values
vr20 30 non-null values
vr21 30 non-null values
vr22 30 non-null values
vr23 30 non-null values
au0 30 non-null values
au1 30 non-null values
au2 30 non-null values
au3 30 non-null values
au4 30 non-null values
au5 30 non-null values
au6 30 non-null values
au7 30 non-null values
au8 30 non-null values
au9 30 non-null values
au10 30 non-null values
au11 30 non-null values
au12 30 non-null values
au13 30 non-null values
au14 30 non-null values
au15 30 non-null values
au16 30 non-null values
au17 30 non-null values
au18 30 non-null values
au19 30 non-null values
au20 30 non-null values
au21 30 non-null values
au22 30 non-null values
au23 30 non-null values
sn0 30 non-null values
sn1 30 non-null values
sn2 30 non-null values
sn3 30 non-null values
sn4 30 non-null values
sn5 30 non-null values
sn6 30 non-null values
sn7 30 non-null values
sn8 30 non-null values
sn9 30 non-null values
sn10 30 non-null values
sn11 30 non-null values
sn12 30 non-null values
sn13 30 non-null values
sn14 30 non-null values
sn15 30 non-null values
sn16 30 non-null values
sn17 30 non-null values
sn18 30 non-null values
sn19 30 non-null values
sn20 30 non-null values
sn21 30 non-null values
sn22 30 non-null values
sn23 30 non-null values
dtypes: float64(24), int64(50), object(2)
T 89
X 89
Y 89
precip_quantity_1hour 89
pressure 89
rel_humidity 89
temp 89
temp_max 0
temp_min 0
wind_direction 89
wind_speed 89
BC_TRAF 89
closest 89
closest.m 89
AGGP.P50_ID 89
AGGP.FUNC_CLASS 89
AGGP.SPEED_CAT 89
LINK_ID 89
FUNC_CLASS 89
SPEED_CAT 89
AR_AUTO 89
AR_BUS 89
AR_TAXIS 89
AR_CARPOOL 89
AR_PEDEST 89
AR_TRUCKS 89
STCA20_PCT 89
VKC_LINKNR 89
TRVIC150R1 89
closest.m 89
closest.m.m 89
VKCP.LINK_ID 89
VKCP.FUNC_CLASS 89
VKCP.SPEED 89
VKCP.LINKNR 89
VKCP.TWIN_ID 89
VKCSKEY1 89
0 (60744, 0)
1 (60744, 0)
2 (60744, 0)
3 (60750, 0)
4 (60768, 0)
5 (60768, 0)
6 (60758, 0)
7 (60758, 0)
8 (69223, 0)
9 (69223, 0)
10 (69223, 0)
11 (64265, 0)
12 (64265, 0)
13 (64265, 0)
14 (64265, 0)
15 (64265, 0)
16 (64265, 0)
17 (64265, 0)
18 (64265, 0)
19 (64265, 0)
20 (64216, 0)
21 (64216, 0)
22 (64216, 0)
23 (64216, 0)
24 (64216, 0)
25 (64216, 0)
26 (64216, 0)
27 (64216, 0)
28 (64216, 0)
29 (57085, 0)
30 (57085, 0)
31 (57085, 0)
32 (57085, 0)
33 (57085, 0)
34 (57085, 0)
35 (57014, 0)
36 (57033, 0)
37 (57033, 0)
38 (64065, 0)
39 (64065, 0)
40 (64065, 0)
41 (64065, 0)
42 (64065, 0)
43 (57070, 0)
44 (64062, 0)
45 (64062, 0)
46 (64062, 0)
47 (64062, 0)
48 (57070, 0)
49 (64061, 0)
50 (64061, 0)
51 (64061, 0)
52 (64061, 0)
53 (59849, 0)
54 (59415, 0)
55 (58487, 0)
56 (58054, 0)
57 (58054, 0)
58 (58054, 0)
59 (52551, 0)
60 (58054, 0)
61 (58054, 0)
62 (58054, 0)
63 (58054, 0)
64 (52551, 0)
65 (58054, 0)
66 (58488, 0)
67 (58488, 0)
68 (58028, 0)
69 (58464, 0)
70 (58028, 0)
71 (57989, 0)
72 (58595, 0)
73 (58027, 0)
74 (57989, 0)
75 (58595, 0)
76 (58595, 0)
77 (58019, 0)
78 (58595, 0)
79 (58595, 0)
80 (58019, 0)
81 (58595, 0)
82 (58595, 0)
83 (66715, 0)
84 (58595, 0)
85 (59295, 0)
86 (67614, 0)
87 (58314, 0)
88 (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0) (60744, 0)
(60750, 0) (60750, 0)
(60768, 0) (60768, 0)
(60758, 0) (60758, 0)
(69223, 0) (69223, 0)
(64265, 0) (64265, 0)
(64216, 0) (64216, 0)
(57085, 0) (57085, 0)
(57014, 0) (57014, 0)
(57033, 0) (57033, 0)
(64065, 0) (64065, 0)
(57070, 0) (57070, 0)
(64062, 0) (64062, 0)
(64061, 0) (64061, 0)
(59849, 0) (59849, 0)
(59415, 0) (59415, 0)
(58487, 0) (58487, 0)
(58054, 0) (58054, 0)
(52551, 0) (52551, 0)
(58488, 0) (58488, 0)
(58028, 0) (58028, 0)
(58464, 0) (58464, 0)
(57989, 0) (57989, 0)
(58595, 0) (58595, 0)
(58027, 0) (58027, 0)
(58019, 0) (58019, 0)
(66715, 0) (66715, 0)
(59295, 0) (59295, 0)
(67614, 0) (67614, 0)
(58314, 0) (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
File "L:\temp\pandas_join_bug.py", line 43, in <module>
right_on ='traf_key',
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
return op.get_result()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
ldata, rdata = self._get_merge_data()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
copydata=False)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
to_rename = self.items.intersection(other.items)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
indexer = self.get_indexer(other.values)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
一旦发生错误,我就无法使任何合并或连接语句成功。起初我没有发现该错误与重复的合并/连接操作有关。最新集合的任何单个合并/连接现在都可以工作。一旦我需要另一个合并/连接,我就会得到同样的错误。
最佳答案
重复的列名会导致此错误,请尝试消除重复的列名
关于python - Pandas 加入/合并 'Reindexing only valid with uniquely valued Index ',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15356287/