大家。我需要匿名the raw table制作 anonymized table 。换句话说,我需要用星号替换非重复的集合。
实际上,我已经运行了这段代码:
for j in range(len(zz_new)):
for i in range(len(zz)):
if zz_new.iloc[j][0] != zz.iloc[i][0]:
zz_new.iat[j,0]="*"
if zz_new.iloc[j][1] != zz.iloc[i][1]:
zz_new.iat[j,1]="*"
if zz_new.iloc[j][2] != zz.iloc[i][2]:
zz_new.iat[j,2]="*"
if zz_new.iloc[j][3] != zz.iloc[i][3]:
zz_new.iat[j,3]="*"
if zz_new.iloc[j][4] != zz.iloc[i][4]:
zz_new.iat[j,4]="*"
,但是结果是这样的My anonymized table 。我想知道您是否可以帮助我联系 anonymized table .
最佳答案
使用 value_counts() 方法:
df
age education
0 30-39 HS-grad
1 40-49 Bachelors
2 30-39 HS-grad
3 30-39 11th
vcnt= df.education.value_counts().eq(1)
HS-grad False
Bachelors True
11th True
Name: education, dtype: bool
df["education"]= df.education.replace(vcnt.loc[vcnt].index,"*")
age education
0 30-39 HS-grad
1 40-49 *
2 30-39 HS-grad
3 30-39 *
关于python - 如何用星号 ("*") 替换 csv 文件列中的非重复值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59508602/