python - 将 DTM 转换为文本

我想改造以下 DTM

pd.DataFrame({"ID": [1,2,3,4,5],
              "t1": [0,0,1,1,0],
              "t2": [1,1,0,0,0],
              "t3": [1,0,1,0,0],
              "t4": [0,0,0,0,0]})

到此 DF

pd.DataFrame({"ID": [1,2,3,4,5],
              "text": ["t2, t3", "t2", "t1, t3", "t1", ""]})
>> 1  t2, t3
   2      t2
   3  t1, t3

我的尝试是以下脚本

for col in df.columns: df = np.where(df[col] == 1, col, "")
df.apply(lambda x: " ".join(x), axis=1).str.split().apply(lambda x: ", ".join(x))

但我想知道是否有更Pythonic的方法来做到这一点

最佳答案

使用DataFrame.dot过滤列为 filter或按 iloc 的位置:

df1 = df.filter(like='t')

#df1 = df.iloc[:, 1:]
df = df[['ID']].join(df1.dot(df1.columns + ', ').str[:-2].rename('new'))
print (df)
   ID     new
0   1  t2, t3
1   2      t2
2   3  t1, t3
3   4      t1
4   5

或者通过set_index :

df1 = df.set_index('ID')
df = df1.dot(df1.columns + ', ').str[:-2].reset_index(name='new')
print (df)
   ID     new
0   1  t2, t3
1   2      t2
2   3  t1, t3
3   4      t1
4   5

关于python - 将 DTM 转换为文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51026819/

上一篇：python - 在CPU为python2.7的Mac上安装Keras/Tensorflow

下一篇：python - Pandas 日期时间 : find the correct year for the first date after a datetime

python - 使用 Google 协作中的所有可用 RAM 后，您的 session 崩溃

python - 将整数转换为给定字母表上的字符串的算法

c++ - 将 python + numpy 代码嵌入到 C++ dll 回调中

python - 具有多个元素的 Django __str__

python - 将我的 CSV 与日志文件 : Length of values does not match length of index using pandas 相匹配

python - 如何运行按列值分组的分析，而不是使用整个数据集

python - 如何让 SciPy.integrate.odeint 在路径关闭时停止？

python - 如何一次更改一个类的所有对象？

pandas - 从过滤后的 Pandas 数据框中获取整数索引值