python - 创建关系时从 2 个不同的目录读取到 pandas 数据帧中

我有两个目录。一个包含图像，另一个包含蒙版。 images 文件夹中的每个图像都有一个与 mask 文件夹中的文件名相同的蒙版。现在我想创建一个 pandas 数据框，其中有一列包含图像位置列表，第二列包含蒙版的相应位置。为了初步研究如何做到这一点，我编写了以下代码:

# Generate a list of all the files and their
def generate_list(images, masks):

    images_df = pd.concat([pd.DataFrame([file],
                                        columns=['images']) for file in os.listdir(images)], ignore_index = True)
    masks_df = pd.concat([pd.DataFrame([file],
                                       columns=['masks']) for file in os.listdir(masks)], ignore_index = True)

    df = pd.concat([images_df, masks_df], axis=0, ignore_index=True)

    print(df)

    return df

但是，我得到输出:

       images     masks
0    47_1.bmp       NaN
1     5_1.bmp       NaN
2    26_1.bmp       NaN
3    24_1.bmp       NaN
4     7_1.bmp       NaN
5    19_1.bmp       NaN
6      19.bmp       NaN
7      18.bmp       NaN
8    45_1.bmp       NaN 
26    4_1.bmp       NaN
..        ...       ...
131       NaN    14.bmp
132       NaN  50_1.bmp
133       NaN  15_1.bmp
134       NaN  28_1.bmp
135       NaN   9_1.bmp
136       NaN    16.bmp
137       NaN  17_1.bmp
138       NaN    17.bmp
139       NaN  33_1.bmp

显然，os.listdir 已经打乱了要进行 concat 操作的文件列表。

我该如何去做呢？

最佳答案

def generate_list(images, masks):

    images_df = pd.concat([pd.DataFrame([images + file]) for file in os.listdir(images)], ignore_index=True)
    masks_df = pd.concat([pd.DataFrame([masks + file]) for file in os.listdir(masks)], ignore_index=True)

    df = pd.concat([images_df, masks_df], axis=1, ignore_index=True)

    return df.sample(frac=1)

这是我的新答案。轴错误!

关于python - 创建关系时从 2 个不同的目录读取到 pandas 数据帧中，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56763211/

python - 创建关系时从 2 个不同的目录读取到 pandas 数据帧中

上一篇：python - Openpyxl 文件未找到错误

下一篇：python - Google距离矩阵不包含村庄