python - 我无法使用使用 pandas Dataframes 作为参数的生成器创建 Dataset.from_generator()

我想从生成器创建一个数据帧管道，该生成器使用 pandas 数据帧在磁盘上查找图像路径并将其加载到管道中。 Tensorflow 不允许我执行此操作，会弹出 Can't conversion non-quarant Python sequence to Tensor. 消息。

当将生成器传递给tf.data.Dataset.from_generator时，我尝试在args参数中使用.values，但我会必须重写我使用数据帧编写的所有代码才能找到正确图像的路径。

这是生成数据集的命令:

train_dataset = tf.data.Dataset.from_generator(make_triplet_dataset, (tf.float32, tf.float32, tf.float32), args = ([train_families, train_positive_relations]))

这是 make_triplet_dataset 生成器(它使用 pandas 数据帧作为参数):

def make_triplet_dataset(families, positive_relations):
    """
    Dataset Generator that returns a random anchor, positive and negative images each time it is called
    """
    while True:
        
        # generates random triplet
        anchor, positive, negative = make_triplet(families, positive_relations)
        
        # builds the path for the randomly chosen images
        path_anchor_img = 'train/' + anchor + '/' + random.choice(os.listdir('train/' + anchor))
        path_positive_img = 'train/' + positive + '/' + random.choice(os.listdir('train/' + positive))
        path_negative_img = 'train/' + negative + '/' + random.choice(os.listdir('train/' + negative))
        
        # loads and preprocess the images to be used in the in the algorithm 
        anchor_img = preprocess_input(cv2.imread(path_anchor_img)) # preprocess does a (img/127.5) - 1 operation
        positive_img = preprocess_input(cv2.imread(path_positive_img))
        negative_img = preprocess_input(cv2.imread(path_negative_img))
        
        yield (anchor_img, positive_img, negative_img)

函数make_triplet是一个嵌套函数，它使用pandas Dataframes生成图像的路径。我希望能够使用生成器生成 tensorflow 数据集，该生成器可以生成三元组中的图像，使用 pandas Dataframes 查找这些图像的路径并将它们加载到管道中。请，如果有人可以提供帮助，我们将不胜感激。

最佳答案

找到答案了。我没有在 tf.data.Dataset.from_generator 方法的 args 参数中传递生成器函数的 pandas dataframes 参数，而是使用 lambda将它们传递到生成器函数本身:

train_dataset = tf.data.Dataset.from_generator(lambda:make_triplet_dataset(train_families，train_positive_relations)，output_types =(tf.float32，tf.float32，tf.float32))

关于python - 我无法使用使用 pandas Dataframes 作为参数的生成器创建 Dataset.from_generator()，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56598250/

python - 我无法使用使用 pandas Dataframes 作为参数的生成器创建 Dataset.from_generator()

上一篇：html - 在 Node.js 中将 html 和 CSS 缩小到一行

下一篇：angular - 类型 'currentState' 上不存在属性 'FadeblockComponent'