numpy - 使用 Conv2d 进行图像调整

标签 numpy opencv tensorflow machine-learning tensor

我正在使用 TensorFlow 开展与 CNN 相关的项目。 我使用(20 个这样的图像)导入图像

for filename in glob.glob('input_data/*.jpg'):
input_images.append(cv2.imread(filename,0))

image_size_input = len(input_images[0])

由于灰度,图像的大小为 (250,250)。 但是对于 conv2D,它需要一个 4D 输入张量来馈送。我的输入张量看起来像

x = tf.placeholder(tf.float32,shape=[None,image_size_output,image_size_output,1], name='x')

所以我无法将上面的 2d 图像转换为给定的形状 (4D)。如何处理“无”字段。 我试过这个:

input_images_padded = []
for image in input_images:
temp = np.zeros((1,image_size_output,image_size_output,1))
for i in range(image_size_input):
    for j in range(image_size_input):
        temp[0,i,j,0] = image[i,j]
input_images_padded.append(temp)

我收到以下错误:

File "/opt/intel/intelpython3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 975, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))

ValueError: Cannot feed value of shape (20, 1, 250, 250, 1) for Tensor 'x_11:0', which has shape '(?, 250, 250, 1)'

完整代码(供引用):

import tensorflow as tf
from PIL import Image
import glob
import cv2
import os
import numpy as np
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

input_images = []
output_images = []

for filename in glob.glob('input_data/*.jpg'):
    input_images.append(cv2.imread(filename,0))

for filename in glob.glob('output_data/*.jpg'):
    output_images.append(cv2.imread(filename,0))    

image_size_input = len(input_images[0])
image_size_output = len(output_images[0])

'''
now adding padding to the input images to convert from 125x125 to 250x2050 sized images
'''
input_images_padded = []
for image in input_images:
    temp = np.zeros((1,image_size_output,image_size_output,1))
    for i in range(image_size_input):
        for j in range(image_size_input):
            temp[0,i,j,0] = image[i,j]
    input_images_padded.append(temp)

output_images_padded = []
for image in output_images:
    temp = np.zeros((1,image_size_output,image_size_output,1))
    for i in range(image_size_input):
        for j in range(image_size_input):
            temp[0,i,j,0] = image[i,j]
    output_images_padded.append(temp)



sess = tf.Session()
'''
Creating tensor for the input
'''
x = tf.placeholder(tf.float32,shape=    [None,image_size_output,image_size_output,1], name='x')
'''
Creating tensor for the output
'''
y = tf.placeholder(tf.float32,shape=    [None,image_size_output,image_size_output,1], name='y')


def create_weights(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.05))

def create_biases(size):
    return tf.Variable(tf.constant(0.05, shape=[size]))

def create_convolutional_layer(input, bias_count, filter_height, filter_width, num_input_channels, num_out_channels, activation_function):  


    weights = create_weights(shape=[filter_height, filter_width, num_input_channels, num_out_channels])

    biases = create_biases(bias_count)


    layer = tf.nn.conv2d(input=input,
                  filter=weights,
                 strides=[1, 1, 1, 1],
                 padding='SAME')

    layer += biases


layer = tf.nn.max_pool(value=layer,
                        ksize=[1, 2, 2, 1],
                        strides=[1, 1, 1, 1],
                        padding='SAME')

if activation_function=="relu":
    layer = tf.nn.relu(layer)

return layer


'''
Conv. Layer 1: Patch extraction
64 filters of size 1 x 9 x 9
Activation function: ReLU
Output: 64 feature maps
Parameters to optimize: 
    1 x 9 x 9 x 64 = 5184 weights and 64 biases
'''
layer1 = create_convolutional_layer(input=x,
                                bias_count=64,
                                filter_height=9,
                                filter_width=9,
                                num_input_channels=1,
                                num_out_channels=64,
                                activation_function="relu")

'''
Conv. Layer 2: Non-linear mapping
32 filters of size 64 x 1 x 1
Activation function: ReLU
Output: 32 feature maps
Parameters to optimize: 64 x 1 x 1 x 32 = 2048 weights and 32 biases
'''

layer2 = create_convolutional_layer(input=layer1,
                                bias_count=32,
                                filter_height=1,
                                filter_width=1,
                                num_input_channels=64,
                                num_out_channels=32,
                                activation_function="relu")

'''Conv. Layer 3: Reconstruction
1 filter of size 32 x 5 x 5
Activation function: Identity
Output: HR image
Parameters to optimize: 32 x 5 x 5 x 1 = 800 weights and 1 bias'''
layer3 = create_convolutional_layer(input=layer2,
                                bias_count=1,
                                filter_height=5,
                                filter_width=5,
                                num_input_channels=32,
                                num_out_channels=1,
                                activation_function="identity")

'''print(layer1.get_shape().as_list()) 
print(layer2.get_shape().as_list())
print(layer3.get_shape().as_list())'''

'''
    applying gradient descent algorithm
'''
#loss_function
loss = tf.reduce_sum(tf.square(layer3-y))
#optimiser
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)


init = tf.global_variables_initializer()
sess.run(init)
for i in range(len(input_images)):
    sess.run(train,{x: input_images_padded, y:output_images_padded})


curr_loss = sess.run([loss], {x: x_train, y: y_train})
print("loss: %s"%(curr_loss))

最佳答案

我认为你的 image_padded 不对。我没有编写 tf 代码的经验(尽管已经阅读了一些代码)。但是试试这个:

// imgs is your input-image-sequences
// padded is to feed 
cnt = len(imgs)
H,W = imgs[0].shape[:2]
padded = np.zeros((cnt, H, W, 1))
for i in range(cnt):
    padded[i, :,:,0] = img[i]

关于numpy - 使用 Conv2d 进行图像调整,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46989600/

相关文章:

python - numpy array - 使用 np.concatenate 或 np.insert 将零行添加到 ndarray

python - 在至少 7 天内连续三天登录该产品的用户

python - 如何通过 OpenCV 验证标记的特征是否正确跟踪视频中的对象?

python - 在 Python 中使用 Opencv 降低图像的不透明度

python - 出不来这个洞: can't use pre-learnt model's output

python - 值错误: Cannot take the length of Shape with unknown rank

python - 对任意数量的数组的所有可能组合求和并应用限制

python - 如何在 numpy 数组中使用多个过滤器?

image-processing - 模糊图像的阈值 - 第 2 部分

python - Keras:binary_crossentropy 和 categorical_crossentropy 混淆