python - 将 Numpy 数组转换为张量

标签 python pandas numpy tensorflow deep-learning

我使用 pandas 将文件转换为数据帧,现在我想通过 TensorFlow 训练深度学习模型。我没有成功训练模型:划分训练集和测试集后,当我去编译模型时,它告诉我

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type 
numpy.ndarray).

我认为问题在于 numpy 数组具有不同的大小,但尽管执行了填充(通过这种方式,所有数组在列内具有相同的维度),但问题并未解决。 下面我插入了数据集中的列示例:如果我想将其转换为张量,我应该怎么做?

df = pd.read_parquet('example.parquet')
df['column']

0                            [0, 1, 1, 1, 0, 1, 0, 1, 0]
1          [0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0]
2          [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1]
3                      [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1]
4                   [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0]
                         ...                        
115                          [0, 1, 0, 0, 1, 1, 1, 1, 1]
116    [0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, ...
117     [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1]
118    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, ...
119                    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1]

显然我已经插入了原始列,而不是我填充失败的列。

这些是我训练模型的步骤(如果有用的话)

from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
Y = label_encoder.fit_transform(Y)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 42)
#create model
model = Sequential()

#add model layers
model.add(BatchNormalization())
model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50)

更新:完整回溯

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported 
object type numpy.ndarray).
--------------------------------------------------------------------- 
------
ValueError                                Traceback (most recent call 
last)
~\AppData\Local\Temp/ipykernel_16380/3421148994.py in <module>
  1 from livelossplot import PlotLossesKeras
  2 
----> 3 model.fit(X_train, y_train, validation_data=(X_test, y_test), 
epochs=50, callbacks=[PlotLossesKeras()])

~\AppData\Local\Programs\Python\Python39\lib\site- 
packages\keras\engine\training.py in fit(self, x, y, batch_size, 
epochs, verbose, callbacks, validation_split, validation_data, 
shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, 
validation_steps, validation_batch_size, validation_freq, 
max_queue_size, workers, use_multiprocessing)
1132          training_utils.RespectCompiledTrainableState(self):
1133       # Creates a `tf.data.Dataset` and handles batch and epoch 
iteration.
-> 1134       data_handler = data_adapter.get_data_handler(
1135           x=x,
1136           y=y,

~\AppData\Local\Programs\Python\Python39\lib\site- 
packages\keras\engine\data_adapter.py in get_data_handler(*args, 
**kwargs)
1381   if getattr(kwargs["model"], "_cluster_coordinator", None):
1382     return _ClusterCoordinatorDataHandler(*args, **kwargs)
-> 1383   return DataHandler(*args, **kwargs)
1384 
1385
~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weight, batch_size, steps_per_epoch, initial_epoch, epochs, shuffle, class_weight, max_queue_size, workers, use_multiprocessing, model, steps_per_execution, distribute)
   1136 
   1137     adapter_cls = select_data_adapter(x, y)
-> 1138     self._adapter = adapter_cls(
   1139         x,
   1140         y,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weights, sample_weight_modes, batch_size, epochs, steps, shuffle, **kwargs)
    228                **kwargs):
    229     super(TensorLikeDataAdapter, self).__init__(x, y, **kwargs)
--> 230     x, y, sample_weights = _process_tensorlike((x, y, sample_weights))
    231     sample_weight_modes = broadcast_sample_weight_modes(
    232         sample_weights, sample_weight_modes)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in _process_tensorlike(inputs)
   1029     return x
   1030 
-> 1031   inputs = tf.nest.map_structure(_convert_numpy_and_scipy, inputs)
   1032   return tf.__internal__.nest.list_to_tuple(inputs)
   1033
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\nest.py in map_structure(func, *structure, **kwargs)
    867 
    868   return pack_sequence_as(
--> 869       structure[0], [func(*x) for x in entries],
    870       expand_composites=expand_composites)
    871 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\nest.py in <listcomp>(.0)
    867 
    868   return pack_sequence_as(
--> 869       structure[0], [func(*x) for x in entries],
    870       expand_composites=expand_composites)
    871 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in _convert_numpy_and_scipy(x)
   1024       if issubclass(x.dtype.type, np.floating):
   1025         dtype = backend.floatx()
-> 1026       return tf.convert_to_tensor(x, dtype=dtype)
   1027     elif _is_scipy_sparse(x):
   1028       return _scipy_sparse_to_sparse_tensor(x)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\dispatch.py in wrapper(*args, **kwargs)
    204     """Call target, and fall back on dispatchers if there is a TypeError."""
    205     try:
--> 206       return target(*args, **kwargs)
    207     except (TypeError, ValueError):
    208       # Note: convert_to_eager_tensor currently raises a ValueError, not a
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2_with_dispatch(value, dtype, dtype_hint, name)
   1428     ValueError: If the `value` is a tensor not of given `dtype` in graph mode.
   1429   """
-> 1430   return convert_to_tensor_v2(
   1431       value, dtype=dtype, dtype_hint=dtype_hint, name=name)
   1432 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2(value, dtype, dtype_hint, name)
   1434 def convert_to_tensor_v2(value, dtype=None, dtype_hint=None, name=None):
   1435   """Converts the given `value` to a `Tensor`."""
-> 1436   return convert_to_tensor(
   1437       value=value,
   1438       dtype=dtype,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\profiler\trace.py in wrapped(*args, **kwargs)
    161         with Trace(trace_name, **trace_kwargs):
    162           return func(*args, **kwargs)
--> 163       return func(*args, **kwargs)
    164 
    165     return wrapped

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1564 
   1565     if ret is None:
-> 1566       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1567 
   1568     if ret is NotImplemented:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\tensor_conversion_registry.py in _default_conversion_function(***failed resolving arguments***)
     50 def _default_conversion_function(value, dtype, name, as_ref):
     51   del as_ref  # Unused.
---> 52   return constant_op.constant(value, dtype, name=name)
     53 
     54 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in constant(value, dtype, shape, name)
    269     ValueError: if called on a symbolic tensor.
    270   """
--> 271   return _constant_impl(value, dtype, shape, name, verify_shape=False,
    272                         allow_broadcast=True)
    273 
    ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
    281       with trace.Trace("tf.constant"):
    282         return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
--> 283     return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    284 
    285   g = ops.get_default_graph()

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    306 def _constant_eager_impl(ctx, value, dtype, shape, verify_shape):
    307   """Creates a constant on the current device."""
--> 308   t = convert_to_eager_tensor(value, ctx, dtype)
    309   if shape is None:
    310     return t

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
    104       dtype = dtypes.as_dtype(dtype).as_datatype_enum
    105   ctx.ensure_initialized()
--> 106   return ops.EagerTensor(value, ctx.device_name, dtype)
    107 
    108 

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

最佳答案

我假设您当前正在处理填充数据。因此,现在填充数据后,您将进行缩放。完成此操作后,训练和测试的 X 形状分别为 (120,3) 和 (84,3)。

现在第一个明显的错误在下面一行

model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

您未在 input_shape 中指定 batch 的尺寸。更简单地说,假设您要向模型提供图像,那么如果是 1 channel 图像,您将在 input_shape 中写入什么内容?如下所示。

height = 224
width = 224
model.add(Dense(20, activation='softmax', input_shape=(height, width)))

# In your case you have written
model.add(Dense(20, activation='softmax', input_shape=(120, 3)))

这告诉模型,对应于形状 (120,3) 的每个输入,有一些标签,但情况并非如此,因此您应该只传递如下所示的特征维度

model.add(Dense(20, activation='softmax', input_shape=(3,)))

此后错误应该被删除。另外,我没有看到您在 model.fit 中使用 batch_size 参数,您应该使用它。

我看到的第二件事不是语法错误,而是下面代码中的方法错误。

#create model
model = Sequential()
#add model layers
model.add(BatchNormalization()) # RED FLAG
model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

您不应该对输入使用BatchNormalization。使用 BatchNormalization 的主要原因是提高模型的训练速度,甚至不需要提高输入的训练速度。另外,需要注意的重要一点是,BatchNormalization 是针对训练批处理而不是整个数据集的标准化,因此如果您不使用大批量大小,则没什么用处。可以代表全体人口。

更新: 您没有正确填充。填充后 X.shape 的输出应为 ( _ , _ ) 而不是 ( _ , )。因此,请执行以下操作

# Creating some random data
random_array = []
for i in range(20):
    random_array.append([i for i in range(i+1)])

df = pd.DataFrame()
df['values'] = random_array

for i in range (0, len(df['values'])):
    pad_size = 21 - len(df['values'][i])
    df['values'][i] = np.pad(df['values'][i], (pad_size, 0))

final_array = np.array([np.array(i) for i in df['values']])
print(final_array.shape) # This will give (20, 21) and not (20,)

关于python - 将 Numpy 数组转换为张量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68785329/

相关文章:

python - 如何使用 SciPy/NumPy 从图像中查找和删除白色 Blob ?

python - 用户鼠标在绘图上输入积分限制 - python

python - 如何使用xlrd将Python日期转换为Excel日期(属性xlrd.xldate_from_date_tuple不存在)

python - 需要函数源码从dataframe中获取数据;找到平均中位数和众数

python - 在第一次出现 '-' 时拆分列

python - 如何将矢量化函数应用于 numpy 数组的前一个元素?

python - 更改不正确的python路径ubuntu

python - 通过迭代将函数应用于数据帧的所有行 - Python

python - 如何在 pandas 的组中创建一个具有最后一个值和第一个值之间差异的列

python - Numpy 3d 数组索引