我正在尝试运行下面的代码。一切都很顺利,直到我尝试适应训练数据和标签。
我一直犯以下错误。我找不到原因。你能帮我一下吗?
UnimplementedError: Cast string to float is not supported [[node metrics/accuracy/Cast (defined at :1) ]] [Op:__inference_distributed_function_53201]
Function call stack: distributed_function
import numpy as np
import pandas as pd
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, GRU, Embedding, CuDNNGRU, Activation
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
import tensorflow as tf
datas=pd.read_csv('data.csv', sep='delimiter', engine='python')
targets=pd.read_csv('label.csv', sep='delimiter', engine='python')
data=datas['XDESCRIPTION'].values.tolist()
target=targets['YMode'].values.tolist()
cutoff=int(len(data)*0.80)
x_train,x_test=data[:cutoff],data[cutoff:]
y_train,y_test=target[:cutoff],target[cutoff:]
tokenizer=Tokenizer()
tokenizer.fit_on_texts(data)
tokenizer.fit_on_texts(target)
x_train_tokens=tokenizer.texts_to_sequences(x_train)
num_tokens=[len(tokens) for tokens in x_train_tokens +x_test_tokens]
num_tokens=np.array(num_tokens)
np.mean(num_tokens)
max_tokens=np.mean(num_tokens)+2*np.std(num_tokens)
max_tokens=int(max_tokens)
max_tokens
np.sum(num_tokens<max_tokens)/len(num_tokens)
x_train_pad=pad_sequences(x_train_tokens, maxlen=max_tokens)
x_test_pad=pad_sequences(x_test_tokens, maxlen=max_tokens)
idx=tokenizer.word_index
inverse_map=dict(zip(idx.values(),idx.keys()))
def tokens_to_string(tokens):
words=[inverse_map[token] for token in tokens if token!=0]
text=" ".join(words)
return text
model=Sequential()
embedding_size=41
model.add(Embedding(input_dim=num_words,output_dim=embedding_size,input_length=max_tokens))
model.add(GRU(units=16,return_sequences=True))
model.add(GRU(units=8,return_sequences=True))
model.add(GRU(units=4))
model.add(Dense(1,activation="sigmoid"))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x=np.array(x_train_pad), y=np.array(y_train),epochs=2,batch_size=256)
最佳答案
您的 y_train
和 y_test
数组绝对是字符串数组。通过这两行你可以看到
target=targets['YMode'].values.tolist()`
y_train,y_test=target[:cutoff],target[cutoff:]
如果 csv 文件中的数据是数字,则可以将目标数组转换为 int,如下所示
target = [int(lab) for lab in target]
但是,如果您的数据是分类的,您可以通过对数据进行标签编码来解决该问题。
from sklearn.preprocessing import LabelEncoder
target=targets['YMode'].values.tolist()
label_encoder = LabelEncoder()
Y = np.array(label_encoder.fit_transform(target))
y_train,y_test=Y[:cutoff],Y[cutoff:]
关于python - 未实现错误: Cast string to float is not supported,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61465675/