python - ValueError : Buffer dtype mismatch, 预期 'double' 但得到 'float'

标签 python pandas nlp

def cast_vector(row):
    return np.array(list(map(lambda x: x.astype('float32'), row)))

words = pd.DataFrame(word_vectors.vocab.keys())
words.columns = ['words']
words['vectors'] = words.words.apply(lambda x: word_vectors.wv[f'{x}'])
words['vectors_typed'] = words.vectors.apply(cast_vector)
words['cluster'] = words.vectors_typed.apply(lambda x: model.predict([np.array(x)]))
#words.cluster = words.cluster.apply(lambda x: x[0])

为什么是float32却有错误?

enter image description here

最佳答案

对我来说,更改 kmeans 定义以将词向量用作 double 很有效。生成的代码是:

from sklearn.cluster import KMeans

word_vectors = Word2Vec.load("../models/word2vec.model").wv

kmeans = KMeans(n_clusters=2, max_iter=1000, random_state=True, n_init=50).fit(X=word_vectors.vectors.astype('double'))

def cast_vector(row):
    return np.array(list(map(lambda x: x.astype('double'), row)))

words = pd.DataFrame(word_vectors.vocab.keys())
words.columns = ['words']
words['vectors'] = words.words.apply(lambda x: word_vectors[f'{x}'])
words['vectors_typed'] = words.vectors.apply(cast_vector)
words['cluster'] = words.vectors_typed.apply(lambda x: kmeans.predict([np.array(x)]))
words.cluster = words.cluster.apply(lambda x: x[0])
words['cluster_value'] = [1 if i==0 else -1 for i in words.cluster]
words['closeness_score'] = words.apply(lambda x: 1/(model.transform([x.vectors]).min()), axis=1)
words['sentiment_coeff'] = words.closeness_score * words.cluster_value

words.head(10)

关于python - ValueError : Buffer dtype mismatch, 预期 'double' 但得到 'float',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64546786/

相关文章:

python - 将 Pandas 系列作为一行添加到 Pandas DataFrame

java - OpenNLP 头规则

python - 如何使用 Keras 确定类别?

python - 如何在 WMI 调用中分配变量?

python - 如何用python创建动态方法?

python - CV2 Python VideoCapture(0) 意外参数

python - 遍历 Pandas 系列时出错

python - 有没有一种方法可以遍历 pyspark 数据框并在没有显式 session key 的情况下识别 session ?

python - 将pandas groupby对象转换为数据框列表

java - 在 Eclipse 中遵循斯坦福 CoreNLP 教程时出现错误