我是 spacy 新手。我尝试找出与金钱或日期有关的单词。
import spacy
from spacy import displacy
nlp = spacy.load("en_core_web_sm")
doc=nlp("""The client is look at 5x income. He has a loan with $5000 outstanding which can be repaid now this will free up $50 monthly. Credit card outstanding of $60 client will look to pay this off with bonus in September.""")
displacy.render(doc,style="ent",jupyter=True)
displacy.render(doc,style="dep",jupyter=True)
根据依赖项输出(此处未显示),我尝试搜索表示金钱和日期的单词(例如,$60 -> 信用卡未偿还)。在阅读了大量教程(包括 spacy)和博客之后,我认为我应该使用基于依赖规则的匹配。然而,我似乎需要用特定的结构来指定模式中的数字(金钱)(例如,$10000 的模式结构)。我们可以为任何货币实体创建一个模式吗?
另外,为了构建模型,有人可以帮我构建一个 60 美元和 5000 美元的模型吗?谢谢
最佳答案
你可以使用这样的东西:
import spacy
nlp = spacy.load("en_core_web_sm")
# Merge noun phrases and entities for easier analysis
nlp.add_pipe("merge_entities")
nlp.add_pipe("merge_noun_chunks")
doc=nlp("""The client is look at 5x income. He has a loan with $5000 outstanding which can be repaid now this will free up $50 monthly. Credit card outstanding of $60 client will look to pay this off with bonus in September.""")
for token in doc:
if token.ent_type_ == "MONEY":
# We have an attribute and direct object, so check for subject
if token.dep_ in ("attr", "dobj"):
subj = [w for w in token.head.lefts if w.dep_ == "nsubj" or w.dep_ == "amod"]
if subj:
print(subj[0], "-->", token)
# We have a prepositional object with a preposition
elif token.dep_ == "pobj" and token.head.dep_ == "prep":
print(token.head.head, "-->", token)
输出:
a loan --> 5000
this --> 50
Credit card --> 60
关于python - 基于 Spacy 规则的匹配来识别 python 中与金钱/日期相关的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66320574/