Python:检查列表中的任何单词是否存在于文档中

我正在尝试“自学”Python。目前我正在使用 Udacity 上提供的免费 Python 类(class)。我也在读 HTLPTHW。

其中一个模块有点过时，并要求您为现已不存在的网站使用 URLLIB 模块。它所做的是根据给定文档中是否存在诅咒词来判断真/假。它引用文件，在读取到 URL 搜索后输入其内容，然后在搜索后解析为 True/False。

我在想办法解决这个问题，我想我可以使用一个可以在文档中搜索的咒语列表。如果在打开的文档中也发现了列表中的发誓，它会发出警报。

我遇到了一些问题，部分原因可能是我保留了大部分基于教程的原始代码格式——这意味着很多代码可能是针对 URLLIB 方法定制的，而不是关键字搜索.

def read_text():
    quotes = open("/Users/Ishbar/Desktop/movie_quotes.txt")
    contents_of_file = quotes.read()
    print(contents_of_file)
    quotes.close()
    check_profanity(contents_of_file)

def check_profanity(text_to_check):
    Word_db = ["F***","S***","A**"]
    quotes = open("/Users/Ishbar/Desktop/movie_quotes.txt")
    contents_of_file = quotes.read()
    output == Word_db
    if str(Word_db) in quotes.read():
        output == 1
    if output == 1:
        print("Profanity Alert!!")
    elif output == 0:
        print("This document has no curse words.")
    else:
        print("ERROR: Could not scan the document properly.")
read_text()

我只是无法让代码开心。我要么总是发现脏话，要么找不到脏话。我想我可以让它修改输出是什么，并且输出的默认状态是没有亵渎，除非另有发现。

为此，我什至需要有一个 elif 来表示亵渎/缺席，如果它总是缺席，否则存在？

最佳答案

由于您已经在 read_text() 中读取了文件的内容，因此您不必在 check_profanity() 中再次读取文件

此外，if str(Word_db) in quotes.read(): 行将列表转换为字符串并检查它是否存在于文件中。它相当于:

if '["F***","S***","A**"]' in quotes.read()

您需要检查文件中是否存在列表的任何元素。这可以使用 for 循环来完成。

def check_profanity(text_to_check):
    Word_db = ["bad","verybad"]
    if set(Word_db).intersection(set(text_to_check.split())):
        print("Profanity Alert!!")
    else:
        print("This document has no curse words.")

check_profanity("this file contains bad words") # 1st call
check_profanity("this file contains good words") #2nd call

输出:

Profanity Alert!!

This document has no curse words.

您也可以使用正则表达式来做到这一点。

import re
if re.search("("+")|(".join(Word_db)+")", quotes.read()):
   print("Profanity Alert!!")
else:
   print("This document has no curse words.")

关于Python:检查列表中的任何单词是否存在于文档中，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38240963/

Python:检查列表中的任何单词是否存在于文档中

上一篇：python - 使用 ElementTree 在 Python 中解析 XML

下一篇：python - Latex 在 matplotlib 中无法正确显示