python - 在 Python 中查找字符串下方(以及字符串之间)的单词

标签 python regex

我有这样的文字:

<div style="margin-left:10px;margin-right:10px;">
<!-- start of lyrics -->
There are times when I've wondered<br />
And times when I've cried<br />
When my prayers they were answered<br />
At times when I've lied<br />
But if you asked me a question<br />
Would I tell you the truth<br />
Now there's something to bet on<br />
You've got nothing to lose<br />
<br />
When I've sat by the window<br />
And gazed at the rain<br />
With an ache in my heart<br />
But never feeling the pain<br />
And if you would tell me<br />
Just what my life means<br />
Walking a long road<br />
Never reaching the end<br />
<br />
God give me the answer to my life<br />
God give me the answer to my dreams<br />
God give me the answer to my prayers<br />
God give me the answer to my being
<!-- end of lyrics -->
</div>

我想打印这首歌的歌词,但是 re.findall 和 re.search 在这种情况下不起作用。我如何能?我正在使用这段代码:

lyrics = re.findall('<div style="margin-left:10px;margin-right:10px;">(.*?)</div>', open('file.html','r').read())   

for words in lyrics:
    print words

最佳答案

试试这个:

with open(r'<file_path>','r') as file:
        for line in file:
            if  re.match(r'^<', line) == None:
                print line[:line.find(r'<')]

输出

There are times when I've wondered
And times when I've cried
When my prayers they were answered
At times when I've lied
But if you asked me a question
Would I tell you the truth
Now there's something to bet on
You've got nothing to lose
When I've sat by the window
And gazed at the rain
With an ache in my heart
But never feeling the pain
And if you would tell me
Just what my life means
Walking a long road
Never reaching the end
God give me the answer to my life
God give me the answer to my dreams
God give me the answer to my prayers
God give me the answer to my being

编辑: 使用 Url lib从网络中提取歌词:

from lxml import etree
import urllib, StringIO

# Rip file from URL        
resultado=urllib.urlopen('http://www.azlyrics.com/lyrics/ironmaiden/noprayerforthedying.html')
html = resultado.read()
# Parse html to etree
parser= etree.HTMLParser()
tree=etree.parse(StringIO.StringIO(html),parser)
# Apply the xpath rule
e = tree.xpath("//div[@style='margin-left:10px;margin-right:10px;']/text()")
# print output
for i in e:
    print str(i).strip()

关于python - 在 Python 中查找字符串下方(以及字符串之间)的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19348446/

相关文章:

python - 使 tkinter 文本小部件适合窗口

python - 使用 Python 和 BeautifulSoup 获取字符串中 1-10 的正则表达式时出现问题

regex - .htaccess 重写规则以防止缓存 css、js、图像文件。

c++ - 为什么这个正则表达式会导致我的程序崩溃?

regex - 使用 sed 更改找到的模式中的任意数量的分隔符

javascript - 在 JavaScript 中组合正则表达式

python - 如何在 Raspberry Pi 中同步时钟?

python - Google 云端硬盘断点续传上传失败

python - Django 操作系统错误 : [Errno 13] Permission denied

python - 浏览器未使用 python 脚本通过 Jenkins 启动