linux - 使用正确的 grep 选项

标签 linux shell ubuntu grep

我需要在 shell 脚本中设置正确的 grep 命令方面的帮助。 我目前使用的是:

grep '[A-Za-z]|' words

(我使用的是一个包含大量单词的大型文本文件。它的大小相当于几本字典。)

问题是我必须过滤掉单字母单词(aio 除外),并且单词不能少于 2 个字符,并且必须包含元音。

我不确定如何执行此操作。我假设您使用“或”表达式。我尝试使用但失败的是:

grep '[A-Za-z]|[a,i,o]'

然后我不知道如何强制它匹配 2 个字符。

最佳答案

假设您要选择有效单词(不区分大小写),其中有效单词可以是:

  • 单字母单词“a”、“i”、“o”
  • 以元音开头的两个或多个字母的单词
  • 以元音结尾的两个或多个字母的单词
  • 中间有元音的三个或更多字母的单词

然后你可以使用:

grep -Eiwo -e '[aio]|[aeiouy][a-z]+|[a-z]+[aeiouy]|[a-z]+[aeiouy][a-z]+'

选项:

  • -E 扩展正则表达式(使用 |+)
  • -i 不区分大小写
  • -w 匹配单词边界
  • -o 仅输出匹配项
  • -e regexp — 要匹配的正则表达式

正则表达式:

  • 4 个替代术语,由 | 分隔
  • 选择单字母单词
  • 选择以元音开头且主元音后包含一个或多个字母的单词
  • 选择以元音结尾且尾随元音之前包含一个或多个字母的单词
  • 选择以一个或多个字母开头,后跟元音,然后是一个或多个字母的单词。

这些模式并不相互排斥。例如,给定输入单词 oboe,所有三个多字母模式都与其匹配 - 只有单字母模式失败(毕竟它确实有四个字母)。然而,只有一种模式需要匹配它;其他的不会改变该单词的输出。

这里有一些文字(据说来自“德古拉”):

 "Then he spoke to me mockingly, 'And so you, like the others, would play
your brains against mine. You would help these men to hunt me and
frustrate me in my designs! You know now, and they know in part already,
and will know in full before long, what it is to cross my path. They
should have kept their energies for use closer to home. Whilst they
played wits against me--against me who commanded nations, and intrigued
for them, and fought for them, hundreds of years before they were
born--I was countermining them. And you, their best beloved one, are now
to me, flesh of my flesh; blood of my blood; kin of my kin; my bountiful
wine-press for a while; and shall be later on my companion and my
helper. You shall be avenged in turn; for not one of them but shall
minister to your needs. But as yet you are to be punished for what you
have done. You have aided in thwarting me; now you shall come to my
call. When my brain says "Come!" to you, you shall cross land or sea to
do my bidding; and to that end this!' With that he pulled open his
shirt, and with his long sharp nails opened a vein in his breast. When
the blood began to spurt out, he took my hands in one of his, holding
them tight, and with the other, my neck and pressed my mouth to
the wound, so that I must either suffocate or swallow some of the---- Oh
my God! my God! what have I done? What have I done to deserve such a
fate, I who have tried to walk in meekness and righteousness all my
days. God pity me! Look down on a poor soul in worse than mortal peril;
and in mercy pity those to whom she is dear!" Then she began to rub her
lips as though to cleanse them from pollution.

"Oh, no, not distressed me," she replied, "but I have been more touched
than I can say by your grief. That is a wonderful machine, but it is
cruelly true. It told me, in its very tones, the anguish of your heart.
It was like a soul crying out to Almighty God. No one must hear them
spoken ever again! See, I have tried to be useful. I have copied out the
words on my typewriter, and none other need now hear your heart beat, as
I did." 

这是通过上面的 grep 命令从此文本中选择的已排序、不区分大小写的单词列表,以柱状格式呈现:

a             again         against       aided         all           Almighty      already
And           anguish       are           as            avenged       be            beat
been          before        began         beloved       best          bidding       blood
born          bountiful     brain         brains        breast        but           by
call          can           cleanse       closer        come          commanded     companion
copied        countermining cross         cruelly       crying        days          dear
deserve       designs       did           distressed    do            done          down
either        end           energies      ever          fate          flesh         for
fought        from          frustrate     full          God           grief         hands
have          he            hear          heart         help          helper        her
his           holding       home          hundreds      hunt          I             in
intrigued     is            it            its           kept          kin           know
land          later         like          lips          long          Look          machine
me            meekness      men           mercy         mine          minister      mockingly
more          mortal        mouth         must          my            nails         nations
neck          need          needs         no            none          not           now
of            Oh            on            one           open          opened        or
other         others        out           part          path          peril         pity
play          played        pollution     poor          press         pressed       pulled
punished      replied       righteousness rub           say           says          sea
See           shall         sharp         she           shirt         should        so
some          soul          spoke         spoken        spurt         such          suffocate
swallow       than          that          the           their         them          Then
these         they          this          those         though        thwarting     tight
to            told          tones         took          touched       tried         true
turn          typewriter    use           useful        vein          very          walk
was           were          what          When          while         Whilst        who
whom          will          wine          With          wits          wonderful     words
worse         would         wound         years         yet           you           your

关于linux - 使用正确的 grep 选项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60518630/

相关文章:

linux - 将 'man grep' 命令输出到文件中

python - Linux静态路由规则文件格式转换使用Python

bash - 如何在/etc/init.d 脚本中检测网络何时初始化?

bash - 如何在 .tmux.conf 中编写 if 语句来为不同的 tmux 版本设置不同的选项?

linux - 如何在shell脚本中创建服务器?

linux - scp 命令的行为

linux - NFS - 来自两个客户端文件损坏的 mv 命令?

ubuntu - 海豚文件浏览器怎么设置win+E快捷键

linux - 从 IPTables 中丢弃数据包

c++ - Visual Studio 代码重构似乎不起作用(例如重命名符号 - f2)