regex - 使用 sed 一次排序 2 行

标签 regex sorting sed

我想用 sed对大型播放列表文件进行排序。每个播放列表项有 2 行,所以我需要根据 tvg-name= 之后的字符串进行排序在第一行(所以在这种情况下,第一行是 UK: Button 1(SG) )排序第一行和下一行(以 http 开头的行)。

我见过非常相似的正则表达式示例,但语法超出了我的能力,无法逆向工程以适应我的文件。任何人都可以帮忙吗?

#EXTM3U
#EXTINF:-1 tvg-id="BBCRedButton.uk" tvg-name="UK: BBC Red Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC Red Button 1 (SG)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/112233.ts
#EXTINF:-1 tvg-id="BBC1London.uk" tvg-name="UK: BBC ONE (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC ONE (HD) (720p)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/36011.ts
#EXTINF:-1 tvg-id="BBC1Scotland.uk" tvg-name="UK: BBC One Scot FHD (1080p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC One Scot FHD (1080p)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/24651.ts
#EXTINF:-1 tvg-id="" tvg-name="Act Of Grace" tvg-logo="http://wesite.test:8080/images/skQxxCIxEWuXL4tmPcfDoFjtSZU_small.jpg" group-title="_Movies_",Act Of Grace
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/102408.mp4
#EXTINF:-1 tvg-id="" tvg-name="Act Of Valor" tvg-logo="http://wesite.test:8080/images/1Xcd5ci69pVXMPP02DU11Ffq0yY_small.jpg" group-title="_Movies_",Act Of Valor
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96873.mp4
#EXTINF:-1 tvg-id="" tvg-name="Action Jackson" tvg-logo="http://wesite.test:8080/images/rg5WY1SiyPDuTYZs2vgTV0csVbz_small.jpg" group-title="_Movies_",Action Jackson
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96874.mp4
#EXTINF:-1 tvg-id="" tvg-name="Acts Of Vengeance" tvg-logo="http://wesite.test:8080/images/r5o6vWPYOQs6bv91gQ8kQs2zQYl_small.jpg" group-title="_Movies_",Acts Of Vengeance
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/107101.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creed" tvg-logo="http://wesite.test:8080/images/hKzhV274pkZBSpXfCjUyzbyYKLl_small.jpg" group-title="Drama",Creed
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/102773.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creep Van" tvg-logo="http://wesite.test:8080/images/r2tSTcns0gHVynnLVPwCtTePfOt_small.jpg" group-title="Horror",Creep Van
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/105306.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creepozoids" tvg-logo="http://wesite.test:8080/images/gZ3HBNBYe6hDTsocsXjDYGv0ZXD_small.jpg" group-title="",Creepozoids
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/106831.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creepshow 2" tvg-logo="http://wesite.test:8080/images/qxJWtBb89RaSRgLuz1d6ZuFTfVG_small.jpg" group-title="Horror",Creepshow 2
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/105307.mp4
#EXTINF:-1 tvg-id="" tvg-name="Acts Of Violence" tvg-logo="http://wesite.test:8080/images/pK9CugRd3DIP0THBH8WlGrvk5vy_small.jpg" group-title="_Movies_",Acts Of Violence
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96711.mp4
#EXTINF:-1 tvg-id="" tvg-name="Adaptation" tvg-logo="http://wesite.test:8080/images/5trb1V5f3IsjpZx2GiuUylowl3W_small.jpg" group-title="_Movies_",Adaptation
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/99509.mp4

预期输出:
#EXTM3U
#EXTINF:-1 tvg-id="BBCRedButton.uk" tvg-name="UK: BBC Red Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC Red Button 1 (SG)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/112233.ts
#EXTINF:-1 tvg-id="BBC1London.uk" tvg-name="UK: BBC ONE (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC ONE (HD) (720p)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/36011.ts
#EXTINF:-1 tvg-id="BBC1Scotland.uk" tvg-name="UK: BBC One Scot FHD (1080p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: BBC One Scot FHD (1080p)
http://wesite.test:8080/live/XXXXXXXX/XXXXXXXXX/24651.ts
#EXTINF:-1 tvg-id="" tvg-name="Act Of Grace" tvg-logo="http://wesite.test:8080/images/skQxxCIxEWuXL4tmPcfDoFjtSZU_small.jpg" group-title="_Movies_",Act Of Grace
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/102408.mp4
#EXTINF:-1 tvg-id="" tvg-name="Act Of Valor" tvg-logo="http://wesite.test:8080/images/1Xcd5ci69pVXMPP02DU11Ffq0yY_small.jpg" group-title="_Movies_",Act Of Valor
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96873.mp4
#EXTINF:-1 tvg-id="" tvg-name="Action Jackson" tvg-logo="http://wesite.test:8080/images/rg5WY1SiyPDuTYZs2vgTV0csVbz_small.jpg" group-title="_Movies_",Action Jackson
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96874.mp4
#EXTINF:-1 tvg-id="" tvg-name="Acts Of Vengeance" tvg-logo="http://wesite.test:8080/images/r5o6vWPYOQs6bv91gQ8kQs2zQYl_small.jpg" group-title="_Movies_",Acts Of Vengeance
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/107101.mp4
#EXTINF:-1 tvg-id="" tvg-name="Acts Of Violence" tvg-logo="http://wesite.test:8080/images/pK9CugRd3DIP0THBH8WlGrvk5vy_small.jpg" group-title="_Movies_",Acts Of Violence
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/96711.mp4
#EXTINF:-1 tvg-id="" tvg-name="Adaptation" tvg-logo="http://wesite.test:8080/images/5trb1V5f3IsjpZx2GiuUylowl3W_small.jpg" group-title="_Movies_",Adaptation
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/99509.mp4   
#EXTINF:-1 tvg-id="" tvg-name="Creed" tvg-logo="http://wesite.test:8080/images/hKzhV274pkZBSpXfCjUyzbyYKLl_small.jpg" group-title="Drama",Creed
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/102773.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creep Van" tvg-logo="http://wesite.test:8080/images/r2tSTcns0gHVynnLVPwCtTePfOt_small.jpg" group-title="Horror",Creep Van
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/105306.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creepozoids" tvg-logo="http://wesite.test:8080/images/gZ3HBNBYe6hDTsocsXjDYGv0ZXD_small.jpg" group-title="",Creepozoids
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/106831.mp4
#EXTINF:-1 tvg-id="" tvg-name="Creepshow 2" tvg-logo="http://wesite.test:8080/images/qxJWtBb89RaSRgLuz1d6ZuFTfVG_small.jpg" group-title="Horror",Creepshow 2
http://wesite.test:8080/movie/XXXXXXXX/XXXXXXXXX/105307.mp4

我从我的主人那里复制并粘贴了一部分电影,并且脚本都没有对输出进行更改。示例如下:

输入 :
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause" tvg-logo="http://liquidit.info:8080/images/hrZjAYAF1o37k4Qb442c4yxwVLw_small.jpg" group-title=_Movies_",The Santa Clause
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/104987.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause 2" tvg-logo="http://liquidit.info:8080/images/i7tbiDPIaa4VsQh1wWmbkY4zTRX_small.jpg" group-title=_Movies_",The Santa Clause 2
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/104988.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause 3 The Escape Clause" tvg-logo="http://liquidit.info:8080/images/kvKXyrc3cUGqXin2u76Ef8lApMI_small.jpg" group-title=_Movies_",The Santa Clause 3 The Escape Clause
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/101023.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Sapphires" tvg-logo="http://liquidit.info:8080/images/h7zn7Sf0Jl6mFZjGj4TCHjSJj6T_small.jpg" group-title=_Movies_",The Sapphires
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/101024.mp4
#EXTINF:-1 tvg-id="" tvg-name="Fracture" tvg-logo="http://liquidit.info:8080/images/sl5QYze20MclzDLxLDqe3sEJdiW_small.jpg" group-title=_Movies_",Fracture
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/107279.mp4

由于 alpabeticalised 排序,预期输出将是顶部的电影项目“骨折”。
#EXTINF:-1 tvg-id="" tvg-name="Fracture" tvg-logo="http://liquidit.info:8080/images/sl5QYze20MclzDLxLDqe3sEJdiW_small.jpg" group-title=_Movies_",Fracture
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/107279.mp4  
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause" tvg-logo="http://liquidit.info:8080/images/hrZjAYAF1o37k4Qb442c4yxwVLw_small.jpg" group-title=_Movies_",The Santa Clause
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/104987.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause 2" tvg-logo="http://liquidit.info:8080/images/i7tbiDPIaa4VsQh1wWmbkY4zTRX_small.jpg" group-title=_Movies_",The Santa Clause 2
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/104988.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Santa Clause 3 The Escape Clause" tvg-logo="http://liquidit.info:8080/images/kvKXyrc3cUGqXin2u76Ef8lApMI_small.jpg" group-title=_Movies_",The Santa Clause 3 The Escape Clause
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/101023.mp4
#EXTINF:-1 tvg-id="" tvg-name="The Sapphires" tvg-logo="http://liquidit.info:8080/images/h7zn7Sf0Jl6mFZjGj4TCHjSJj6T_small.jpg" group-title=_Movies_",The Sapphires
http://liquidit.info:8080/movie/Wallace47B/2IhPHmO9Q8/101024.mp4

最佳答案

如果你想用像 sed 这样的工具来做到这一点,你必须 A) 相信你的字段不包含任何像 tvg-id="http://... 这样恶性的东西。 , 或 B) 编写极其艰苦的脚本。

我会尝试一些粗糙但有效的东西,就像这样。首先,将两行合二为一:

sed 'N;s/\n//'

然后复制tvg-name字段到行的前面:
sed 's/\(.*tvg-name=\)\("[^"]*"\)/\2\1\2/'

然后排序:
sort

然后删除我添加的字段:
sed 's/^"[^"]*"//'

然后将一行拆分为两行:
sed 'h;s/http.*//;p;g;s/.*http/http/'

把它们放在一起:
sed 'N;s/\n//;s/\(.*tvg-name=\)\("[^"]*"\)/\2\1\2/' filename | sort | sed 's/^"[^"]*"//;h;s/http.*//;p;g;s/.*http/http/'

关于regex - 使用 sed 一次排序 2 行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51590404/

相关文章:

javascript - 在 JavaScript 中合并多个已排序对象数组的最有效方法是什么?

linux - sed 未知选项 s 错误

regex - 使用 sed 或 awk 拆分数据

regex - Eclipse,正则表达式搜索和替换

用于验证字符串的python re

用于日期扫描的Java库(如数字扫描仪)?

python - 字典排序时遇到一些麻烦

c - 使用一组有限的操作对 2 个 50000 个数字的链表进行排序

linux - 模式解码二

python - 读取 PDF 文件并使用正则表达式过滤内容