我的.srt文件内容如下:
1<br>
00:00:00,000 --> 00:00:01,000 <br>
This is the first line: <br>
and it has a secondary line,<br>
it may have more lines
2<br>
00:00:01,000 --> 00:00:02,000<br>
This is the second line<br>
it may have more lines<br>
3<br>
00:00:02,000 --> 00:00:03,000<br>
This is the last line<br>
and it has a secondary line too,<br>
it may have more lines
我正在使用扫描仪,但它没有得到如下正确解析:
var indexString: NSString?
scanner.scanUpToCharacters(from: CharacterSet.newlines, into: &indexString)
var startTimeString: NSString?
scanner.scanUpTo(" --> ", into: &startTimeString)
scanner.scanString("-->", into: nil)
var endTimeString: NSString?
scanner.scanUpToCharacters(from: CharacterSet.newlines, into: &endTimeString)
var textString: NSString?
scanner.scanUpTo("\n", into: &textString)
if textString != nil {
textString = (textString?.replacingOccurrences(of: "\r\n", with: " "))! as NSString
textString = (textString?.trimmingCharacters(in: CharacterSet.whitespaces))! as NSString
}
最佳答案
考虑使用简单的正则表达式:
let pattern = "(?<index>^\\d+$)\\n^(?<startTime>\\d\\d:[0-5]\\d:[0-5]\\d,\\d{1,3}) --> (?<endTime>\\d\\d:[0-5]\\d:[0-5]\\d,\\d{1,3})$\\n(?<text>(?:^.+$\\n?)+)"
let regex = try NSRegularExpression(pattern: pattern, options: .anchorsMatchLines)
let matches = regex.matches(in: srt, range: NSRange(..<srt.endIndex, in: srt))
let firstTextRange = matches[0].range(withName: "text")
let firstText = Range(firstTextRange, in: srt).flatMap { range in String(srt[range]) }
我建议缓存正则表达式。
关于ios - 如果 .srt 文件包含多行字幕文本,如何在 swift 中使用 Scanner 解析它?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50616447/