我是 Haskell 新手,希望得到一些指导来解决我的问题。我想要一个文本编码函数,该函数列出了文本中的每个单词都由其索引表示的列表。例如:
["The more I like, the more I love.","The more I love, the more I hate."]
输出可能是
(["The", "more", "I", "like", "the", "love.", "love,", "hate."],
[1, 2, 3, 4, 5, 2, 3, 6, 1, 2, 3, 7, 1, 2, 3, 8])
我已经完成了删除重复部分
removeDuplicates :: Eq a => [a] -> [a]
removeDuplicates = rdHelper []
where rdHelper seen [] = seen
rdHelper seen (x:xs)
| x `elem` seen = rdHelper seen xs
| otherwise = rdHelper (seen ++ [x]) xs
最佳答案
您可以迭代单词列表并累积唯一单词及其索引。如果该元素位于累积列表中,则将索引附加到累积索引列表中。如果该元素不在列表中,则附加新索引(单词列表的长度 + 1)。
说实话,Haskell
代码比我的描述更容易理解:
import Data.List (findIndex)
build :: ([String], [Int]) -> String -> ([String], [Int])
build (words, indexes) word =
let
maybeIndex = findIndex (== word) words
in
case maybeIndex of
Just index ->
(words, indexes ++ [index + 1])
Nothing ->
(words ++ [word], indexes ++ [(+1) . length $ words])
buildIndexes =
let
listOfWords = words "The more I like, the more I love. The more I love, the more I hate."
in
foldl build ([], []) listOfWords
这里我有一个连接字符串作为输入
“我越喜欢,我就越爱。我越爱,我就越恨。”
请随意根据您的需要定制代码。
顺便说一下,在列表的开头插入元素然后反转结果列表可能会更高效。
import Data.List (findIndex)
build :: ([String], [Int]) -> String -> ([String], [Int])
build (words, indexes) word =
let
maybeIndex = findIndex (== word) words
in
case maybeIndex of
Just index ->
(words, (index + 1) : indexes)
Nothing ->
(word : words, ((+1) . length $ words) : indexes)
buildIndexes =
let
listOfWords = words "The more I like, the more I love. The more I love, the more I hate."
(listOfUniqueWords, listOfIndexes) = foldl build ([], []) listOfWords
in
(reverse listOfUniqueWords, reverse listOfIndexes)
关于Haskell 文本编码器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45452696/