python - 字符串中最常见的字符

Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count

这是更新后的代码

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    print(new_string)
    length = len(new_string)
    print(length)
    count = 1
    j = 0
    higher_count = 0
    return_character = ""
    for i in range(0, len(new_string)):
        character = new_string[i]
        while (length - 1):
            if (character == new_string[j + 1]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count < count):
                higher_count = count
    return (character)     

#Main Program
input_str = input("Enter a string: ")
result = most_common_character(input_str)
print(result)

以上是我的代码。我收到 string index out of bound 错误，我不明白为什么。此外，代码仅检查第一个字符的出现我对如何继续下一个字符并获取最大计数感到困惑？

运行代码时出现的错误:

> Your answer is NOT CORRECT Your code was tested with different inputs.
> For example when your function is called as shown below:
> 
> most_common_character ('The cosmos is infinite')
> 
> ############# Your function returns ############# e The returned variable type is: type 'str'
> 
> ######### Correct return value should be ######## i The returned variable type is: type 'str'
> 
> ####### Output of student print statements ###### thecosmosisinfinite 19

最佳答案

您可以使用正则表达式模式搜索所有字符。 \w 匹配任何字母数字字符和下划线；这相当于集合 [a-zA-Z0-9_]。 [\w]后的+表示匹配一次或多次重复。

最后，您使用 Counter 对它们求和，并使用 most_common(1) 获取最高值。见下文了解平局的情况。

from collections import Counter
import re

s = "Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count"

>>> Counter(c.lower() for c in re.findall(r"\w", s)).most_common(1)
[('t', 46)]

如果是平局，那就有点棘手了。

def top_character(some_string):
    joined_characters = [c for c in re.findall(r"\w+", some_string.lower())]
    d = Counter(joined_characters)
    top_characters = [c for c, n in d.most_common() if n == max(d.values())]
    if len(top_characters) == 1:
        return top_characters[0]
    reversed_characters = joined_characters[::-1]  
    for c in reversed_characters:
        if c in top_characters:
            return c

>>> top_character(s)
't'

>>> top_character('the the')
'e'

对于上面的代码和句子“The cosmos is infinite”，您可以看到“i”比“e”(函数的输出)出现得更频繁:

>>> Counter(c.lower() for c in "".join(re.findall(r"[\w]+", 'The cosmos is infinite'))).most_common(3)
[('i', 4), ('s', 3), ('e', 2)]

您可以在代码块中看到问题:

for i in range(0, len(new_string)):
    character = new_string[i]
    ...
return (character)

您正在遍历一个句子并将该字母分配给变量字符，该字符永远不会在其他地方重新分配。因此，变量 character 将始终返回字符串中的最后一个字符。

关于python - 字符串中最常见的字符，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35594767/

python - 字符串中最常见的字符

上一篇：javascript - 数字的数字总和

下一篇：计算sinh的泰勒级数