python - 字符串中最常见的字符

标签 python algorithm python-3.x

Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count

这是更新后的代码

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    print(new_string)
    length = len(new_string)
    print(length)
    count = 1
    j = 0
    higher_count = 0
    return_character = ""
    for i in range(0, len(new_string)):
        character = new_string[i]
        while (length - 1):
            if (character == new_string[j + 1]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count < count):
                higher_count = count
    return (character)     

#Main Program
input_str = input("Enter a string: ")
result = most_common_character(input_str)
print(result)

以上是我的代码。我收到 string index out of bound 错误,我不明白为什么。此外,代码仅检查第一个字符的出现我对如何继续下一个字符并获取最大计数感到困惑?

运行代码时出现的错误:

> Your answer is NOT CORRECT Your code was tested with different inputs.
> For example when your function is called as shown below:
> 
> most_common_character ('The cosmos is infinite')
> 
> ############# Your function returns ############# e The returned variable type is: type 'str'
> 
> ######### Correct return value should be ######## i The returned variable type is: type 'str'
> 
> ####### Output of student print statements ###### thecosmosisinfinite 19

最佳答案

您可以使用正则表达式模式搜索所有字符。 \w 匹配任何字母数字字符和下划线;这相当于集合 [a-zA-Z0-9_][\w]后的+表示匹配一次或多次重复。

最后,您使用 Counter 对它们求和,并使用 most_common(1) 获取最高值。见下文了解平局的情况。

from collections import Counter
import re

s = "Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count"

>>> Counter(c.lower() for c in re.findall(r"\w", s)).most_common(1)
[('t', 46)]

如果是平局,那就有点棘手了。

def top_character(some_string):
    joined_characters = [c for c in re.findall(r"\w+", some_string.lower())]
    d = Counter(joined_characters)
    top_characters = [c for c, n in d.most_common() if n == max(d.values())]
    if len(top_characters) == 1:
        return top_characters[0]
    reversed_characters = joined_characters[::-1]  
    for c in reversed_characters:
        if c in top_characters:
            return c

>>> top_character(s)
't'

>>> top_character('the the')
'e'

对于上面的代码和句子“The cosmos is infinite”,您可以看到“i”比“e”(函数的输出)出现得更频繁:

>>> Counter(c.lower() for c in "".join(re.findall(r"[\w]+", 'The cosmos is infinite'))).most_common(3)
[('i', 4), ('s', 3), ('e', 2)]

您可以在代码块中看到问题:

for i in range(0, len(new_string)):
    character = new_string[i]
    ...
return (character)     

您正在遍历一个句子并将该字母分配给变量字符,该字符永远不会在其他地方重新分配。因此,变量 character 将始终返回字符串中的最后一个字符。

关于python - 字符串中最常见的字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35594767/

相关文章:

python 从两个单独的列表中计数对,列表中没有唯一值(我认为集合不起作用)

python - 指定函数中的输入类型

algorithm - 大数分解

java - 为什么 Java List 遍历比文件 readline 慢?

java - 在java中将大文件的数据缓存在内存中

python - 将命令行参数传递给 runpy

Python CSV,如何在逐行(逐行)读取数据的同时将数据追加到行尾?

Spotify 的 Python 社交身份验证 : redirect is missing 'state' parameter

python - 尝试安装 python 时出现错误 : Missing the OpenSSL lib?

python - 使用 Tkinter 按 Enter 键时文本未保存在文本文件中