c++ - std::tolower 的行为在不同的语言环境中如何变化？

Converts the given character to lowercase according to the character conversion rules defined by the currently installed C locale.

In the default "C" locale, the following uppercase letters ABCDEFGHIJKLMNOPQRSTUVWXYZ are replaced with respective lowercase letters abcdefghijklmnopqrstuvwxyz.

这种行为在不同的语言环境中会发生怎样的变化？

最佳答案

实际上，网站上的示例显示了不同之处:

#include <iostream>
#include <cctype>
#include <clocale>

int main()
{
    unsigned char c = '\xb4'; // the character Ž in ISO-8859-15
                              // but ´ (acute accent) in ISO-8859-1 

    std::setlocale(LC_ALL, "en_US.iso88591");
    std::cout << std::hex << std::showbase;
    std::cout << "in iso8859-1, tolower('0xb4') gives "
              << std::tolower(c) << '\n';
    std::setlocale(LC_ALL, "en_US.iso885915");
    std::cout << "in iso8859-15, tolower('0xb4') gives "
              << std::tolower(c) << '\n';
}

输出:

in iso8859-1, tolower('0xb4') gives 0xb4
in iso8859-15, tolower('0xb4') gives 0xb8

因为 C 语言没有编码的概念，char(因此 char const*)只是字节。切换语言环境时，您会切换这些字节的解释，例如此处字节 0xb4 (180) 超出 ASCII 范围 (0-127)，因此其含义根据您切换到的语言环境而变化:

在ISO-8859-1中，表示´，因此从上往下移动时不变
在 ISO-8859-15 中，它表示 Ž，因此在从上到下移动时变为 ž(在该语言环境中为 0xb8)

你会认为在后 Unicode 世界中，这将是无关紧要的，但许多人还没有过渡到 Unicode...

关于c++ - std::tolower 的行为在不同的语言环境中如何变化？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25236733/

c++ - std::tolower 的行为在不同的语言环境中如何变化？

上一篇：c++ - 带有自定义键的 multimap - 比较功能

下一篇：c++ - If 语句返回错误的输出 C++