我试图测试一些涉及字符串编码之间转换的代码,并在尝试创建具有无效 UTF-8 序列的 NSString 时发现了这种现象:
char before = 0xa1;
NSString *s = [NSString stringWithFormat:@"%c",before];
char after = [s characterAtIndex:0]; // = 0xb0
对于 0x80-0xFF 范围内的大多数(但不是全部)字符,NSString 中的字符与我指定的字符不同。
有人知道这是为什么吗?
以下是所有可能的 char 值的前后值:
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 6
7 -> 7
8 -> 8
9 -> 9
a -> a
b -> b
c -> c
d -> d
e -> e
f -> f
10 -> 10
11 -> 11
12 -> 12
13 -> 13
14 -> 14
15 -> 15
16 -> 16
17 -> 17
18 -> 18
19 -> 19
1a -> 1a
1b -> 1b
1c -> 1c
1d -> 1d
1e -> 1e
1f -> 1f
20 -> 20
21 -> 21
22 -> 22
23 -> 23
24 -> 24
25 -> 25
26 -> 26
27 -> 27
28 -> 28
29 -> 29
2a -> 2a
2b -> 2b
2c -> 2c
2d -> 2d
2e -> 2e
2f -> 2f
30 -> 30
31 -> 31
32 -> 32
33 -> 33
34 -> 34
35 -> 35
36 -> 36
37 -> 37
38 -> 38
39 -> 39
3a -> 3a
3b -> 3b
3c -> 3c
3d -> 3d
3e -> 3e
3f -> 3f
40 -> 40
41 -> 41
42 -> 42
43 -> 43
44 -> 44
45 -> 45
46 -> 46
47 -> 47
48 -> 48
49 -> 49
4a -> 4a
4b -> 4b
4c -> 4c
4d -> 4d
4e -> 4e
4f -> 4f
50 -> 50
51 -> 51
52 -> 52
53 -> 53
54 -> 54
55 -> 55
56 -> 56
57 -> 57
58 -> 58
59 -> 59
5a -> 5a
5b -> 5b
5c -> 5c
5d -> 5d
5e -> 5e
5f -> 5f
60 -> 60
61 -> 61
62 -> 62
63 -> 63
64 -> 64
65 -> 65
66 -> 66
67 -> 67
68 -> 68
69 -> 69
6a -> 6a
6b -> 6b
6c -> 6c
6d -> 6d
6e -> 6e
6f -> 6f
70 -> 70
71 -> 71
72 -> 72
73 -> 73
74 -> 74
75 -> 75
76 -> 76
77 -> 77
78 -> 78
79 -> 79
7a -> 7a
7b -> 7b
7c -> 7c
7d -> 7d
7e -> 7e
7f -> 7f
80 -> c4 [CHANGED]
81 -> c5 [CHANGED]
82 -> c7 [CHANGED]
83 -> c9 [CHANGED]
84 -> d1 [CHANGED]
85 -> d6 [CHANGED]
86 -> dc [CHANGED]
87 -> e1 [CHANGED]
88 -> e0 [CHANGED]
89 -> e2 [CHANGED]
8a -> e4 [CHANGED]
8b -> e3 [CHANGED]
8c -> e5 [CHANGED]
8d -> e7 [CHANGED]
8e -> e9 [CHANGED]
8f -> e8 [CHANGED]
90 -> ea [CHANGED]
91 -> eb [CHANGED]
92 -> ed [CHANGED]
93 -> ec [CHANGED]
94 -> ee [CHANGED]
95 -> ef [CHANGED]
96 -> f1 [CHANGED]
97 -> f3 [CHANGED]
98 -> f2 [CHANGED]
99 -> f4 [CHANGED]
9a -> f6 [CHANGED]
9b -> f5 [CHANGED]
9c -> fa [CHANGED]
9d -> f9 [CHANGED]
9e -> fb [CHANGED]
9f -> fc [CHANGED]
a0 -> 2020 [CHANGED]
a1 -> b0 [CHANGED]
a2 -> a2
a3 -> a3
a4 -> a7 [CHANGED]
a5 -> 2022 [CHANGED]
a6 -> b6 [CHANGED]
a7 -> df [CHANGED]
a8 -> ae [CHANGED]
a9 -> a9
aa -> 2122 [CHANGED]
ab -> b4 [CHANGED]
ac -> a8 [CHANGED]
ad -> 2260 [CHANGED]
ae -> c6 [CHANGED]
af -> d8 [CHANGED]
b0 -> 221e [CHANGED]
b1 -> b1
b2 -> 2264 [CHANGED]
b3 -> 2265 [CHANGED]
b4 -> a5 [CHANGED]
b5 -> b5
b6 -> 2202 [CHANGED]
b7 -> 2211 [CHANGED]
b8 -> 220f [CHANGED]
b9 -> 3c0 [CHANGED]
ba -> 222b [CHANGED]
bb -> aa [CHANGED]
bc -> ba [CHANGED]
bd -> 3a9 [CHANGED]
be -> e6 [CHANGED]
bf -> f8 [CHANGED]
c0 -> bf [CHANGED]
c1 -> a1 [CHANGED]
c2 -> ac [CHANGED]
c3 -> 221a [CHANGED]
c4 -> 192 [CHANGED]
c5 -> 2248 [CHANGED]
c6 -> 2206 [CHANGED]
c7 -> ab [CHANGED]
c8 -> bb [CHANGED]
c9 -> 2026 [CHANGED]
ca -> a0 [CHANGED]
cb -> c0 [CHANGED]
cc -> c3 [CHANGED]
cd -> d5 [CHANGED]
ce -> 152 [CHANGED]
cf -> 153 [CHANGED]
d0 -> 2013 [CHANGED]
d1 -> 2014 [CHANGED]
d2 -> 201c [CHANGED]
d3 -> 201d [CHANGED]
d4 -> 2018 [CHANGED]
d5 -> 2019 [CHANGED]
d6 -> f7 [CHANGED]
d7 -> 25ca [CHANGED]
d8 -> ff [CHANGED]
d9 -> 178 [CHANGED]
da -> 2044 [CHANGED]
db -> 20ac [CHANGED]
dc -> 2039 [CHANGED]
dd -> 203a [CHANGED]
de -> fb01 [CHANGED]
df -> fb02 [CHANGED]
e0 -> 2021 [CHANGED]
e1 -> b7 [CHANGED]
e2 -> 201a [CHANGED]
e3 -> 201e [CHANGED]
e4 -> 2030 [CHANGED]
e5 -> c2 [CHANGED]
e6 -> ca [CHANGED]
e7 -> c1 [CHANGED]
e8 -> cb [CHANGED]
e9 -> c8 [CHANGED]
ea -> cd [CHANGED]
eb -> ce [CHANGED]
ec -> cf [CHANGED]
ed -> cc [CHANGED]
ee -> d3 [CHANGED]
ef -> d4 [CHANGED]
f0 -> f8ff [CHANGED]
f1 -> d2 [CHANGED]
f2 -> da [CHANGED]
f3 -> db [CHANGED]
f4 -> d9 [CHANGED]
f5 -> 131 [CHANGED]
f6 -> 2c6 [CHANGED]
f7 -> 2dc [CHANGED]
f8 -> af [CHANGED]
f9 -> 2d8 [CHANGED]
fa -> 2d9 [CHANGED]
fb -> 2da [CHANGED]
fc -> b8 [CHANGED]
fd -> 2dd [CHANGED]
fe -> 2db [CHANGED]
ff -> 2c7 [CHANGED]
最佳答案
尝试使用:
unichar before = 0xa1;
NSString *s = [NSString stringWithFormat:@"%C",before];
unichar after = [s characterAtIndex:0];
NSLog(@"Read back char was %C", after);
从技术上讲,“char”tpe 应该是 0-127。 UTF8 使用两个较高的位,因此它有点不确定单个“0xFF”会生成什么。当您使用 stringWithFormat 时,chars 会被提升为 ints,因此您的 0xA0 会变成 0xFFFFFFA0,系统可能会查找负值并执行谁知道的操作。
关于ios - 为什么用 %c 指定的 NSString 字符被改变了?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17304446/