我试图通过压缩 JSON 来减少带宽消耗 String
我通过 WebSocket 从我的 Springboot 应用程序发送到浏览器客户端(这是在 permessage-deflate
WebSocket 扩展之上)。此场景使用以下 JSON String
其长度为 383 个字符:
{"headers":{},"body":{"message":{"errors":{"password":"Password length must be at least 8 characters.","retype":"Retype Password cannot be null.","username":"Username length must be between 6 to 64 characters."},"links":[],"success":false,"target":{"password":"","retype":"","username":""}},"target":"/user/session/signup"},"statusCode":"UNPROCESSABLE_ENTITY","statusCodeValue":422}
为了进行基准测试,我从服务器发送压缩和未压缩的字符串,如下所示:Object response = …,
SimpMessageHeaderAccessor simpHeaderAccessor =
SimpMessageHeaderAccessor.create(SimpMessageType.MESSAGE);
simpHeaderAccessor.setSessionId(sessionId);
simpHeaderAccessor.setContentType(new MimeType("application", "json",
StandardCharsets.UTF_8));
simpHeaderAccessor.setLeaveMutable(true);
// Sends the uncompressed message.
messagingTemplate.convertAndSendToUser(sessionId, uri, response,
simpHeaderAccessor.getMessageHeaders());
ObjectMapper mapper = new ObjectMapper();
String jsonString;
try {
jsonString = mapper.writeValueAsString(response);
}
catch(JsonProcessingException e) {
jsonString = response.toString();
}
log.info("The payload is application/json.");
log.info("uncompressed payload (" + jsonString.length() + " character):");
log.info(jsonString);
String lzStringCompressed = LZString.compress(jsonString);
simpHeaderAccessor = SimpMessageHeaderAccessor.create(SimpMessageType.MESSAGE);
simpHeaderAccessor.setSessionId(sessionId);
simpHeaderAccessor.setContentType(new MimeType("text", "plain",
StandardCharsets.UTF_8));
simpHeaderAccessor.setLeaveMutable(true);
// Sends the compressed message.
messagingTemplate.convertAndSendToUser(sessionId, uri, lzStringCompressed,
simpHeaderAccessor.getMessageHeaders());
log.info("The payload is text/plain.");
log.info("compressed payload (" + lzStringCompressed.length() + " character):");
log.info(lzStringCompressed);
在 Java 控制台中记录以下行:The payload is application/json.
uncompressed payload (383 character):
{"headers":{},"body":{"message":{"errors":{"password":"Password length must be at least 8 characters.","retype":"Retype Password cannot be null.","username":"Username length must be between 6 to 64 characters."},"links":[],"success":false,"target":{"password":"","retype":"","username":""}},"target":"/user/session/signup"},"statusCode":"UNPROCESSABLE_ENTITY","statusCodeValue":422}
The payload is text/plain.
compressed payload (157 character):
??????????¼??????????????p??!-??7??????????????????????????????????u??????????????????????·}???????????????????????????????????????/?┬R??b,??????m??????????
然后浏览器收到服务器发送的两条消息,并被这个javascript捕获:stompClient.connect({}, function(frame) {
stompClient.subscribe(stompClientUri, function(payload) {
try {
JSON.parse(payload.body);
console.log("The payload is application/json.");
console.log("uncompressed payload (" + payload.body.length + " character):");
console.log(payload.body);
payload = JSON.parse(payload.body);
} catch (e) {
try {
payload = payload.body;
console.log("The payload is text/plain.");
console.log("compressed payload (" + payload.length + " character):");
console.log(payload);
var decompressPayload = LZString.decompress(payload);
console.log("decompressed payload (" + decompressPayload.length + " character):");
console.log(decompressPayload);
payload = JSON.parse(decompressPayload);
} catch (e) {
} finally {
}
} finally {
}
});
});
在浏览器的调试控制台中显示以下几行:The payload is application/json.
uncompressed payload (383 character):
{"headers":{},"body":{"message":{"errors":{"password":"Password length must be at least 8 characters.","retype":"Retype Password cannot be null.","username":"Username length must be between 6 to 64 characters."},"links":[],"success":false,"target":{"password":"","retype":"","username":""}},"target":"/user/session/sign-up"},"statusCode":"UNPROCESSABLE_ENTITY","statusCodeValue":422}
The payload is text/plain.
compressed payload (157 character):
ᯡࠥ䅬ࢀጨᎡ乀ஸ̘͢¬ߑ䁇啰˸⑱ᐣ䱁ሢ礒⽠݉ᐮ皆⩀p瑭漦!-䈠ᷕ7ᡑ刡⺨狤灣મ啃嵠ܸ䂃ᡈ硱䜄ቀρۯĮニᴴဠ䫯⻖֑点⇅劘畭ᣔ奢⅏㛥⡃Ⓛ撜u≂㥋╋ၲ⫋䋕᪒丨ಸ䀭䙇Ꮴ吠塬昶⬻㶶Т㚰ͻၰú}㙂沁⠈ƹ⁄᧸㦓ⴼ䶨≋愐㢡ᱼ溜涤簲╋㺮橿䃍砡瑧ᮬ敇⼺ℙ滆䠢榵ⱀ盕ີ‣Ш眨રą籯/ሤÂR儰Ȩb,帰Ћ愰䀥․䰂m㛠ளǀ䀭❖⧼㪠Ө柀䀠
decompressed payload (383 character):
{"headers":{},"body":{"message":{"errors":{"password":"Password length must be at least 8 characters.","retype":"Retype Password cannot be null.","username":"Username length must be between 6 to 64 characters."},"links":[],"success":false,"target":{"password":"","retype":"","username":""}},"target":"/user/session/sign-up"},"statusCode":"UNPROCESSABLE_ENTITY","statusCodeValue":422}
在这一点上,我现在可以验证任何 String
我的 Springboot 应用程序压缩的值,浏览器可以解压并得到原始 String
.但是有一个问题。当我检查浏览器调试器时,如果传输的消息的大小实际上减少了,它告诉我事实并非如此。这是未压缩的原始消息 (598B):
a["MESSAGE destination:/user/session/broadcast
content-type:application/json;charset=UTF-8
subscription:sub-0
message-id:5lrv4kl1-1
content-length:383
{"headers":{},"body":{"message":{"errors":{"password":"Password length must be at least 8 characters.","retype":"Retype Password cannot be null.","username":"Username length must be between 6 to 64 characters."},"links":[],"success":false,"target":{"password":"","retype":"","username":""}},"target":"/user/session/sign-up"},"statusCode":"UNPROCESSABLE_ENTITY","statusCodeValue":422}
虽然这是原始压缩消息 (589B):a["MESSAGE destination:/user/session/broadcast
content-type:text/plain;charset=UTF-8
subscription:sub-0
message-id:5lrv4kl1-2
content-length:425
á¯¡à ¥ä¬à¢á¨á¡ä¹à®¸Ì͢¬ßäå°Ë¸â±á£ä±á¢ç¤â½Ýá®çâ©pç漦!-ä á·7á¡å¡âº¨ç¤ç£àª®ååµÜ¸äá¡ç¡±äáÏۯĮãá´´á䫯â»Öç¹âåçá£å¥¢âã¥â¡âæuâã¥âá²â«äáªä¸¨à²¸ääá¤å塬æ¶â¬»ã¶¶Ð¢\u2029ã°Í»á°Ãº}ã᥸æ²âƹâ᧸ã¦â´¼ä¶¨âæ㢡ᱼæºæ¶¤ç°²â㺮橿äç¡ç§á®¬æ⼺âæ»ä¢æ¦µâ±çີâ£Ð¨ç¨àª°Ä籯/á¤ÃRå°È¨b,帰Ðæ°ä¥â¤ä°mãளÇäâ⧼㪠Өæä \u0000"]
调试控制台指示未压缩的消息以 598B 的大小传输,其中 383 个字符作为消息有效负载的大小(由 content-length
header 指示)。而另一方面,压缩消息的总大小为 589B,比未压缩的小 9B,消息负载大小为 425 个字符。我有几个问题:content-length
以字节或字符表示的 STOMP 消息? content-length
未压缩消息的 383 比压缩消息的 425 小? content-length
压缩消息的 425,与 Java 控制台(使用 lzStringCompressed.length()
)中返回的值 157 不同,考虑到未压缩消息是使用 content-length
传输的383 个,这与 Java 控制台中的长度相同。两者都通过 charset=UTF-8
转移编码。 content-length
压缩消息的 425,与 Java 控制台中返回的值不同(使用 lzStringCompressed.length()
),即 157,但 JavaScript 代码 payload.length
返回 157,而不是 425? application/json
的消息?不受影响,只有 plain/text
变得臃肿? 虽然 9B 的差异仍然是一个差异,但我正在重新考虑压缩/解压缩消息的开销成本是否值得保留。我要测试其他
String
值。
最佳答案
所有的问题都是密切相关的。
- Is the
content-length
of the STOMP message indicated in bytes, or in characters?
正如您在 STOMP specification 中看到的那样:
All frames MAY include a
content-length
header. This header is an octet count for the length of the message body....
从 STOMP 的 Angular 来看,主体是一个字节数组,头部是
content-type
和 content-length
确定主体包含什么以及应该如何解释它。
- Why does the
content-length
of the uncompressed message, which is383
, smaller than that of the compressed message, which is425
?
因为转换为
UTF-8
这是在您将信息发送到 STOMP 服务器中的客户端时执行的。你有一条消息,一个
String
, 这个消息是由一系列字符组成的。无需详细介绍 - 请查看 this或 this other one如果您需要更多信息,很好的答案 - internally , 每
char
Java 中的 Unicode 代码单元表示。要在某个字符集中表示这些 Unicode 代码单元,
UTF-8
在您的情况下,可能需要可变数量的字节,在您的特定情况下从 1 到 4。对于未压缩的消息,您有
383
char
s,纯 ASCII,将被编码为 UTF-8
与一个 byte
每 char
.这就是您在 content-length
中获得相同值的原因。标题。但压缩消息的情况并非如此:当你压缩消息时,它会给你任意数量的字节,对应于
157
char
s - Unicode 代码单元 - 带有任意信息。获得的字节数将少于原始消息。但是随后您将其编码为 UTF-8
.其中一些 157
char
s 将用一个 byte
表示,与原始消息的情况一样,但由于压缩消息的信息的任意性,在许多情况下更有可能需要两个、三个或四个字节来表示其中一些。这就是为什么您获得的字节数大于未压缩消息的字节数的原因。
- Does this mean reducing the character length does not always necessarily means reducing the size?
通常,在压缩数据时,您将始终获得少量信息。
如果信息足以使压缩值得使用,并且您有能力发送压缩的原始二进制信息 - 类似于服务器发送指示
Content-Encoding: gzip
的信息时。或 deflate
,它可以给你带来很大的好处。但是,如果客户端库只能处理文本消息而不是二进制消息,例如 SockJS,如您所见,编码问题实际上可能会给您带来不适当的结果。
为了缓解这个问题,您可以首先尝试将您的信息压缩为其他中间编码,例如
Base 64
,这将给你大约 1.6
压缩的字节数的倍数:如果此值小于未压缩的字节数,则压缩消息可能是值得的。在任何情况下,正如规范中所指出的,STOMP 是基于文本的,但也允许传输二进制消息。此外,它表示 STOMP 的默认编码是
UTF-8
,但它支持消息正文的替代编码规范。如果您正在使用,正如您的代码所建议的那样,
stomp-js
- 请注意,我没有使用过这个库,因为 documentation表明,似乎也可以处理二进制消息。基本上,您的服务器必须发送带有
content-type
的原始字节信息。带有值的标题 application/octet-stream
.然后,库可以在客户端使用类似于以下内容的方式处理此信息:
// within message callback
if (message.headers['content-type'] === 'application/octet-stream') {
// message is binary
// call message.binaryBody
} else {
// message is text
// call message.body
}
如果这可行,并且您可以通过这种方式发送压缩信息,如前所述,压缩可以为您带来很大的好处。
- Why does the
content-length
of the compressed message, which is425
, not the same with the value returned in the Java console (usinglzStringCompressed.length()
) which is157
, considering that the uncompressed message was transferred with acontent-length
of383
, which is the same length in Java console. Both too are transferred withcharset=UTF-8 encoding
.
考虑
length
的 Javadoc String
的方法类(class):Returns the length of this string. The length is equal to the number of Unicode code units in the string.
如您所见,
length
方法将为您提供表示 String
所需的 Unicode 代码单元数。 , 同时 content-length
header 将为您提供在 UTF-8
中表示它们所需的字节数如前所述。实际上,计算字符串的长度可能是tricky task .
- Why does the
content-length
of the compressed message, which is425
, not the same with value returned in the Java console (usinglzStringCompressed.length()
) which is157
but the JavaScript code payload.length returns157
, not425
?
因为,正如您在 documentation 中看到的那样,
length
在Javascript中还表示String
的长度对象在 UTF-16
代码单位:The
length
property of aString
object contains the length of the string, inUTF-16
code units.length
is a read-only data property of string instances.
- If it really gets bloated during the transfer, why does the message with
application/json
remained unaffected and only thetext/plain
gets bloated?
如上所述,它与
Content-Type
无关。但随着信息的编码。
关于javascript - 压缩后的 Java String 的长度与作为 WebSocket 消息发送时的 content-length 不相等,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63952094/