When non-breaking space characters (U+00A0) are sent in chat, they are converted to space characters. However, spaces are cleaned up (removed at the beginning and end, etc.) before this conversion, meaning that NBSPs can be used to bypass it.
Examples: ("#" means a NBSP character)
a##########b##########c creates a lot of spaces between a, b, and c
#####abc creates a lot of spaces before the chat message "abc"(Since sometime between 17w06a and 17w15a, Minecraft now removes NBSPs at the beginning of the message)# creates a "blank" chat message containing only a space
The cause
Decompiled via MCP 9.24 beta:
net.minecraft.network.NetHandlerPlayServer.processChatMessage(CPacketChatMessage)
/**
* Process chat messages (broadcast back to clients) and commands (executes)
*/
public void processChatMessage(CPacketChatMessage packetIn)
{
// ...
else
{
this.playerEntity.markPlayerActive();
String s = packetIn.getMessage();
s = StringUtils.normalizeSpace(s); // <- here
// ...
}
}
net.minecraft.network.NetHandlerPlayServer.processChatMessage(CPacketChatMessage)
calls org.apache.commons.lang3.StringUtils.normalizeSpace(String)
, which uses its WHITESPACE_PATTERN
.
Mojang can fix this by first replacing all NBSP with spaces via s.replace('\u00A0', ' ')
(though it is kind of hacky), or by using their own pattern instead of Apache's WHITESPACE_PATTERN
.
Original description
Copy-pasting non-breaking space characters (or using Opt-Space on a Mac) into chat causes them to be converted into normal spaces when the chat message is sent. However, this can cause bugs as they can be stringed together to create multiple spaces in a row, something that is not possible with regular spaces. Also, this can be used to send "blank" chat messages (only a space).
A way to fix:
It seems to me that the game first changes all double-spaces to single space characters, then checks if the message is empty, and then converts non-breaking spaces to spaces.
To fix the issue, the game should first convert non-breaking spaces to spaces, then change double-spaces to single spaces and check if the message is empty.
Possibly not a minecraft bug, see this comment.
Linked issues
Attachments
Comments 15
Looks like it is not a Minecraft bug.
The problem lies in the StringUtils
(Apache) class. The Pattern
WHITESPACE_PATTERN
is probably incorrect.
WHITESPACE_PATTERN
Pattern.compile("(?: |\\u00A0|\\s|[\\s&&[^ ]])\\s*");
The Apache method normalizeSpace(String)
first trims the text using the method String.trim()
), which does not trim the \u00A0
character, see Character.isWhiteSpace(char)
.
After that it replaces all parts that match with the pattern with a space. As the pattern is not working correctly it replaces every single \u00A0
However, even if the pattern would work as supposed
/**
* A regex pattern for recognizing blocks of whitespace characters.
* The apparent convolutedness of the pattern serves the purpose of
* ignoring "blocks" consisting of only a single space: the pattern
* is used only to normalize whitespace, condensing "blocks" down to a
* single space, thus matching the same would likely cause a great
* many noop replacements.
*/
it would could single \u00A0
to stay whereas multiple \u00A0
become a space.
Please link to this comment
And can anyone confirm this? If so please close this report as invalid
Isn't this behavior the purpose of the nbsp character - that it isn't coalesced into a single space? That's how it works in HTML and in most other contexts...
@@unknown, but I think the whole point of this replacement is to prevent spam messages or messages of that kind, so I assume it should apply to nbsp characters as well. Otherwise I do not understand why this normalization is done in the first place.
I just checked, and normalizeSpace
explicitly checks for NBSP and replaces it with a space, but doesn't remove duplicates (note that Character.isWhitespace for NBSP is false). The analysis refers to some regex, but that regex doesn't exist anymore.
I don't think removing duplicate space to prevent spam makes too much sense; it's just as easy to spam, say, _
or █
, except those symbols are even wider than a space.
The behavior with a NBSP allowing sending an empty message is different and I'd say that actually is an issue (note that it also happens for other spaces, including the ogham space (
)).
Confirmed for 16w14a.