Developer report
ThreadedAnvilChunkStorage
overuses synchronization, resulting in massive performance drops to the server.
Because most of the methods are synchronized, the File IO Thread writing a chunk out is done under a synchronized context, blocking the entire class.
The main thread then will hit this class to load chunks, save chunks, check if chunks exists, etc, and will be blocked while any asynchronous chunk saving is occurring.
This totally defeats the entire purpose of having asynchronous IO, and drastically impacts performance.
Please see my patch (licensed MIT, or WTFPL if that makes it even easier legally) for my improvements to the Chunk Save process that has worked fine for us for years: https://github.com/PaperMC/Paper/blob/ffd9c779233fb6e5bb428d865e87c4f3d46607e0/Spigot-Server-Patches/0062-Chunk-save-queue-improvements.patch
Comments 4
I'm not set up right now with any kind of decompile for 1.13.1, and your patch doesn't look like it's for what MCP was calling AnvilChunkLoader.java. Is this bug unrelated to MC-119971? I haven't gotten a chance to have a look at how Mojang fixed that – did they roll their own solution or did they use mine as a reference?
Thanks.
my patch is based on mc-dev mappings, not mcp.
I believe that MC-119971 is only fixed due to the fact that this file is over synchronized and blocks the getChunkAt call until done....
As soon as they fix this issue, they would reintroduce your issue.
They really need to implement my entire patch, which uses a queue to know what work to do, and then uses the map to track latest version of what to save.
I have carefully placed synchronizes to ensure that the removed entry from the map is actually the entry that just got saved, because there is another race condition in this code otherwise:
Chunk can be saved
Chunk is polled on IO thread, saving, but a new version of the Chunk is queued up at same time
IO Thread deletes entry from map, erasing the new one instead of old, making further getChunkAt calls load stale data.
My patch as linked solves all of these issues.
Issue still remains in 1.13.1 because a new synchronize was added to the FileIO thread, resulting in the exact same problem.
Please see https://github.com/PaperMC/Paper/blob/ffd9c779233fb6e5bb428d865e87c4f3d46607e0/Spigot-Server-Patches/0062-Chunk-save-queue-improvements.patch for my work on optimizing this area of the code.
Ultimately, I would love to see all of this work brought into vanilla (all of my work is licensed MIT).