Since the way structures are saved has changed for 1.6.4, the save-all is causing huge lagspikes on servers with bigger maps.
In 1.6.2, save-all didn't cause any lagspikes.
1.7.2 and the lagspikes still occurs.
Attachments
Comments 24
Even when /save-all does not cause long pauses, I often see the following exception:
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
at java.util.HashMap$KeyIterator.next(HashMap.java:928)
at dg.a(SourceFile:21)
at dg.a(SourceFile:314)
at dg.a(SourceFile:23)
at dg.a(SourceFile:314)
at dg.a(SourceFile:23)
at dr.a(SourceFile:94)
at dr.a(SourceFile:84)
at dr.a(SourceFile:23)
at ayc.a(SourceFile:93)
at ayc.a(SourceFile:75)
at mj.a(SourceFile:616)
at mj.a(SourceFile:601)
at bg.b(SourceFile:36)
at z.a(SourceFile:58)
at net.minecraft.server.MinecraftServer.g(SourceFile:674)
at nn.run(SourceFile:84)
at java.lang.Thread.run(Thread.java:722)
As well as:
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819)
at java.util.ArrayList$Itr.next(ArrayList.java:791)
at mj.a(SourceFile:455)
at aow.a(SourceFile:225)
at aow.a(SourceFile:96)
at mi.b(SourceFile:146)
at mi.a(SourceFile:175)
at mj.a(SourceFile:604)
at bg.b(SourceFile:36)
at z.a(SourceFile:58)
at net.minecraft.server.MinecraftServer.g(SourceFile:674)
at nn.run(SourceFile:84)
at java.lang.Thread.run(Thread.java:722)
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at mj.a(SourceFile:455)
at aow.a(SourceFile:225)
at aow.a(SourceFile:96)
at mi.b(SourceFile:146)
at mi.a(SourceFile:175)
at mj.a(SourceFile:604)
at bg.b(SourceFile:36)
at z.a(SourceFile:58)
at net.minecraft.server.MinecraftServer.g(SourceFile:674)
at nn.run(SourceFile:84)
at java.lang.Thread.run(Thread.java:722)
The stack traces happen most often when RCON is used to issue the /save-all command. But I suspect they are hidden and happen in the background when /save-all is issued by OP from the Minecraft client.
This is related. In my case, save spikes are indeed caused by minecraft.dat. Client timeouts can occur if a) the server is already behind (Can't keep up!) and b) the save takes long enough. The combination can cause the client to think the server is no longer responding. It is my understanding that the timeout threshold is 20 seconds. So if it takes 10 seconds to do a /save-all (random), and your server is 10 seconds behind, you have a high chance of timeouts.
Still have this issue in 1.7.4. When the game saves the world, either automatically by the game, or externally by cron, there is a save spike that continues to increase over the course of time until it is great enough to cause the clients to disconnect.
I have a start script that deletes the Mineshaft.dat file. Every time it deletes the file, the file has grown to about 1MB. It only takes 30 minutes for it to grow to that size. When the file is small, /save-all is always fast. But when it grows to about 512KB to 1MB, the /save-all can take up to 40 seconds.
There is a direct correlation to save spikes and the size of Mineshaft.dat, at least on my server.
Total world size is currently 43GB. The world was restarted when 1.7.0 was released, so there are no legacy chunks from prior versions. And, of course, no mods.
Now that Mineshaft.dat has grown to 21,228,624, the server can take up to 10 seconds to save, often results in Full GC event.
To see these events, here's how I launch the server:
java -server -Xmx4G -Xms1024M -XX:PermSize=128m -XX:MaxPermSize=256m -XX:InitiatingHeapOccupancyPercent=80 -XX:+PrintGCDetails -Xloggc:server_gc.log -XX:+UseG1GC -XX:+UseFastAccessorMethods -XX:+UseStringCache -XX:+OptimizeStringConcat -XX:+UseBiasedLocking -XX:+UseNUMA -Xnoclassgc -XX:OnError=/Users/steve/Minecraft/Swim3/scripts/jvm_on_error.sh -XX:OnOutOfMemoryError=/Users/steve/Minecraft/Swim3/scripts/jvm_on_out_of_memory.sh
Here is what the GC is saying when this occurs:
400.910: [Full GC 3941M->1988M(4096M), 10.3670910 secs]
This means it took > 10 seconds to do a Full GC with only 6 minutes of uptime, 4GB memory. This is unusual. Deleting the Mineshaft.dat eliminates the Full GC events.
It's certainly an improvement, but I think this is still too long and this item should be re-opened.
Also still getting the following stack trace in the console during save (14w20b):
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at on.a(SourceFile:536)
at on.a(SourceFile:516)
at ayu.a(SourceFile:226)
at ayu.a(SourceFile:99)
at om.b(SourceFile:161)
at om.a(SourceFile:192)
at on.a(SourceFile:697)
at bz.b(SourceFile:36)
at z.a(SourceFile:65)
at net.minecraft.server.MinecraftServer.g(SourceFile:748)
at qt.run(SourceFile:80)
at java.lang.Thread.run(Thread.java:722)
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at on.a(SourceFile:536)
at on.a(SourceFile:516)
at ayu.a(SourceFile:226)
at ayu.a(SourceFile:99)
at om.b(SourceFile:161)
at om.a(SourceFile:192)
at on.a(SourceFile:697)
at bz.b(SourceFile:36)
at z.a(SourceFile:65)
at net.minecraft.server.MinecraftServer.g(SourceFile:748)
at qt.run(SourceFile:80)
at java.lang.Thread.run(Thread.java:722)
... and ...
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819)
at java.util.ArrayList$Itr.next(ArrayList.java:791)
at on.a(SourceFile:536)
at on.a(SourceFile:516)
at ayu.a(SourceFile:226)
at ayu.a(SourceFile:99)
at om.b(SourceFile:161)
at om.a(SourceFile:192)
at on.a(SourceFile:697)
at bz.b(SourceFile:36)
at z.a(SourceFile:65)
at net.minecraft.server.MinecraftServer.g(SourceFile:748)
at qt.run(SourceFile:80)
at java.lang.Thread.run(Thread.java:722)
14w21b
[00:45:05] [RCON Client #838/ERROR]: Couldn't process command: 'save-all'
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[?:1.7.0_07]
at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[?:1.7.0_07]
at oq.a(SourceFile:704) ~[minecraft_server.jar:?]
at bz.b(SourceFile:37) ~[minecraft_server.jar:?]
at z.a(SourceFile:75) [minecraft_server.jar:?]
at z.a(SourceFile:59) [minecraft_server.jar:?]
at net.minecraft.server.MinecraftServer.g(SourceFile:760) [minecraft_server.jar:?]
at qv.run(SourceFile:80) [minecraft_server.jar:?]
at java.lang.Thread.run(Thread.java:722) [?:1.7.0_07]
[00:47:21] [RCON Listener #2/INFO]: Rcon connection from: /192.168.0.3
14w21b; Java 1.7.0_60-b19; Mac OS 10.9.2
[08:17:36] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 2416ms behind, skipping 48 tick(s)
s: Saving entity NBT
at ub.e(SourceFile:1112)
at ub.d(SourceFile:1074)
at ayx.a(SourceFile:210)
at ayx.a(SourceFile:99)
at op.b(SourceFile:161)
at op.a(SourceFile:192)
at oq.a(SourceFile:700)
at net.minecraft.server.MinecraftServer.a(SourceFile:339)
at net.minecraft.server.MinecraftServer.v(SourceFile:524)
at net.minecraft.server.MinecraftServer.run(SourceFile:422)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
at aku.k(SourceFile:127)
at akv.a(SourceFile:128)
at acc.b(SourceFile:228)
at ub.e(SourceFile:1100)
... 10 more
s: Saving entity NBT
at ub.e(SourceFile:1112)
at ub.d(SourceFile:1074)
at ayx.a(SourceFile:210)
at ayx.a(SourceFile:99)
at op.b(SourceFile:161)
at op.a(SourceFile:192)
at oq.a(SourceFile:700)
at net.minecraft.server.MinecraftServer.a(SourceFile:339)
at net.minecraft.server.MinecraftServer.v(SourceFile:524)
at net.minecraft.server.MinecraftServer.run(SourceFile:422)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
at aku.k(SourceFile:127)
at akv.a(SourceFile:128)
at acc.b(SourceFile:228)
at ub.e(SourceFile:1100)
... 10 more
s: Saving entity NBT
at ub.e(SourceFile:1112)
at ub.d(SourceFile:1074)
at ayx.a(SourceFile:210)
at ayx.a(SourceFile:99)
at op.b(SourceFile:161)
at op.d(SourceFile:220)
at oq.c(SourceFile:179)
at net.minecraft.server.MinecraftServer.w(SourceFile:570)
at np.w(SourceFile:284)
at net.minecraft.server.MinecraftServer.v(SourceFile:506)
at net.minecraft.server.MinecraftServer.run(SourceFile:422)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
at aku.k(SourceFile:127)
at akv.a(SourceFile:128)
at acc.b(SourceFile:228)
at ub.e(SourceFile:1100)
... 11 more
[08:17:53] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 2468ms behind, skipping 49 tick(s)
[08:17:56] [RCON Listener #2/INFO]: Rcon connection from: /192.168.0.3
I have seen /save-all cause 20+ second pauses, which, combined with existing pauses from a busy server, is enough to cause clients (netty) to timeout.
Observed in 1.7.2.