[MC-63590] Watchdog shuts down server

To set the server watchdog to a higher time or switch it off:

Set max-tick-time in server.properties to a value higher than 60000 or to -1
https://minecraft.fandom.com/wiki/Server.properties#max-tick-time

The server crashes randomly. Most recent crash report attached.

Description: Watching Server

java.lang.Error
	at java.util.ArrayList.indexOf(ArrayList.java:298)
	at java.util.ArrayList.contains(ArrayList.java:281)
	at java.util.ArrayList.batchRemove(ArrayList.java:700)
	at java.util.ArrayList.removeAll(ArrayList.java:671)
	at aqa.i(SourceFile:1339)
	at ql.i(SourceFile:470)
	at net.minecraft.server.MinecraftServer.y(SourceFile:605)
	at ph.y(SourceFile:303)
	at net.minecraft.server.MinecraftServer.x(SourceFile:529)
	at net.minecraft.server.MinecraftServer.run(SourceFile:445)
	at java.lang.Thread.run(Thread.java:744)

Update: This crash appears to be caused by a new “feature“ called the Server Watchdog. This is a thread that kills the server when there is too much lag. Sometimes, but by no means always, this is logged accordingly:

2014-07-26 21:06:11 [Server Watchdog/FATAL]: A single server tick took 35.28 seconds (should be max 0.05)
2014-07-26 21:06:11 [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.
2014-07-26 21:06:13 [Server Watchdog/ERROR]: This crash report has been saved to: /opt/wurstmineberg/server/./crash-reports/crash-2014-07-26_23.06.12-server.txt

Thanks @unknown for finding this.

Linked issues

is duplicated by 221

MC-7007 Enderman Name Tags are not high enough Resolved

MC-63624 Server Crash Resolved

MC-63668 Server Crash when using /fill Resolved

MC-63671 Server sometimes crashes with high randomTickSpeed Resolved

MC-63726 Watching Server java.lang.Error Resolved

216 more links

relates to 1

MC-77255 Server: ConcurrentModificationException Resolved

Attachments

2014-08-05-4.log

crash-2014-07-31_11.26.44-server.txt

crash-report.txt

Comments 97

Fenhl (Max Dominik Weber) 2014-07-25T08:00:06Z

About a dozen crashes later, I am fairly certain that the crash happens when a certain part of the Nether is unloaded, ~30 seconds after using a specific portal to the Overworld. The region of the Nether in question has a MC-15019 style item dupe in it, so the same bug that causes the dupe may be responsible for this crash.

Anthony Martin 2014-07-25T17:37:59Z

Typical log entry prior to watchdog kicking in:

[20:05:23] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 30556ms behind, skipping 611 tick(s)
[20:05:26] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 2918ms behind, skipping 58 tick(s)
[20:05:59] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 3954ms behind, skipping 79 tick(s)
[20:06:49] [Server Watchdog/FATAL]: A single server tick took 44.60 seconds (should be max 0.05)
[20:06:49] [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.
[20:06:51] [Server Watchdog/ERROR]: This crash report has been saved to: /Users/steve/Minecraft/Swim3/./crash-reports/crash-2014-07-24_20.06.51-server.txt
[20:06:51] [Server Shutdown Thread/INFO]: Stopping server

Anthony Martin 2014-07-25T17:44:06Z

The criteria Watchdog is using might be a little overzealous. It'd be nice if instead of arbitrarily stopping the server, if there was a way for it to execute an external script, that'd be cool.

JVM offers such an option, e.g.:

-XX:OnOutOfMemoryError=/Users/steve/minecraft/scripts/jvm_on_out_of_memory.sh

If we were allowed the option to execute our own script, we could kick all of the players, save, then stop the server. Often, when the server is told to just stop during a overloaded condition, it just sits there for a long time. In my experience, kick/save/stop is typically more reliable than just stop, during an overloaded condition.

Anthony Martin 2014-07-26T20:18:48Z

Also, I don't think it's always appropriate for the server to shut down due to being behind. There are cases where network conditions can cause the server to fall behind. When the network conditions clear up, the server can catch up, if it weren't for the watchdog. Just considering the server crashed because of network conditions seems like an overreaction.

Fenhl (Max Dominik Weber) 2014-07-27T07:47:19Z

Okay, so what I described above is only one very specific cause of a much more general problem. Loading and unloading the Nether causes lag, and the watchdog kills the server when there's too much lag. I can see why a behavior such as this could be desirable, but in my opinion the threshold is much too low and the watchdog should be optional.

87 more comments

Anthony Martin 2014-07-28T00:30:37Z

In my case, the chunks near spawn tend to have a lot of wither skulls floating motionless in the sky. On my server, the area about 2,000 around spawn is sometimes full of withers that attack at random, throwing their skulls everywhere. At some point, (not sure if this is still the case) skulls that enter an unloaded chunk freeze in place. I believe some of these chunks take longer to load as result.

Anthony Martin 2014-07-28T06:13:00Z

Fenhl, you said:

... the crash happens when a certain part of the Nether is unloaded, ~30 seconds after using a specific portal to the Overworld. ...

Are you sure it wasn't caused by something in the Overworld itself? Because I just noticed something. There are two main locations I frequent in my world. One had villagers, one did not. Then I cured a new zombie villager at the location without villagers. Now the watchdog takes down the server when I AFK at both locations. The place without villagers had far fewer crashes, if any (there are other players who may have villagers too). Now it's bad at both locations now that they both have villagers.

So, do you have villagers at the place you refer to as "a specific portal to the Overworld"?

To verify this, I went to The End, where there were never any villagers. I went AFK for quite some time, no watchdog event. Then I spawned a villager in The End, after about 10 minutes, I got a watchdog crash.

I believe it might even have to do with the villager proximity to the player.

Come to think of it, the 10 minute period might be related to the day/night cycle. It might be related to doors placed before 14w30c.

I managed to capture a debug report just before the watchdog thread took the server down:

http://pastebin.com/NqpmH0HC

DaMaloma 2014-07-29T12:19:50Z

I got a lot of crashes like this when somone had created one of these: MC-45568. I'm actually grateful that watchdog alerted me to the situation by crashing the server, but maybe the threshold is a bit low.

Jason C. McDonald 2014-07-30T16:37:32Z

Confirmed for 14w31a. We get this a lot, rather randomly. Not much pattern to it, though often it happens when an entity is spawned/summoned (even an arrow). I'm getting annoyingly fast on the server restart, but I had to do it seven times in ten minutes this morning. Oy.

That said, it is always preceded by "Can't keep up" messages, as with the others. For whatever reason, the crashes seem to slow down the more people on my server. Eh??

Anthony Martin 2014-07-30T16:47:36Z

I confirm it's still happening in 14w31a:

[09:43:57] [Server Watchdog/FATAL]: A single server tick took 60.00 seconds (should be max 0.05)
[09:43:57] [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.
[09:43:59] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[09:43:59] [Server Watchdog/ERROR]: This crash report has been saved to: /Users/steve/Minecraft/Swim3/./crash-reports/crash-2014-07-30_09.43.59-server.txt
[09:43:59] [Server Shutdown Thread/INFO]: Stopping server

Here's the crash report itself: http://pastebin.com/3W4SM6BZ

naturalismus 2014-07-30T17:26:36Z

@Anthony Martin: The crashes also happen when there is no villager around. As Fenhl said " Loading and unloading the Nether causes lag, and the watchdog kills the server when there's too much lag." Sometimes this happens when new chunks are generated, sometimes it happens when some Chunks are unloaded, sometimes it happens due to too many chickens, or players.

@Jason McDonald: On our server the problem seems to be worse with more people.

Anthony Martin 2014-07-30T17:46:43Z

I hope this is related, but I notice that my max tick is now 60.00, and at the same time, I noticed a new key in server.properties:

max-tick-time=60000

lapppy 2014-07-30T18:09:09Z

Still in 14w31a.
Setting "/gamerule doMobSpawning false" seems to stop it (for me at least), but that prevents mob spawning.

Anthony Martin 2014-07-30T20:53:04Z

I just confirmed that "/gamerule doMobSpawning false" eliminates crashes even when I'm near villagers. I think that points to Zombie AI being the problem.

Frisk Dreemurr 2014-07-30T21:02:56Z

I think that this problem is only dependent on the lag, nothing else.

Anthony Martin 2014-07-30T22:48:00Z

Absolutely, it's only dependent on the lag. But lag is caused by something, which this feature will help us diagnose. One of those things might be Zombies pathing to Villagers.

DaMaloma 2014-07-31T10:28:19Z

I had to revert to 14w30c on my server because of this - so for me 14w31a is much, much worse. I tried with my old world and with a fresh one. It took somewhat longer for the fresh one to start crashing so I thought it was only world related until the fresh world also started crashing.

Anthony Martin 2014-07-31T14:51:36Z

I also reverted back to 14w30c after testing 14w31a. Too many crashes for me as well.

Aleksiki2000 2014-07-31T15:15:50Z

Anthony Martin:
Absolutely, it's only dependent on the lag. But lag is caused by something, which this feature will help us diagnose. One of those things might be Zombies pathing to Villagers.

MC-17630 may cause the lag.

William Sandey 2014-07-31T20:45:29Z

Getting a server crash also since upgrading to 14w31a this morning (attached crash report)

Slackware64 14.0
Java 1.8.0_05

Jacob Brown 2014-07-31T21:03:06Z

I can confirm this as it happens to me as well, but only on the 14w31a snapshot. Same error as well.

Dale s 2014-07-31T21:21:13Z

I also have the same problem with 14w31a. Any one knows how to fix this?
I went back to 14w30c. I get the lag 53 ticks, but crash maybe 1 time a day.

Are we just going to have to wait Aug 5. when they have the offical 1.8?

Jacob Brown 2014-07-31T22:26:37Z

Ok, just read this on the wiki, apparently the fix "might" be easy, i still need to test it; but try setting "max-tick-time=60000" to something like "max-tick-time=36000000". you can find this in the server.properties.
EDIT: It seems to have worked, it has not crashed after it's usual 10 minutes of running. it has now been 30 minutes. looks like that was it. hope this helps.

Dale s 2014-07-31T22:51:31Z

what about the lag. with the other snapshot.. all i get is 53 tick skipped. How much tick are you skipping with the new snapshot. are people able to play.. are they having lots of block lag??

Dale s 2014-08-01T00:32:47Z

OK..i upgraded to the new SnapShot 14w31a. But i get alot more lag info.
"[20:29:16] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 8861ms behind, skipping 177 tick(s)"

It used to always be "skipping 53 tick(s)" now its aways higher then 53.
No one has complained about block lag yet. That might be a Temp fix for now. But something is causing alot of lag on the server.

Thanks.

Update: It took any hour to crash. Crash the same way.
I will make the max-tick-time=360000000

does this tell you anything?

[21:04:01] [Server thread/ERROR]: Encountered an unexpected exception
u: Ticking entity
at net.minecraft.server.MinecraftServer.y(SourceFile:610) ~[minecraft.jar:?]
at pk.y(SourceFile:303) ~[minecraft.jar:?]
at net.minecraft.server.MinecraftServer.x(SourceFile:530) ~[minecraft.jar:?]
at net.minecraft.server.MinecraftServer.run(SourceFile:446) [minecraft.jar:?]
at java.lang.Thread.run(Thread.java:722) [?:1.7.0]
Caused by: java.lang.NullPointerException
at aay.l(SourceFile:207) ~[minecraft.jar:?]
at aay.k(SourceFile:155) ~[minecraft.jar:?]
at xl.bE(SourceFile:450) ~[minecraft.jar:?]
at xk.l(SourceFile:1430) ~[minecraft.jar:?]
at xl.l(SourceFile:316) ~[minecraft.jar:?]
at afc.l(SourceFile:39) ~[minecraft.jar:?]
at afm.l(SourceFile:151) ~[minecraft.jar:?]
at xk.j(SourceFile:1298) ~[minecraft.jar:?]
at xl.j(SourceFile:194) ~[minecraft.jar:?]
at afc.j(SourceFile:44) ~[minecraft.jar:?]
at aqh.a(SourceFile:1410) ~[minecraft.jar:?]
at qo.a(SourceFile:601) ~[minecraft.jar:?]
at aqh.g(SourceFile:1388) ~[minecraft.jar:?]
at aqh.i(SourceFile:1281) ~[minecraft.jar:?]
at qo.i(SourceFile:479) ~[minecraft.jar:?]
at net.minecraft.server.MinecraftServer.y(SourceFile:606) ~[minecraft.jar:?]
... 4 more

Joe Crump 2014-08-01T19:08:50Z

FYI: That call stack includes a Skeleton entity (class name "afm" in snapshot 31a).

Adam Conway 2014-08-04T07:00:19Z

[21:23:27] [Server Watchdog/FATAL]: A single server tick took 60.00 seconds (should be max 0.05)
[21:23:27] [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.

Yup. Same here. Crash report for this one is here: https://bugs.mojang.com/browse/MC-65257

Nathan Adams 2014-08-04T13:54:53Z

This is the server just shutting down because it was lagging too much. You can configure the threshold in server.properties ("max-tick-time"), but it is not a bug.

Adam Conway 2014-08-04T14:04:12Z

In my case there is no lag whatsoever until the server shuts down. I have a video of this. Server plays fine, mobs move, blocks place fine and then the server shuts down saying a tick has not passed when it obviously has. I have video evidence of this.

This issue only started in 14w31a, a downgrade to 14w30c works fine. The issue IS a bug in 14w31a. Raising the threshold does nothing.

Anthony Martin 2014-08-04T15:17:58Z

https://twitter.com/inertia186/status/496310644455374848

lapppy 2014-08-04T18:24:10Z

@Dinnerbone Can you at least look at some of the causes of this? Natural mobspawning causes huge tick lag in the recent snapshots which is most likely a common reason for this crash.

https://bugs.mojang.com/browse/MC-58120

https://bugs.mojang.com/browse/MC-63708

William Sandey 2014-08-05T14:11:16Z

I have "max-tick-time" set to 1920000 and still getting the crash. How high is it supposed to be to stop the crashing from occurring?

Anthony Martin 2014-08-05T16:43:14Z

Have you tried:

max-tick-time=-1

If it's like the other properties, maybe that's how you disable it.

Anthony Martin 2014-08-05T21:01:38Z

What pisses me off about this so called "works as intended" situation is, about haft the time, when the server considers itself crashed, the java process has to be killed (e.g.: killall -9 java) because this stupid "watchdog" thread doesn't always shutdown cleanly. But of course, it "works as intended" so it will never be fixed, even if I could pry out a crash report.

Net result I get a useless "Considering it to be crashed, server will forcibly shutdown" message and the server sits there for hours on end until I notice it and force kill java. Thanks for that.

And if I create a new bug detailing this, it'll naturally be marked as duplicate of this one.

Anthony Martin 2014-08-05T21:11:24Z

If it helps, here's the crash report that resulted in a java process that required kill -9 to recover from:

http://pastebin.com/X8q1KgwH

Nathan Adams 2014-08-06T09:49:17Z

It calls System.exit(1). If this is not killing the process for you, then there is a bug in the version of java you're using and is completely out of our hands.

From the next snapshot, you may use a value of -1 to disable the watchdog, but you could just as easily use a value of 9,223,372,036,854,775,807 (the maximum supported) until then.

As for the root causes: there are an uncountable number of possible causes, some helpable some not, each one is its own issue and not to be collected in this issue. The watchdog is literally a case of: "something is lagging the server very much, and instead of letting it hang for a long period of time until somebody realizes it and has to kill the process I'm going to kill it for them, so they can at least notice and clean up."

Nathan Adams 2014-08-06T09:55:45Z

I've made it try even harder to shutdown (halting the process) after 10 seconds of requesting an exit, which should absolutely guarantee that the server stops unless it is definitely a bug in your java version.

Adam Conway 2014-08-06T10:35:21Z

Dinnerbone I feel you did not even read my comments.

There is no lag leading up to the crash. The server operates as normal until my players get a Netty error.

I have tried this on three different Ubuntu boxes trying both Java 7 and 8 on all three. Same result.

Nathan Adams 2014-08-06T10:37:39Z

Give me some log output, please.

Sean Moran 2014-08-06T12:21:41Z

14w30c with Java 8.

https://docs.google.com/file/d/0B1ESg-ZJ8WoiMVhGUEpRWDQ2aF9McVcxVk5zYWlPMzVoSU9F/edit?usp=docslist_api

Adam Conway 2014-08-06T14:48:47Z

https://bugs.mojang.com/browse/MC-65257 this was on one of my boxes.

Also can get the video of it running fine until the crash if needed.

Joe S 2014-08-06T15:13:57Z

Dinnerbone you said to just do this

"From the next snapshot, you may use a value of -1 to disable the watchdog, but you could just as easily use a value of 9,223,372,036,854,775,807"

Would you suggest setting the max value OVER completely disabling watchdog?

Thank you for your insight.

Anthony Martin 2014-08-06T16:56:49Z

I found the root cause for me. So, think about this. The initial watchdog was 30 seconds. Then the default was changed to 60 seconds. That's how long I had to find and deal with:

http://cl.ly/image/1a2L050t1A08

Yes, those were "natural" spawn. That is to say, no admin purposefully placed them. They spawned during the snapshots. The watchdog didn't help this, it just got in the way.

Anthony Martin 2014-08-06T17:17:08Z

Instead of a thread that just stops the server when there are problems, it'd be nice if there was a way for admin to be able to disable individual threads that are causing problems. In my case, it would have been nice to have had the ability to toggle mob AI. For example, this would have been nice:

/gamerules doMobAI false

Anthony Martin 2014-08-06T21:49:09Z

@unknown, here's an example where System.exit(1) failed and I had to do kill -9 instead. Is it a race condition? (14w32a, java 1.8.0_11, Mac OS X 10.9.4)

[14:40:07] [User Authenticator #96/INFO]: UUID of player RCminecraft2013 is 587fb1ea-fb5e-43bd-91ad-7419aa9f5efe
[14:40:08] [Server thread/INFO]: RCminecraft2013[/38.86.65.14:55776] logged in with entity id 315398810 at (5555.5, 36.0, -5557.5)
[14:40:08] [Server thread/INFO]: RCminecraft2013 joined the game
[14:40:08] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:08] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:08] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:08] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:08] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:10] [Server thread/WARN]: Can't keep up! Did the system time change, or is the server overloaded? Running 4794ms behind, skipping 95 tick(s)
[14:40:10] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:10] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:11] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:11] [Server thread/INFO]: [@: Successfully spread 1 players around 5500.5,-5499.5]
[14:40:13] [Server thread/INFO]: Minecraftio_RD has just earned the achievement [Time to Strike!]
[14:40:14] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:17] [Server thread/INFO]: Minecraftio_RD has just earned the achievement [Hot Topic]
[14:40:17] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:23] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:23] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:23] [RCON Client #2188/INFO]: [Rcon: Given [Rules] * 1 to RCminecraft2013]
[14:40:23] [Server thread/ERROR]: Encountered an unexpected exception
u: Exception ticking world
	at net.minecraft.server.MinecraftServer.y(SourceFile:602) ~[minecraft_server.jar:?]
	at pl.y(SourceFile:305) ~[minecraft_server.jar:?]
	at net.minecraft.server.MinecraftServer.x(SourceFile:530) ~[minecraft_server.jar:?]
	at net.minecraft.server.MinecraftServer.run(SourceFile:446) [minecraft_server.jar:?]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_11]
Caused by: java.util.ConcurrentModificationException
	at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:886) ~[?:1.8.0_11]
	at java.util.ArrayList$Itr.next(ArrayList.java:836) ~[?:1.8.0_11]
	at aqm.a(SourceFile:2316) ~[minecraft_server.jar:?]
	at aqy.a(SourceFile:73) ~[minecraft_server.jar:?]
	at qq.c(SourceFile:182) ~[minecraft_server.jar:?]
	at net.minecraft.server.MinecraftServer.y(SourceFile:598) ~[minecraft_server.jar:?]
	... 4 more
[14:40:23] [Server thread/ERROR]: This crash report has been saved to: /Users/steve/Minecraft/Swim3/./crash-reports/crash-2014-08-06_14.40.23-server.txt
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Client #2189/INFO]: [RCminecraft2013: Played sound 'loz_key' to RCminecraft2013]
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Client #2190/INFO]: [Rcon: Title command successfully executed]
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Client #2191/INFO]: [Rcon: Title command successfully executed]
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [User Authenticator #97/INFO]: UUID of player dragoneith is 834a5b4c-89e3-4222-9d85-2b2223c80a77
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:24] [RCON Client #2195/INFO]: [SWS8964: Played sound 'smas_smb3_thud' to SWS8964]
[14:40:24] [RCON Client #2195/INFO]: [Thelocust88: Played sound 'smas_smb3_thud' to Thelocust88]
[14:40:24] [RCON Client #2195/INFO]: [snowydesert: Played sound 'smas_smb3_thud' to snowydesert]
[14:40:24] [RCON Client #2195/INFO]: [Burnhard14: Played sound 'smas_smb3_thud' to Burnhard14]
[14:40:24] [RCON Client #2195/INFO]: [RCminecraft2013: Played sound 'smas_smb3_thud' to RCminecraft2013]
[14:40:24] [RCON Client #2195/INFO]: [Minecraftio_RD: Played sound 'smas_smb3_thud' to Minecraftio_RD]
[14:40:24] [RCON Client #2195/INFO]: [TheDownfall: Played sound 'smas_smb3_thud' to TheDownfall]
[14:40:24] [RCON Client #2195/INFO]: [SuEk0: Played sound 'smas_smb3_thud' to SuEk0]
[14:40:24] [RCON Client #2195/INFO]: [BusierMold58: Played sound 'smas_smb3_thud' to BusierMold58]
[14:40:24] [RCON Client #2195/INFO]: [inertia186: Played sound 'smas_smb3_thud' to inertia186]
[14:40:24] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:25] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:29] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:39] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:50] [User Authenticator #98/INFO]: UUID of player snowydesert is 54c91052-7477-448d-80ee-3414c986de8e
[14:40:52] [User Authenticator #99/INFO]: UUID of player Thelocust88 is 09dbee4d-4afa-457a-951f-17339775bb65
[14:40:52] [User Authenticator #100/INFO]: UUID of player TheDownfall is 502a4d27-a280-4e65-b620-54718adf816c
[14:40:54] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:40:54] [User Authenticator #101/INFO]: UUID of player SuEk0 is db47a9b5-213f-405f-bfaf-35c21e548eff
[14:40:55] [User Authenticator #102/INFO]: UUID of player BusierMold58 is ab6c18f8-a2ea-4e2a-bfbd-eac0ac20ac29
[14:41:04] [User Authenticator #103/INFO]: UUID of player Minecraftio_RD is 33285fea-ec93-4abf-a6e9-0ae835aa0057
[14:41:07] [User Authenticator #104/INFO]: UUID of player snowydesert is 54c91052-7477-448d-80ee-3414c986de8e
[14:41:09] [User Authenticator #105/INFO]: UUID of player Thelocust88 is 09dbee4d-4afa-457a-951f-17339775bb65
[14:41:19] [User Authenticator #106/INFO]: UUID of player SuEk0 is db47a9b5-213f-405f-bfaf-35c21e548eff
[14:41:20] [User Authenticator #107/INFO]: UUID of player TheDownfall is 502a4d27-a280-4e65-b620-54718adf816c
[14:41:21] [User Authenticator #108/INFO]: UUID of player BusierMold58 is ab6c18f8-a2ea-4e2a-bfbd-eac0ac20ac29
[14:41:29] [User Authenticator #109/INFO]: UUID of player RCminecraft2013 is 587fb1ea-fb5e-43bd-91ad-7419aa9f5efe
[14:42:33] [User Authenticator #110/INFO]: UUID of player dragoneith is 834a5b4c-89e3-4222-9d85-2b2223c80a77
[14:42:37] [User Authenticator #111/INFO]: UUID of player RCminecraft2013 is 587fb1ea-fb5e-43bd-91ad-7419aa9f5efe
[14:42:37] [User Authenticator #112/INFO]: UUID of player Thelocust88 is 09dbee4d-4afa-457a-951f-17339775bb65
[14:42:46] [User Authenticator #113/INFO]: UUID of player snowydesert is 54c91052-7477-448d-80ee-3414c986de8e
[14:43:21] [User Authenticator #114/INFO]: UUID of player BusierMold58 is ab6c18f8-a2ea-4e2a-bfbd-eac0ac20ac29
[14:43:45] [User Authenticator #115/INFO]: UUID of player Thelocust88 is 09dbee4d-4afa-457a-951f-17339775bb65
[14:44:29] [User Authenticator #116/INFO]: UUID of player snowydesert is 54c91052-7477-448d-80ee-3414c986de8e
[14:45:00] [Server thread/INFO]: Saving...
[14:45:00] [Server thread/INFO]: Saved the world
[14:45:01] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:45:01] [RCON Client #2206/INFO]: [Rcon: Saved the world]
[14:45:01] [RCON Listener #2/INFO]: Rcon connection from: /127.0.0.1
[14:45:08] [User Authenticator #117/INFO]: UUID of player BusierMold58 is ab6c18f8-a2ea-4e2a-bfbd-eac0ac20ac29
[14:45:23] [Server Watchdog/FATAL]: A single server tick took 300.01 seconds (should be max 0.05)
[14:45:23] [Server Watchdog/FATAL]: Considering it to be crashed, server will forcibly shutdown.
[14:45:25] [Server Watchdog/ERROR]: This crash report has been saved to: /Users/steve/Minecraft/Swim3/./crash-reports/crash-2014-08-06_14.45.25-server.txt
[14:45:25] [Server Shutdown Thread/INFO]: Stopping server

And the crash report:

http://pastebin.com/d4w0z0is

Nathan Adams 2014-08-07T08:57:39Z

It looks like your server died at 14:40, watchdog realized at 14:45 and then initiated the shutdown which didn't go through. I'll investigate, thank you for the log.

Nathan Adams 2014-08-07T09:01:17Z

Roughly how long was it stuck like this before you had to kill it? Seconds or minutes?

Anthony Martin 2014-08-07T14:38:13Z

This particular example was allowed to sit for about 15 minutes before I killed it. I've seen it sit for hours.

I think it's related to RCON. It seems to happen most if I try to remove massive amounts of entities with RCON. The same commands in-game do not cause it to hang.

Thanks for looking into it. 🙂

Anthony Martin 2014-08-10T07:56:56Z

Another crash that required kill -9 to stop Java. It was allowed to sit for about 15 minutes before kill -9 was used.

This time in 14w32d:

http://pastebin.com/X88kQctM

Sean Moran 2014-08-10T12:33:40Z

It appears that crash doesn't have a casual relation to watchdog. Description says exception in server tick. What's interesting it decided to crash when you had 11 players on.

Anthony Martin 2014-08-10T15:18:07Z

That is correct, the crash in my previous comment (#comment-187211) has no connection with the watchdog. That's because I disabled it by setting it max-tick-time=-1.

@unknown said that all the watchdog does is send System.exit(1) but the crash I posted in my previous comment required an OS kill signal (kill -9), which the watchdog has not been able to deal with, so far.

Sean Moran 2014-08-10T15:29:14Z

Btw, we were able to reduce server tick lag after finding about 200 natural spawned zombies trying to path find villagers. It seems the tracking causes immense strain on servers. Perhaps its the cause.

Andrejs Kilis 2014-08-23T06:27:48Z

Resolution: Works as intended??? My server use to work perfectly fine before watchdog, so please give a server option to shut it down!!!

Todd Dudek 2014-08-24T01:02:45Z

Okay, this just stops it from crashing, but it doesn't solve the massive tick lag spikes that make the game unplayable

Fenhl (Max Dominik Weber) 2014-08-24T14:14:19Z

Those are a different issue.

Todd Dudek 2014-08-24T15:31:31Z

how? the server crashes are caused by watchdog stopping the server because of these massive tick lag spikes, turning off watchdog doesn't not solve the issue

Warren Liddell 2014-08-27T10:54:03Z

I agree, i run 1.6 & 1.7 servers fine not a problem .. 1.8 was fine for a while then all of a sudden 16GB RAM Server isnt enough to please or keep up its ridiculous.

Todd Dudek 2014-08-27T21:56:05Z

If you look in the task manager it doesn't pull a lot of ram but the CPU is absolutely hogged, something is messed up in the code, it has to be

rodney cheney 2014-08-28T16:34:09Z

to clear up alot of the lag i send out a change in difficulty every 5 mins to peacfull then back to hard

it kills all hostile mobs and game stops lagging for the most part till. still get random spikes but every 5 mins there is a no lag period that ive noticed

Pierre Waldén 2014-09-02T16:30:04Z

Ehm.. I understand that shutting down the server if the server takes more than one min to do one game tick, but isn't the issue here that it takes longer than 60 seconds for one game tick in the first place?

I can not see how "the server takes > 60 seconds to do a gametick" (on a fully functional server) can ever be intended, so is there any report related to this one where this is discussed? Because all the posts I find related to this seem to be closed as duplicates of this one.

Anthony Martin 2014-09-02T17:33:01Z

@unknown, it is true that > 60 seconds to do a gametick is a problem in and of itself. The watchdog does not solve these particular problems, it only masks them. Some of these other problems have been discussed in this very "thread" and @unknown has already said he would look at some of these situations. I assume he has by now and couldn't find the root cause.

The fact that the watchdog can be disabled now is a result of the issue you describe. Disabling the watchdog will pretty much be required in order to avoid "closed as duplicate" of this one.

Keybounce 2014-09-05T00:13:21Z

In my experience, a massive lag like this in the server only happens when a major garbage collection is needed while using the default "throughput" garbage collector.

When I switched to using CMS collection, those rare, large garbage delays went away.

Maybe CMS needs to become the new recommended default for servers? Even clients benefit from it in my experience.

Pierre Waldén 2014-09-05T19:50:02Z

@unknown: You are right that garbage collections can produce something similar, but at least on our server this is not the case here.
We already use a different garbage collector (G1 in our case) and have tested this a lot. The garbage collections are done in half a tick (25ms) and not often enough to cause this kind of behavior. The CPU choke is a result of a combination of code changes that were made.

Brandon Enright 2014-09-07T17:54:37Z

If I leave the watchdog timer at the default 60 seconds sometimes my server will kill itself in a few minutes, before any player has attempted to join. I've also noticed that upon starting, the 1.8 server sits at 100% CPU for much, much longer than the 1.7.10 server did.

Brandon Enright 2014-09-07T23:16:21Z

The previous comments mentioning turning off mob spawning have no effect on my server. It lags just as much on peaceful and with mob spawning off.

Lorenz Hahn 2014-09-10T17:15:04Z

I saw these problems on my server, too. Changing the JVM, playing with garbage collection optins didn't help at all.

In the game we recognized that mob's didn't dissapear when they used to. E.g. a pigman left hell with us players and was blocked by a door to return to hell. We took a ride on our horses and were more than 1000 blocks away when we returned. The pigman should have despawned when we returned - but he was still present.

Switching the game to peaceful and back to normal removed the symptoms.

Root cause or other bug?

Has anyone else observed this behavior?

Brandon Enright 2014-09-18T23:37:31Z

I decided to explore garbage collector options and I've found that using the G1 garbage collector (using the flag -XX:+UseG1GC) makes the problem significantly better.

I think garbage collection is the source of the terrible performance and very long lags. What's probably happening is that the server now requires more CPU and memory than it used to require. When a garbage collection happens it takes so long that the server gets behind. When the garbage collection finishes the server detects that it's behind and tries to "catch up" by skipping ticks. This in turn probably causes another garbage collection and so the server gets stuck in an vicious loop where garbage collections cause the server to do work that causes another garbage collection and so forth. Instead of getting useful work done it spend almost all of its time in garbage collection cycles.

The G1 garbage collector is designed to spread the garbage collection work out and not cause long delays which seems to prevent the vicious garbage collection cycle from killing performance.

With G1 now my server just runs slowly with lag at all times. It is now playable, but just barely.

I doubt it's just one thing that needs fixing. I suspect a lot of effort needs to go into using less memory and probably also less CPU so that garbage collection isn't needed so often and when it does happen, there is enough spare CPU left for the garbage collection to not get the server too behind. Right now I think the server can't even keep up without garbage collection so any time spent doing garbage collection means the server gets behind.

Sean Shannon 2014-09-19T03:20:57Z

I've found 2 important things to work around this issue and I hope the mojangsters see this.
#1 for the tick issue, setting the tick rate to 100 seconds seems to drastically reduce the random crashes.

more importantly
#2 running the vanilla server client with the nogui argument DRAMATICALLY improves the response rate of the server.
java -d64 -Xmx2048M -Xms2048M -XX:PermSize=128M -XX:MaxPermSize=256m -jar minecraft_server.1.8.jar nogui -o true

my server was nearly unplayable with only a single person connected and now there's only a tiny bit of lag with 5 people connected.

this issue seems to be very specific and more importantly reproducible.

Pierre Waldén 2014-09-19T04:49:25Z

@unknown:

setting the tick rate to 100 seconds

What does this even mean?
Is rate not 1/t anymore?

Also.. Using nogui is a no brainer.

Sean Shannon 2014-09-19T05:20:32Z

max-tick-time=100000

sorry for not being clear.
and while I agree that running nogui may be normal to some, this is not working as designed.
the 1.7.10 server had zero issues running with the gui open.
the 1.8 server is impossible to run with the gui open.

thusly a bug is apparent.

Brandon Enright 2014-09-19T06:48:09Z

I have always run with nogui. The performance problems are intolerable even with nogui specified.

StevenNL2000 2014-09-19T07:14:31Z

~~I am suspecting that this bug is part of a chain that looks like this: MC-17630 > MC-46812 > MC-63590.~~ This is not true according to @unknown.

Brandon Enright 2014-09-19T07:24:40Z

@StevenNL2000 except that the first bug is about zombies and the second is about animals and other mobs and plenty of people have tested this bug and the load has nothing to do with any mobs. Turning off mob spawning and killing all mobs has no effect.

Don't be fooled by idiots posting in this bug about how they tracked the issue down to <simple fix here> because it isn't mobs or nogui.

StevenNL2000 2014-09-19T07:29:06Z

@unknown, this issue is caused by lag, as lag is the only thing that causes a reduced tickrate, which triggers Watchdog. I am pointing out a likely chain for the lag cause.

Pierre Waldén 2014-09-19T07:35:53Z

@unknown:
It is about lag, and that can be caused by several things. The zombie pathfinding is probably one of the largest problems though.
Saying that a "cant keep up" message has nothing to do with any mob is just stupid and can be explored by putting 10000000 mobs in a world and see what happens.

@unknown:
I agree.. The zombie pathfinding algorithm is probably one of the largest problems causing lag atm due to the way it grows.
(Of course that is not the only problem causing lag, but its would be a good start to optimize this)

Edit: Ops.. Did not see your last post before posting this.

Brandon Enright 2014-09-19T07:51:42Z

@Pierre Waldén

Saying that a "cant keep up" message has nothing to do with any mob is just stupid and can be explored by putting 10000000 mobs in a world and see what happens.

Turning off mob spawning and killing all mobs still results in the problem. Therefore mobs aren't the root cause of the issue even if they may make it worse.

You've committed a logical fallacy known as "affirming the consequent" (look it up). Next time try not to be an idiot.

Pierre Waldén 2014-09-19T08:09:04Z

@unknown

I doubt it's just one thing that needs fixing. I suspect a lot of effort needs to go into using less memory and probably also less CPU

the load has nothing to do with any mobs

Therefore mobs aren't the root cause of the issue even if they may make it worse.

Look at your own posts...

Also... I don not know if I ever said that it was the "root" of the problem. Did i?

Edit:
Your posts so far have been good, so lets not go on in a tangent here and instead focus on the problem. I have since long been trying to put some light on G1GC for example, so I agree with you on a lot of things.

StevenNL2000 2014-09-19T14:43:08Z

@unknown, @unknown, it seems that docm77 found one cause of the lag at https://twitter.com/docm77/status/512882629184454656.

kumasasa 2014-09-19T17:30:27Z

@unknown: Please keep it civil and don't call others idiot.

Fábio Cabrita 2014-09-30T12:29:55Z

Since i am getting the same problem, i made some tests with 1 gig of RAM, it crashs with just one player, and CPU and RAM are always on 100%.
There is no difference playing using 512 megas or 1024 megas.

rowan popat 2014-10-23T15:33:33Z

Hey i created a server yesterday and i worked fin.Then the next day, there where crashes every 10-15 mins.When i look at the crash report this is what its say: Description: Watching Server. I don't understand so can someone please help

kumasasa 2014-10-23T18:56:28Z

@@unknown: Read the read / yellow box in the description of this ticket.

For technical support please use the Mojang Support Center.

rowan popat 2014-10-24T09:13:35Z

Hi @[Mod] Kumasasa what yellow box are you talking about?

user-f2760 2014-10-24T13:23:35Z

where it says this at the top of this ticket....

To set the server watchdog to a higher time or switch it off:
Set max-tick-time in server.properties to a value higher than 60000 or to -1
http://minecraft.gamepedia.com/Server.properties#section_3

kumasasa 2014-10-25T00:23:21Z

That:

To set the server watchdog to a higher time or switch it off:

Set max-tick-time in server.properties to a value higher than 60000 or to -1
http://minecraft.gamepedia.com/Server.properties#section_3

StanleyMines 2015-11-19T09:11:13Z

Just want to double check, In the newer versions, this is fixed, right (allowing numbers higher than 60,000ms)?

rowan popat 2018-01-11T14:01:01Z

Hi just want to say thank you for helping me i changed the max time to -1.
But now it just lags alot how do i stop that?

Scolution 2018-02-01T01:47:20Z

My bug report was flagged as a duplicate but this is about the 2014 update but the same bug is now at the latest snapshot...

[Mod] Neko 2018-02-01T01:52:05Z

@unknown, see the resolution.

Resolution: Works As Intended

Nothing has changed since 2014.

Scolution 2018-02-01T02:44:49Z

Why is my server still crashing then... with the same bug

Warren Liddell 2018-02-01T02:56:18Z

Its not a bug if your server is crashing and you've done the above, then id suggest getting a better server and/or cutting back the amount of users that can be online at any 1 time

Scolution 2018-02-01T03:23:49Z

I've a good server. It works great but not with the latest snapshot..
The server lags alot and we use 8GB ram for it and only playing with a maximum of 5-10 players at the same time.

[Mod] Neko 2018-02-01T03:26:08Z

@unknown:

To set the server watchdog to a higher time or switch it off:
Set max-tick-time in server.properties to a value higher than 60000 or to -1

This should fix your problem.

Scolution 2018-02-01T04:01:14Z

This creates a huge amount of lagg. I already tried it.

Warren Liddell 2018-02-01T04:53:50Z

Then it is quite apparent your server cannot handle what you got running, i would also suggest you add more RAM ..8GB is rather small even for 5-10 ppl .. i would'nt run less than 16GB for a Vanilla server these days imho

WellyPL 2019-02-16T10:15:37Z

I don't understand why so many issues reporting the server performance drop are getting linked to this one. The problem isn't the watchdog itself, but the fact that lately in 1.14 snapshots servers are not optimized and often freeze for a few seconds every minute, sometimes the freeze is so long it takes over a minute resulting in watchdog stopping the server. Neither CPU nor RAM are the problem here as I tested it on machines with different hardware and systems and the problem is the same with every single one of them.

bob tucker 2024-05-05T00:28:18Z

If the server timeout stops when you update the console window (type a key, right click, etc) and you cant see it happening until you update the window, what helped for me is turning off "quick edit mode".

(This is for windows only, i believe). You have to right click on the top bar of the console window while server is running, then go to defaults and turn off quick edit mode. Then restart the server (resetting console window).

No random timeouts anymore!

My problem was that the server was running completely fine (20tps, 10-20 mspt), but would randomly timeout. This seems to have fixed it.