My current GC parameters are -Xmx2560M -Xloggc:/srv/minecraft/server0/logs/gc.log -verbose:gc -XX:+PrintGCDateStamps -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalPacing -XX:ParallelGCThreads=2 -XX:+AggressiveOpts
I'm willing to try something else if you have any suggestions.
The problem is exacerbated as the service runs longer. Running at a view-distance of 12 with an uptime of 120k seconds the lags was almost unbearable.
I have experienced the issue with and with it the -XX:+DisableExplicitGC on my smp server. The flag does not seen to make a difference. The issue seems more pervasive if a player is in a jungle biome. Perhaps the problem is related to the random lighting checks?
I have been noticing GC times increasing as the heap-size expands on my vanilla 1.7.4 SMP server. Once the heap-size has expanded to about 80% of the max significant lag is noticed. This only started happening since updating to 1.7.4 from 1.7.2. We need to restart the MC service every few days or the game becomes unplayable. This lag is accompanied with "Can't keep up!" messages in the server log. We are running a 64-bit JDK 7 with the CMS GC and 2.5GB allowed for heap.
We have been experimenting with lowering the view-distance. We have slowly lowered it from 15 with each service restart. My last test was was with a view-distance of 12 and the symptoms started happening after an uptime of about 100k seconds. We have now lowered the view-distance to 8 to see if it makes a difference.
I have experienced similar issues with SMP in MC-41874. My comments can be found here:
https://mojang.atlassian.net/browse/MC-41874?focusedCommentId=125551&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-125551
Before I moved GC logs to its own file, I was able to gleam this from the console's standard out after the issue had been observed:
2013-12-16 16:54:10,553 WARN [server0]: Can't keep up! Did the system time change, or is the server overloaded? Running 7696ms behind, skipping 153 tick(s)
[GC 1979499K->1835083K(2604416K), 0.0286580 secs]
[GC 1971403K->1833500K(2604416K), 0.0196050 secs]
[GC 1969820K->1836337K(2604416K), 0.0204280 secs]
2013-12-16 16:54:26,196 WARN [server0]: Can't keep up! Did the system time change, or is the server overloaded? Running 5460ms behind, skipping 109 tick(s)
[GC 1972657K->1835088K(2604416K), 0.0216870 secs]
[GC 1971408K->1839286K(2604416K), 0.0254290 secs]
[GC 1975606K->1843388K(2604416K), 0.0289380 secs]
[GC 1979708K->1891431K(2604416K), 0.1430030 secs]
2013-12-16 16:54:43,226 WARN [server0]: Can't keep up! Did the system time change, or is the server overloaded? Running 6222ms behind, skipping 124 tick(s)
[GC 2027751K->1906185K(2604416K), 0.0606420 secs]
[GC 2042506K->1897370K(2604416K), 0.0245430 secs]
[GC 2033690K->1896936K(2604416K), 0.0238370 secs]
2013-12-16 16:54:59,839 WARN [server0]: Can't keep up! Did the system time change, or is the server overloaded? Running 5803ms behind, skipping 116 tick(s)
[GC 2033256K->1900197K(2604416K), 0.0298560 secs]
[GC 2036517K->1899627K(2604416K), 0.0339490 secs]
[GC 2035947K->1901086K(2604416K), 0.0346990 secs]
2013-12-16 16:55:14,918 WARN [server0]: Can't keep up! Did the system time change, or is the server overloaded? Running 5295ms behind, skipping 105 tick(s)
[GC 2037406K->1903222K(2604416K), 0.0368690 secs]
[GC 2039542K->1918848K(2604416K), 0.0429730 secs]
[GC 2055168K->1963249K(2604416K), 0.0681000 secs]
[GC 2099569K->1979768K(2604416K), 0.0318620 secs]
Here is how we look shortly after the service was restarted:
2013-12-16T17:06:41.670+0000: 2.586: [GC 34176K->5096K(123776K), 0.0126660 secs]
2013-12-16T17:06:42.452+0000: 3.368: [GC 39272K->15050K(123776K), 0.0482260 secs]
2013-12-16T17:06:42.578+0000: 3.493: [GC 49226K->56896K(123776K), 0.1089240 secs]
2013-12-16T17:06:42.687+0000: 3.602: [GC 57469K(123776K), 0.0062070 secs]
2013-12-16T17:06:42.777+0000: 3.692: [GC 91072K->93277K(128768K), 0.1311230 secs]
2013-12-16T17:06:43.328+0000: 4.244: [GC 127453K->120743K(156224K), 0.0868910 secs]
2013-12-16T17:06:43.613+0000: 4.529: [GC 154919K->132860K(168320K), 0.0196110 secs]
2013-12-16T17:06:43.633+0000: 4.549: [GC 133513K(168320K), 0.0062470 secs]
2013-12-16T17:06:43.811+0000: 4.726: [GC 166161K->142054K(251336K), 0.0221200 secs]
2013-12-16T17:06:43.833+0000: 4.749: [GC 142725K(251336K), 0.0014100 secs]
2013-12-16T17:06:43.965+0000: 4.881: [GC 176230K->153956K(251336K), 0.0153910 secs]
2013-12-16T17:06:44.071+0000: 4.986: [GC 155974K(251336K), 0.0014620 secs]
2013-12-16T17:07:06.177+0000: 27.093: [GC 185162K->156267K(283004K), 0.0189160 secs]
2013-12-16T17:07:06.360+0000: 27.276: [GC 190443K->167460K(283004K), 0.0220990 secs]
2013-12-16T17:07:06.509+0000: 27.424: [GC 201636K->180087K(283004K), 0.0188000 secs]
2013-12-16T17:07:06.657+0000: 27.573: [GC 214263K->189685K(283004K), 0.0137700 secs]
2013-12-16T17:07:07.134+0000: 28.049: [GC 223861K->199684K(283004K), 0.0135040 secs]
2013-12-16T17:07:08.079+0000: 28.994: [GC 233860K->203110K(283004K), 0.0076930 secs]
2013-12-16T17:07:09.773+0000: 30.688: [GC 237286K->201733K(283004K), 0.0037100 secs]
2013-12-16T17:07:11.449+0000: 32.365: [GC 235909K->203326K(283004K), 0.0040140 secs]
2013-12-16T17:07:14.406+0000: 35.322: [GC 237502K->204060K(283004K), 0.0072080 secs]
2013-12-16T17:07:18.097+0000: 39.012: [GC 238236K->204551K(283004K), 0.0039480 secs]
2013-12-16T17:07:22.447+0000: 43.363: [GC 238727K->206192K(283004K), 0.0056700 secs]
2013-12-16T17:07:26.810+0000: 47.726: [GC 240368K->204933K(283004K), 0.0059880 secs]
2013-12-16T17:07:29.956+0000: 50.872: [GC 239109K->208049K(283004K), 0.0127380 secs]
The Can't keep up!
messages are very consistent and correspond to the observed lag.
We are running a stock Debian 7 (wheezy) server with vanilla MC 1.7.4. The stock java version in this environment is:
java version "1.7.0_25"
OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1~deb7u1)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
This was not an issue in 1.7.2.
I've been having a substantial lag problem since I updated my vanilla server from 1.7.2 to 1.7.4. I'm using the 64-bit openjdk 7 jvm with CMS for my garbage collector and 2.5 gb heap.
At first I thought the problem only happened when 4-5 people were logged in at a time, but it doesn't matter so much. what does matter is how long the instance had been running and how far the heap has expanded.
Unfortunately I've only been reviewing the GC times since the problem started happening. what I have observed is a constant minor gc every few seconds for about two thousandths of a second but as the heap expands to the Xmx setting it takes significantly longer in the hundredths of a second.... sometimes more.
Restarting the service helps for 12-36 hours.
Let me know what other information could be helpful.
I'm sorry you do not understand the explanation but it is nonetheless true. The problem isn't the specific setting, it is how much can be allocated at the time the JVM instance starts. This is related to the JVM implementation and it may behave entirely different on Windows than Linux. I don't understand what point you are trying to make other than you cannot reproduce this issue on Linux with different settings. The OP stated they were using 32-bit Windows and would likely work with lower -Xms/-Xmx setting assuming their OS can allocate it.
As I said, this is not a bug, use a 32 bit JVM or a different -Xms setting.
This is not a bug. It is because you are using a 32bit JVM and it cannot allocate enough contiguous heap space. Use a 64bit version of Java or don't set your min memory that high.
This is happening on 1.7.2 on Linux with 64 bit Java. Before finding this bug I found it was happening with distances of 10 and up. I'm going to try 8 tonight. When it happens I have to constantly hit F3+A to reload the chunks and it is better for a few minutes.
The gui probably wasn't helping but this issue goes far beyond that. I run my vanilla server with {{ -Xmx2560M -Xmx2560M -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalPacing -XX:ParallelGCThreads=3 -XX:+AggressiveOpts -XX:MaxPermSize=256m -XX:PermSize=256m }}. After an indeterminate amount of time (but usually within a day of server start) we start get "Can't keep up!" message all over place.
Given enough time this even happens with just one person logged into the server and gets drastically worse the more people on. As I mentioned, kicking off a new JVM temporarily resolves the issue.
The issue is much more prevalent in 1.7.4 than it was in 1.7.2. I'm attributing this to the view-distance bug that was fixed. I hate to lower the server view-distance all the way to 8 though. We used to run the server at a view-distance of 15 on 1.6.x with no issues.