mojira.dev
MC-150824

1.14 Server using 100% of CPU and RAM

I've seen other reports of this... but it is NOT resolved. So don't mark this as duplicate please, because it is very much an issue and HUGE one at that, affecting every single server I have come across playing in this version. There is so much rubber banding and server lag, because even when no one is playing, 100% of the CPU core is being used! Please fix this! It's unplayable!

Linked issues

Attachments

Comments 26

Damien Boyington

Happens to me all the time, even when nobody's on.

I use JDK and force MC to use all cores and it still constantly is ~1,200ms-15,000ms behind. Averages around 2,000-3,000ms when no one is logged in. Server has 6 cores and 12Gig of RAM, uses ~30% CPU with no one on. Does not scale at all.

Cannot reproduce. What JVM arguments are you using? Can you provide a debug profiling?

/debug start

(wait for a few)

/debug stop

and join the report you'll find in debug folder.

Here my Debug, 1 core of my (8 cores / 16 threads) GOES 100% all the time.

[media]

@unknown Try disabling the server's gui with the nogui option.
Also, what are your JVM arguments?

16 more comments
David Chamberlin

The bug, I'm referring to on SystemUtils is that it can never reach the code to use the direct executor service over the fork join pool in the case of a single core, because the available cores minus one is being clamped to a value 1 to 7.  At least, it was that way in 1.14.3.

If I recall correctly, it would stall on 0% when creating the spawn areas, waiting for the main thread to finish because it couldn't allocate the worker thread in the pool on a single core.

David Chamberlin

But the more relevant concern is the whether the availableProcessors() method on that particular JVM under your VM is reporting the correct value (all the time).  Because it had been reported that on some VMs it could report only one core, or might report the hardware number of cores, not the number allocated to the VM.

Well, I don't know of course. I don't have access to the server code. The server does not appear to be stalling at start, though. World generation seems to proceed smoothly in the reproducer, above. Restart of an existing, production, 1.14.3, server also seems to proceed smoothly. For that same, existing, server the symptoms on the virtual guest include:

1) An, infrequent, number of can't-keep-up-messages when the server is completely idle and a veritable barrage of same when there are players present.

2) On the production guest, I configure the CPU as pass-through and include the TSC clock. When that configuration is used the guest kernel reports that the TSC is behaving badly, enough to be unusable, and switches to the HPET timer-counter.

3) Persistent lag and rubber-banding behaviors during game play. Network does not seem to matter. It happens to players on long-haul links and on the (10 Gb/s wired) LAN.

On the host side, when the server is idle CPU load hovers around 100 to 103 percent. It goes up only a small amount when players are present.

Most of those symptoms are displayed by the reproducer I gave. I tried to simplify the reproducer by using the generic qemu CPU (Haswell) and I didn't enable the TSC. Also, I only tried game-play from the wired LAN.

Do those symptoms sound like they might be related to this JVM availableProcessors() bug you are talking about? If so, do you have a workaround you would care to share?

 

 

 

David Chamberlin

Hard to say if they are related.   HPET is per system and higher precision, but uses more resources, but allows for better sync when usign multiple cores.  Whereas TSC is per CPU and faster, and synchronizes across all cores on Nehalem+ CPUs. 

Might want to try both HPET standalone, and TSC with HPET as a backup. 

If your core is spending a lot of time servicing interrupts, then changing these could help.

Also, not sure if it will help, but there is a difference in running a server in SMP vs. Pre-Empt, or an actual Realtime kernel.  As these will also affect interrupts and their priorities.

Did you try yet with -pre6 ?  I've heard other reports that from pre5 to pre6, some had an improvement in performance.

 

 

I did that. The production VM is configured with both clocks. The kernel, on the guest and very quickly after the minecraft server starts, reports that the TSC is unusable and switches to HPET. The test VM is configured HPET only. Both configurations reliably reproduce the problems.

Tyler Kim

(Unassigned)

Confirmed

Performance

lag, multiplayer

Minecraft 1.14, Minecraft 1.14.2

Retrieved