momcilo

momcilo on MC-149018 2019-06-10T11:18:36Z

@Maxou I could not affect the timerslack_ns due to the old kernel on my 16.04, so I installed a new 18.04, mscs and rsynced the same world there.

Findings on hosting the server inside VM (allocated 2 cores + 4GB RAM)

cpu usage was still high on freshly installed machine, that had only minimum required to run the minecraft server + mscs
I set 200000ns for the process using the CPU, and CPU usage inside vm dropped to just 10-14% (from 25), however the hypervisor itself was still using 35% with all cores being set to at 2.5 to 3.1 GHz (I disabled the TurboBoost for troubleshooting). Attempts to affect the VBoxHeadless through the same mechanism did not provide any change

Clearly, setting default timer slack is not an option for the VM, so I tried to the hardware itself:

installed mscs, synced the workd again and ran it without modifying the timerslack_ns.
the cpu usage went closer to idle and is now 13-14%, with all cores having frequency below 1GHz (finally!) The temperature of the CPU finally dropped
I moved another world (lots of redstone), and this one was running at 25% (all cores ~1.01 GHz)

So far we know that:

issue affects VMs (at least VirtualBox based, but possibly others)
issue affects some of the CPUs and some are unaffected. Mine is modest 2 core (4 thread) Intel Core i7-5557U

For me the issue is mitigated at the moment since I moved to the direct hardware hosting, but at this point we need the developers to look into it.

IMHO: 15-25% on 1.01 GHz for an empty server looks too much.

momcilo on MC-149018 2019-06-09T07:20:27Z

@Maxou: as a user with a same problem, I am curious about your configuration differences. IMHO: having 2 different machines with different symptoms is perfect for troubleshooting.

Could you please check the content of following files: /proc/${PID}/timerslack_ns (you need to find out actual PID of the running minecraft server). e.g. if the PID is 8132.
```
PID=8132 cat /proc/${PID}/timerslack_ns
```

What else could be different between these machines? (e.g. kernel options)
Do they have any running hypervisor on any of the machines?
Do the processes run as regular accounts or root?
Perhaps the output of the following can be compared?
```
dpkg -l | sort
```

momcilo on MC-149018 2019-06-08T11:11:54Z

In addition I found an interesting article dissecting the implementation of parkNanos()

https://hazelcast.com/blog/locksupport-parknanos-under-the-hood-and-the-curious-case-of-parking/

momcilo on MC-149018 2019-06-08T10:56:31Z

I hope this qualifies as 'constructive'. 😃

I made some additional tests.

1. On Ubuntu 16.04 LTS VM running with Openjdk 1.8u191 ~ 20-25 % CPU usage on idle. (hypervisor was idle Intel NUC5I7RYH)

2. I ran the same version of the server on Windows 10 host with Oracle OpenJDK 1.8u181, the server was idle very soon after the initialization

3. Tried the same version of Oracle JDK, on the Ubuntu 16.04, and had the ~20-25% CPU

4. Enabled jmx console on the Ubuntu server and connected to monitor from Windows 10 host. I've used the Samples to monitor for the CPU usage and found out that the majority of CPU time spent is inside "Server Thread", namely in java.util.concurrent.locks.LockSupport.parkNanos () In total over 90%

5. I repeated the same as 4 on Windows 10 laptop with I7, and majority of time was still in parkNanos(), but with way less CPU usage ~ 50% and way less time spend in total!

Suggestions to the developers:

How many nano-seconds are you actually passing to the parkNanos()?
Would it be possible to make this configurable?
Is there an alternative, since it appears this function hogs the CPU on Linux systems?

momcilo

Assigned

Reported

Comments