mojira.dev
MC-154629

server v1.14.2 running in kvm guest uses 100% of CPU on host

Running the 1.14.2 jar in a Debian 9.9 or 10 guest uses 100% of a core on the host when idle, with no players. With players, it uses at least 100% of a core. A `top' on the guest indicates the running server is using 6 to 8 percent of the vCPU.

 

/proc/interrupts shows the local timer interrupts increasing at around 180K/sec on the guest and host.

 

To reproduce:

1) On a Debian 9.9 system, craft a qemu-kvm guest with sufficient memory  and disk

2) Download the 1.14.2 jar, start it with the flags given on the download page

3) Wait for the world to be created

4) Enjoy the "can't keep up" messages.

 

Profiling with visualvm seems to indicate a large majority of the time is spent in netty/epoll-wait. Really, like, almost all...

 

An strace, on the guest, of the most active thread just pours out the following:

 

futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=733}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=724}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=753}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=742}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=669}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=710}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=743}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=700}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=736}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=729}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f5be0a2d418, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f5be0a2d468, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=710}) = -1 ETIMEDOUT (Connection timed out)

 

Seems a rather short time to wait...

 

On the host side, the qemu-kvm profiling seems to indicate that the guest thread is spending all of it's time servicing timer interrupts.

Linked issues

Comments

Till Mueller

@Lee Ward

I am aware that I am resurrecting a very old issue here, but has this problem every been fully fixed for you? I am running into it with all Minecraft server versions >= 1.14 (KVM VM running Debian 9 / 10; host running Debian 9 / Ubuntu 19.04) and none of the reports marked as "resolved" here have offered a solution to this issue for me. I am kinda lost and very much hope that there has been some development concering this issue?

Thank you very much in advance,
Till

Lee Ward

No. It is not fixed, nor mitigated at all. Honestly, I just got tired of going around and around with the devs on the issue. If I'm remembering correctly, the final substantial activity from them was when they tried to reproduce and couldn't. I suspect they tried to reproduce using a Linux guest on a Windows host, which will not reproduce the issue.

Really sorry. I just gave up. The overhead I incurred to test plus the lack of meaningful effort on the part of the Mojang developers just left me feeling super frustrated.

As an aside, my efforts lead me to believe either the server is not using the Java Netty package in a nice way or that there is a bug in the Netty package. I checked the Netty bug reports and, yes, others are having the same problems with it. Still... See https://github.com/netty/netty/issues/5896 In that case, the reporting individual ditched it and wrote their own. Doesn't bode well for a fix in Minecraft then 😞

Till Mueller

First of all, sorry for not seeing your message sooner - I still got to figure out how I make this Jira instance send me emails.

Thank you so much for responding! It's unfortunate that this seems to have not been looked at properly, maybe I'll try to get another bug report raised when I have some time for it. You're probably right that this is somehow linked to the netty issue (thanks for sharing that!) so if the MC developers do not recognize this on their side maybe we'll have to resort to using some external fix if there ever is one (not sure if this is even possible, I am not familiar enough with these codebases).

Anyway, at least it's good to know that this issue is not due to me doing something blatantly wrong. However, I am wondering whether it might be possible to get one of the alternative Minecraft server projects (Spigot / Paper) to fix this issue? I did experience the same issue with every alternative I tried, but maybe they'll be more responsive - I'll have a look into that.

Again, thank you for sharing your knowledge with me. Should I figure something out I'll make sure to share it here.

Till

Krister Bäckman

Just signed up to comment I'm running into this as well. Debian 10 as KVM host and Ubuntu 18.04 as VM. Running 1.15.2 version of Minecraft with default-jre jvm.

Stuart Tuvey

Also experiencing the issue, Ubuntu 18.04 on both host and guest, hypervisor is KVM. MC 1.15.2 running on openjdk-8-jre-headless.

Stuart Tuvey

Following up, I was able to resolve my issue by switching the guest to FreeBSD 12. With this guest, a basic survival world created with default settings idles at ~6% CPU (observed from the host, not the guest) with no players logged in. The guest was created with virt-manager default settings, other than the amount of RAM and cores.

Lee Ward

Just to clarify, the host was unchanged, still Ubuntu 18.04?

Stuart Tuvey

Correct, the host was unchanged, Ubuntu 18.04.

Side note: I also tried an OpenBSD guest, but I found that even after fiddling with libvirt's clock settings I couldn't get the idle CPU usage (without Minecraft running, observed from the host) much below 20%, so I gave up on it.

El Santo

Hi!

 

Same problem here, running debian 10 64 bits on host and guest.

 

Please fix this.

 

Thank you

El Santo

This is marked as solved, but the problem is not solved.

 

The problem is high CPU usage on host side, while low on guest side

 

Switching guest OS to FreeBSD is not a solution neither.

Arisa Bot

Thank you for your report!
We're tracking this issue in MC-149018, so this ticket is being resolved and linked as a duplicate.

That ticket has already been resolved as Fixed. The fix will arrive in the next version or is already included in the latest development version of the game, you can check the Fix Version/s field in that ticket to learn more.

If you haven't already, you might like to make use of the search feature to see if the issue has already been mentioned.

Quick Links:
📓 Issue Guidelines – 💬 Community Support – 📧 Customer Support – ✍️ Feedback and Suggestions – 📖 Game Wiki
-- I am a bot. This action was performed automagically! Please report any issues in Discord or Reddit

Robert C.

This is still an issue as of 1.16.2.  It also seems to have migrated to MC-183518 with additional information.  It doesn't appear to be getting traction there either.  I'm surprised considering how many MC servers are hosted on VPS.  At the end of the day, it's just under $20 USD per year in wasted electricity for me, and having one fewer core to borrow idle time from in my host.  In the meantime, I posted information in the comments on the above-linked issue.

Krister Bäckman

I have found a workaround for this. After discussion on the VFIO discord

https://discord.com/channels/244187921228234762/244190447147286529/790969366476095501

they suggested to change a kernel parameter on the host `kvm.halt_poll_ns=0` 

You can apply it on the host at runtime as root with

```
echo 0 > /sys/module/kvm/parameters/halt_poll_ns

```
The default value on archlinux seems to be 200000.

Lee Ward

Krister's workaround, above, verified. Tested with 64 bit Debian 10.8 host and guest running stock kernels.

Guest usage fell from 100% to ~24%

Power usage on the host server decreased from ~113W to ~87W

Thank you very much, Krister!

 

Lee Ward

(Unassigned)

Unconfirmed

(Unassigned)

Minecraft 1.14.2

Retrieved