mojira.dev

Barteks2x

Assigned

No issues.

Reported

MC-263430 Memory leak in TickingTracker Fixed MC-251456 Matrix4f.invert is broken Duplicate MC-235967 Block entity ticking checks chunk (target) ticket level instead of actual chunk ticking state when checking for ticking chunk, resulting in block entities being sometimes ticked in border chunks Awaiting Response MCL-19244 Launcher randomly quits Duplicate MC-229557 Mushroom generation very rare due to light check with no light calculated while placing mushroom feature Duplicate MC-170134 Minecraft uses several times more VRAM than needed after exploring terrain for a while Fixed MC-162522 Setting render distance to values >8 chunks freezes the game for >1 second (minutes for higher values) Fixed MC-159361 Chunks near the edge of view distance aren't sent to client after movement Confirmed MC-158988 Minecraft will remove up to several hundred chunks per region if region file isn't exact multiple of 4096 bytes Fixed MC-141286 Teleporting big distance freezes the server Fixed MC-128547 Minecraft keeps previously used World in memory if the player has hit any entity in that world Fixed MC-121586 Watchdog thread kills the server if average tick time is larger than max-tick-time/(15*20) ms for too long Fixed MC-120079 Chunks not rendering at certain positions when joining server where the chunks have been already loaded Awaiting Response MC-109120 Parts of area under blocks set with /fill are still fully lit clientside Duplicate MC-107257 Possible rare concurrency issue may cause changes in chunk to be reverted Fixed MC-62558 Rendering empty chunks/cubes in 14w29b (or number of rendered chunks on debug screen (C) is wrong) Fixed

Comments

This could also be fully fixed by writing chunks to regions in batches instead of one by one, which I have implemented as a proof of concept mod (is it ok to link the code here?), while keeping the safety properties regarding data loss. It makes a huge difference with an HDD, time to save the world can go down from minutes down to the normal few seconds. This would likely also benefit servers, as it would allow to actually use that option without being forced to have an SSD.

First Minecraft no longer uses gl 1.1. and the issue is long fixed afaik. So no this is not a possible cause. And non-VBO rendering mod is no longer even available.

How did you check it? If you check the ticket level (getFullStatus()), that's the same thing as what it's doing. That is getting the target/ticket level. The actual "is ticking" would be ServerChunkCache.isTickingChunk which checks whether the ticking chunk future in ChunkHolder is complete and successful.

They don't keep ticking in border chunks, it's only that they tick in chunks that are going to become ticking as soon as things finish loading, but aren't yet because some chunks are still loading. This is not going to be easy to see ingame without a debugger attached.

While I have not looked at it in mod dev environment, I can confirm that the issue still exists on 1.16-pre4, I also recorded a video showing what I do ingame together with video memory usage: https://www.youtube.com/watch?v=kZ7_GPrkdUU

At 6:08 I reload all chunks and video memory usage immediately begins going down to where it started. Then I show a second cycle.

I can still reproduce it in 1.15 pre3, shown in this video https://www.youtube.com/watch?v=itf6gPlFTmc

While it eventually fixes itself after a while of waiting, I have to be in the same place for a while for it to do so.

This appears to be the same or similar issue as MC-159361. And while the value that appears in the log may have been fixed, the effect seen ingame is not.

If the edge chunks are never rendered, why would they be part of render distance? Also they sometimes appear to be rendered - for example when switching from higher render distance to lower value. Client chunk cache value after changing from lower to higher render distance value seems to confirm that - the value is lower than what render distance would imply. For example when changing from render distance 2 to 8 (after ensuring that the entire surrounding area is fully generated), the expected amount of chunks would be 289, but it is 225 instead (corresponding to render distance 7)

I also now verified that editing the code to bind 0 if the ID is negative solves this issue, but then creates gl errors.

I haven't looked deeper into the code to find what exactly what change makes it attempt to bind and use deleted buffers, but it might be a result of removing VBOs as separate option, since I also found that in older version, this was actually the cause of changing render distance with VBOs disabled being slow. In older version, when VBOs were enabled, MC didn't attempt to use deleted buffers, but when they were disabled, it did attempt to use already deleted display lists (ID -1). Fixing it in 1.12.2 (It's probably exactly the same in 1.14.4) for display lists the same way (using value 0 instead when it's -1) has the same effect - a few gl errors but changing render distance is fast.

I think I figured out what the REAL issue is and it's completely unlike what I thought it would be. And of course not without first getting very confused on false track of what the issue is. A tl;dr answer of what is actually wrong is at the bottom, there follows a full story of me discovering it.

First I modified 1.12.2 and latest snapshot code to track currently used GL buffer IDs. And I noticed, that 1.12.2 has always the last id as a very large number, while latest snapshot, has just 1, 2, 3 and 4. This turned out to be completely irrelevant to the issue, but this put me on the right track.

I started looking at mesa source code, since this issue seems to exclusively affect mesa-based GPU drivers.

Grepping the code for glGenBuffers, I quickly found the implementation in src/mesa/main/bufferobj.c. From there we can follow that code into the _mesa_HashFindFreeKeyBlock() call in src/mesa/main/hash.c here we can see 2 code paths: one in if (maxKey - numKeys > table->MaxKey) and another where this condition is false. Without fully understanding what this line did, my initial assumption was that the fast path, is when the max allocated ID is already the maximum, and the slow path, is otherwise. That turned out to be completely wrong.

Mesa is actually going to attempt to keep giving sequential IDs as long as possible and unless Minecraft somehow managed to exhaust all 32-bit integers, this should never reach the slow path. After quick search through the code, we can find that the only other place where mesa writes to MaxKey is in  _mesa_HashInsert_unlocked which is called, among other places, from glBindBuffer.

In order to see where this is called from, I modified this mesa code to check for value close to max 32-bit unsigned in, and cause segfault (because that generates a useful hs_err file with a java stacktrace in it). With that change, I quickly crashed the JVM and got my answer: When binding VertexBuffer during VBO upload, it ends up binding already-deleted VertexBuffers. This gives mesa an ID of -1 which ends up being converted to 4294967295. This doesn't break anything, bit has the effect of always triggering the slow path in _mesa_HashFindFreeKeyBlock, resulting in extremely change of render distance.

 

TL;DR:

Minecraft is binding already deleted VertexBuffers, which it internally assigns an ID of -1. This ends up changing mesa driver internal state in a way that always triggers a very slow code path for glGenBuffers.

After a bit of testing in mod development environment with latest snapshot, I was able to find a fix/workaround in code in GlStateManager. It's not exactly the best solution, but works really well in practice and requires very little code changes:

 

private static final int GENBUFFERS_BATCH = 2048;
    private static IntBuffer BUFFERS;

    public static int genBuffers() {
        RenderSystem.assertThread(RenderSystem::isOnRenderThreadOrInit);
        if (BUFFERS == null || !BUFFERS.hasRemaining()) {
            if (BUFFERS == null) {
                BUFFERS = BufferUtils.createIntBuffer(GENBUFFERS_BATCH);
            }
            BUFFERS.rewind();
            GL15.glGenBuffers(BUFFERS);
            BUFFERS.rewind();
        }
        return BUFFERS.get();
    }

This confirms that the issue is the huge amount of glGenBuffers calls, but it's still unclear to me why it's a performance issue to begin with, especially with glDeleteBuffers not being an issue at all.

A good solution would be to allocate and delete all the chunk buffers in one glGenBuffers call but that doesn't work nicely with the way this code is written.

 

I also verified that this code didn't change in any significant way even since 1.12.2 so I really have no idea what has changed here that suddenntly makes this code path so much slower.

The issue still persists with 19w45b. VisualVM sampler still shows the same parts of code involved. But now it correctly shows GPU information on F3 screen.

This time again it didn't occur on the first attempt so initially I almost thought it's fixed. The first 1 or 2 times I set render distance, it works just fine even when setting it to values as high as 22, but after that it still takes well over a second to set it to 8 or 10, and whole minutes to set it above 16 chunks.

 

Aside of intel GPU on linux, I was able to also reproduce this issue with linux nouveau drivers on nvidia GPU (running on nvidia GeForce GT 740M. On the same computer+OS, the issue doesn't occur when using official nvidia drivers.

This has nothing to do with block entities. This happened with 1.12.2 when VBOs are disabled. Now it happens almost always. I haven't analyzed the code yet, but it's most likely due to the intel drivers on linux taking much more time on creating buffers (or at least MC triggering some slower code path there?).

I'm quite sure this no longer applies to 1.14.x, as a lot of the code is different now, but I haven't verified it yet.

It's probably going to be the easiest to see the effects on render distance 3. Normally, when all chunks are loaded and rendered on the client, the farthest parts of the terrain should be deep within fog.

To reproduce it:

Create a new world, set render distance to 3 (works with any render distance but easiest to see with 3), go far (a few hundred blocks) away from spawn area. As you walk around, you will see the new chunks being loaded, and the edge of loaded terrain will be clearly visible, not in fog. Sometimes getting almost as close as 1 chunk away.

 

To see the correct behavior, set view distance to 7, unpause, wait a while, and set it back to 3, without moving. The edge of loaded terrain will be hidden in fog, noticeably farther away.

No, the linked issue is a completely different scenario, that myself I thought it's actually working as intended, and the behavior hasn't changed... ever since multiplayer support was added to MC. Server doesn't send chunks outside of server view distance (obvious) and as a consequence of server controlling unloading chunks on the client, it also unloads chunks that may otherwise be in client view distance. Obvious consequence of the way the game works.

This issue is about the chunks within view distance not being sent within correct view distance on singleplayer (and in some way also on server), when client and server have the same view distance set.

 

The main issue is that chunks appear to be sent to client only within "(server?) view distance - 2" chunks, but chunk unloading happens only on "(client?) view distance + 1". This looks like the server is only sending chunks when it's in a state where entities can be updated, which requires 32 blocks distance to be loaded in each direction. So when moving around even in already generated areas, the player will never be able to see the full render distance.

I just realized that it may be a duplicate, as I didn't search for similar issues in 1.14.x yet. I first noticed it in 19w34a and only searched for issues in that version, and confirmed it for 1.14.4 while writing this issue report.

This appears to be fixed in 1.13.

 

The memory leak from WorldGenBigTree is actually fixed by MinecraftForge, that's likely why I didn't notice it when first testing it and analyzing with forge. Only later I confirmed the issue with vanilla and found that hitting an entity doesn't always fix it.

17w45b makes no difference for me. It still consumes all the available memory.

It's an issue in the same part of code, but a different issue. The fix suggested in MC-121196 could also fix this one (but I haven't analyzed the code completely yet)