This is more of a theoretical issue and maybe not so relevant for practical purposes. I mainly post this because any bug makes it hard/impossible to properly reason about code and because this issue is automatically fixed by my proposed solution for MC-177685 , so it is mainly intended as a supplement for that one.
The issue can occur when a player chunk ticket gets scheduled for removal and shortly after another player chunk ticket gets scheduled for addition for the same chunk. The problem is that the scheduled removal will cancel some pending additions at the time the removal is executed.
net.minecraft.server.world.ChunkTicketManager.java
private void updateTicket(long pos, int distance, boolean oldWithinViewDistance, boolean withinViewDistance) {
if (oldWithinViewDistance != withinViewDistance) {
ChunkTicket<?> chunkTicket = new ChunkTicket(ChunkTicketType.PLAYER, ChunkTicketManager.NEARBY_PLAYER_TICKET_LEVEL, new ChunkPos(pos));
if (withinViewDistance) {
ChunkTicketManager.this.playerTicketThrottler.send(ChunkTaskPrioritySystem.createMessage(() -> {
ChunkTicketManager.this.mainThreadExecutor.execute(() -> {
if (this.isWithinViewDistance(this.getLevel(pos))) {
ChunkTicketManager.this.addTicket(pos, chunkTicket);
ChunkTicketManager.this.chunkPositions.add(pos);
} else {
ChunkTicketManager.this.playerTicketThrottlerSorter.send(ChunkTaskPrioritySystem.createSorterMessage(() -> {
}, pos, false));
}
});
}, pos, () -> {
return distance;
}));
} else {
ChunkTicketManager.this.playerTicketThrottlerSorter.send(ChunkTaskPrioritySystem.createSorterMessage(() -> {
ChunkTicketManager.this.mainThreadExecutor.execute(() -> {
ChunkTicketManager.this.removeTicket(pos, chunkTicket);
});
}, pos, true));
}
}
}
The problematic part is basically
ChunkTicketManager.this.playerTicketThrottlerSorter.send(ChunkTaskPrioritySystem.createSorterMessage(() -> { ... }, pos, true));
which will remove all pending tasks at the same chunk residing in a LevelPrioritizedQueue
at the time this task is executed on the ChunkTaskPrioritySystem
.
As a result, if a re-addition of a player chunk ticket is scheduled before earlier removals have finished their execution on the ChunkTaskPrioritySystem
, this can lead to the re-addition being canceled and hence to a loss of chunk tickets.
As it turns out, this is actually quite hard to achieve. The argument for this, however, requires some deeper knowledge of the internal ordering of ChunkTaskPrioritySystem
and is hence in my opinion not really well suited for simple arguments about code correctness. Furthermore, there will be a slight bug in the following argument.
The problematic case occurs when a chunk ticket gets scheduled for addition after a scheduled removal. So, let's consider such a situation.
When the
ChunkTaskPrioritySystem
looks at the addition task, it does not directly execute it, but first sorts it into aLevelPrioritizedQueue
. Before that point, the task resides in some simpler prioritized queue together with the removal task.By the internal priorities of this simple queue, the
ChunkTaskPrioritySystem
will look at the scheduled removal before it looks at the scheduled addition, because it has a higher priority and was added earlier. (This is the buggy step in the argument!)When it looks at the scheduled removal, it will directly execute its associated action (the removal of the chunk ticket) and furthermore remove all pending tasks in the
LevelPrioritizedQueue
. However, at this point the addition task still resides in the simple queue and not in theLevelPrioritizedQueue
, so it will actually not be removed.Hence, everything is actually fine and works as it should
Now, as pointed out, the bug lies within the floowing false statement:
If a task T1
gets added to a TaskQueue.Prioritized
before a task T2
and it has a strictly higher priority then it gets returned by poll()
before task T2
.
The problem is that while the individual queues for each priority level are threadsafe, the collection as a whole is not, in the sense described here, i.e. it does not satisfy some basic PRAM consistency (or whatever you like).
So, this makes the above argument invalid and allows to break it with some very precise timing. However, due to this precise timing, it is not really possible to break multiple chunk tickets at once, so there will always be enough tickets around to hide the effect. Hence I cannot really provide concrete steps to visualize the issue.
As stated in the beginning, this is more of a theoretical consideration for arguing about code correctness and a supplement to MC-177685.
The case where a removal gets scheduled after an addition, on the other hand, looks unproblematic, as this case explicitly checks whether the ticket should still be added right before doing so.
As a proposed solution, I think doing a similar check right before the removal of a ticket and removing the code that cancels all pending tasks, should solve the issue.
However, I prefer the solution of removing the ChunkTaskPrioritySystem
used in ChunkTicketManager
alltogether, as proposed in MC-177685, which will then automatically solve this issue as well.
Best,
PhiPro
Linked issues
Comments
No comments.