Noise to biome mapping depends on previous sampling position
While doing a code analysis of the new biome generation, I ran into inconsistent biome placement with the MultiNoiseBiomeSource
. This will cause discrepancies between worlds with the same seed that are most likely to cause headaches in the future. For example, this affects structure generation near biome borders and a fix will completely change outer stronghold positions! (Strongholds are highly dependent on biome positions, and each stronghold depends on those generated before it.)
The problem is easiest to demonstrate with a small test function:
public void inconsistentBiomeSampling()
{
// Get an overworld biome source
MultiNoiseBiomeSource bs = MultiNoiseBiomeSource.Preset.OVERWORLD
.getBiomeSource(BuiltinRegistries.BIOME);
// Get a noise sampler for a seed
long seed = 7;
ChunkGeneratorSettings gen = ChunkGeneratorSettings.getInstance();
NoiseColumnSampler noise = new NoiseColumnSampler(
gen.getGenerationShapeConfig(),
gen.hasNoiseCaves(),
seed,
BuiltinRegistries.NOISE_PARAMETERS,
gen.getRandomProvider());
// Example of biome coordinates for this seed where the sampling
// is inconsistent.
int x = 116, y = 15, z = 0;
// First sample the biome when no previous result is cached.
// (This is a local optimum.)
Biome b1 = bs.getBiomeAtPoint(noise.sample(x, y, z));
// Now sample a position nearby that may be a better fit to the
// first sampling point.
bs.getBiomeAtPoint(noise.sample(x, y, z+1));
// Checking the first sampling point again gets the cached result
// from the last call, if that node was a better fit.
Biome b2 = bs.getBiomeAtPoint(noise.sample(x, y, z));
if (b1 != b2) {
System.out.println(
"seed:"+seed+"@["+x+","+y+","+z+"] "+
"can have biomes: "+b1+" or "+b2);
}
}
Which outputs:
seed:7@[116,15,0] can have biomes: minecraft:ocean or minecraft:cold_ocean
The inconsistency arises from the SearchTree that is used to map noise points to biomes. In its current form, it does not always find the biome with the smallest distance in the noise parameter space. This may be a sacrifice that has been made for performance, which is not pretty, but not necessarily a problem as long as there remains a unique mapping from noise values to biomes. However, another optimization that has been done, is caching the resulting node from the last call. Reusing this node is incorrect, as it might result in a biome that is closer to the sampling noise point than what the SearchTree would normally find.
Below is a graphic with a 2D example outlining why this is the case.
[media]The relevant methods are located in MultiNoiseUtil.java
:
public T get(NoiseValuePoint point, NodeDistanceFunction<T> distanceFunction) {
long[] ls = point.getNoiseValueList();
// Caching previousResultNode is problematic
TreeLeafNode<T> treeLeafNode = this.firstNode.getResultingNode(ls, this.previousResultNode.get(), distanceFunction);
this.previousResultNode.set(treeLeafNode);
return treeLeafNode.value;
}
@Override
protected TreeLeafNode<T> getResultingNode(long[] otherParameters, @Nullable TreeLeafNode<T> alternative, NodeDistanceFunction<T> distanceFunction) {
// Suggestion: the distance for 'alternative' is already known by the recursive caller and could be passed as an argument.
long l = alternative == null ? Long.MAX_VALUE : distanceFunction.getDistance(alternative, otherParameters);
TreeLeafNode<T> treeLeafNode = alternative;
for (TreeNode<T> treeNode : this.subTree) {
long n;
long m = distanceFunction.getDistance(treeNode, otherParameters);
if (l <= m) continue;
TreeLeafNode<T> treeLeafNode2 = treeNode.getResultingNode(otherParameters, treeLeafNode, distanceFunction);
long l2 = n = treeNode == treeLeafNode2 ? m : distanceFunction.getDistance(treeLeafNode2, otherParameters);
if (l <= n) continue;
l = n;
treeLeafNode = treeLeafNode2;
}
return treeLeafNode;
}
Remark:
The bug primarily causes different parts of the code to disagree about the biomes, especially near biome borders. This has potential, among other things, to make the population of chunks depend on the order in which the world is generated. However, biomes inside the anvil chunk storage between multiple generations of the same world will most likely be consistent (although wrong) since the generator iterates over the biome points inside each chunk in the same order.
Linked issues
relates to 1
Attachments
Comments 7
Is this related to large biomes world seeds not matching their default counterparts?
No, this just causes the game thinks that a location has a different biome, depending on which part of the code is looking. Also it affects mostly just the location of biome borders, not the generation as a whole.
Regarding the large biomes in 1.18-pre1, the random number generators now get initialized with an MD5 hash of the generator name. This noise generator name is "minecraft:temperature" for normal generation and "minecraft:temperature_large" with large biomes and similar for the other noise parameters. This means the noise generation is completely different for large biomes.
Added a warning that this has a major affect on stronghold positions and should preferably be fixed before the main release.
I do not understand why the Mojang Priority of this ticket is Low while MC-55596 is Important, since both describe very similar issues.
Relates to MC-55596