Minecraft stores all loaded chunks in a hash table. The hash function used starts by XOR'ing the chunk's X and Z coordinates, so that many nearby chunks will get the same hash. This causes chunk lookups - and block updates, block lookups, tile entity lookups, etc - to take O(sqrt(N)) time instead of O(1), where N is the number of loaded chunks.
For a server with 10 players and a view-distance of 10, there will be 4410 loaded chunks, and chunk lookups will be up to 66 times slower than they should be, or 20 times slower in the best case.
Comments 13
after X ^ Z the rest of the function is:
private static int hash(int par0)
{
par0 ^= par0 >>> 20 ^ par0 >>> 12;
return par0 ^ par0 >>> 7 ^ par0 >>> 4;
}
X and Z are ints. I haven't tested any alternative options.
Is this still a concern in the current Minecraft version 1.6.2 / Launcher version 1.1.3 ? If so, please update the affected versions in order to best aid Mojang ensuring bugs are still valid in the latest releases/pre-releases.
Is this still a concern in the current Minecraft version 1.7.2 / Launcher version 1.3.4 ? If so, please update the affected versions in order to best aid Mojang ensuring bugs are still valid in the latest releases/pre-releases.
Well, Grum's presence in this thread changes the "unlikely to be fixed" part. I still think it should be Mojang's job to see if bug reports are still valid.
The problem with X^Z is that nearby chunks tend to get the same hash. X^Z is the first stage of the hashing function, so no matter how good the rest of the function is, chunks 0,5 and 1,4 and 2,7 and 3,6 and 4,1 and 5,0 and 6,3 and 7,2 will always get the same hash.
A Python simulation of the existing hash function on a 65x65 chunk area centred on random coordinates showed that this creates an 8192 slot hashmap, with 272 slots used, and an average chain length of 15. If centred around the (0,0) chunk instead, the average chain length is 25.6.
Using Grum's suggestion to replace X^Z, the average chain length was 1.03 for a 65x65 area, 4.08 for a 257x257 area, and 8.14 for a 513x513 area.
Using Brandon's suggestion to replace X^Z, the average chain length was 1.28 for a 65x65 area, 1.31 for a 257x257 area, and 1.32 for a 513x513 area, .
I doubt anyone has a loaded region of 257x257 chunks (or even 65x65), so while Brandon's hash function generates a low, predictable rate of collisions (edit: bucket collisions, not hash collisions), Grum's may be a bit better for typical Minecraft usage. Grum's is also simpler.
After some testing we'll be using the normal LCG one.
private static final int HASH_A = 0x19660d;
private static final int HASH_C = 0x3c6ef35f;
private static final int HASH_Z_XOR = 0xDEADBEEF;
@Override
public int hashCode() {
final int xTransform = HASH_A * x + HASH_C;
final int zTransform = HASH_A * (z ^ HASH_Z_XOR) + HASH_C;
return xTransform ^ zTransform;
}
You may be modifying the wrong method. Chunk coordinates are hashed in the (MCP names) LongHashMap.getHashedKey and LongHashMap.hash methods. MCP code follows:
private static int getHashedKey(long par0)
{
return hash((int)(par0 ^ par0 >>> 32)); // <-- this is the part where I noticed the inefficient hashing. Every chunk lookup uses this hash function.
}
private static int hash(int par0)
{
par0 ^= par0 >>> 20 ^ par0 >>> 12;
return par0 ^ par0 >>> 7 ^ par0 >>> 4;
}
Don't undo what you did already, but make sure to change it in LongHashMap (above) as well.
This might be an effect, but:
Seed 1, render distance 8, travel from spawn to x=2000, v1.7.2: 19.94 ticks/sec
Seed 1, render distance 8, travel from spawn to x=2000, v1.7.4: 11.03 ticks/sec
The extra chunk ticking seems to have slowed down the server. You can see debug files, etc. at MC-41874
@unknown:
I still think it should be Mojang's job to see if bug reports are still valid.
You're a modder yourself, right? Do you check if every bug reported to you about your mods is valid from version to version? Do you really think it's even possible to do so, for all reports?
You're smart enough to identify a bad hash function. You're obviously a reasonably skilled programmer. So why isn't it obvious to you that:
Number of employees checking bug reports cannot possibly scale with the number of users reporting them.
The user is the one person best situated to verify whether or not their own experience has changed.
Mojang cannot determine what is happening on a user's computer. Depending on how clear the report is, they may not actually understand what the problem is. So unless you're suggesting they fly out to the user's home and see the problem in person (which would scale particularly poorly as a solution), I don't really understand what you're getting at.
From another angle, when Mojang does check reports, and closes them because they think they've fixed it, someone tends to come along and complain that it isn't fixed, and the report was closed prematurely. As the saying goes, damned if you do, damned if you don't.
Hmm that's pretty bad. What hash is being applied to X ^ Y? Is it just a Java built-in? Are X and Z floats?
An affine transform like AX + Z mod N would probably do a lot better. Have you tested various options?