this was achieved by storing binary representations of the blockstates, rather than the original BlockStateData.
Due to the insane object:data ratio of Tag objects (40:1 for ByteTag for example), modestly sized NBT can explode in memory footprint. This has been previously seen with the absurd 25 MB footprint on file load.
Previously, I attempted to mitigate this by deduplicating tag objects, but this was mitigating a symptom rather than addressing the cause.
We don't actually need to keep the NBT around in memory, since we don't actually use it for anything other than matching blockstates. In this case, we can allow the code to be possibly a little slower, since the lookup is anyway slow and the result will be cached.
In fact, using encoded ordered states as hash keys significantly improves the speed of lookups for stuff like walls, which have many thousands of states.
We keep around generateStateData(), since it's still possible we may need the BlockStateData associated, and it can be easily reconstructed from the binary-encoded representation in BlockStateDictionaryEntry.
closes#5154
this hack sends only the bare essential data to create the tiles in LevelChunkPacket,
and then separately sending the full tile data using BlockActorDataPacket afterwards.
This is necessary because the client doesn't handle items correctly in NBT when chunks are sent without using the SubChunkRequest system.
In 4.6 this is observed with incorrect items shown in item frames; in 5.0 it's seen with items simply not showing up at all (difference due to modernization of the serialization format in 5.0).
while this is a bit hacky outside of the protocol namespace, it makes it much easier to use the protocol library for alternative purposes, such as for a client or MITM proxy.
It also removes all but one remaining core dependency of the protocol library, making it very close to being able to be separated from the server core entirely.
the expectation is that eventually this will receive arbitrary internal runtime IDs instead of static id/meta, and RuntimeBlockMapping doesn't really care about this crap anyway.
this prepares for a fully dynamic block mapper, as well as allowing a small performance improvement to chunk encoding by eliding the constant lazy-init checks.
this optimization relies on the fact that palette entries are always unsigned, and positive zigzag varints are just the same as their non-zigzag counterparts, except shifted left by 1 bit. This eliminates some function call overhead, making the encoding slightly less agonizingly slow.