This reverts commit b172c93e45.
I made a significant mistake with this change: the scaling factor of
batch-by-chunk is O(nSendBytes), while the scaling factor of sending
directly to players is O(nSendBytes * nActivePlayers). It seems like the
real problem is that this system isn't getting enough usage.
While it does reduce compression efficiency in some cases, it falls back
to letting the sessions do individual compression when the amount of
data is less than 256 bytes anyway (compression-threshold in
pocketmine.yml).
My motivation for scrapping this system was to reduce the broadcast
system's complexity to make it easier to thread the living shit out of
compression, but it seems like this change was a step in the wrong
direction for performance.
A few steps can be taken to improve the usefulness of this system (and
also improve output bandwidth):
- Make general entity updates use this system. Movement and velocity
already kinda used this system, but crucially, players did not,
because we couldn't prevent the player from receiving its own movement
updates if we did that, which caused all kinds of problems.
- Therefore, we need to reintroduce static "self" entity IDs, like we
had in the shoghi days when entity ID 0 referred to the "self"
player.
- However, since entity ID 0 has a variety of interesting bugs since
it usually refers to "no entity" in MCPE, it would be better to use
1 (or 2, like MiNET does).
- The fixed ID used should be close to zero to maximize varint
encoding efficiency.
- We assumed that changes to player's position and velocity would be
ignored by the client. This assumption depends on the client and
could be broken at any time, so it's best not to rely on it.
- Make block updates use this system (when chunk updates are not sent).
Currently, block updates use a separate mechanism which creates a
second batch for every active chunk, which wastes CPU, and decreases
bandwidth efficiency on multiple fronts (reduced compression
efficiency, more cross-thread interactions, more bytes wasted on
RakNet packet headers).
Grouping packets to batch by chunk instead of by player reduces bandwidth efficiency, because the number of active chunks is almost always larger than the number of active players.
With the old mechanism, a batch of packets for each active chunk (which could be dozens, or hundreds... or thousands) would be created once, and then sent to many players. This makes the compression load factor O(nActiveChunks). Broadcasting directly is simpler and changes the load factor to O(nActivePlayers), which usually means a smaller number of larger batches are created, achieving better compression ratios for approximately the same cost (the same amount of data is being compressed, just in a different way).
The motivation for this is to prevent passing a dynamic argument to cancellation, which in almost all cases is a bug in user code. This same mistake also appears in a few places in the PM core (as seen in this commit), but in those cases the mistakes were mostly harmless since they were taking place before the event was actually called.
closes#3836
this produces a major performance improvement for large render distances, and reduces the impact of lighting calculation to zero on servers which have random blockupdates turned off.
again, this should never happen, because chunk unloading cleans this stuff out. But if it did happen, loading chunks is not the way to take care of it.
Various bugs existed for a while with stuff using chunk managers instead of worlds when interacting with terrain due to a behavioural inconsistency between World::getChunk() (return from cache or load from disk), and SimpleChunkManager::getChunk() (return from cache only). This change brings the two in line.
World::getOrLoadChunk() has been added as a replacement, which has the same behaviour as the old getChunk() and also makes it more obvious that there is an issue with code using it during refactoring.
the expectation is that eventually this will receive arbitrary internal runtime IDs instead of static id/meta, and RuntimeBlockMapping doesn't really care about this crap anyway.
this solves multiple architectural issues:
- improves reusability by avoiding having old state info stick around to fuck stuff up
- prevents access to propagation state from outside of propagation
this also reduces the latent memory usage of light-updates after they have been used.
TODO: we could probably change LightPropagationContext to LightPropagator and move all the propagation-specific code into it if we can solve the subchunk-iterator and effective light problems.
this new form allows skipping some useless checks during sky light calculation and also allows getting rid of the last hard dependency on core Block classes.
We're getting real close to native light now.
I accidentally added this during a separation of my local changes, but it's useful anyway, so we should use it.
This removes BlockLightUpdate's implicit dependency on Block, which is a
step towards native light.
it's useful to have an immutable stub around for the sake of feeding back dummy read values, but for write values it has to barf instead of being quiet.
There's still some issues with LightArray which I don't currently have a solution for, but I'm thinking about separating light storage from chunks anyway.
recalculateHeightMapColumn is stateless, so it can't make any assumptions about which subchunks to check for blocks. However, in most the average case (6 allocated subchunks), this causes 2500+ useless SubChunk->getHighestBlockAt() calls (10 per column). Since we're calculating in bulk, we can figure out which subchunks are empty one time and ignore them for all 256 columns.
In the average case, this produced a 50-60% performance improvement for heightmap calculation (~1.1 ms -> 0.5 ms).
In extreme cases where the height is extremely varied, this produces no observable performance benefit, but for most cases with flattish terrain, it's an improvement.
It can likely be further improved, but further performance improvements are outside the scope of this commit and will likely result in more complexity increases.
typically this is a state that only lasts for a tick or so, but it's a race condition that is regardless very commonly encountered.
If you were very unlucky, you might have noticed grass randomly dying when you were spawning or flying around, even though it was in full sky light.
opcache preloading doesn't store non-class constants, but it does store class constants. Class constants that reference non-class constants are pre-resolved into values before storing in opcache SHM, so class constants are OK to use, but non-class constants will come back as undefined.
this was degraded whenever it was I decided to make chunks always be allocated. This commit uses a fast path for light filling in subchunks which are completely clear of the heightmap, which returns the performance back to its old fast levels.
ChunkListeners are less dangerous, and also make more sense considering the usages.
Ideally we want to not have to care if a listener is a Player at all, but that's still some work away yet.
in pretty much every case, these usages really wanted to read the tag's contents anyway, which can be combined with a getTag() and instanceof call for more concise and static analysis friendly code.
In the few cases where the tag contents wasn't needed, it still wanted to check the type, which, again, can be done in a more static analysis friendly way by just using getTag() and instanceof.