In 05ed9d9a73 we stopped caching the values with ToArray to resolve a desync. But even caching the enumerable can lead to a desync, so remove the caching entirely.
----
Let's explain how the code that cached values via ToArray could desync.
Usually, the cell given by `self.Location` matches with the cell given by `self.GetTargetablePositions()`. However if the unit is moving and close to the boundary between two cells, it is possible for the targetable position to be an adjacent cell instead.
Combined with the fact hovering over the unit will evaluate `CurrentAdjacentCells` only for the local player and not everybody, the following sequence becomes possible to induce a desync:
- As the APC is moving into the last cell before unloading, the local player hovers over it. `self.Location` is the last cell, but `self.GetTargetablePositions()` gives the *previous* cell (as the unit is close to the boundary between the cells)
- The local player then caches `CurrentAdjacentCells`. The cache key of `self.Location` is the final cell, but the values are calculated for `self.GetTargetablePositions()` of an *adjacent* cell.
- When the order to unload is resolved, the cache key of `CurrentAdjacentCells` is already `self.Location` and so `CurrentAdjacentCells` is *not* updated.
- The units unload into cells based on the *adjacent* cell.
Then, for other players in the game:
- The hover does nothing for these players.
- When the order is resolved, `CurrentAdjacentCells` is out of date and is re-evaluated.
- `self.Location` and `self.GetTargetablePositions()` are both the last cell, because the unit has finished moving.
- So the cache is updated with a key of `self.Location` and values from the *same* cell.
- The units unload into cells based on the *current* cell.
As the units unload into different cells, a desync occurs. Ultimately the cause here is that cache key is insufficient - `self.Location` can have the same value but the output can differ. The function isn't a pure function so memoizing the result via `ToArray()` isn't sound.
Reverting it to cache the enumerable, which is then lazily re-evaluated reduces the scope of possible desyncs but is NOT a full solve. The cached enumerable caches the result of `Actor.GetTargetablePositions()` which isn't a fully lazy sequence. A different result is returned depending on `EnabledTargetablePositions.Any()`. Therefore, if the traits were to enable/disable inbetween, then we can still end up with different results. Memoizing the enumerable isn't sound either!
Currently our only trait is `HitShape` which is enabled based on conditions. A condition that enables/disables it based on movement would be one way to trigger this scenario. Let's say you have a unit where you toggle between two hit shapes when it is moving and when it stops moving. That would allow you to replicate the above scenario once again.
Instead of trying to come up with a sound caching mechanism in the face of a series of complex inputs, we just give up on trying to cache this information at all.
The AI would often invoke this method inside of loops, searching for a different category of queue each time. This would result in multiple searches against the trait dictionary to locate matching queues. Now we alter the method to create a lookup of all the queues keyed by category. This allows a single trait search to be performed.
UnitBuilderBotModule and BaseBuilderBotModule are updated to fetch this lookup once when required, and pass the results along to avoid calling the method more times than necessary. This improves their performance.
Updating squads is the most expensive part of SquadManagerBotModule. It involves ticking the current squad state. This usually involves finding nearby enemies and evaluating the fuzzy state machine to decide whether to interact with those enemies. Since all the AI squads for a player get ordered on the same tick, this can result in a lag spike.
To reduce the impact, we'll spread out the updates over multiple ticks. This means overall all the AI squads will still be refreshed every interval, but it'll be a rolling update rather than all at once. By spreading out the updates we avoid a lag spike from the cumulative updates of all the squads. Now the lag spike is reduced to the worst any single squad update can incur.
The StateMachine offered a feature to remember the previous state and allow reverting to it. However this feature is unused. Remove it to allow the previous states to be reclaimed by the GC earlier.
If a long path was being visualized with the path-debug command it would generate renderables for everything on the path, even for parts of the path that would be offscreen. Add some simplistic culling so the performance impact is reduced.
Several activities that queue child Move activities can get into a bad scenario where the actor is pathfinding and then gets stuck because the destination is unreachable. When the Move activity then completes, then parent activity sees it has yet to reach the destination and tries to move again. However, the actor is still blocked in the same spot as before and thus the movment finishes immediately. This causes a performance death spiral where the actor attempts to pathfind every tick. The pathfinding attempt can also be very expensive if it must exhaustively check the whole map to determine no route is possible.
In order to prevent blocked actors from running into this scenario, we introduce MoveCooldownHelper. In its default setup it allows the parent activity to bail out if the actor was blocked during a pathfinding attempt. This means the activity will be dropped rather than trying to move endlessly. It also has an option to allow retrying if pathfinding was blocked, but applies a cooldown to avoid the performance penalty. For activities such as Enter, this means the actors will still try and enter their target if it is unreachable, but will only attempt once a second now rather than every tick.
MoveAdjacentTo will now cancel if it fails to reach the destination. This fixes MoveOntoAndTurn to skip the Turn if the move didn't reach the intended destination. Any other derived classes will similarly benefit from skipping follow-up actions.
- EditorActorLayer now tracks previews on map with a SpatiallyPartitioned instead of a Dictionary. This allows the copy-paste logic to call an efficient PreviewsInCellRegion method, instead of asking for previews cell-by-cell.
- EditorActorPreview subscribes to the CellEntryChanged methods on the map. Previously the preview was refreshed regardless of which cell changed. Now the preview only regenerates if the preview's footprint has been affected.
The SupportPowerManager and WithSpriteBody trait captured the ActorInitializer in lambda expressions, which keeps it alive as long as the trait. The lambdas didn't need to capture the ActorInitializer, so rejig them to allow the ActorInitializer to be reclaimed after the traits have been created. As the TypeDictionary in the ActorInitializer can be quite large, this helps reduce memory usage.
When dealing with isometric maps, the copy paste region needs to copy the CellCoords area covered by the CellRegion. This is equivalent to the selected rectangle on screen. Using the cell region itself invokes cell->map->cell conversion that doesn't roundtrip and thus some of the selected cells don't get copied.
Also when pasting terrain + actors, we need to fix the sequencing to clear actors on the current terrain, before adjusting the height and pasting in new actors. The current sequencing means we are clearing actors after having adjusted the terrain height, and affects the wrong area.
If you edit an actor name, then delete the actor - it fails to be removed from the map in the editor. This is because the actor previews are keyed by ID. Editing their name edits their ID and breaks the stability of their hash code. This unstable hash code means the preview will now fail to be removed from collections, even though it's the "same" object.
Fix this by making the ID immutable to ensure hash stability - this means that a preview can be added and removed from collections successfully. Now when we edit the ID in the UI, we can't update the ID in place on the preview. Instead we must generate a new preview with the correct ID and swap it with the preview currently in use.
Several parts of bot module logic, often through the AIUtils helper class, will query or count over all actors in the world. This is not a fast operation and the AI tends to repeat it often.
Introduce some ActorIndex classes that can maintain an index of actors in the world that match a query based on a mix of actor name, owner or trait. These indexes introduce some overhead to maintain, but allow the queries or counts that bot modules needs to perform to be greatly sped up, as the index means there is a much smaller starting set of actors to consider. This is beneficial to the bot logic as the TraitDictionary index maintained by the world works only in terms of traits and doesn't allow the bot logic to perform a sufficiently selective lookup. This is because the bot logic is usually defined in terms of actor names rather than traits.
This is a more natural representation than int that allows removal of casts in many places that require uint. Additionally, we can change the internal representation from long to uint, making the Color struct smaller. Since arrays of colors are common, this can save on memory.