Shroud, access ProjectedCellLayer by array index over PPos index.
Performance improvement. Avoid the multiple PPos to array index
conversions in the same method call by calculating the cell
layer index once.
Background:
`Shroud.Tick` and `ProjectedCellLayer.Index(PPos puv)` shows up
in profile reports as one of the most expensive methods
(9% of CPU time).
In `Shroud.Tick` calls `ProjectedCellLayer.Index(PPos puv)` multiple
times for the same or different cell layers of the same dimension.
Improvement:
Benchmark results show an 0.5ms mean improvement in tick
time and 0.3 improvement in render time -
on a replay map of 1.12 min of play at max speed.
Render time:
render222052(bleed) render221934(this commit)
count 8144.000000 8144.000000
mean 11.410075 11.470100
std 5.004876 4.731463
min 3.450700 3.638400
25% 7.409100 7.015900
50% 12.410600 12.435900
75% 13.998100 14.242900
max 149.036200 149.656500
Tick time:
tick_time222043(bleed) tick_time221923(this commit)
count 2366.000000 2366.000000
mean 4.762923 4.275833
std 3.240976 3.206362
min 0.263900 1.653600
25% 4.145375 3.668600
50% 4.779350 4.240050
75% 5.232575 4.611775
max 85.751800 87.387100
Shroud.touchedCount to avoid Tick updates if no cells touched.
Avoids iterating over all map cells of the `touched` cell layer.
Tick time improvement of 40%+ - during at least the first two
minutes of gameplay.
During the first minutes of a game - out of every 1000 ticks
only 10-100 result in the Shroud - of any player - to be touched.
For certains player types (Neutral, Creep) less Shroud updates
are expected throughout a complete game.
Throughout a complete game human/AI players can also have no
Shroud touches during certain Ticks.
All targetlines can now be set to a custom color in yaml or set to be invisible.
All automated behaviours including scripted activities now have no visible target lines.
As proposed in the past in #13577.
Replace TraitContainer.All() that uses the custom AllEnumerator with
TraitContainer.ApplyToAllX() that takes an action as argument.
The AllEnumerator.Current function show up in profiling reports since it is
used each tick multiple times for multiple traits. The function is 'heavy'
because it creates TraitPair<T>'s temporary objects for each actor
trait combination.
In the past about 20k ITick trait pairs were present during an average
multi player game.
Using an Apply function that takes an action avoid the need to create
these temporary objects.
To be able to still use 'DoTimed' somewhat efficiently the measurement
was moved to inside the trait container method.
Results in a 25% performance improvement in accessing all traits of
a certain type.
Apply function could be used for other TraitContainer functions as well
for further improvements.
Test result for calling on a dummy trait on 20k actors a 1000 times:
1315 ms traitcontainer.AllEnumerator (current)
989 ms traitcontainer.Apply (this commit)
The Stream.ReadByte method is implemented by allocating a 1 byte buffer and calling into Read(byte[], int, int). Override ReadByte in derived classes to avoid needing to allocate this small temp buffer.
Also, fix some bugs in the stream implementations. Remove Write capability from MergedStream that didn't make sense. Add guards into SegmentStream to ensure reads and writes belonged to the segment - otherwise a reader or writer could access regions of the base stream that were outside the intended segment.
- Clone method will use the node count to create the correct capacity.
- ResolveInherits will use the node count as the suggested initial capacity.
- FromStream will now stream lines, rather than reading the whole file into memory and then splitting into lines.
- Avoid creating new strings in SpriteRenderer.Flush.
- ProductionQueue.CancelUnbuildableItems can exit early if the queue is empty. It can also use a set of names for quicker lookups.
- OpenGL.CheckGLError avoids a Enum.HasFlag call.
- VertexBuffer interface redefined to remove an IntPtr overload for SetData. This removes some unsafe code in TerrainSpriteLayer. This also allows the ThreadedVertexBuffer to use a buffer and post these calls, meaning the SetData call can now be non-blocking.
- ThreadedTexture SetData now checks the incoming array size. As the arrays sent here are usually large (megabytes) this allows us to avoid creating temp arrays in the LOH and skip Array.Copy calls on large arrays. This means the call is now blocking more often, but significantly reduces memory churn and GC Gen2 collections.
These config files often contain many repeated strings which result in different string references in memory. By using a pool, we can detect when the strings are equal and reuse an existing reference as strings are immutable.
The FromLines will now use a pool to de-duplicate strings for a single call. By allowing a pool to be provided as a parameter, we can reuse even more strings. The MapCache defines such a pool so that strings are reused across all maps in the cache for even more savings.