When handling the Nodes collection in MiniYaml, individual nodes are located via one of two methods:
// Lookup a single key with linear search.
var node = yaml.Nodes.FirstOrDefault(n => n.Key == "SomeKey");
// Convert to dictionary, expecting many key lookups.
var dict = nodes.ToDictionary();
// Lookup a single key in the dictionary.
var node = dict["SomeKey"];
To simplify lookup of individual keys via linear search, provide helper methods NodeWithKeyOrDefault and NodeWithKey. These helpers do the equivalent of Single{OrDefault} searches. Whilst this requires checking the whole list, it provides a useful correctness check. Two duplicated keys in TS yaml are fixed as a result. We can also optimize the helpers to not use LINQ, avoiding allocation of the delegate to search for a key.
Adjust existing code to use either lnear searches or dictionary lookups based on whether it will be resolving many keys. Resolving few keys can be done with linear searches to avoid building a dictionary. Resolving many keys should be done with a dictionary to avoid quaradtic runtime from repeated linear searches.
This changeset is motivated by a simple concept - get rid of the MiniYaml.Clone and MiniYamlNode.Clone methods to avoid deep copying yaml trees during merging. MiniYaml becoming immutable allows the merge function to reuse existing yaml trees rather than cloning them, saving on memory and improving merge performance. On initial loading the YAML for all maps is processed, so this provides a small reduction in initial loading time.
The rest of the changeset is dealing with the change in the exposed API surface. Some With* helper methods are introduced to allow creating new YAML from existing YAML. Areas of code that generated small amounts of YAML are able to transition directly to the immutable model without too much ceremony. Some use cases are far less ergonomic even with these helper methods and so a MiniYamlBuilder is introduced to retain mutable creation functionality. This allows those areas to continue to use the old mutable structures. The main users are the update rules and linting capabilities.
- Seal the classes, and make SourceLocation a readonly struct.
- In ToDictionary, use TryAdd to avoid a try-catch.
- In Merge, use ToList to ensure sources is only enumerated once.
Moving the key duplication check allows a redundant check on top-level nodes to be avoided. Add tests to ensure key checks are functioning as expected.
# Example Scenario
MockString:
CollectionOfStrings:
StringA: A
MockString:
CollectionOfStrings:
StringB: B
MockString:
-CollectionOfStrings:
MockString:
CollectionOfStrings:
StringC: C
MockString:
CollectionOfStrings:
StringD: D
# Previous MergePartial result
# The CollectionOfStrings is merged into a single unit, so the C and D items are dragged upwards and jump ahead of the Removal
# When this is processed, the final result removes CollectionOfStrings entirely
MockString:
CollectionOfStrings:
StringA: A
StringB: B
StringC: C
StringD: D
-CollectionOfStrings:
# New MergePartial result
# When merging nodes, we no longer allow merges to jump an intervening removal node.
# This means we can have multiple of a certain key (CollectionOfStrings in this example) which was not the case previously.
# When this is processed, the final result includes C/D but not A/B.
MockString:
CollectionOfStrings:
StringA: A
StringB: B
-CollectionOfStrings:
CollectionOfStrings:
StringC: C
StringD: D
- Stream lines in as memory rather than needing to realise a string for each line, via a new method in StreamExts.
- Use span to avoid string allocations during parsing until we want to realise the node itself, in MiniYaml.FromLines.
- Change several callsites to use the streaming extension method rather than string method where possible.
- Clone method will use the node count to create the correct capacity.
- ResolveInherits will use the node count as the suggested initial capacity.
- FromStream will now stream lines, rather than reading the whole file into memory and then splitting into lines.
These config files often contain many repeated strings which result in different string references in memory. By using a pool, we can detect when the strings are equal and reuse an existing reference as strings are immutable.
The FromLines will now use a pool to de-duplicate strings for a single call. By allowing a pool to be provided as a parameter, we can reuse even more strings. The MapCache defines such a pool so that strings are reused across all maps in the cache for even more savings.
- Add support for escaping '#' inside values
- Add support for escaping leading and trailing whitespace
And when discardCommentsAndWhitespace is set to false:
- Add proper support for comments
- Persist empty lines
Whitespace and comment support requires an explicit opt-in because
they produce MiniYamlNodes with null keys. Supporting these through
the entire game engine would require changing all yaml enumerations
to explicitly check and account for these keys with no benefit.
Comments and whitespace are now treated as real nodes during parsing,
which means that the yaml parser will throw errors if they have
incorrect indentation, even if these nodes will be discarded.