architecture
2023-09-10 ideas to try out
2023-09-10 parallelization with an event loop
2023-09-10 parsing less
2023-09-20 I measured the compiler’s performance yesterday and most time is actually spent on parsing, out of all things
2023-09-20 going along with the philosophy of be lazy, we should probably be parsing less things then
2023-09-20
I don’t think we need to parse method bodies if we’re not emitting IR
2023-09-20 basically, out of this code:
unrealscript function Hug(Hat_Player OtherPlayer) { PlayAnimation('Hugging'); // or something, idk UE3 }
parse only this:unrealscript function Hug(Hat_Player OtherPlayer) { /* token blob: PlayAnimation ( 'Hugging' ) ; */ }
omitting the entire method body and treating it as an opaque blob of tokens until we need to emit IR2023-09-20
I don’t think we need to parse the entire class if we only care about its superclass
2023-09-20 basically, out of this code:
unrealscript class lqGoatBoy extends Hat_Player;
defaultproperties { Model = SkeletalMesh’lqFluffyZone.SkGoatBoy’; // etc }
only parse the following:
class lqGoatBoy extends HatPlayer;/* parser stops here, rest of text is ignored until needed */
and then only parse the rest if any class items are requested
2023-09-20
ideas I tried out
2023-10-20 lexing first
2023-09-20 something that MuScript did not use to do is have a separate tokenization stage
2023-09-20 implementing this taught me one important lesson: context switching is expensive
2023-10-20 I think also having token data in one contiguous block of memory also helped, though isn’t as efficient as it could be yet.
2023-10-20 the current data structure as of writing this is
rust struct Token { kind: TokenKind, source_range: Range<usize>, }
struct TokenArena { tokens: Vec<Token>, }
(with some irrelevant things omitted - things like source files are not relevant for token streams themselves)
2023-10-20 I don’t know if I’ll ever optimize this to be even more efficient than it already is, but source ranges are mostly irrelevant to the high level task of matching tokens, so maybe arranging the storage like
rust struct Tokens { kinds: Vec<TokenKind>, source_ranges: Vec<Range<usize>>, }
could help2023-10-20