The scenario is fairly intricate, but easy enough to understand in hindsight. Here's the rundown:
- Each code block is attached to a lexical scope - every time a { } block appears, a new scope is created by the compiler
- Each lexical scope contains some metadata: the types of each variable, what kind of structures are present, and so on
- Included in this metadata is a set of pointers that indicate where each variable is stored. Typically this points to a location on the stack
- In a single-threaded environment, this works fine: each time a lexical scope is entered during execution, we bind the variables to the stack, and proceed as normal
- In a multi-threaded environment, this ceases to work, because one thread can clobber another thread's scope metadata
Now, solving this involved two primary requirements:
- I do not want any kind of locking in the multiprocessing code
- Threads must be able to intelligently recreate hierarchies of nested scopes
These goals are somewhat contradictory; how can we examine the scope hierarchy and copy scope metadata without locking? The solution is, as often happens, obvious in hindsight, but it took me a couple of hours to really figure it out and nail it down.
The solution is that we now have multiple copies of each scope metadata. There is one "master copy" which is never mutated by any code. Each time a scope is entered, a copy of the metadata is made. That copy is never passed to anyone outside the thread.
Since there is no change ever permitted to the master copy, we can read from it any time from a thread without needing to worry about locking. Since each thread has its own copy, and that copy is never shared outside the thread, we don't need to lock to use the copy of the metadata.
This leaves us with the question of rebuilding the scope hierarchy. For this, we take advantage of the fact that the lexical scope hierarchy is always traversed from top (most global scope) to bottom (most local scope). All we have to do is fork a copy of the scope metadata right before we enter that scope, and clean it up when we exit the scope. Since the traversal always follows this pattern, we always have a correct chain of parent/child relationships in the metadata.
Although the approach works, it kind of sucks. There's a need for many copies of the scope metadata, which increases memory consumption for comparatively little benefit. Worse, it costs performance - every function invocation or flow control block now has to do a scope copy on top of the other overhead.
I haven't really had a chance to think out ways to improve this, but I'm sure over the next week or so I'll have plenty of chances to refine the approach.
For now, my outstanding bug list has several issues related to this change; in particular, I broke an awful lot of scoping code, so I have to go check every permutation of code that involves lexical scopes, and ensure that things work correctly. Worse, I've introduced several memory leaks which will probably be ugly to hunt down across a multithreaded runtime.
On the plus side, I don't have to worry about being bored [grin]