Run reducers in their own V8 HandleScope for js modules#4746
Open
joshua-spacetime wants to merge 1 commit intojoshua/js-worker-queuefrom
Open
Run reducers in their own V8 HandleScope for js modules#4746joshua-spacetime wants to merge 1 commit intojoshua/js-worker-queuefrom
HandleScope for js modules#4746joshua-spacetime wants to merge 1 commit intojoshua/js-worker-queuefrom
Conversation
HandleScope in js workerHandleScope for js modules
HandleScope for js modulesHandleScope for js modules
77faa0d to
02dd1b5
Compare
1516f25 to
b5650cf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of Changes
The JS worker is intentionally long-lived. Before this patch, we essentially had a memory leak where V8 call-local handles and some host-side call state would survive/accumulate across multiple calls on the worker. That created gradual heap growth over time, more GC work, and eventually enough slowdown and heap pressure that the isolate needed to be replaced.
And even though we periodically check heap statistics to determine if/when we need to replace the isolate, execution latencies can (and did) degrade dramatically before this kicks in.
Now each reducer call is given a fresh V8
HandleScopein which to execute instead of reusing a single global scope. This scope is then dropped at the end of each run, which avoids retaining/accumulating call-local JS objects over the JS worker's lifetime.This patch also makes end-of-call host cleanup explicit, lowers the default heap-check cadence, and limits exported heap metrics to the JS worker only. Previously we had poor heap observability for diagnosis as heap metrics from the instance pool (for procedures) could overwrite heap metrics for the worker.
What changed
1. Add a fresh V8
HandleScopefor every invocationEach reducer, view, and procedure call now opens a nested V8 scope for the duration of that call.
This preserves the existing long-lived isolate and context, but gives every invocation its own temporary handle lifetime. Call-local V8 handles now die when the invocation returns instead of sticking around until the worker exits.
As part of that refactor:
ArrayBufferis now created per reducer call instead of being stored as a worker-lifetime local.2. Make end-of-call cleanup a real boundary
The V8 host now force-clears leftover per-call host state at the end of a function call.
Specifically:
3. Lower the default heap-check cadence
The default V8 heap policy is now more aggressive about checking worker heap usage.
Defaults changed from:
heap-check-request-interval = 65536heap-check-time-interval = 30sto:
heap-check-request-interval = 4096heap-check-time-interval = 5sThese settings remain configurable through the existing
v8-heap-policyconfig.4. Only export heap metrics for the instance-lane worker
Heap metrics now reflect only the long-lived instance lane.
Specifically:
worker_kind="instance_lane".This avoids the last-writer-wins issue from the pooled instances while keeping the metrics focused on the worker that accumulates state over time and is most relevant for long-run slowdown diagnosis.
API and ABI breaking changes
None
Expected complexity level and risk
2
Testing
Manually tested via the keynote-2 benchmark. Will add the keynote benchmark to CI which will serve as a regression test.