fix(python): use mimalloc as global allocator to fix RSS leak#7245
fix(python): use mimalloc as global allocator to fix RSS leak#7245mansiverma897993 wants to merge 6 commits into
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
There was a problem hiding this comment.
I don't think you need to check this in. What I'd be most interested is for each variant—main, --feature mimalloc, --feature jemalloc—you run the full benchmark suite under python/python/benchmarks and then reports the changes. That's what I did to produce the heatmap in this discussion: #1372 (comment)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mimalloc's default initial-exec TLS model doesn't fit in glibc's static TLS surplus when lance.abi3.so is dlopen'd, causing 'cannot allocate memory in static TLS block' on import. The local_dynamic_tls feature switches to the local-dynamic model. Also adds the new allocator deps to Cargo.lock so the build passes with --locked. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Enabling jemalloc without --no-default-features left the mimalloc default on too, so both defined a #[global_allocator] static GLOBAL (E0428). Gate mimalloc on not(feature = "jemalloc") so the explicit jemalloc opt-in wins. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
I've made some updates to make mimalloc compile. It does look like it's not working on Mac. Maybe we should use jemalloc there? Or there's something else we can do to fix it. Also LMK if you want guidance on how to run those benchmark suites. |
Resolves #7242
This PR configures
mimallocas the global allocator in the Python library (pylance) to resolve an unbounded RSS memory leak caused by glibc per-thread arena fragmentation.Root Cause
Under glibc, sequential
merge_insertworkloads on Tokio worker threads allocate temporary buffers. Glibc does not eagerly return pages from these per-thread arenas to the OS after they are freed, causing RSS memory to grow monotonically until the process is terminated by the OOM killer.Solution
mimallocdependency topython/Cargo.toml(withdefault-features = falseto keep dependencies minimal).mimalloc::MiMallocas the#[global_allocator]inpython/src/lib.rs.