-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A more cursed technique... =) #3
Comments
i also tried this detach technique with the process pinned to a single core and the rate is pretty much same as the normal way of doing it - if even a tiny bit slower. so it's trading increased cpu usage (for GC, on another thread) up front against reduced memory usage as far as i can see. |
This is some interesting work. Thanks for digging in!
Faster than not freeing them, and using regular ArrayBuffers, right? I had explored putting this sort of approach together, with a new subclass of ArrayBuffer called "DisposableArrayBuffer", which would be allocated much like in your approach, but ultimately decided against it since the ability to do this without a native addon, or modifying Node.js itself, and for any arbitrary ArrayBuffer is very compelling. It also means that if you don't detach, the GC can still do its job later on as normal.
Are you using a custom ArrayBufferAllocator? Or the default V8 one? Or something akin to what Node.js does? I wonder if that's what makes the difference here.
If you're doing that, then you don't need to use the Function constructor. You can just put the natives syntax in your code directly, even from within your benchmarks. No need to wrap it at all. |
i've been fiddling around with this approach and it's broken in various ways. trying to find an efficient (and safe) way to wrap external memory in v8. fyi - i think the speed improvement is likely down to fact i am never writing to the memory and calloc always seems to return the same block of memory if i free it directly after and run in a tight loop. |
btw - it turns out the memory leak i experienced was down to a current bug in v8 when pointer compression is enabled. thanks to the deno folks for documenting it! |
this is what the js benchmark looks like. import { Bench } from 'lib/bench.js'
import { system } from 'lib/system.js'
const { wrapMemory, unwrapMemory, assert } = spin
const bench = new Bench()
let runs = 0
const size = 100 * 1024 * 1024
while (1) {
runs = 6000
for (let i = 0; i < 5; i++) {
bench.start(`new ArrayBuffer ${size}`)
for (let j = 0; j < runs; j++) {
const buf = new ArrayBuffer(size)
assert(buf.byteLength === size)
}
bench.end(runs)
}
runs = 6000
for (let i = 0; i < 5; i++) {
bench.start(`new ArrayBuffer w/unwrap ${size}`)
for (let j = 0; j < runs; j++) {
const buf = new ArrayBuffer(size)
assert(buf.byteLength === size)
unwrapMemory(buf)
assert(buf.byteLength === 0)
}
bench.end(runs)
}
runs = 180000
for (let i = 0; i < 5; i++) {
bench.start(`calloc/wrap external ${size}`)
for (let j = 0; j < runs; j++) {
const address = system.calloc(1, size)
const buf = wrapMemory(address, size, 0)
assert(buf.byteLength === size)
system.free(address)
}
bench.end(runs)
}
runs = 180000
for (let i = 0; i < 5; i++) {
bench.start(`calloc/wrap external w/unwrap ${size}`)
for (let j = 0; j < runs; j++) {
const address = system.calloc(1, size)
const buf = wrapMemory(address, size, 0)
assert(buf.byteLength === size)
system.free(address)
unwrapMemory(buf)
assert(buf.byteLength === 0)
}
bench.end(runs)
}
runs = 6000
for (let i = 0; i < 5; i++) {
bench.start(`calloc/wrap internal ${size}`)
for (let j = 0; j < runs; j++) {
const address = system.calloc(1, size)
const buf = wrapMemory(address, size, 1)
assert(buf.byteLength === size)
}
bench.end(runs)
}
runs = 6000
for (let i = 0; i < 5; i++) {
bench.start(`calloc/wrap internal w/unwrap ${size}`)
for (let j = 0; j < runs; j++) {
const address = system.calloc(1, size)
const buf = wrapMemory(address, size, 1)
assert(buf.byteLength === size)
unwrapMemory(buf)
assert(buf.byteLength === 0)
}
bench.end(runs)
}
runs = 6000000
for (let i = 0; i < 5; i++) {
const address = system.calloc(1, size)
bench.start(`wrap existing external ${size}`)
for (let j = 0; j < runs; j++) {
const buf = wrapMemory(address, size, 0)
assert(buf.byteLength === size)
}
bench.end(runs)
system.free(address)
}
runs = 6000000
for (let i = 0; i < 5; i++) {
const address = system.calloc(1, size)
bench.start(`wrap existing external w/unwrap ${size}`)
for (let j = 0; j < runs; j++) {
const buf = wrapMemory(address, size, 0)
assert(buf.byteLength === size)
unwrapMemory(buf)
assert(buf.byteLength === 0)
}
bench.end(runs)
system.free(address)
}
} and the wrapMemory and unwrapMemory from C++ void spin::WrapMemory(const FunctionCallbackInfo<Value> &args) {
Isolate* isolate = args.GetIsolate();
uint64_t start64 = (uint64_t)Local<Integer>::Cast(args[0])->Value();
uint32_t size = (uint32_t)Local<Integer>::Cast(args[1])->Value();
void* start = reinterpret_cast<void*>(start64);
int32_t free_memory = 0;
if (args.Length() > 2) {
free_memory = (int32_t)Local<Integer>::Cast(args[2])->Value();
}
if (free_memory == 0) {
std::unique_ptr<BackingStore> backing = ArrayBuffer::NewBackingStore(
start, size, v8::BackingStore::EmptyDeleter, nullptr);
Local<ArrayBuffer> ab = ArrayBuffer::New(isolate, std::move(backing));
args.GetReturnValue().Set(ab);
return;
}
std::unique_ptr<BackingStore> backing = ArrayBuffer::NewBackingStore(
start, size, spin::FreeMemory, nullptr);
Local<ArrayBuffer> ab = ArrayBuffer::New(isolate, std::move(backing));
args.GetReturnValue().Set(ab);
}
void spin::UnWrapMemory(const FunctionCallbackInfo<Value> &args) {
Local<ArrayBuffer> ab = args[0].As<ArrayBuffer>();
ab->Detach();
}
|
Thanks for this Bryan - i didn't know this was possible. Out of interest, i ran some benchmarks of this on a custom v8 runtime i am hacking on and compared it to another technique i have been playing with. Of course, this is very dangerous and not something I would expect to see in Node.js or Deno, but the numbers are interesting all the same.
The technique I use is:
This proves to be ~30 times faster on my setup, but your detach technique does not seem to work for me in freeing up the memory for the wrapping ArrayBuffer in the hot loop so I see memory constantly growing.
this is what the JS code looks like. I had to set the --allow-natives-syntax flag on the command line as v8 i am on barfs when i try to change the flags after initialising v8 platform.
Will have a further look when I get a chance and hopefully I can share this code soon.
v8/C++ WrapMemory Function
this is all horribly dangerous of course, but it's fun to test the boundaries of what v8/JS can do I think.
The text was updated successfully, but these errors were encountered: