Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm2c: uvwasi support #2002

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

wasm2c: uvwasi support #2002

wants to merge 11 commits into from

Conversation

talg
Copy link

@talg talg commented Sep 21, 2022

Hi friends, @keithw @sbc100

Here is a first cut at providing uvwasi bindings for wasm2c.

Inspiration and code started as part of wasm2native.

wasm2c/examples/hello-wasi - shows how we can use this with a simple application that exercises wasi.
wasm2c/examples/build-standalone - provides a script that takes an arbitrary wasm file and builds it with wasi bindings so it can run as a normal stand-alone binary.

There are a few things that still need to be done, security checks on arguments to wasi calls, some documentation, but I think it's at the point where feedback would be helpful.

@talg talg changed the title Uvwasi support uvwasi support Sep 21, 2022
@keithw
Copy link
Member

keithw commented Sep 22, 2022

This looks quite interesting! Big-picture questions:

  • Who is this for / what are some intended use-cases?
  • Is this helpful for RLBox or Firefox?
  • Does WASI have any tests? How should we make sure this is working and keeps working?
  • If the goal is similar to wasm2native, I wonder if there's a simplification where instead of using preprocessor macros to parameterize on the module name, what if instead there's just a wasm2native shell script that runs wasm2c --module-name=wasi-command or something, and then the main.c program can have that hardcoded...?
  • Does the wasm2native author have a view on us effectively upstreaming this?

Smaller stuff:

  • Instead of directly accessing the w2c_memory internal member of the instance struct, probably this should access it via the export. WASI commands are required to export their memory as "memory," so calling Z_modnameZ_memory(instance) will get you the exported memory.
  • I'm not sure we want to pollute the global namespace with all the Wasm type names (u8, s8, etc.) for anybody who includes wasm-rt.h. It might be better for uvwasi-rt.c to use standard C type names.
  • Where does "Z_wasi_unstable" come from -- are there WASI modules in the wild that import from this modname?
  • I'm a little unsure if we want to include the hello-wasi Makefile and C source here (i.e. a build system that compiles a C program into a WASI command) -- that feels like a bigger enlargement in scope and leads me to start wanting to check, e.g., whether these CFLAGS are really needed by clang "to play nice with wasm2c" (they haven't been in my experience) but that doesn't even really feel like a WABT issue. Maybe this belongs more in a wasi-sdk tutorial, or somewhere that produces WASI commands? What are your thoughts?

@talg
Copy link
Author

talg commented Sep 24, 2022

Hi Keith,

Thanks for the quick feedback and thoughtful comments!

This looks quite interesting! Big-picture questions:

  • Who is this for / what are some intended use-cases?

People who like running code that makes system calls =)

On several current projects students needed to be able to run existing libraries and applications, but couldn't, because of the lack of a WASI support in wasm2c (e.g. because said libraries/applications needed to access the filesystem). I went looking for a WASI implementation and came across wasm2native, so I forked an older version that still supported (an older version of) wasm2c, made some small changes, and students have been making use of that.

Being able to just compile and run existing code with wasi-libc and wasi makes life so much easier, and most other wasm toolchains (e.g. WAMR, Wasmtime) already supports this.

uvwasi seems to be a reasonable wasi implementation, node.js uses it, WAMR uses it, and its a third party dependency in wasm2c already.

  • Is this helpful for RLBox or Firefox?

I think there is plenty of stuff one might want to sandbox today with RLBox where you need some WASI support, and this is a reasonable solution for that.

  • Does WASI have any tests? How should we make sure this is working and keeps working?

Just started looking at that yesterday. There are a couple of wasi test suites out there in various stages of usefulness. Some of that discussion is here:

WebAssembly/WASI#9

There doesn't seem to be one answer that folks have converged on.

node.js seems to have a decent little collection of tests but the driver seems a little more complicated then I what I want, since I'm just testing a binding, and I don't think wasm2c wants to depend on node. My first thought was
to just replace their driver with a simple python test runner, and use the tests I need to give me adequate coverage.

So, not 100% clear on the path forward (I have a couple other ideas), but plan to have that worked out soon.

  • If the goal is similar to wasm2native, I wonder if there's a simplification where instead of using preprocessor macros to parameterize on the module name, what if instead there's just a wasm2native shell script that runs wasm2c --module-name=wasi-command or something, and then the main.c program can have that hardcoded...?

Yup, that sounds good. Removing pre-processor stuff and hard coding names 👍 .

Moving wasm2native into /scripts and out of examples probably makes sense, then it can just be
a thing that people can make use of, the additional parameter to name the resulting application sounds
good.

  • Does the wasm2native author have a view on us effectively upstreaming this?

I would be glad to know. I reached out to @vshymanskyy a couple days ago by email but haven't heard back yet. wasm2native abandoned wasm2c support sometime back in favor of w2c2, so I'm guessing its not a priority.

Smaller stuff:

  • Instead of directly accessing the w2c_memory internal member of the instance struct, probably this should access it via the export. WASI commands are required to export their memory as "memory," so calling Z_modnameZ_memory(instance) will get you the exported memory.

👍

  • I'm not sure we want to pollute the global namespace with all the Wasm type names (u8, s8, etc.) for anybody who includes wasm-rt.h. It might be better for uvwasi-rt.c to use standard C type names.

Sure, that seems like a good suggestion.

  • Where does "Z_wasi_unstable" come from -- are there WASI modules in the wild that import from this modname?

Honestly, it was one of the few things that was there when I grabbed this code where I was like, huh, that's gross, I should probably look into that at some point, and moved on. There is probably some backstory there
that I'm not aware of.

I just grepped through wasi-libc and it looks like its only using the preview1 symbols. I would be happy to rip out that extra pre-processor cruft.

  • I'm a little unsure if we want to include the hello-wasi Makefile and C source here (i.e. a build system that compiles a C >program into a WASI command) -- that feels like a bigger enlargement in scope and leads me to start wanting to check, >e.g., whether these CFLAGS are really needed by clang "to play nice with wasm2c" (they haven't been in my experience) >but that doesn't even really feel like a WABT issue.

None of those flags are needed for this simple example, they just got carried along with the boilerplate from a previous makefile I used for this =/

Maybe this belongs more in a wasi-sdk tutorial, or somewhere that produces WASI commands? What are your thoughts?

I think taking out the stuff that builds .c => wasm makes sense for the example as it eliminates that dependency on the wasi-sdk. I think making this simpler and keeping it as the wasi example makes sense, and then turning build-standalone into wasm2native as a script also sounds good.

keithw and others added 2 commits September 26, 2022 11:36
…ebAssembly#1999)

also: - cleanup handling of newlines
      - "init_memory"/"init_table" -> "init_memories"/"init_tables"
@sbc100 sbc100 changed the title uvwasi support wasm2c: uvwasi support Sep 28, 2022
@@ -395,6 +397,10 @@ IF (NOT WIN32)
endif ()
endif ()

if (BUILD_UVWASI)
add_subdirectory("third_party/uvwasi" )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a WITH_WASI flag .. is there some reason we can't use use that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed to me like enabling the mostly not working wasi support in the interpreter, and allowing people to build their standalone apps with uvwasi were logically separate things. I'm not attached around the issue of wether this is enabled by default, so whatever makes sense just lmk.

@sbc100
Copy link
Member

sbc100 commented Sep 28, 2022

Can you rebase now that the bulk memory change has landed?

cp ${ORIGIN_DIR}/$1 ./input.wasm
make input.c

make input.elf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this build-standalone idea. The mixing of a build.sh script and the Makefile file a find a little confusing, but I think we can iterate on it over time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, it was just a quick way to get something working, if we want to promote to a first class feature I can just replace it with a stand-alone script.


//force pre-processor expansion of m_name
#define __module_init(m_name) Z_## m_name ##_init_module()
#define module_init(m_name) __module_init(m_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with @keithw that it would be nice to avoid these macros and assume that wasm2c was run with the correct -m flag.

Copy link
Author

@talg talg Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, they have been removed in 9703be9.

@talg
Copy link
Author

talg commented Oct 7, 2022

Can you rebase now that the bulk memory change has landed?

Sure.

@talg
Copy link
Author

talg commented Oct 7, 2022

@sbc100 sorry about the delayed response, I was filtering my git notifications into a lower priority folder because of a time in the past when there was a lot of traffic.

I have responded to all of @keithw's comments, except for the need for a test suite. I'm a little unsure of where to put this, since its a little bit out of step with the rest of the wabt tests. The wasm-c-api test script looks good, but those tests live in a third_part repo. If you have thoughts on where you would be happy with a directory of .c tests, that would be helpful, or I can put them in a submodule, whatever seems right.

I think aside from this, adding some security checks to uvwasi-rt.c, and updating the README is all I have left on my list.

Copy link
Member

@keithw keithw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could hello.wasm be included as a wat file (instead of checking the .wasm into Git)?

@@ -0,0 +1,607 @@
;;; TOOL: run-spec-wasm2c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably a Git issue (recreating the old-spec/bulk-memory-operations/imports.txt file along with the other three files in this directory). It might be best to rebase your commits on top of the main branch (rather than using a merge commit), which will make the GitHub "Files changed" view a lot simpler.

@keithw
Copy link
Member

keithw commented Oct 7, 2022

On the testsuite, I think anything that gives confidence that the code works as intended (and will be robust to well-meaning but unintentionally buggy refactors) sounds fine to me. Maybe some options are:

  • a bunch of end-to-end tests where each one contains

    1. A "starting contents" of a directory
    2. A .wat file for a Wasm module that's a WASI command
    3. An intended "ending contents" of the directory

    and the test jig verifies that if you run the WASI command in the starting directory, you end up with the intended ending contents. I guess this only really works for testing calls that manipulate the filesystem.

  • tests against a mock/stub version of uvwasi, where the test contains:

    1. A .wat file for a Wasm module that's a WASI command
    2. Some sort of test script that defines what uvwasi calls should be getting called, and what their return values are supposed to be

    and the test jig verifies that if you run the WASI command, you get the correct sequence of calls to the mock uvwasi (the real uvwasi wouldn't be used here).

Probably the tests should include some "evil" WASI commands that try to put naughty parameters in (to make sure these are correctly caught).

@talg
Copy link
Author

talg commented Oct 8, 2022

On the testsuite, I think anything that gives confidence that the code works as intended (and will be robust to well-meaning but unintentionally buggy refactors) sounds fine to me. Maybe some options are:

  • a bunch of end-to-end tests where each one contains

    1. A "starting contents" of a directory
    2. A .wat file for a Wasm module that's a WASI command
    3. An intended "ending contents" of the directory

    and the test jig verifies that if you run the WASI command in the starting directory, you end up with the intended ending contents. I guess this only really works for testing calls that manipulate the filesystem.

  • tests against a mock/stub version of uvwasi, where the test contains:

    1. A .wat file for a Wasm module that's a WASI command
    2. Some sort of test script that defines what uvwasi calls should be getting called, and what their return values are supposed to be

    and the test jig verifies that if you run the WASI command, you get the correct sequence of calls to the mock uvwasi (the real uvwasi wouldn't be used here).

Probably the tests should include some "evil" WASI commands that try to put naughty parameters in (to make sure these are correctly caught).

If we could avoid using wat, and instead just have a C test suite that would be much less work for me to write, and
for others to maintain. My thought was just to adapt tests from the node test suite (to remove node depedencies).

https://github.com/nodejs/node/tree/main/test/wasi/c

Each test could be stand-alone, consisting of a set of calls and some assertions
to test that things are correct. The test runner could be a small variation on what
wabt/test/run-c-api-examples.py does.

For the purposes of testing with malicous parameters, we could certain add tests
that call wasi api's directly.

@keithw
Copy link
Member

keithw commented Oct 8, 2022

That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.

I don't think it's a problem to put these tests in this repo in some subdir (maybe test/wasm2c/wasi ?). Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (or maybe just a new argument to run-spec-wasm2c.py ?) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the stdout is as expected and the program exits successfully. Which is already basically what run-spec-wasm2c.py does today, except for the C->Wasm compilation and linking against unwasi-rt.o...

(Separately, somehow we do have to deal with the build failures in CI to get this merged...)

@talg
Copy link
Author

talg commented Oct 8, 2022

That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.

Cool, sounds good. I think just using the clang/llvm compiler that comes as part of the wasi-sdk will be simpler. It seems like adding it as a dependency to the README would be the simplest way to include it.

I don't think it's a problem to put these tests in this repo in some subdir (maybe test/wasm2c/wasi ?).

Cool, will do.

Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (or maybe just a new >argument to run-spec-wasm2c.py ?) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the stdout is as expected and the program exits successfully. Which is already basically what run-spec-wasm2c.py does today, except for the C->Wasm compilation and linking against unwasi-rt.o...

I was thinking to just copy and lightly modify wabt/test/run-c-api-examples.py e.g. run-uvwasi-tests.py,
and adding another custom build target to CMakeLists.txt just like the run-c-api-tests. The only difference being that
we can have it run a make file to build the tests before it runs them.

(Separately, somehow we do have to deal with the build failures in CI to get this merged...)

For sure.

@keithw
Copy link
Member

keithw commented Oct 8, 2022

wasi-sdk is great too, but, where do we pull that in from for the CI tests? (With emscripten I know there's an emsdk:latest Docker repo that wabt already uses...)

@talg
Copy link
Author

talg commented Oct 8, 2022

That sounds fine to me -- I guess the plan would be that the testsuite first compiles all these .c files to .wasm by using >emscripten? If there's a pre-existing collection of WASI tests in C, I agree that's way better than writing them anew.

Cool, I think I would just stick with the clang that comes with the wasi-sdk. No need to get involved with emscripten.

I don't think it's a problem to put these tests in this repo in some subdir (maybe test/wasm2c/wasi ?). Then I guess it's a question of making the appropriate edits to run-tests.py to add a new "tool" (and maybe a new argument to run-spec-wasm2c.py) that compiles the C to Wasm, runs wasm2c, links against uvwasi-rt.o, runs the program, and verifies that the output is as expected. (Which is already quite similar to what run-spec-wasm2c.py does except for the C->Wasm compilation and linking against unwasi-rt.o.)

(Separately, somehow we do have to deal with the Windows build failures in CI to get this merged...)

wasi-sdk is great too, but, where do we pull that in from for the CI tests? (With emscripten I know there's an emsdk:latest Docker repo that wabt already uses...)

https://github.com/WebAssembly/wasi-sdk/releases

@sbc100
Copy link
Member

sbc100 commented Oct 8, 2022

I'd rather not depend on emscripten or wasi-sdk here.

I think having them checked in here as wat files makes more sense, perhaps with script that can be used to update them using wasi-sdk?

Also, I don't think its important to run the entire wasi test suite, at least not initially. Just one or two basic tests to start with I think should be fine. Once we are happy with the approach we can consider expanding to a larger wasi test suite. When we do this we should co-ordinate the wasm-interp which can also run wasi and it would be good to test that method using the same test suite.

@vshymanskyy
Copy link

Does the wasm2native author have a view on us effectively upstreaming this?

Hey! I'm not opposed to this at all. Of course a proper attribution should be included.

Let me know if you have any questions!

@keithw keithw mentioned this pull request Oct 12, 2022
12 tasks
@talg
Copy link
Author

talg commented Oct 28, 2022

I'd rather not depend on emscripten or wasi-sdk here.

I think having them checked in here as wat files makes more sense, perhaps with script that can be used to update them using wasi-sdk?

Also, I don't think its important to run the entire wasi test suite, at least not initially. Just one or two basic tests to start with I think should be fine. Once we are happy with the approach we can consider expanding to a larger wasi test suite. When we do this we should co-ordinate the wasm-interp which can also run wasi and it would be good to test that method using the same test suite.

This all sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants