-
-
Notifications
You must be signed in to change notification settings - Fork 219
WebAssembly support #503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @pdubroy, Overall this plan looks great to me, and reflects much of what we discussed yesterday. (1) Re: the goals,
I'm going to push for this one to go from a maybe to a definitely. For compatibility with existing grammars, it would then be important to implement the whitespace-skipping for syntactic rules and grammar inheritance. (I don't expect either of these things to be tricky, it's just more work that would need to get done.) (2)
Not necessarily. It's common for Smalltalk VMs to implement polymorphic inline caches with a small list of (class, method implentation) tuples per call site. This list is traversed linearly: Is the receiver an instance of this class? If so, then that's the implementation of the method. If not, is the receiver an instance of the second class? (And so on.) I remember Eliot Miranda pointing out that this was way faster than using a hash map. Assuming only a small number of rules is applied at any particular position, this scheme could work very well. (3)
Like I said yesterday, I'm all for getting nixing indirect left recursion -- it adds a lot of complexity to the implementation and on the user's side, it's difficult to think about and rarely necessary. I also think that we could do a better job of designing an API for writing semantic actions now than we did back in the day. (We've had lots of experience using it in Typescript, and it's easy to see several opportunities for improvement.) But this doesn't have to be a breaking change -- for a while, we could support both the old style and a better alternative. The old stuff could be deprecated and eventually go away. Excited about this! -- Alex |
Right! You mentioned this yesterday. You're right, there may be simpler options too. Although a true inline cache unfortunately can't be done in Wasm, because the code isn't writeable. But still, you're right that we might not even need a hash table. |
A progress update — The following test is now passing: test('basic wasm support', async t => {
const g = ohm.grammar(`
G {
}
`);
const matcher = await WasmMatcher.forGrammar(g);
matcher.setInput('1');
t.is(matcher.match(), 1);
}); This is really the most minimal thing possible, but required getting a lot of the basic machinery in place:
Next:
|
More progress:
|
More progress — I have choice, rule application, and sequence all working correctly. The following test now passes: test('wasm: rule application', async t => {
const g = ohm.grammar(`
G {
start = one two -- x
| three
two = "II" | "2"
three = "3"
}
`);
const matcher = await WasmMatcher.forGrammar(g);
t.is(matchWithInput(matcher, '12'), 1);
t.is(matchWithInput(matcher, '1II'), 1);
t.is(matchWithInput(matcher, '3'), 1);
t.is(matchWithInput(matcher, '13'), 0);
}); I also implemented a few things to make the generated code easier to debug. I'm currently inlining the code for all pexprs into a single function for the entire rule body, and I now have the ability to generate labels (using dummy My goal next week is to get parity with the simple PEG language that we implemented for our SLE '17 paper. That will enable a couple of things:
And @alexwarth suggested that we should also try it without inlining, i.e. a separate function for each kind of |
Initial support for WebAssembly, as describe in #503. Adds a new package which exposes a `WasmMatch` class. Internally, `WasmMatcher` compiles the grammar to a WebAssembly module that exports a `match` method. Working so far: - basic matching use mostly pure-PEG features (no left recursion, parameterized rules, etc.) - rudimentary CST with a few tests Not yet: - memoization - support for Ohm-specific features - clean interface for writing semantic actions
Made some good progress this week! Everything is now on main, in a separate wasm package. We can now compile real-world grammars to WebAssembly. There's also rudimentary support for building a very minimal CST, and a JavaScript helper that can walk the CST. We don't yet record rule names though, so it's not yet possible to write proper semantic actions. There's also no memoization, automatic space skipping, or any Ohm-specific features. Next week I plan to implement some kind of memoization, and implement basic support for true semantic actions. Edit: I landed a modified version of the ES5 grammar from our SLE'17 paper. It currently fails to parse a trivial example, but I will look into that soon. |
Progress this week:
|
Uh oh!
There was an error while loading. Please reload this page.
Over the years we've had a few requests for a way to use Ohm from other languages — e.g. #348 mentions Dart, Rust, and Swift. Go and Python have also come up in private discussions. This proposal discusses one way we could support that: by implementing the equivalent of the
Grammar.match
in WebAssembly.Goals
The primary goal is to be able to produce a WebAssembly module (
.wasm
) from an existing Ohm grammar. The Wasm module would export amatch
function, which could be used just like the existingGrammar.match
method.At minimum, we also need an API (or ABI) for walking the parse tree.
If possible, we'd also like to go a bit further:
Non-goals
We won't support the full Ohm API from WebAssembly (
trace
, etc.) We assume that it's acceptable to run JavaScript code to produce the WebAssembly module.Initial Plan
MVP:
and Go.no_std
.Post MVP:
trace
?Beyond:
Open questions
no_std
?The text was updated successfully, but these errors were encountered: