-
Notifications
You must be signed in to change notification settings - Fork 66
feat: Support generic input for Parse effect #1305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
def anyOf[A, S](seq: (A < (Parse & S))*)(using Frame): A < (Parse & S) = | ||
def anyOf[A, In, S](seq: (A < (Parse[In] & S))*)(using Frame): A < (Parse[In] & S) = | ||
Choice.evalWith(seq)(identity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, clause interleaving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add it in all the functions (with A, In, S) if we plan to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So something like
def anyOf[A](seq: (A < (Parse[In] & S))*)[In, S](using Frame): A < (Parse[In] & S)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def anyOf[A](using Frame)[In, S](seq: (A < (Parse[In] & S))*): A < (Parse[In] & S)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I should do the same for every other combinators parametized with A, In, S
such as firstOf
, inOrder
, skipUntil
...?
If so, how should we handle implicit parameters depending on In
like:
def firstOf[A, In, S](
parser1: => A < (Parse[In] & S),
parser2: => A < (Parse[In] & S)
)(using Frame, StateTag[In]): A < (Parse[In] & S)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like that?
def firstOf[A](using Frame)[In, S](
parser1: => A < (Parse[In] & S),
parser2: => A < (Parse[In] & S)
)(using StateTag[In]): A < (Parse[In] & S)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I interleaved the type parameters for other methods but firstOf
and inOrder
seem to have problem with clause interleaving:
def inOrder[A, B, C](using
Frame
)[In, S](
parser1: => A < (Parse[In] & S),
parser2: => B < (Parse[In] & S),
parser3: => C < (Parse[In] & S)
)(using
StateTag[In]
): (A, B, C) < (Parse[In] & S)
//Same for other variants...
From ParseTest.scala line 655:
"consumes input sequentially" in run {
val parser = Parse.inOrder(
Parse.literal("a").andThen(1),
Parse.literal("b").andThen(2),
Parse.literal("c").andThen(3)
)
Parse.run("abc")(parser).map { result =>
assert(result == (1, 2, 3))
}
}
Error message
[error] -- [E134] Type Error: /home/fromentin/IdeaProjects/kyo/kyo-prelude/shared/src/test/scala/kyo/ParseTest.scala:655:35
[error] 655 | val parser = Parse.inOrder(
[error] | ^^^^^^^^^^^^^
[error] |None of the overloaded alternatives of method inOrder in object Parse with types
[error] | [A, B, C, D, E, F, G, H]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S),
[error] | parser3: => C < (kyo.Parse[In] & S),
[error] | parser4: => D < (kyo.Parse[In] & S),
[error] | parser5: => E < (kyo.Parse[In] & S),
[error] | parser6: => F < (kyo.Parse[In] & S),
[error] | parser7: => G < (kyo.Parse[In] & S), parser8: => H < (kyo.Parse[In] & S)
[error] | )
[error] | (using x$10: kyo.Parse.StateTag[In]): (A, B, C, D, E, F, G, H) < (
[error] | kyo.Parse[In] & S)
[error] | [A, B, C, D, E, F, G]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S),
[error] | parser3: => C < (kyo.Parse[In] & S),
[error] | parser4: => D < (kyo.Parse[In] & S),
[error] | parser5: => E < (kyo.Parse[In] & S),
[error] | parser6: => F < (kyo.Parse[In] & S), parser7: => G < (kyo.Parse[In] & S)
[error] | )
[error] | (using x$9: kyo.Parse.StateTag[In]): (A, B, C, D, E, F, G) < (
[error] | kyo.Parse[In] & S)
[error] | [A, B, C, D, E, F]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S),
[error] | parser3: => C < (kyo.Parse[In] & S),
[error] | parser4: => D < (kyo.Parse[In] & S),
[error] | parser5: => E < (kyo.Parse[In] & S), parser6: => F < (kyo.Parse[In] & S)
[error] | )
[error] | (using x$8: kyo.Parse.StateTag[In]): (A, B, C, D, E, F) < (kyo.Parse[In]
[error] | & S)
[error] | [A, B, C, D, E]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S),
[error] | parser3: => C < (kyo.Parse[In] & S),
[error] | parser4: => D < (kyo.Parse[In] & S), parser5: => E < (kyo.Parse[In] & S)
[error] | )
[error] | (using x$7: kyo.Parse.StateTag[In]): (A, B, C, D, E) < (kyo.Parse[In] &
[error] | S)
[error] | [A, B, C, D]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S),
[error] | parser3: => C < (kyo.Parse[In] & S), parser4: => D < (kyo.Parse[In] & S)
[error] | )(using x$6: kyo.Parse.StateTag[In]): (A, B, C, D) < (kyo.Parse[In] & S)
[error] | [A, B, C]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S),
[error] | parser2: => B < (kyo.Parse[In] & S), parser3: => C < (kyo.Parse[In] & S)
[error] | )(using x$5: kyo.Parse.StateTag[In]): (A, B, C) < (kyo.Parse[In] & S)
[error] | [A, B]
[error] | (using x$1: kyo.Frame)
[error] | [In, S]
[error] | (parser1: => A < (kyo.Parse[In] & S), parser2: => B < (kyo.Parse[In] & S))
[error] | (using x$4: kyo.Parse.StateTag[In]): (A, B) < (kyo.Parse[In] & S)
[error] |match arguments ((Int < kyo.Parse[Char], Int < kyo.Parse[Char], Int < kyo.Parse[Char]))
[error] |
[error] |where: A is a type variable
[error] | B is a type variable
[error] | C is a type variable
[error] | D is a type variable
[error] | E is a type variable
[error] | F is a type variable
[error] | G is a type variable
[error] | H is a type variable
Passing the output types (A, B, C...) explicitly seems to solve the problem but still a worse type inferrence 🤷
For instance this compiles:
val parser = Parse.inOrder[Int, Int, Int](
Parse.literal("a").andThen(1),
Parse.literal("b").andThen(2),
Parse.literal("c").andThen(3)
)
Note: I did not push the change for firstOf
and inOrder
due to this issue. Should we still get along with that and just pass explicitly the type parameters in the test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about Chunk.show
, the rest is great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution! Almost there.
def anyOf[A, S](seq: (A < (Parse & S))*)(using Frame): A < (Parse & S) = | ||
def anyOf[A, In, S](seq: (A < (Parse[In] & S))*)(using Frame): A < (Parse[In] & S) = | ||
Choice.evalWith(seq)(identity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add it in all the functions (with A, In, S) if we plan to use it.
For some reason I cannot directly reply to the following comment so I'll do it here:
Do you mean that I can remove the comment or also remove the |
me too, weird.
both? just keep |
a couple of changes, and it's good to go! thanks a lot @Iltotore! |
That's the question actually 😅. Should we keep both or just |
just |
parser6: => F < (Parse & S), | ||
parser7: => G < (Parse & S) | ||
def inOrder[A, B, C, D, E, F, G, In, S]( | ||
parser1: => A < (Parse[In] & S), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fwbrasil do you know why parser1: => A < (Parse[In] & S)
is a by-name? (it's probably just me not getting something about laziness in the last mile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's to defer computation in the case of where the user passes an unsuspended effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of what happens if we remove the by-name parameters:
import kyo.*
enum Expr:
case Lit(value: Int)
case Add(left: Expr, right: Expr)
val parseLit: Expr < Parse[Char] = Parse.int.map(Expr.Lit.apply)
def parenthesized: Expr < Parse[Char] = Parse.between(Parse.char('('), parseExpr, Parse.char(')'))
def parseInvokable: Expr < Parse[Char] = Parse.firstOf(parseLit, parenthesized)
def parseAdd: Expr < Parse[Char] = Parse
.separatedBy(parseInvokable, Parse.char('+'))
.map(_.reduceLeft(Expr.Add.apply))
def parseExpr: Expr < Parse[Char] = parseAdd
Exception in thread "main" java.lang.StackOverflowError
at java.base/java.lang.invoke.DirectMethodHandle.allocateInstance(DirectMethodHandle.java:520)
at kyo.Parse$package$Parse$.literal(Parse.scala:878)
at kyo.Parse$package$Parse$.char(Parse.scala:58)
at ExprTest$package$.parenthesized(ExprTest.scala:28)
at ExprTest$package$.parseInvokable(ExprTest.scala:11)
at ExprTest$package$.parseAdd(ExprTest.scala:14)
at ExprTest$package$.parseExpr(ExprTest.scala:17)
at ExprTest$package$.parenthesized(ExprTest.scala:9)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I added the by-name params to avoid stack overflows in recursive parsers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
val readAspect: Aspect[Const[Text], [C] =>> Maybe[(Text, C)], Parse] = | ||
Aspect.init(using Frame.internal) | ||
def readAspect[A]: Aspect[Const[Chunk[A]], [C] =>> Maybe[(Chunk[A], C)], Parse[A]] = | ||
localReadAspect.asInstanceOf[Aspect[Const[Chunk[A]], [C] =>> Maybe[(Chunk[A], C)], Parse[A]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this isn't safe. If there are multiple parsers in a computation handling different input types, aspects will be applied to all regardless of their type, which can produce a class cast exception. We could make Aspect
be based on tags to distinguish aspects for different input types. A workaround might be putting the aspect in the parse State
itself but then we need to handle their removal at the end of the scope and composition of nested aspects. Can you open a separate to explore a way to encode this in Aspect
? I'm afraid we'd need to fix this to merge.
congrats on your first contribution @Iltotore! this is a good improvement and the json parser is a nice example 🎉 I think the only blocker is the |
Yes the file is getting quite large. To be honest I was surprised when I saw that Parse is in prelude. I was thinking it was in its own module like However I'm not sure what benefit we get by keeping the text-only Parse. Having discussed with @ahoy-jon , at the end of the day it would be good to have a better implementation of A good example is Chumsky in Rust. |
Yeah, I think we should move it to a new
Agreed! The way I structured the effect was also meant to validate the pattern using an opaque type that composes other effects. If I remember correctly, it was the first effect using this pattern. Ideally, we should have a proper |
Should we rename already the current |
I've just posted #1327 with a solution for parametrized generic aspects.
Do you mean move to a new module? That'd be good but it isn't a requirement to merge. |
### Problem `Aspect`s can't be used in generic contexts when their inputs or outputs depend on a parametrized type. This issue was raised as part of the work to generalize `Parse` in #1305. ### Solution Instead of relying on the object identity of the `Aspect` instance, manage `Cut` activations based on a pair of the type signature of the `Aspect` via `Tag` and its allocation site via `Frame`. ### Notes - This approach doesn't keep a strict identity as before since the `Frame` can be shared in case it's available a scope with multiple `Aspect` instantiations but that's something that should happen only in Kyo's codebase since users shouldn't manage `Frame`s. - I've removed the `default` cut to simplify the implementation since it's unused.
Co-authored-by: Jonathan Winandy <jonathan.winandy@gmail.com>
f4ec4b2
to
bb38fcd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the Aspect
fix. A nice to have would be testing multiple parsers for different input types in the same computation
Co-authored-by: Flavio Brasil <fwbrasil@users.noreply.github.com>
Do you have any specific example in mind? |
|
||
object Parse: | ||
|
||
type StateTag[A] = Tag[Var[State[A]]] | ||
|
||
private val localReadAspect: Aspect[Const[Chunk[Any]], [C] =>> Maybe[(Chunk[?], C)], Parse[Any]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
The goal would be checking the failure scenario we had in the implementation without the new tag-based |
Problem
Unlike other parsing libraries such as ZIO Parser, Kyo's
Parse
effect was only made forText
/Char
input and does not support generic inputs. A common use-case are tokens to separate parsing and lexing.Solution
Changed
ParseState
to consumeChunk[In]
whereIn
is the type of the tokens the parser consumes. TheParse
is now also parametized with the input type.Here is a working example (Scala CLI, put the files in the same directory) with a JSON parser: https://gist.github.com/Iltotore/8d4315e1c668645907f6d678c7c202b3
Notes
Parse
and some combinators now require a type parameter.readAspect
to make it work with generic input. See its definition andlocalReadAspect
's.char
being superseded by their generic counterpart (e.gliteral[In](expected: In)
), I currently kept them for backward compatibility and in case we divideParseState
into specializedTextParseState
and another one for generic chunks.ParseState
forText
or keep the current representation of this PR (chunk ofChar
)?In
be contravariant inParse
?DRAFT: Some methods and comments might need to be changed before marking this PR as ready. I'd still like to receive feedback on its current state though