-
Notifications
You must be signed in to change notification settings - Fork 689
Add parsing of decimal constants (e.g., 1.02e+01) #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parsing of decimal constants (e.g., 1.02e+01) #8764
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good to me. Thanks a lot for this contribution.
See comments in the code. Otherwise, here are some answers and general comments.
Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.
No opinion, maybe @ejgallego has a suggestion.
Note : Should
IZR 10200 / IZR (Z.pow_pos 10 3)
(resp.Qmake 10200 1000
) be uninterpreted as10200e-3
or102e-1
or1.02e1
?
As a naive user, I would probably expect 10.2
if in a type with a canonical form. If in Q
, without normalization, I guess I would expect to see 10200/1000
. And for IZR 10200 / IZR (Z.pow_pos 10 3)
, I would expect, I believe,10200 / 10^3
.
Note that we request at least one digit after a potential dot to avoid things like Check 42. to be parsed as "Check" "42." instead of "Check" "42" ".".
Yes, but this shall be mainly a lexer issue, I guess.
As far as I understand, this is only the interpretation infrastructure. It would remain to add the lexer, as well as providing effective primitive interpreters and/or extending the user-defined numeral notations with floating point numbers, right?
More generally, this raises the question of other kinds of printers. E.g. parsing 0xFF00
. Also, aren't there other standard than using e
for floating numbers. The e
looks pretty programming-language (confusing) jargon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, that seems like a big patch for a fairly small feature. I wonder if the lexer shouldn't just process the literals away.
8aaa6f1
to
b8aa667
Compare
Thanks for the comments.
The current behavior in this PR is 10200e-3 in both cases. So maybe we should just not register any uninterpreter to keep 10200 / 1000 and 10200 / 10^3.
Exactly, nothing specifically related to floating-point numbers in this PR.
I hope this would be rather easy to add after this PR. As far as I can see, this would require :
N.B.: If we go one day in this direction, in many languages, a leading 0 is used for octal numbers (e.g., 020 means 2*8+0 = 16). In Coq, leading 0s are simply ignored. Should we forbid them to avoid any mistake? Last question: in many languages, underscores are accepted as comments in number literals.
This sounds standard to both computer scientists (various programming languages) and control theorists (Matlab) around me. I'm not aware of other notations (aside from 0x1p+0 for radix 2). Maybe @silene would have other ideas? @ejgallego: the patch is unfortunately relatively large. @herbelin's comments enabled to simplify it a bit. I also cut it in multiple commits to make it easier to review. I would definitively be happy to simplify it further but I do not fully grasp your idea. Could you please elaborate on what you mean by "I wonder if the lexer shouldn't just process the literals away."? |
Not really. The |
@herbelin all your comments are addressed in the latest rebase. More specifically:
|
b8aa667
to
c6c6fa0
Compare
c6c6fa0
to
4b1dd5a
Compare
It would be preferable to also require a digit before the decimal point, to avoid conflict with the |
That was indeed my first thought, however it appeared a non issue in our developments mixing reals and pair projectors. Notations such as
|
4b1dd5a
to
5fa2d0b
Compare
Certainly, but it's the other way around: when using mathcomp one won't be able to use |
Thanks for the clarification! Indeed, there is one case that could lead to confusing type errors for the user, when she writes Anyway, I understand the consistency point and if there is a consensus towards requiring at least one digit before the dot, I'd be happy to modify the current PR. |
I think it would be good indeed. |
I jump to the bandwagon... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert here but looks good to me
Done |
I say let's merge this on Friday, would be great if @ppedrot @herbelin could comment or request time for review of course. IMHO all the features of this kind should be considered "experimental " for the release they are introduced, meaning that users don't have 100% guarantee on stability in case we want to perform some refinement later on. I suggest indeed users of this feature treat it as experimental. |
Ok, I'm about to merge, I have a last question @proux01 about the changes in CoRN. Why was the overlay needed, should we add some more migration tips to the CHANGES file? IMHO if this change is going to break user developments in any way we should try to document the change of semantics better, the current text seems a bit too concise. |
7ed0849
to
702c65b
Compare
CORN did introduce custom notations on the numeral tokens 2, 3 and 4 which were conflicting with the newly introduced one in Q.
That seems like a rare case (only happened in CORN in the CI) but you're right, I just updated CHANGES.md with that point and the experimental status you suggested in your previous comment. Let me know in case a rebase would be needed. |
Thanks @proux01 , I will merge as soon as the CI passes. I hope we don't need a rebased. |
702c65b
to
6af420b
Compare
Reviewed-by: Zimmi48 Reviewed-by: ejgallego Ack-by: gares Ack-by: herbelin Ack-by: ppedrot Ack-by: proux01
All overlays are merged so we are all set. Thanks @ejgallego and thanks to everyone who reviewed or contributed in the discussion! |
…decimal [coq] Adapt to rocq-prover#8764
'e' was not displayed when printing decimal notations in R : Require Import Reals. Check (1.23e1, 32e+1, 0.1)%R. was giving < (123-1%R, 321%R, 1-1%R) instead of < (123e-1%R, 32e1%R, 1e-1%R) This was introduced in rocq-prover#8764 (in Coq 8.10).
'e' was not displayed when printing decimal notations in R : Require Import Reals. Check (1.23e1, 32e+1, 0.1)%R. was giving < (123-1%R, 321%R, 1-1%R) instead of < (123e-1%R, 32e1%R, 1e-1%R) This was introduced in rocq-prover#8764 (in Coq 8.10).
'e' was not displayed when printing decimal notations in R : Require Import Reals. Check (1.23e1, 32e+1, 0.1)%R. was giving < (123-1%R, 321%R, 1-1%R) instead of < (123e-1%R, 32e1%R, 1e-1%R) This was introduced in rocq-prover#8764 (in Coq 8.10). (cherry picked from commit 85b3be5)
This adds parsing of decimal constant like
1.02e+01
(=10.2
). More precisely, it accepts inputs following the regexp[0-9]* [.]? [0-9]+ ([eE][+-]?[0-9]+)?
. Parsers are provided for Reals and Q. This may look like a crazy useless feature but we'll sorely need it in a future PR adding primitive floating point computation (c.f. #8276 with @erikmd).Note that we request at least one digit after a potential dot to avoid things like
Check 42.
to be parsed as "Check" "42." instead of "Check" "42" ".".Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.
Note : Should
IZR 10200 / IZR (Z.pow_pos 10 3)
(resp.Qmake 10200 1000
) be uninterpreted as10200e-3
or102e-1
or1.02e1
?Todo:
_
as comment in numerals ?.1
and.2
Depends on: #8720(merged)Depends on: #9815(merged)Overlays: