8000 Add parsing of decimal constants (e.g., 1.02e+01) by proux01 · Pull Request #8764 · rocq-prover/rocq · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add parsing of decimal constants (e.g., 1.02e+01) #8764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 4, 2019

Conversation

proux01
Copy link
Contributor
@proux01 proux01 commented Oct 18, 2018

This adds parsing of decimal constant like 1.02e+01 (= 10.2). More precisely, it accepts inputs following the regexp [0-9]* [.]? [0-9]+ ([eE][+-]?[0-9]+)?. Parsers are provided for Reals and Q. This may look like a crazy useless feature but we'll sorely need it in a future PR adding primitive floating point computation (c.f. #8276 with @erikmd).

Note that we request at least one digit after a potential dot to avoid things like Check 42. to be parsed as "Check" "42." instead of "Check" "42" ".".

Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

Todo:

Depends on: #8720 (merged)
Depends on: #9815 (merged)

Overlays:

@coqbot coqbot added the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 18, 2018
@Zimmi48 Zimmi48 removed the request for review from a team October 18, 2018 15:19
@Zimmi48 Zimmi48 added kind: feature New user-facing feature request or implementation. needs: merge of dependency This PR depends on another PR being merged first. labels Oct 18, 2018
Copy link
Member
@herbelin herbelin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me. Thanks a lot for this contribution.

See comments in the code. Otherwise, here are some answers and general comments.

Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.

No opinion, maybe @ejgallego has a suggestion.

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

As a naive user, I would probably expect 10.2 if in a type with a canonical form. If in Q, without normalization, I guess I would expect to see 10200/1000. And for IZR 10200 / IZR (Z.pow_pos 10 3), I would expect, I believe,10200 / 10^3.

Note that we request at least one digit after a potential dot to avoid things like Check 42. to be parsed as "Check" "42." instead of "Check" "42" ".".

Yes, but this shall be mainly a lexer issue, I guess.

As far as I understand, this is only the interpretation infrastructure. It would remain to add the lexer, as well as providing effective primitive interpreters and/or extending the user-defined numeral notations with floating point numbers, right?

More generally, this raises the question of other kinds of printers. E.g. parsing 0xFF00. Also, aren't there other standard than using e for floating numbers. The e looks pretty programming-language (confusing) jargon.

Copy link
Member
@ejgallego ejgallego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, that seems like a big patch for a fairly small feature. I wonder if the lexer shouldn't just process the literals away.

@proux01 proux01 force-pushed the master-parsing-decimal branch from 8aaa6f1 to b8aa667 Compare October 22, 2018 09:07
@proux01
Copy link
Contributor Author
proux01 commented Oct 22, 2018

Thanks for the comments.

If in Q, without normalization, I guess I would expect to see 10200/1000. And for IZR 10200 / IZR (Z.pow_pos 10 3), I would expect, I believe,10200 / 10^3.

The current behavior in this PR is 10200e-3 in both cases. So maybe we should just not register any uninterpreter to keep 10200 / 1000 and 10200 / 10^3.

As far as I understand, this is only the interpretation infrastructure. It would remain to add the lexer, as well as providing effective primitive interpreters and/or extending the user-defined numeral notations with floating point numbers, right?

Exactly, nothing specifically related to floating-point numbers in this PR.

More generally, this raises the question of other kinds of printers. E.g. parsing 0xFF00.

I hope this would be rather easy to add after this PR. As far as I can see, this would require :

  • modifying the lexer parsing/cLexer.ml4
  • modifying the is_numeral and numeral_of_string functions (in lib/util)
  • adding a Coq hexadecimal type like in `theories/Init/Decimal.v``
  • adding translations between string and Coq in interp/notation.ml
  • some bookkeeping in plugins/syntax/numeral.ml

N.B.: If we go one day in this direction, in many languages, a leading 0 is used for octal numbers (e.g., 020 means 2*8+0 = 16). In Coq, leading 0s are simply ignored. Should we forbid them to avoid any mistake?

Last question: in many languages, underscores are accepted as comments in number literals. 1_000_000 is indeed more readable than 1000000. Should we add this feature?

Also, aren't there other standard than using e for floating numbers. The e looks pretty programming-language (confusing) jargon

This sounds standard to both computer scientists (various programming languages) and control theorists (Matlab) around me. I'm not aware of other notations (aside from 0x1p+0 for radix 2). Maybe @silene would have other ideas?

@ejgallego: the patch is unfortunately relatively large. @herbelin's comments enabled to simplify it a bit. I also cut it in multiple commits to make it easier to review. I would definitively be happy to simplify it further but I do not fully grasp your idea. Could you please elaborate on what you mean by "I wonder if the lexer shouldn't just process the literals away."?

@silene
Copy link
Contributor
silene commented Oct 22, 2018

Maybe @silene would have other ideas?

Not really. The e notation has been in use for more than 60 years and I guess it has become a de facto standard. The only modern tool I can think of that does not use it is Mathematica, which instead uses 123*^-1, because 123e-1 is parsed as 123 * e - 1.

@proux01
Copy link
Contributor Author
proux01 commented Oct 22, 2018

@herbelin all your comments are addressed in the latest rebase. More specifically:

  • the half-way refinement of Numeral has been dropped and the decomposition is now entirely done in coqdecimal_of_rawnum
  • no new token, the "INT" token is reused, and renamed "NUMERAL" to avoid any confusion
  • no space after "-" sign
  • type sign = bool has been replaced by type sign = SPlus | SMinus

@proux01 proux01 force-pushed the master-parsing-decimal branch from b8aa667 to c6c6fa0 Compare October 30, 2018 15:07
@coqbot coqbot removed the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 30, 2018
@proux01 proux01 force-pushed the master-parsing-decimal branch from c6c6fa0 to 4b1dd5a Compare January 26, 2019 17:35
@proux01 proux01 requested review from a team as code owners January 26, 2019 17:35
@ggonthier
Copy link
Contributor

It would be preferable to also require a digit before the decimal point, to avoid conflict with the .1 and .2 symbols used for pair projectors.

@proux01
Copy link
Contributor Author
proux01 commented Jan 27, 2019

That was indeed my first thought, however it appeared a non issue in our developments mixing reals and pair projectors. Notations such as p .1 and p .2 are interpreted before numeral notations and everything works fine. Mathcomp perfectly compiles and one can even write:

Check ((2.54, .2).1)%R.
(254e-2, 2e-1).1 : R

@proux01 proux01 force-pushed 8000 the master-parsing-decimal branch from 4b1dd5a to 5fa2d0b Compare January 27, 2019 11:55
@proux01 proux01 requested a review from a team as a code owner January 27, 2019 11:55
@ggonthier
Copy link
Contributor

That was indeed my first thought, however it appeared a non issue in our developments mixing reals and pair projectors. Notations such as p .1 and p .2 are interpreted before numeral notations and everything works fine. Mathcomp perfectly compiles and one can even write:

Check ((2.54, .2).1)%R.
(254e-2, 2e-1).1 : R

Certainly, but it's the other way around: when using mathcomp one won't be able to use .1 to denote a decimal (this will likely cause an obscure typing error), and, depending on how the lexer has been modified, .1414 might also be broken, while .5 and .33 will continue to work. So to provide consistent behavior the ssreflect plugin would have to modify the lexer to enforce a leading digit before decimals, and the manual would have to document this restriction. The ssreflect maintainers would probably prefer this to be the default behavior.

@proux01
Copy link
Contributor Author
proux01 commented Jan 28, 2019

Thanks for the clarification! Indeed, there is one case that could lead to confusing type errors for the user, when she writes t .1 expecting it to be parsed as t applied to 0.1 whereas it would be parsed as fst t. This can be fixed as either t 0.1 or t .10 or t (.1).

Anyway, I understand the consistency point and if there is a consensus towards requiring at least one digit before the dot, I'd be happy to modify the current PR.

@maximedenes
Copy link
8000
Member

Anyway, I understand the consistency point and if there is a consensus towards requiring at least one digit before the dot, I'd be happy to modify the current PR.

I think it would be good indeed.

@fajb
Copy link
Contributor
fajb commented Jan 28, 2019

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

I jump to the bandwagon...
The current encoding (using Z.pow_pos) is not ring/micromega friendly.
Wouldn't it be possible to use the power function of the ring?
For instance,
Goal (254e2 = 254 * 10^2)%R.
Proof.
Fail ring.
Fail lra.

Copy link
Member
@gares gares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert here but looks good to me

@proux01
Copy link
Contributor Author
proux01 commented Apr 2, 2019
6D40
  • @proux01 please add a link to the overlays Pull Request in the description.

Done

@ejgallego
Copy link
Member

I say let's merge this on Friday, would be great if @ppedrot @herbelin could comment or request time for review of course.

IMHO all the features of this kind should be considered "experimental " for the release they are introduced, meaning that users don't have 100% guarantee on stability in case we want to perform some refinement later on.

I suggest indeed users of this feature treat it as experimental.

@ejgallego
Copy link
Member
ejgallego commented Apr 4, 2019

Ok, I'm about to merge, I have a last question @proux01 about the changes in CoRN. Why was the overlay needed, should we add some more migration tips to the CHANGES file?

IMHO if this change is going to break user developments in any way we should try to document the change of semantics better, the current text seems a bit too concise.

@proux01 proux01 force-pushed the master-parsing-decimal branch 2 times, most recently from 7ed0849 to 702c65b Compare April 4, 2019 18:24
@proux01
Copy link
Contributor Author
proux01 commented Apr 4, 2019

I have a last question @proux01 about the changes in CoRN. Why was the overlay needed,

CORN did introduce custom notations on the numeral tokens 2, 3 and 4 which were conflicting with the newly introduced one in Q.

should we add some more migration tips to the CHANGES file?

That seems like a rare case (only happened in CORN in the CI) but you're right, I just updated CHANGES.md with that point and the experimental status you suggested in your previous comment.

Let me know in case a rebase would be needed.

@ejgallego
Copy link
Member

Thanks @proux01 , I will merge as soon as the CI passes. I hope we don't need a rebased.

proux01 added a commit to proux01/ltac2 that referenced this pull request Apr 4, 2019
@proux01 proux01 force-pushed the master-parsing-decimal branch from 702c65b to 6af420b Compare April 4, 2019 21:51
@ejgallego ejgallego merged commit 6af420b into rocq-prover:master Apr 4, 2019
ejgallego added a commit that referenced this pull request Apr 4, 2019
Reviewed-by: Zimmi48
Reviewed-by: ejgallego
Ack-by: gares
Ack-by: herbelin
Ack-by: ppedrot
Ack-by: proux01
spitters added a commit to rocq-community/corn that referenced this pull request Apr 5, 2019
ppedrot added a commit to rocq-prover/ltac2 that referenced this pull request Apr 5, 2019
proux01 added a commit to proux01/QuickChick that referenced this pull request Apr 5, 2019
maximedenes added a commit to rocq-archive/stdlib2 that referenced this pull request Apr 5, 2019
@proux01
Copy link
Contributor Author
proux01 commented Apr 5, 2019

All overlays are merged so we are all set.

Thanks @ejgallego and thanks to everyone who reviewed or contributed in the discussion!

@proux01 proux01 deleted the master-parsing-decimal branch April 5, 2019 09:34
vbgl added a commit to vbgl/coq that referenced this pull request Apr 15, 2019
vbgl added a commit to vbgl/coq that referenced this pull request Apr 15, 2019
vbgl added a commit to vbgl/coq that referenced this pull request Apr 16, 2019
maximedenes pushed a commit to maximedenes/coq that referenced this pull request May 6, 2019
maximedenes pushed a commit to maximedenes/coq that referenced this pull request May 6, 2019
proux01 added a commit to proux01/rocq that referenced this pull request Jan 28, 2020
'e' was not displayed when printing decimal notations in R :

Require Import Reals.
Check (1.23e1, 32e+1, 0.1)%R.

was giving

< (123-1%R, 321%R, 1-1%R)

instead of

< (123e-1%R, 32e1%R, 1e-1%R)

This was introduced in rocq-prover#8764 (in Coq 8.10).
proux01 added a commit to proux01/rocq that referenced this pull request Jan 28, 2020
'e' was not displayed when printing decimal notations in R :

Require Import Reals.
Check (1.23e1, 32e+1, 0.1)%R.

was giving

< (123-1%R, 321%R, 1-1%R)

instead of

< (123e-1%R, 32e1%R, 1e-1%R)

This was introduced in rocq-prover#8764 (in Coq 8.10).
ppedrot pushed a commit to ppedrot/coq that referenced this pull request Jan 29, 2020
'e' was not displayed when printing decimal notations in R :

Require Import Reals.
Check (1.23e1, 32e+1, 0.1)%R.

was giving

< (123-1%R, 321%R, 1-1%R)

instead of

< (123e-1%R, 32e1%R, 1e-1%R)

This was introduced in rocq-prover#8764 (in Coq 8.10).

(cherry picked from commit 85b3be5)
@proux01 proux01 mentioned this pull request Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: feature New user-facing feature request or implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0