Add parsing of decimal constants (e.g., 1.02e+01) #8764

proux01 · 2018-10-18T14:09:08Z

This adds parsing of decimal constant like 1.02e+01 (= 10.2). More precisely, it accepts inputs following the regexp [0-9]* [.]? [0-9]+ ([eE][+-]?[0-9]+)?. Parsers are provided for Reals and Q. This may look like a crazy useless feature but we'll sorely need it in a future PR adding primitive floating point computation (c.f. #8276 with @erikmd).

Note that we request at least one digit after a potential dot to avoid things like Check 42. to be parsed as "Check" "42." instead of "Check" "42" ".".

Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

Todo:

separate numerals and notations? (c.f. Add parsing of decimal constants (e.g., 1.02e+01) #8764 (comment))
accept _ as comment in numerals ?
add hexadecimal/octal ? (in another PR)
check that change under notations works (c.f. Add parsing of decimal constants (e.g., 1.02e+01) #8764 (comment))
require at least one digit before dot to avoid conflict with pair projectors .1 and .2
use a ring/micromega friendly encoding
update documentation
add overlays

~~Depends on: #8720~~ (merged)
~~Depends on: #9815~~ (merged)

Overlays:

herbelin

Looks very good to me. Thanks a lot for this contribution.

See comments in the code. Otherwise, here are some answers and general comments.

Note : the 'is_number : string -> bool' function was somewhat duplicated in the code, it is now factored in 'lib/util.ml', not sure it's the best place though.

No opinion, maybe @ejgallego has a suggestion.

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

As a naive user, I would probably expect 10.2 if in a type with a canonical form. If in Q, without normalization, I guess I would expect to see 10200/1000. And for IZR 10200 / IZR (Z.pow_pos 10 3), I would expect, I believe,10200 / 10^3.

Note that we request at least one digit after a potential dot to avoid things like Check 42. to be parsed as "Check" "42." instead of "Check" "42" ".".

Yes, but this shall be mainly a lexer issue, I guess.

As far as I understand, this is only the interpretation infrastructure. It would remain to add the lexer, as well as providing effective primitive interpreters and/or extending the user-defined numeral notations with floating point numbers, right?

More generally, this raises the question of other kinds of printers. E.g. parsing 0xFF00. Also, aren't there other standard than using e for floating numbers. The e looks pretty programming-language (confusing) jargon.

interp/constrexpr_ops.ml

interp/constrexpr_ops.mli

coqpp/coqpp_main.ml

interp/constrexpr.ml

interp/constrexpr_ops.ml

interp/constrexpr.ml

interp/constrexpr_ops.ml

interp/constrexpr.ml

ejgallego

Umm, that seems like a big patch for a fairly small feature. I wonder if the lexer shouldn't just process the literals away.

proux01 · 2018-10-22T09:16:24Z

Thanks for the comments.

If in Q, without normalization, I guess I would expect to see 10200/1000. And for IZR 10200 / IZR (Z.pow_pos 10 3), I would expect, I believe,10200 / 10^3.

The current behavior in this PR is 10200e-3 in both cases. So maybe we should just not register any uninterpreter to keep 10200 / 1000 and 10200 / 10^3.

As far as I understand, this is only the interpretation infrastructure. It would remain to add the lexer, as well as providing effective primitive interpreters and/or extending the user-defined numeral notations with floating point numbers, right?

Exactly, nothing specifically related to floating-point numbers in this PR.

More generally, this raises the question of other kinds of printers. E.g. parsing 0xFF00.

I hope this would be rather easy to add after this PR. As far as I can see, this would require :

modifying the lexer parsing/cLexer.ml4
modifying the is_numeral and numeral_of_string functions (in lib/util)
adding a Coq hexadecimal type like in `theories/Init/Decimal.v``
adding translations between string and Coq in interp/notation.ml
some bookkeeping in plugins/syntax/numeral.ml

N.B.: If we go one day in this direction, in many languages, a leading 0 is used for octal numbers (e.g., 020 means 2*8+0 = 16). In Coq, leading 0s are simply ignored. Should we forbid them to avoid any mistake?

Last question: in many languages, underscores are accepted as comments in number literals. 1_000_000 is indeed more readable than 1000000. Should we add this feature?

Also, aren't there other standard than using e for floating numbers. The e looks pretty programming-language (confusing) jargon

This sounds standard to both computer scientists (various programming languages) and control theorists (Matlab) around me. I'm not aware of other notations (aside from 0x1p+0 for radix 2). Maybe @silene would have other ideas?

@ejgallego: the patch is unfortunately relatively large. @herbelin's comments enabled to simplify it a bit. I also cut it in multiple commits to make it easier to review. I would definitively be happy to simplify it further but I do not fully grasp your idea. Could you please elaborate on what you mean by "I wonder if the lexer shouldn't just process the literals away."?

silene · 2018-10-22T10:40:19Z

Maybe @silene would have other ideas?

Not really. The e notation has been in use for more than 60 years and I guess it has become a de facto standard. The only modern tool I can think of that does not use it is Mathematica, which instead uses 123*^-1, because 123e-1 is parsed as 123 * e - 1.

proux01 · 2018-10-22T11:30:53Z

@herbelin all your comments are addressed in the latest rebase. More specifically:

the half-way refinement of Numeral has been dropped and the decomposition is now entirely done in coqdecimal_of_rawnum
no new token, the "INT" token is reused, and renamed "NUMERAL" to avoid any confusion
no space after "-" sign
type sign = bool has been replaced by type sign = SPlus | SMinus

ggonthier · 2019-01-27T00:22:22Z

It would be preferable to also require a digit before the decimal point, to avoid conflict with the .1 and .2 symbols used for pair projectors.

proux01 · 2019-01-27T11:54:50Z

That was indeed my first thought, however it appeared a non issue in our developments mixing reals and pair projectors. Notations such as p .1 and p .2 are interpreted before numeral notations and everything works fine. Mathcomp perfectly compiles and one can even write:

Check ((2.54, .2).1)%R.
(254e-2, 2e-1).1 : R

parsing/tok.ml

ggonthier · 2019-01-27T18:42:26Z

That was indeed my first thought, however it appeared a non issue in our developments mixing reals and pair projectors. Notations such as p .1 and p .2 are interpreted before numeral notations and everything works fine. Mathcomp perfectly compiles and one can even write:
Check ((2.54, .2).1)%R.
(254e-2, 2e-1).1 : R

Certainly, but it's the other way around: when using mathcomp one won't be able to use .1 to denote a decimal (this will likely cause an obscure typing error), and, depending on how the lexer has been modified, .1414 might also be broken, while .5 and .33 will continue to work. So to provide consistent behavior the ssreflect plugin would have to modify the lexer to enforce a leading digit before decimals, and the manual would have to document this restriction. The ssreflect maintainers would probably prefer this to be the default behavior.

proux01 · 2019-01-28T08:47:10Z

Thanks for the clarification! Indeed, there is one case that could lead to confusing type errors for the user, when she writes t .1 expecting it to be parsed as t applied to 0.1 whereas it would be parsed as fst t. This can be fixed as either t 0.1 or t .10 or t (.1).

Anyway, I understand the consistency point and if there is a consensus towards requiring at least one digit before the dot, I'd be happy to modify the current PR.

maximedenes · 2019-01-28T08:54:38Z

Anyway, I understand the consistency point and if there is a consensus towards requiring at least one digit before the dot, I'd be happy to modify the current PR.

I think it would be good indeed.

fajb · 2019-01-28T10:45:21Z

Note : Should IZR 10200 / IZR (Z.pow_pos 10 3) (resp. Qmake 10200 1000) be uninterpreted as 10200e-3 or 102e-1 or 1.02e1?

I jump to the bandwagon...
The current encoding (using Z.pow_pos) is not ring/micromega friendly.
Wouldn't it be possible to use the power function of the ring?
For instance,
Goal (254e2 = 254 * 10^2)%R.
Proof.
Fail ring.
Fail lra.

gares

I'm not an expert here but looks good to me

proux01 · 2019-04-02T14:00:20Z

6D40

@proux01 please add a link to the overlays Pull Request in the description.

Done

ejgallego · 2019-04-03T12:50:13Z

I say let's merge this on Friday, would be great if @ppedrot @herbelin could comment or request time for review of course.

IMHO all the features of this kind should be considered "experimental " for the release they are introduced, meaning that users don't have 100% guarantee on stability in case we want to perform some refinement later on.

I suggest indeed users of this feature treat it as experimental.

ejgallego · 2019-04-04T17:48:17Z

Ok, I'm about to merge, I have a last question @proux01 about the changes in CoRN. Why was the overlay needed, should we add some more migration tips to the CHANGES file?

IMHO if this change is going to break user developments in any way we should try to document the change of semantics better, the current text seems a bit too concise.

proux01 · 2019-04-04T18:27:10Z

I have a last question @proux01 about the changes in CoRN. Why was the overlay needed,

CORN did introduce custom notations on the numeral tokens 2, 3 and 4 which were conflicting with the newly introduced one in Q.

should we add some more migration tips to the CHANGES file?

That seems like a rare case (only happened in CORN in the CI) but you're right, I just updated CHANGES.md with that point and the experimental status you suggested in your previous comment.

Let me know in case a rebase would be needed.

ejgallego · 2019-04-04T18:39:15Z

Thanks @proux01 , I will merge as soon as the CI passes. I hope we don't need a rebased.

Reviewed-by: Zimmi48 Reviewed-by: ejgallego Ack-by: gares Ack-by: herbelin Ack-by: ppedrot Ack-by: proux01

[coq] Adapt to rocq-prover/rocq#8764

proux01 · 2019-04-05T09:28:02Z

All overlays are merged so we are all set.

Thanks @ejgallego and thanks to everyone who reviewed or contributed in the discussion!

… 1.02e+01)

…decimal [coq] Adapt to rocq-prover#8764

'e' was not displayed when printing decimal notations in R : Require Import Reals. Check (1.23e1, 32e+1, 0.1)%R. was giving < (123-1%R, 321%R, 1-1%R) instead of < (123e-1%R, 32e1%R, 1e-1%R) This was introduced in rocq-prover#8764 (in Coq 8.10).

'e' was not displayed when printing decimal notations in R : Require Import Reals. Check (1.23e1, 32e+1, 0.1)%R. was giving < (123-1%R, 321%R, 1-1%R) instead of < (123e-1%R, 32e1%R, 1e-1%R) This was introduced in rocq-prover#8764 (in Coq 8.10). (cherry picked from commit 85b3be5)

proux01 requested review from ejgallego, gares, herbelin, mattam82, ppedrot and a team as code owners October 18, 2018 14:09

coqbot added the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 18, 2018

Zimmi48 removed the request for review from a team October 18, 2018 15:19

Zimmi48 added kind: feature New user-facing feature request or implementation. needs: merge of dependency This PR depends on another PR being merged first. labels Oct 18, 2018

herbelin reviewed Oct 18, 2018

View reviewed changes

ejgallego reviewed Oct 18, 2018

View reviewed changes

proux01 force-pushed the master-parsing-decimal branch from 8aaa6f1 to b8aa667 Compare October 22, 2018 09:07

proux01 force-pushed the master-parsing-decimal branch from b8aa667 to c6c6fa0 Compare October 30, 2018 15:07

coqbot removed the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 30, 2018

proux01 force-pushed the master-parsing-decimal branch from c6c6fa0 to 4b1dd5a Compare January 26, 2019 17:35

proux01 requested review from a team as code owners January 26, 2019 17:35

proux01 force-pushed 8000 the master-parsing-decimal branch from 4b1dd5a to 5fa2d0b Compare January 27, 2019 11:55

proux01 requested a review from a team as a code owner January 27, 2019 11:55

ejgallego previously requested changes Jan 27, 2019

View reviewed changes

parsing/tok.ml Outdated Show resolved Hide resolved

gares reviewed Apr 2, 2019

View reviewed changes

proux01 force-pushed the master-parsing-decimal branch 2 times, most recently from 7ed0849 to 702c65b Compare April 4, 2019 18:24

proux01 added a commit to proux01/ltac2 that referenced this pull request Apr 4, 2019

Adapt to rocq-prover/rocq#8764

a3d5bf6

Update CHANGES.md

6af420b

proux01 force-pushed the master-parsing-decimal branch from 702c65b to 6af420b Compare April 4, 2019 21:51

ejgallego merged commit 6af420b into rocq-prover:master Apr 4, 2019

ejgallego added a commit that referenced this pull request Apr 4, 2019

Merge PR #8764: Add parsing of decimal constants (e.g., 1.02e+01)

be6f3a6

Reviewed-by: Zimmi48 Reviewed-by: ejgallego Ack-by: gares Ack-by: herbelin Ack-by: ppedrot Ack-by: proux01

spitters added a commit to rocq-community/corn that referenced this pull request Apr 5, 2019

Merge pull request #72 from proux01/master-parsing-decimal

db29aed

[coq] Adapt to rocq-prover/rocq#8764

ppedrot added a commit to rocq-prover/ltac2 that referenced this pull request Apr 5, 2019

Merge pull request #116 from proux01/master-parsing-decimal

72ab28c

[coq] Adapt to rocq-prover/rocq#8764

proux01 added a commit to proux01/QuickChick that referenced this pull request Apr 5, 2019

Adapt to rocq-prover/rocq#8764

48a2697

maximedenes added a commit to rocq-archive/stdlib2 that referenced this pull request Apr 5, 2019

Merge pull request #18 from proux01/master-parsing-decimal

87767e2

[coq] Adapt to rocq-prover/rocq#8764

proux01 deleted the master-parsing-decimal branch April 5, 2019 09:34

vbgl added a commit to vbgl/coq that referenced this pull request Apr 15, 2019

Backport PR rocq-prover#8764: Add parsing of decimal constants (e.g.,…

3543604

… 1.02e+01)

vbgl added a commit to vbgl/coq that referenced this pull request Apr 15, 2019

Backport PR rocq-prover#8764: Add parsing of decimal constants (e.g.,…

bbd783f

… 1.02e+01)

vbgl added a commit to vbgl/coq that referenced this pull request Apr 16, 2019

[CI] Update for PR rocq-prover#8764

2bfbd38

maximedenes pushed a commit to maximedenes/coq that referenced this pull request May 6, 2019

Adapt to rocq-prover#8764

0d8a4ec

maximedenes pushed a commit to maximedenes/coq that referenced this pull request May 6, 2019

Merge pull request rocq-prover/ltac2#116 from proux01/master-parsing-…

d478925

…decimal [coq] Adapt to rocq-prover#8764

erikmd mentioned this pull request Jan 27, 2020

Missing "e" when printing decimal notations in R_scope using Coq 8.10 #11467

Closed

proux01 mentioned this pull request Jan 28, 2020

Fix #11467 ('e' was not displayed when printing decimal notations in R) #11472

Merged

1 task

proux01 mentioned this pull request Feb 27, 2020

Refactor lookaheads #11695

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add parsing of decimal constants (e.g., 1.02e+01) #8764

Add parsing of decimal constants (e.g., 1.02e+01) #8764

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add parsing of decimal constants (e.g., 1.02e+01) #8764

Add parsing of decimal constants (e.g., 1.02e+01) #8764

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!