-
Notifications
You must be signed in to change notification settings - Fork 9
Inconsistent datatype representation causes segfault #353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a scary bug! But not entirely surprising, since we have this long-standing issue of how the choice of "optimized" representations for datatype constructors (or, more precisely, datatype "variants" like the nil and cons variants for lists) and how that choice is limited by type abstraction of various kinds. This is actually a fairly deep and delicate problem. The underlying polymorphic type system is build on the assumption that all types are the "same", i.e. that all values have the same shape and size. This allows us to use the same algorithm for, say, list reverse or list append without taking the list element type into account. In practice, this is based on the uniform representation assumption that all values are represented physically by a "word" (e.g., 32 or 64 bits, depending on the computer architecture, or specifically the size of general-purpose registers. Then, naively, a discriminated or tagged union (like list = nil | cons) is represented by "tagged" values, like (0,x), (1,y). But it turns out that there is a way of distinguish values based not on the bits themselves, but on how they are interpreted in a particular context. This distinction is between words used as pointers, and words used as numbers or bytes or bit sequences. This distinction is normally important for garbage collection, so in languages like ML (or Lisp, or Pop-2, or ...) we make the distinction between pointers and non-pointers physically manifest by using even numbers as pointers and odd numbers as non-pointers (numbers, etc. where n represents n div 2). Then we have a way of distinguishing values that can be exploited to simplify or streamline the representations of datatype values: E.g., nil = 0, a cons cell is a pointer (to a pair)) -- this means that "tagging" is no longer necessary for lists, for example. One can get fancier and generalize this idea, as Luca Cardelli explored in the early 1980s in his implementation of "Cardelli ML". The problem is the presence of "abstract" types whose representation is unknown limits the type representation information available for calculating these optimized variant or constructor representations, and things get tricky. Hence lurking bugs like the one. I have some further thoughts on the general issue, but I won't go into them here. |
I wonder if this is a regression, since I thought that handling this issue was one of the things that FLINT uses runtime type passing for? |
It could be a regression, but since none of us understands the FLINT type passing functionality in any detail, it may require some digging and code revision to find and fix the problem. I have never learned or touched that code, but I have revised some other FLINT code that might be connected somehow. It is still the case that much of FLINT remains a "black hole" in the implementation. It needs to be re-implemented (renewed) or replaced, and I would start with a greatly simplified version of the FLINT types. We also need to think more deeply about the general problem of data constructor representation and type abstraction Might make a good Master's thesis topic. |
Version
110.99.7.1 (Latest)
Operating System
OS Version
No response
Processor
System Component
Core system
Severity
Major
Description
The CPS IR generated for
toT
is the following:It assumes that
A
is represented by an integer andB
is a box storingint t
.If a structure defines
int t
to be a boxed type, theB
constructor is represented directly by theint t
box --- instead of a box storingint t
--- causing inconsistency.Discovered in conversation with @Skyb0rg007
Transcript
Expected Behavior
No response
Steps to Reproduce
Additional Information
This bug seems to relate to the fact that
t
is a type constructor with type variables. Changing the signature fromtype 'a t
totype t
makes the compiler generate the correct code.Email address
byronzhong@cs.uchicago.edu
The text was updated successfully, but these errors were encountered: