-
Notifications
You must be signed in to change notification settings - Fork 15
RFC 0016: Instances vs Types #485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9b2950d
to
2704a9e
Compare
``` | ||
|
||
## Drawbacks | ||
- This change introduces more structure for blocks but pipelines are still confusing, see Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- With that enforced structure, users might loose flexibility to define things where they want (as an implication).
Leading to an alternative:
- leave as it is and document that the
pipe
syntax means instantiation and everything else is type definition- decided against by applying principle "Explicit modeling over hidden magic"; not mixing concepts makes things more explicit IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this as drawback / alternative with the decision as described here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! I think this will immensly help object-oriented programmers to understand what is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the more verbose from-to syntax of pipeline steps?
|
||
### Proposed change | ||
|
||
Allow **definitions** only outside of pipelines, allow block **instantiations** only inside of pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow **definitions** only outside of pipelines, allow block **instantiations** only inside of pipelines. | |
Allow **definitions** only outside of pipelines, allow block **instantiations** only inside of pipelines and composite blocks. |
Or better: write one sentence at the top that a composite block is equivalent to a pipeline definition in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I added a note for that into the future enhancements, I think we should clean it up with the packages RFC implementation. For this RFC, I would not change anything about that setup.
2704a9e
to
34af978
Compare
Luckily we just deleted them :D. |
FYI @dirkriehle regarding our discussion on instances vs types in the recent JValue team meeting. If you have some time, I'd love to get a review from you for this RFC. Otherwise I'll bring it into the next JValue meeting. |
|
||
Allow **definitions** only outside of pipelines, allow block **instantiations** only inside of pipelines. | ||
|
||
Because pipelines get executed implicitly when executing a Jayvee model (and therefore pipelines get **instantiated** during runtime), it makes sense to bundle all other instantiations in them as well. This means everything outside of a pipeline is a **definition** (**type**), everything inside a pipeline is an **instance**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we always instantiate as singletons within a pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so but I am also not sure I understand what you are asking. The interpreter instantiates a new object for every block reference in a pipeline I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a special kind of instantiation: you cannot create two instances of a block, right?
I don't know how to better frame it >.<
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean in Jayvee or in the interpreter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in Jayvee as well. Let's take this example:
pipeline Demonstrator {
MyCsvExtractor
-> MyTableInterpreter
-> MySqliteLoader;
MyTableInterpreter
-> MyPostgresLoader;
}
In my understanding, one instance of MyTableInterpreter
would be shared between those two branches of the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, but within the scope of a pipeline it kinda is. The question is if that is just an interpreter detail or part of the language as well..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, that is not always true. Example:
pipeline MySecondPipeline {
MyExtractor1 -> MyTableInterpreter -> MyLoader;
MyExtractor2 -> MyTableInterpreter -> MyLoader;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I lost your suggestion on how to change this RFC for this. I think this is an implementation detail of the interpreter so I am unsure what effect it should have here. Can you make that more explicit? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, my suggestion is to specify when an instance is "reused" like in the first example and when not like in the second example. However, I'm having troubles to specify this. Maybe leave it open then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I would leave this out of the scope of this to focus on the description vs. instantiation semantics. Reusing existing instances on instantiation is a nuance that would make this too big I think?
Thanks for pointing me to this thread. My original thrust was to fix the oftype relationship, which I think remains broken, see email copy below. This proposal, if I understand it, makes matters worse because it cements the confusion. So it is a good time to have the fundamental discussion, though I don't think this is the best place for it (hence the email). Maybe it can serve as an example though. This RFC is called Instances vs. types but it is only about block types? Also, it isn't really about introducing types but rather about where code may appear and where not? (In general reducing expressiveness is not a good thing; it is better to work from atomic principles that allow sound expressions so that expressiveness increases.) I don't think it is unclear what an instance and what a type is in Jayvee. As explained below, we don't have explicit pipeline types, only implicit ones. This can be fixed easily by making the type explicit by allowing it to have a name. Same thing for blocks and block types I would think. Without giving them a name, we can't reuse them, so my assumption is that an anonymous type only ever has exactly one instance (within a running Jayvee instance). My interpretation of code like "block CarsCSVExtractor oftype CSVExtractor" inside a pipeline description is that you are describing an M0 (to be created at runtime) object structure consisting of an outer pipeline object with with several M0 block objects of which one is called CarsCSVExtractor whose type is CSVExtractor and of which the url attribute is set a specific value. You write this is a definition of a data source but I don't understand this. It is a description of a block inside a pipeline; there is no explicit type definition here. You then say the block is instantiated. I'd think there is only exactly one block and it gets linked by pipes, it doesn't get instantiated multiple times. If you are going to tell me that this is actually a type definition, I'd be even more confused: It should not relate to another type by oftype and it should not have value to attribute assignments then. As to the proposed change, the confusion for me continues: I think it makes sense to have block types outside of pipeline and then instantiate the block types to blocks inside a concrete pipeline description. So is what follows after "block" outside of a pipeline actually a block type or just a block (instance)? All of this to more urgently move block types like CSVInterpreter into a system library (M1 level) out of the language (M2 level), see below. (If it hasn't happened yet.) Hi all, I wanted to pick up the old topic of modeling levels and types and instances, as discussed in a recent team meeting. I saw that Philip wanted to do something here as well, but I think it is only about closing an old unfinished thread about value types? Go ahead, but I suggest we first finish the discussion triggered below. Jayvee is just fabulous work and I'm very proud of you and what we have achieved so far. That said, I remain convinced that the overall typing / relationship structure isn't right or at least is unconventional. This goes back to me trying to explain it but gaining no ground about a year ago. We got today's structure, because, in Felix words, he wouldn't know how to implement it any other way then the way he did. So I want to try again and if only to help us see the problems we might run into earlier. Also, with Johannes onboard, it may be helpful to rehash some concepts. My basic point is that our modeling / language structure does not conform to how about everyone else does it in language design, and that I suspect that this is not a good idea i.e. we are missing something. First the fundamentals. I'm using the UML terminology of M levels, i.e. M0, M1, M2. M0 are the runtime objects, M1 are the model obbjects, and M2 are the language objects. The M2 level defines the language. Logical M2 objects are block and blocktype, value and valuetype, etc. They are expressed as elements in the grammar and as generated classes from which ASTs are created. Logical M1 objects are specific Jayvee programs (models). They are the written code in a .jv file or expressed as the objects in an AST after parsing the .jv file. Logical M0 objects then are the running pipelines. They are created, as runtime objects, by the code of the classes of the objects in an AST as the AST gets interpreted. Between the objects in one M-level, you can have regular object relationships. The relationships are what the next higher level lets you express. If the M2 level (the Jayvee language) defines a subtype relationship, then two M1-level objects can get related by a subtype relationship. Traditionally, there is only one relationship, which goes across the levels, which is an is-instance-of relationship. Any Mx level object is logically an instance of an M(x+1) level object. There is no other relationship across the M-levels. (It is always possible to repeat this whole architecture within the M1 level, the model level, to let users at runtime get the benefits of this flexible architecture, but this is beyond the scope of this email. Let's stick with the fundamental structure of a regular programming / modeling language.) The M2-to-M1 level is-instance-of relationship is established when using the keywords of the language while programming. The M1-to-M0 level is-instance-of relationship is established when you declare instances on the M1 level and give them a type: The declaration maybe solely on the M1 level, mixing types and instances of types, but the runtime relationship is between M1 and M0 objects. The misery starts when people start confusing the is-instance-of relationship with the is-a relationship. Again: is-instance-of establishes an instantiation relationship between two objects on adjacent M-levels. The is-a relationship (also subtyping, generalization, etc.) is a relationship between two objects within one level (the supertype and the subtype) and can only exist if the next higher level actually created an object for this type of relationship (= introduced the concept). If there is no subtyping concept defined on M2 in the Jayvee language definition, obviously users can't express subtype relationships between M1 level objects like specific block types or value types. Let me use these fundamentals now to go over the existing example code that I got confused over back then and that I'm still confused about today. Also, what it implies for libraries and other future additions to Jayvee.
Before I can get started, I need to address some possibly problematic syntactic shorthands we are using The code pipeline CarsPipeline { ... } contains an anonymous type specification, the type of the CarsPipeline object, as specified by the { ... } code. If we want reusable pipelines, we need to introduce pipeline types and name them; this would lead to code like pipelinetype CarsPipeline { ... }; Please note the potential for inconsistences using lower and upper caps for the first character of names. The canonical way is to have types on M1 start with a capital letter, and objects on M1 with a lower case letter. "pipeline CarsPipeline ..." screws with this and is likely to confuse users. My use of oftype here is to indicate an is-instance-of relationship between the runtime M0-level object myCarsPipeline and the M1-level object CarsPipeline. This a valid use in my book; in Java it would be expressed like: CarsPipeline myCarsPipeline = new CarsPipeline(); I always thought of "oftype" as a shorthand for "is-of-type" or is-instance-of indicating an instantiation relationship, not a subtyping relationship.
The first block in the CarsPipeline example is block CarsExtractor oftype HttpExtractor { ... }; Like CarsPipeline in the example code above, CarsExtractor is an instance, not a type; hence, a better name would have been myCarsExtractor. Its type is HttpExtractor. My understanding from back when, and I hope that this has changed, is that HttpExtractor is an M2 level object and part of the language (because we wouldn't know how to implement it differently). It really should be an M1-level object defined in a library. (We touched on this a couple of times; did we get out of that pit? In any case:) The oftype relationship described under 1. is between a type specification on the M1 level and an object declaration on the M1 level. It can't also go across two M-levels. If we can't fix this, it should at least be two different concepts, oftype1 and oftype2. A more canonical way if we really wanted a library M1-level concept like HttpExtractor be a language-level M2-level concept would be to introduce the concept as a keyword i.e. write httpextractorblock myCarsExtractor { ... }; // yes ugly
Here is the use of the oftype concept as it exists for value types: valuetype VehicleIdentificationNumber10 oftype text { ... }; This time, a type (VIN10) is created as an instance of the M2-level concept text. It is really confusing to me here: Think back to your normal programming. You wouldn't make VIN10 a subtype of String and you wouldn't make it an instance of String. What you want is to write valuetype VIN { // text field with constraints }; so that you can't just stick a VIN10 into anywhere a String is expected (this would blow up quickly). This oftype relationship between an M2-level object and an M1-level type object is broken in my book. I think there should be one definition of oftype on the M2 level, and it specifies the instantiation relationship between two M1 level objects, i.e. user-defined types and their instances.
There is one caveat here. Did you notice the intuition for text (lower-case first letter i.e. language-level keyword) and HttpExtractor (upper-case first letter suggesting specified by user even though it isn't if my memory is correct)? I think it is valid to somehow represent a finite set of built-in value types on the M2-level so that users can create values and value thpes on the M1 level using built-in value types. Examples: value myVIN oftype text; // poor modeling just need an example This does not apply in my book to an ever growing number of block types like HttpExtractor. This may be an inconsistency, but it is how other languages do it AFAIK. So maybe there is an elegant way of handling a defined finite set of the builtin value types that everyone knows.
Which leads me to the missing is-a relationship or the future goal of letting users specify subtypes. Forgive me for rehashing old examples from ADAP, but I couldn't find examples in the Jayvee example code that worked for me. I want to be able to write valuetype Coordinate {}; I can then have value cartesianOrigin oftype CartesianCoordinate { I don't know whether extends is a good keyword, but I do know we shouldn't overload oftype with two meanings.
With the recent addition of separate files and import statements, we can now relate types from library files to newly declared instances using oftype and newly specified types using extends. I recognize that this is probably a hard to digest email and I probably made mistakes. A whiteboard is probably a better idea to wrap your head around these concepts. I hope we can pick this up in an upcoming discussion. Happy holidays ;-) Dirk |
1e39a0f
to
660ba73
Compare
660ba73
to
14636cb
Compare
…ue/jayvee into rfc-0016-instances-vs-types
No description provided.