WO2016163901A1 - An apparatus for processing an abstract syntax tree being associated with a source code of a source program - Google Patents
An apparatus for processing an abstract syntax tree being associated with a source code of a source program Download PDFInfo
- Publication number
- WO2016163901A1 WO2016163901A1 PCT/RU2015/000218 RU2015000218W WO2016163901A1 WO 2016163901 A1 WO2016163901 A1 WO 2016163901A1 RU 2015000218 W RU2015000218 W RU 2015000218W WO 2016163901 A1 WO2016163901 A1 WO 2016163901A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- code
- nodes
- data structure
- virtualized
- application programming
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 18
- 238000013507 mapping Methods 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims description 57
- 238000004590 computer program Methods 0.000 claims description 8
- 238000011161 development Methods 0.000 description 21
- 238000011156 evaluation Methods 0.000 description 21
- 238000013459 approach Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 7
- 238000010276 construction Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000005457 optimization Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008275 binding mechanism Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- PWPJGUXAGUPAHP-UHFFFAOYSA-N lufenuron Chemical compound C1=C(Cl)C(OC(F)(F)C(C(F)(F)F)F)=CC(Cl)=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F PWPJGUXAGUPAHP-UHFFFAOYSA-N 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/47—Retargetable compilers
Definitions
- the invention relates to the field of computer technology, in particular to compilers for compiling source codes of source programs.
- compilers of different architectures can be employed.
- a compiler can process a source program written in a source language and can provide an equivalent program written in a target language.
- the majority of compilers typically accepts only one source language and provides an equivalent program only in one target language.
- a minority of compilers is designed to accept a variety of source languages and to provide equivalent programs in a variety of target languages.
- Such compilers are often referred to as compiler-compilers.
- a compiler-compiler accepts additional inputs which specify the source language, the rules of translation and other parameters, and allows a developer to change its implementation in order to reflect the specification of a new language and translation rules.
- the invention is based on the finding that a code virtualizer can be applied in order to provide virtualized code indicating a plurality of calls of a target application programming interface.
- the virtualized code can be generated upon the basis of an abstract syntax tree being associated with a source code of a source program, wherein the abstract syntax tree can e.g. be provided by a parser for parsing the source code.
- the virtualized code can be semantically equivalent to the source code of the source program and can therefore efficiently represent the source program.
- the virtualized code allows for a reduction of used memory space within a computer.
- the virtualized code enables an efficient interpretation of semantics of the source program by an interpreter e.g. used for debugging.
- the virtualized code allows for an efficient processing by an evaluator yielding an intermediate representation suitable for generating executable machine code. The evaluator can be based on the concept of staged evaluation.
- the invention relates to an apparatus for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the apparatus comprising a code virtualizer being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
- a code virtualizer being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of
- the apparatus can be a compiler being configured to compile the source code of the source program.
- the predetermined mapping data structure can be a predetermined mapping table.
- the virtualized code can be an intermediate representation associated with the source code.
- the virtualized code can indicate the plurality of calls of the target application programming interface, thereby efficiently representing semantics of the source program.
- the target application programming interface can be associated with a domain specific language (DSL).
- the code virtualizer is further configured to associate the plurality of nodes with the plurality of calls upon the basis of a predetermined semantics data structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface.
- a semantic specification of the target application programming interface can be considered efficiently.
- the virtualized code is semantically equivalent to the source code of the source program.
- a complete semantic representation of the source program can be provided.
- the apparatus further comprises an elaborator being configured to determine a plurality of names and a plurality of types associated with the plurality of nodes upon the basis of the predetermined mapping data structure, and to append the plurality of names and the plurality of types to the plurality of nodes.
- name and type resolution upon the basis of the abstract syntax tree can be performed efficiently.
- the elaborator is further configured to determine the plurality of names and the plurality of types associated with the plurality of nodes upon the basis of a/the predetermined semantics data structure, the predetermined semantics data structure indicating a/the semantic specification of the target application programming interface.
- name and type resolution upon the basis of the abstract syntax tree can be performed more efficiently.
- the predetermined semantics data structure can be processed by the code virtualizer and the elaborator.
- the apparatus further comprises a parser being configured to parse the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes.
- a parser being configured to parse the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes.
- the parser can be provided by a parser generator.
- the virtualized code is an object-oriented virtualized code.
- the virtualized code can be late-bound with regard to the target application programming interface.
- the apparatus further comprises an interpreter being configured to semantically interpret the virtualized code.
- the interpreter can be used for an efficient debugging of the source program.
- the interpreter can perform a late-binding with regard to the target application programming interface.
- the apparatus further comprises an evaluator being configured to generate an intermediate representation associated with the source program upon the basis of the virtualized code, the intermediate representation comprising the plurality of calls of the target application programming interface.
- the intermediate representation comprises a graph data structure indicating a graph comprising a plurality of graph nodes, the plurality of graph nodes comprising the plurality of calls of the target application programming interface.
- the staged evaluation can be implemented as described in the document WO 2015/01271 1 Al , which is herewith incorporated by reference in its entirety.
- the method for constructing a graph data structure can be used for providing the intermediate representation, wherein the plurality of graph nodes of the graph data structure can be identified by symbols.
- the intermediate representation is semantically equivalent to the virtualized code.
- a complete semantic representation of the virtualized code can be provided.
- the apparatus further comprises a code generator being configured to generate an executable machine code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer.
- a code generator being configured to generate an executable machine code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer.
- the apparatus further comprises an optimizer being configured to optimize the intermediate representation with regard to a predetermined performance metric of the executable machine code.
- an optimizer being configured to optimize the intermediate representation with regard to a predetermined performance metric of the executable machine code.
- the predetermined performance metric can indicate a runtime and/or a used memory of the executable machine code.
- the invention relates to a method for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the method comprising associating, by a code virtualizer, the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and generating, by the code virtualizer, a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
- the method can be performed by the apparatus. Further features of the method directly result from the functionality of the apparatus.
- the method further comprises associating, by the code virtualizer, the plurality of nodes with the plurality of calls upon the basis of a predetermined semantics data structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface.
- a semantic specification of the target application programming interface can be considered efficiently.
- the virtualized code is semantically equivalent to the source code of the source program.
- a complete semantic representation of the source program can be provided.
- the method further comprises determining, by an elaborator, a plurality of names and a plurality of types associated with the plurality of nodes upon the basis of the predetermined mapping data structure, and appending, by the elaborator, the plurality of names and the plurality of types to the plurality of nodes.
- the method further comprises determining, by the elaborator, the plurality of names and the plurality of types associated with the plurality of nodes upon the basis of a/the predetermined semantics data structure, the predetermined semantics data structure indicating a/the semantic specification of the target application programming interface.
- the method further comprises parsing, by a parser, the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes.
- the abstract syntax tree can be provided efficiently.
- the virtualized code is an object-oriented virtualized code.
- the virtualized code can be late-bound with regard to the target application programming interface.
- the method further comprises semantically interpreting, by an interpreter, the virtualized code.
- an interpreter the virtualized code.
- the method further comprises generating, by an evaluator, an intermediate representation associated with the source program upon the basis of the virtualized code, the intermediate representation comprising the plurality of calls of the target application programming interface.
- the intermediate representation comprises a graph data structure indicating a graph comprising a plurality of graph nodes, the plurality of graph nodes comprising the plurality of calls of the target application programming interface.
- the intermediate representation is semantically equivalent to the virtualized code.
- a complete semantic representation of the virtualized code can be provided.
- the method further comprises generating, by a code generator, an executable machine code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer.
- the source program can efficiently be executed by a computer.
- the method further comprises optimizing, by an optimizer, the intermediate representation with regard to a predetermined performance metric of the executable machine code.
- a performance of the executable machine code can be improved.
- the invention relates to a computer program comprising a computer program code for performing the method, when executed on a computer.
- the method can be performed in an automatic and repeatable manner.
- the apparatus can be programmably arranged to perform the computer program.
- the invention can be implemented in hardware and/or software.
- Fig. 1 shows a diagram of an apparatus for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment
- Fig. 2 shows a diagram of a method for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment
- Fig. 3 shows a diagram of a generic structure for compiling a source code of a source program into executable machine code
- Fig. 4 shows a diagram of a two-phase development approach using an integrated development environment
- Fig. 5 shows a diagram of an apparatus for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment
- Fig. 6 shows a diagram of semantic equivalences between a source code, a virtualized code, and an intermediate representation within an apparatus according to an embodiment.
- Fig. 1 shows a diagram of an apparatus 100 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment.
- the abstract syntax tree comprises a plurality of nodes.
- the apparatus 100 comprises a code virtualizer 101 being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
- Fig. 2 shows a diagram of a method 200 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment.
- the abstract syntax tree comprises a plurality of nodes.
- the method 200 comprises associating 201 the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and generating 203 a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
- the apparatus 100 and the method 200 allow for a semantic debugging of domain specific languages (DSLs).
- DSLs domain specific languages
- Fig. 3 shows a diagram of a generic structure for compiling a source code of a source program into executable machine code.
- the diagram relates to a structure of a programming language implementation.
- the generic structure comprises the following blocks, wherein processing is performed in order: a parser 301 , an elaborator 303 for name and type resolution, an optimizer 305, and a code generator 307. These four blocks are typically comprised by the generic structure.
- an executer 309 can be provided for executing the target executable machine code.
- the executer 309 can be realized by computer hardware or by an interpreter.
- Compiler-compilers can automate the design of some of the blocks 301-307 within various degrees.
- a high level of automation can e.g. be achieved for the parser 301 by using parser generators.
- a parser generator can accept a language grammar specification and, optionally, specific information about an internal program representation.
- the output of the parser 301 can be an intermediate representation (IR) of the source program, also referred to as abstract syntax tree (AST). Further approaches can be used in order to provide abstract syntax trees.
- the construction of the second block, the elaborator 303 for name and type resolution can be less automated, because rules and algorithms of name and type resolution may greatly vary from language to language. Nevertheless, a joint use of several languages in one application can assume that name and type resolution rules of these languages have much in common.
- a certain automation of the construction of the elaborator 303 can be achieved by a library of building blocks and/or modules, which can be used to design an elaborator 303 for a particular domain specific language.
- the name and type resolution for modern programming languages may be parameterized to a certain de gree. While compiling a module of a source program, the elaborator 303 may not use the entire code of other program modules. Rather, it can accept a description of interfaces which the other modules satisfy.
- the interfaces can be supplied in class files, jar files, as well as Java files with source code of interfaces. These features of elaborators can be utilized in embodiments of the invention and can be additionally parameterized.
- the optimizer 305 can be built for intermediate representations of a large variety of programming languages. For example, a LLVM optimization framework can be used, which can accept languages such as C. C++, Java, and other imperative and/or object- oriented languages.
- the code generator 307 may not depend on details of the source language, may be designed for a class of intermediate representations, and may depend on primitive operations of a model of computation and/or computer architecture. It can generate code in terms of low level executable machine code or higher level code, which can use a target application programming interface that defines basic semantic notions in terms of which the semantics of the language are expressed.
- Embodiments of the invention apply techniques collectively denoted as multistage programming or simply staging. There are different techniques to implement staging, e.g. staged evaluation.
- an extension of a programming language can be difficult in particular with regard to the middle blocks.
- Programming languages in particular domain specific languages, may not be created in one step. They may be created gradually by a stepwise addition of language features and extension of the generic structure for compilation.
- the desire for adding a feature for a programming language can e.g. arise when a new basic semantic notion is added to a model of computation and can be expressed as a new primitive operation in the target application programming interface. Then, the task of extending the programming language and the compiler can arise.
- It can comprise an extension of the language syntax and a modification of the blocks e.g. from parsing to code generation using the new operation.
- Changing language syntax can be easy when the parser 301 is implemented by using a convenient grammar definition and a parser generator.
- a modification of the other blocks 303- 307 may be much more laborious, as there may be no general convenient approaches, and they may be constructed using ad-hoc decisions.
- An efficient structure of a compiler can be desired in order to make this task easy and to raise the productivity and decrease the cost of constructing and evolving new programming languages, in particular domain specific languages that can change rapidly.
- an efficient structure of an elaborator 303 for name and type resolution and of an optimizer 305 can be desired that can allow for parameterization with specific information about name and type resolution for a domain specific language and a mapping of nodes of an abstract syntax tree to operations of a backend.
- a two-mode evaluation of the source code may further not be supported by integrated development environments (IDEs) but may be highly desirable in practice when using a generic structure for compilation.
- IDEs integrated development environments
- Common integrated development environments can provide a user interface around a compiler and/or interpreter of one or many programming languages. Since compilers may have a linear ordering of compilation steps, a user may only run and debug the program after it is compiled. Since many programs may utilize parallelized hardware and may thus comprise parts that run in parallel, it can be difficult to debug such parallelized programs.
- an approach for developing parallelized programs can be to use high-level abstractions with deterministic semantics. This determinism can enable semantic debugging as a special execution mode of the integrated development environment. Semantic debugging can be directly supported by both the compiler and the integrated development environment.
- Embodiments of the invention employ systematic approaches in order to implement a two-mode evaluation within integrated development environments.
- Fig. 4 shows a diagram of a two-phase development approach using an integrated development environment.
- the two-phase development approach can equivalently be referred to as two-mode development approach.
- Embodiments of the invention enable the usage of the two-phase development approach for a productive, cheap and easy specification of semantics of domain specific languages and for providing an extendible structure for the variety of domain specific languages.
- Embodiments of the invention realize an efficient compiler which can be parameterized in such a way that a developer of a programming language can specify not just the syntax of the language and related notions, but also semantic notions, including a specification of a target application programming interface, which can define primitive operations of the language, and a mapping of nodes of the abstract syntax tree to the target application programming interface. Furthermore, given the target application programming interface and the mapping specifications, a complete name and/or type resolution for domain specific languages can be performed, which can be based on object-oriented notions and can allow for an easy integration with object-oriented languages, such as Java.
- the frontend can automatically adapt to this change accordingly. So, new features can automatically be available within the frontend.
- the frontend can be a reflection of the semantics which are specified in the backend.
- Such a compiler can be regarded as a compiler-compiler since it can allow for a specification of semantics of a domain specific language additionally to its syntax.
- two-mode or two-phase program evaluation inside the integrated development environment can be supported. This two-mode evaluation can allow implementing a semantic debugging in addition to compilation. Thus, duplication of efforts can be eliminated, i.e. a single specification of the semantics can allow both for semantic debugging and for compilation.
- Fig. 5 shows a diagram of an apparatus 100 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment.
- the apparatus 100 forms a possible implementation of the apparatus 100 as described in conjunction with Fig. 1.
- the apparatus 100 comprises a code virtualizer 101, an elaborator 501, a parser 503, an interpreter 505, an evaluator 507, an optimizer 509, and a code generator 51 1. Furthermore, an executer 513 is provided.
- a source code of a domain specific language can be parsed by the parser 503 in order to provide an abstract syntax tree.
- the elaborator 501 can perform a name and/or type resolution upon the basis of the abstract syntax tree in order to obtain an abstract syntax tree with name and type information.
- the elaborator 501 can consider a predetermined mapping data structure, e.g. a table of mapping of nodes of the abstract syntax tree to calls of a target application programming interface and/or a predetermined semantics data structure, e.g. a semantics specification of the domain specific language.
- the code virtualizer 101 can generate a virtualized code upon the basis of the abstract syntax tree with name and/or type information, wherein the virtualized code can represent a single version of code with calls of the target application programming interface.
- the code virtualizer 101 can consider the predetermined mapping data structure, e.g. the table of mapping of nodes of the abstract syntax tree to calls of the target application programming interface and/or the predetermined semantics data structure, e.g. the semantics specification of the domain specific language.
- the interpreter 505 can semantically interpret the virtualized code, e.g. for enabling a semantic debugging.
- the interpreter 505 can process provided data.
- the parser 503, the elaborator 501, the code virtualizer 101 , and the interpreter 505 can form a frontend.
- the evaluator 507 can be a staged evaluator and can generate an intermediate representation upon the basis of the virtualized code.
- the optimizer 509 and the code generator 51 1 can jointly provide an executable machine code upon the basis of the intermediate representation.
- the executer 513 can execute the executable machine code using provided data.
- the evaluator 507, the optimizer 509, the code generator 51 1 , and the executer 513 can form a backend.
- the backend accepts as an input the virtualized code as a virtualized representation of the source program.
- the intermediate representation can be provided by the evaluator 507 using staged evaluation by constructing a graph data structure.
- the virtualized code can be executed by the evaluator 507, e.g.
- optimization by the optimizer 509 and/or code generation by the code generator 511 can be realized using standard approaches, or they can be implemented as part of a staged evaluation process using graph rewriting rules.
- the interpreter 505 can be a semantic debugger which can accept the virtualized code and can execute semantic functions contained within the virtualized code. These semantic functions can be taken from the predetermined semantics data structure indicating semantic specifications of the domain specific language during code virtualization within the frontend.
- a property of the virtualized code is that it can allow both semantic and staged evaluations. Semantic evaluation can be used to implement semantic debugging. Staged evaluation can be used to generate an intermediate representation for subsequent stages of the backend.
- the virtualized code can comprise invocations of the target application programming interface.
- the respective mode of operation, e.g. the semantics of staged evaluation, of the virtualized code can be implemented by selecting a corresponding version of the target application programming interface implementation.
- the target application programming interface can be specified in an object-oriented language, such as Java, or it can be specified in other languages that can support late- binding mechanisms.
- the target application programming interface can be generic, which means that the types of arguments and results of methods may have type parameters, as in Java, or this typing information can be erased and the target application programming interface can be specified in terms of a universal data type, such as an object in Java.
- the frontend comprises the elaborator 501 for name and/or type resolution and the code virtualizer 101. These blocks can accept as inputs the predetermined mapping data structure and/or the predetermined semantics data structure.
- the predetermined semantics data structure can indicate a specification of the target application programming interface, e.g. in form of object-oriented code or an interface object.
- the predetermined mapping data structure can indicate a mapping of the nodes of the abstract syntax tree of the source code to corresponding calls or invocations of the target application programming interface as implementations of semantic functions associated with the nodes of the abstract syntax tree.
- the frontend can be capable of performing a complete name and/or type resolution of the source code by the elaborator 501 including those language constructs that are specified by the semantics specification of the domain specific language.
- a developer of a particular programming language may only provide semantics specifications of the domain specific language. He may not implement name and/or type resolution algorithms for the language, including the resolution of parametrically polymorphic types, e.g. generics.
- the implementation may be granted by the apparatus 100.
- the frontend can adapt to this change accordingly. So, new features are automatically available within the frontend.
- the frontend may be a reflection of the semantics which are specified in the backend.
- a sub-division into a frontend and a backend is provided, wherein the functionalities differ from those of generic structures for compilation.
- a parser 503 which can be provided by a parser generator, can be used.
- the output of the parser 503 can be an abstract syntax tree.
- an elaborator 501 for name and/or type resolution can be used, which can perform a resolution of object-oriented languages, such as Java. It can additionally be parameterized with a mapping between primitive operations of the source language and method invocations of the target application programming interface. Given these parameters, the elaborator 501 can be capable to perform full name and/or type resolution for all language constructs given by the semantics specification of the domain specific language.
- a code virtualizer 101 can be employed, which can transform an abstract syntax tree into calls or invocations of the target application programming interface with respect to a provided mapping and semantics specification of the domain specific language.
- an interpreter 505 of the virtualized code can be used, which can allow for a semantic execution of the virtualized code without performing optimization and/or code generation.
- the interpreter 505 can be used for debugging.
- the interpreter 505 can bind interface methods of the target application programming interface, e.g. using late-binding, with a specific implementation which can allow a direct execution of all semantic functions from the semantics specification of the domain specific language.
- An evaluator 507 e.g. a staged evaluator, can be applied, which can perform a staged evaluation of the virtualized code. Internally, it can perform a construction of a graph data structure, wherein it can bind interface methods of the target application programming interface with a specific implementation for generating an intermediate representation. After executing the virtualized code by this kind of interpreter, the intermediate representation can be constructed which can semantically be equivalent with regard to the virtualized code and can thus be equivalent to the source code of the source program.
- a developer of a language can define a different version of a staged evaluator library.
- the staged evaluation can be implemented as described in the document WO 2015/01271 1 Al, which is herewith incorporated by reference in its entirety.
- the method for constructing a graph data structure can be used for providing the intermediate representation, wherein graph nodes of the graph data structure can be identified by symbols.
- program operations are represented in an object-oriented programming language by objects of classes that can form a hierarchy growing from a base node class of the graph data structure.
- New graph nodes of the graph data structure can be produced by calling factory methods associated with existing graph nodes of the graph data structure based on a factory method design pattern implemented in the graph nodes of the graph data structure.
- the graph nodes of the graph data structure can be identified by symbols. The symbols can be used as proxies of the graph nodes of the graph data structure according to a proxy design pattern.
- the target application programming interface which is exposed by the evaluator 507, can be provided either as an interface in an object-oriented language or as a runtime interface object.
- an optimizer 509 and/or a code generator 51 1 can be employed.
- Embodiments of the invention allows for defining a compiler by specifying inputs of a generic structure for compilation.
- the inputs can be a specification of a target application programming interface, a semantics specification of a domain specific language, and a specification of a mapping of nodes of an abstract syntax tree to calls of the target application programming interface.
- the author of the specifications may know and understand the syntax and semantics of the specifications. This can be simpler and less error-prone than developing or modifying specific code of an implementation, which can be error-prone and may be based on understanding all internals of a compiler, even if the compiler is implemented in a high-level language.
- Embodiments of the invention enable a cheaper and more productive development and evolution of domain specific languages and corresponding compilers.
- Embodiments of the invention allow a systematic implementation of two-phase or two-mode integrated development environments and an implementation of semantic debugging.
- the code virtualization allows for alternative interpretations of the same source code which can simplify a transition from a prototype to a production ready code.
- Fig. 6 shows a diagram of semantic equivalences between a source code PSRC, a virtualized code PVIRT, and an intermediate representation PIR within an apparatus 100 according to an embodiment.
- the apparatus 100 forms a possible implementation of the apparatus 100 as described in conjunction with Fig. 1 and Fig. 5.
- the apparatus 100 comprises a code virtualizer 101, an elaborator 501 , a parser 503, an interpreter 505, an evaluator 507, an optimizer 509, and a code generator 51 1. Furthermore, an executer 513 is provided.
- Embodiments of the invention use a separation of blocks of an implementation of a domain specific language in two parts, a frontend and a backend.
- the functionality may differ from the functionality of a generic structure for compilation.
- the elaborator 501 can be provided with a predetermined mapping data structure of an abstract syntax tree to a target application programming interface.
- the output of the elaborator 501 can be transformed into a specific intermediate representation, referred to as virtualized code.
- the virtualized code can comprise calls of the target application programming interface, which can implement a mechanism for invocations of semantic functions or primitives of a specific domain.
- the virtualized code can be an object-oriented virtualized code, which can be late-bound with regard to different implementations of the target application programming interface.
- Each binding can implement a different mode of evaluation, wherein at least two modes may be supported. Firstly, a semantic evaluation or interpretation can be supported in order to implement semantic debugging. Secondly, a staged evaluation or generation of an intermediate representation can be supported in order to implement a compilation of the code.
- the elaborator 501 for name and/or type resolution can have two additional parameters with regard to common approaches: a predetermined semantics data structure indicating a specification of the target application programming interface, and a predetermined mapping data structure indicating a specification of a mapping from nodes of the abstract syntax tree to calls of the target application programming interface.
- the code virtualizer 101 can be provided with the same two parameters, and can yield the virtualized code.
- the staged evaluation can satisfy the following equalities:
- PSRC denotes the source code of a program P
- PVIRT CodeVirtualization(PsRc) denotes the virtualized code representation of the program P
- DATA denotes provided data.
- PSRC(DATA), PVIRT(DATA), and PIR(DATA) denote respective evaluations of the respective code using the provided data.
- a first phase is semantic validation and a second phase is performance profiling.
- an author of a new domain specific language for a particular domain can describe domain operations using the backend target application programming interface and can then use the described integrated development environment being able to run code of the domain specific language in a debugger and to generate machine executable code for the target platform at runtime.
- duplication of efforts is made by programmers when developing parallelized programs.
- Embodiments of the invention support a two-phase or two- mode development workflow.
- the coupling of the frontend and the backend can make it difficult to implement alternative frontends for a given backend.
- by exposing the backend as an application programming interface it is easy to add alternative frontends for an already existing backend. It is also possible to combine many backends with a single frontend.
- An abstract syntax tree refers to a tree data structure that is used to represent an abstract syntactic structure of a source code. Each node of the abstract syntax tree can denote a construct occurring in the source code.
- An application programming interface refers to a specification of a set of subroutines, e.g. procedures, functions, and methods, which are to be called in order to interoperate with some subsystem. Usually an application programming interface is meant to be a set of interfaces in terms of an object-oriented language.
- a target application programming interface refers to a specification of a set of primitive operations, which are used in a target code. These operations can be elementary building blocks of language semantics and constitute an application programming interface (API) to a language runtime system.
- a compiler refers a computer program that processes a source code of a source program and generates an executable machine code.
- a compiler-compiler refers to a tool which can create a parser, an interpreter, or can compiler from a specific form of a formal description of a language and machine.
- a compile time refers to notions and operations performed when a compiler runs.
- An executable machine code refers to a sequence of machine code instructions to be executed by a processor of a computer in order to perform a given task.
- a generic program refers to a program written in terms of to-be-specified-later types that are instantiated when needed for specific types provided as parameters. This approach permits writing common functions or types that may only differ in the set of types on which they operate when used, thus reducing duplication.
- An interpreter refers to a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously batch-compiling them into machine language.
- An intermediate representation refers to a data structure that can be used inside a compiler for representing the source program and can allows an analysis and transformation before outputting an executable machine code.
- An intermediate representation can be a graph or a tree data structure with specific information inside the nodes. As the intermediate representation can comprise all information for evaluating the program if given the input data, the evaluation of the intermediate representation can be regarded as an equivalent way to execute the source program.
- Late binding or dynamic binding refers to a computer programming mechanism in which the method being called upon an object is looked up by name at runtime.
- OOP object-oriented programming
- a method relates to a subroutine or procedure associated with a class. Methods can define a behavior to be exhibited by instances of the associated class at program runtime. Methods can have the special property at runtime that they have access to data stored within an instance of the class they are associated with and are thereby able to control the state of the instance.
- An object- oriented language refers to a computer programming language that supports object-oriented programming (OOP).
- Object-oriented programming refers to a programming paradigm using objects - usually instances of a class - comprising data fields and methods together with their interactions in order to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance.
- Parametric polymorphism refers to the property of a programming language that types expressed in it may have parameters.
- Parametrically polymorphic types refer to types in a programming language that may have parameters.
- a runtime refers to notions and operations performed when a program is executed after compilation.
- a source code refers to a textual representation of a source program using a specific programming language.
- a source program refers to a program that can be used as an input to a compiler. It can be translated into executable machine code.
- a staged code can relate to a staged program representation.
- a staged code for a source program P can be a program P' such that evaluation of P' produces an intermediate representation, which is semantically equivalent to the program P.
- the result of executing the program can be an intermediate representation (IR).
- This intermediate representation can comprise a data structure that keeps track of all operations that were used in the program along with their order.
- a target code refers to a programming language code, to which the source code is compiled. Target code may be either low level as an executable machine code or higher level as a code in a conventional programming language, including object-oriented ones.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention relates to an apparatus (100) for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the apparatus (100) comprising a code virtualizer (101) being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
Description
AN APPARATUS FOR PROCESSING AN ABSTRACT SYNTAX TREE BEING ASSOCIATED WITH A SOURCE CODE OF A SOURCE PROGRAM
TECHNICAL FIELD
The invention relates to the field of computer technology, in particular to compilers for compiling source codes of source programs.
BACKGROUND
For compiling a source code of a source program, compilers of different architectures can be employed. A compiler can process a source program written in a source language and can provide an equivalent program written in a target language. The majority of compilers typically accepts only one source language and provides an equivalent program only in one target language. A minority of compilers is designed to accept a variety of source languages and to provide equivalent programs in a variety of target languages. Such compilers are often referred to as compiler-compilers. Usually a compiler-compiler accepts additional inputs which specify the source language, the rules of translation and other parameters, and allows a developer to change its implementation in order to reflect the specification of a new language and translation rules.
Due to an increasing application of computers and software, recent trends are directed towards increasing the variety of programming languages and the development of new domain specific languages (DSLs). Often, an application is written in several different languages and new domain specific languages are defined with regard to specific parts of the application. Currently, integrated development platforms are used for dealing with the variety of languages. In US 7076772 B2, a system and method for a multi- language extensible compiler framework is described.
When defining a new language, the developer usually reuses notions and modules of
the definitions of other languages. However, this approach is based on a deep knowledge of the structure and internals of the development environment, and the task of specifying a new language is not easy. Approaches for increasing productivity, reliability and maintainability of new language specifications as well as efficient compilers are highly desirable.
SUMMARY
It is an object of the invention to provide an efficient concept for processing an abstract syntax tree being associated with a source code of a source program.
This object is achieved by the features of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.
The invention is based on the finding that a code virtualizer can be applied in order to provide virtualized code indicating a plurality of calls of a target application programming interface. The virtualized code can be generated upon the basis of an abstract syntax tree being associated with a source code of a source program, wherein the abstract syntax tree can e.g. be provided by a parser for parsing the source code.
The virtualized code can be semantically equivalent to the source code of the source program and can therefore efficiently represent the source program. Thus, the virtualized code allows for a reduction of used memory space within a computer. Furthermore, the virtualized code enables an efficient interpretation of semantics of the source program by an interpreter e.g. used for debugging. In addition, the virtualized code allows for an efficient processing by an evaluator yielding an intermediate representation suitable for generating executable machine code. The evaluator can be based on the concept of staged evaluation.
According to a first aspect, the invention relates to an apparatus for processing an abstract syntax tree being associated with a source code of a source program, the
abstract syntax tree comprising a plurality of nodes, the apparatus comprising a code virtualizer being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface. Thus, an efficient concept for processing an abstract syntax tree being associated with a source code of a source program is realized.
The apparatus can be a compiler being configured to compile the source code of the source program. The predetermined mapping data structure can be a predetermined mapping table. The virtualized code can be an intermediate representation associated with the source code.
The virtualized code can indicate the plurality of calls of the target application programming interface, thereby efficiently representing semantics of the source program. The target application programming interface can be associated with a domain specific language (DSL).
In a first implementation form of the apparatus according to the first aspect as such, the code virtualizer is further configured to associate the plurality of nodes with the plurality of calls upon the basis of a predetermined semantics data structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface. Thus, a semantic specification of the target application programming interface can be considered efficiently.
In a second implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the virtualized code is semantically equivalent to the source code of the source program. Thus, a complete semantic representation of the source program can be provided.
In a third implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the apparatus further comprises an elaborator being configured to determine a plurality of names and a plurality of types associated with the plurality of nodes upon the basis of the predetermined mapping data structure, and to append the plurality of names and the plurality of types to the plurality of nodes. Thus, name and type resolution upon the basis of the abstract syntax tree can be performed efficiently.
In a fourth implementation form of the apparatus according to the third implementation form of the first aspect, the elaborator is further configured to determine the plurality of names and the plurality of types associated with the plurality of nodes upon the basis of a/the predetermined semantics data structure, the predetermined semantics data structure indicating a/the semantic specification of the target application programming interface. Thus, name and type resolution upon the basis of the abstract syntax tree can be performed more efficiently. The predetermined semantics data structure can be processed by the code virtualizer and the elaborator.
In a fifth implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the apparatus further comprises a parser being configured to parse the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes. Thus, the abstract syntax tree can be provided efficiently. The parser can be provided by a parser generator.
In a sixth implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the virtualized code is an object-oriented virtualized code. Thus, the virtualized code can be late-bound with regard to the target application programming interface.
In a seventh implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the apparatus further comprises an interpreter being configured to semantically interpret the virtualized code. Thus, a direct semantic execution of the virtualized code can be realized
efficiently. The interpreter can be used for an efficient debugging of the source program. The interpreter can perform a late-binding with regard to the target application programming interface. In an eighth implementation form of the apparatus according to the first aspect as such or any preceding implementation form of the first aspect, the apparatus further comprises an evaluator being configured to generate an intermediate representation associated with the source program upon the basis of the virtualized code, the intermediate representation comprising the plurality of calls of the target application programming interface. Thus, an optimization and/or code generation upon the basis of the intermediate representation can be performed efficiently.
In a ninth implementation form of the apparatus according to the eighth implementation form of the first aspect, the intermediate representation comprises a graph data structure indicating a graph comprising a plurality of graph nodes, the plurality of graph nodes comprising the plurality of calls of the target application programming interface. Thus, semantics of the virtualized code can be represented efficiently. The staged evaluation can be implemented as described in the document WO 2015/01271 1 Al , which is herewith incorporated by reference in its entirety. In particular, the method for constructing a graph data structure can be used for providing the intermediate representation, wherein the plurality of graph nodes of the graph data structure can be identified by symbols.
In a tenth implementation form of the apparatus according to the eighth implementation form or the ninth implementation form of the first aspect, the intermediate representation is semantically equivalent to the virtualized code. Thus, a complete semantic representation of the virtualized code can be provided.
In an eleventh implementation form of the apparatus according to the eighth implementation form to the tenth implementation form of the first aspect, the
apparatus further comprises a code generator being configured to generate an executable machine code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer. Thus, the source program can efficiently be executed by a computer.
In a twelfth implementation form of the apparatus according to the eleventh implementation form of the first aspect, the apparatus further comprises an optimizer being configured to optimize the intermediate representation with regard to a predetermined performance metric of the executable machine code. Thus, a performance of the executable machine code can be improved. The predetermined performance metric can indicate a runtime and/or a used memory of the executable machine code.
According to a second aspect, the invention relates to a method for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the method comprising associating, by a code virtualizer, the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and generating, by the code virtualizer, a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface. Thus, an efficient concept for processing an abstract syntax tree being associated with a source code of a source program is realized.
The method can be performed by the apparatus. Further features of the method directly result from the functionality of the apparatus. In a first implementation form of the method according to the second aspect as such, the method further comprises associating, by the code virtualizer, the plurality of nodes with the plurality of calls upon the basis of a predetermined semantics data
structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface. Thus, a semantic specification of the target application programming interface can be considered efficiently.
In a second implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the virtualized code is semantically equivalent to the source code of the source program. Thus, a complete semantic representation of the source program can be provided.
In a third implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the method further comprises determining, by an elaborator, a plurality of names and a plurality of types associated with the plurality of nodes upon the basis of the predetermined mapping data structure, and appending, by the elaborator, the plurality of names and the plurality of types to the plurality of nodes. Thus, name and type resolution upon the basis of the abstract syntax tree can be performed efficiently.
In a fourth implementation form of the method according to the third implementation form of the second aspect, the method further comprises determining, by the elaborator, the plurality of names and the plurality of types associated with the plurality of nodes upon the basis of a/the predetermined semantics data structure, the predetermined semantics data structure indicating a/the semantic specification of the target application programming interface. Thus, name and type resolution upon the basis of the abstract syntax tree can be performed more efficiently.
In a fifth implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the method further comprises parsing, by a parser, the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes. Thus, the abstract syntax tree can be provided efficiently.
In a sixth implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the virtualized code is an object-oriented virtualized code. Thus, the virtualized code can be late-bound with regard to the target application programming interface.
In a seventh implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the method further comprises semantically interpreting, by an interpreter, the virtualized code. Thus, a direct semantic execution of the virtualized code can be realized efficiently.
In an eighth implementation form of the method according to the second aspect as such or any preceding implementation form of the second aspect, the method further comprises generating, by an evaluator, an intermediate representation associated with the source program upon the basis of the virtualized code, the intermediate representation comprising the plurality of calls of the target application programming interface. Thus, an optimization and/or code generation upon the basis of the intermediate representation can be performed efficiently.
In a ninth implementation form of the method according to the eighth implementation form of the second aspect, the intermediate representation comprises a graph data structure indicating a graph comprising a plurality of graph nodes, the plurality of graph nodes comprising the plurality of calls of the target application programming interface. Thus, semantics of the virtualized code can be represented efficiently. In a tenth implementation form of the method according to the eighth implementation form or the ninth implementation form of the second aspect, the intermediate representation is semantically equivalent to the virtualized code. Thus, a complete semantic representation of the virtualized code can be provided. In an eleventh implementation form of the method according to the eighth implementation form to the tenth implementation form of the second aspect, the method further comprises generating, by a code generator, an executable machine
code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer. Thus, the source program can efficiently be executed by a computer. In a twelfth implementation form of the method according to the eleventh implementation form of the second aspect, the method further comprises optimizing, by an optimizer, the intermediate representation with regard to a predetermined performance metric of the executable machine code. Thus, a performance of the executable machine code can be improved.
According to a third aspect, the invention relates to a computer program comprising a computer program code for performing the method, when executed on a computer. Thus, the method can be performed in an automatic and repeatable manner. The apparatus can be programmably arranged to perform the computer program.
The invention can be implemented in hardware and/or software.
BRIEF DESCRIPTION OF EMBODIMENTS Embodiments of the invention will be described with respect to the following figures, in which:
Fig. 1 shows a diagram of an apparatus for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment;
Fig. 2 shows a diagram of a method for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment;
Fig. 3 shows a diagram of a generic structure for compiling a source code of a source program into executable machine code;
Fig. 4 shows a diagram of a two-phase development approach using an integrated
development environment;
Fig. 5 shows a diagram of an apparatus for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment; and
Fig. 6 shows a diagram of semantic equivalences between a source code, a virtualized code, and an intermediate representation within an apparatus according to an embodiment. DETAILED DESCRIPTION OF EMBODIMENTS
Fig. 1 shows a diagram of an apparatus 100 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment. The abstract syntax tree comprises a plurality of nodes.
The apparatus 100 comprises a code virtualizer 101 being configured to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
Fig. 2 shows a diagram of a method 200 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment. The abstract syntax tree comprises a plurality of nodes.
The method 200 comprises associating 201 the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and generating 203 a virtualized code upon the basis of the plurality
of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
In the following, further implementation forms and embodiments of the apparatus 100 and the method 200 are described. The apparatus 100 and the method 200 allow for a semantic debugging of domain specific languages (DSLs).
Fig. 3 shows a diagram of a generic structure for compiling a source code of a source program into executable machine code. The diagram relates to a structure of a programming language implementation.
The generic structure comprises the following blocks, wherein processing is performed in order: a parser 301 , an elaborator 303 for name and type resolution, an optimizer 305, and a code generator 307. These four blocks are typically comprised by the generic structure. Furthermore, an executer 309 can be provided for executing the target executable machine code. The executer 309 can be realized by computer hardware or by an interpreter.
Compiler-compilers can automate the design of some of the blocks 301-307 within various degrees. A high level of automation can e.g. be achieved for the parser 301 by using parser generators. A parser generator can accept a language grammar specification and, optionally, specific information about an internal program representation. The output of the parser 301 can be an intermediate representation (IR) of the source program, also referred to as abstract syntax tree (AST). Further approaches can be used in order to provide abstract syntax trees.
The construction of the second block, the elaborator 303 for name and type resolution, can be less automated, because rules and algorithms of name and type resolution may greatly vary from language to language. Nevertheless, a joint use of several languages in one application can assume that name and type resolution rules of these languages have much in common. A certain automation of the construction of the elaborator 303 can be achieved by a library of building blocks and/or modules, which can be used to
design an elaborator 303 for a particular domain specific language. The name and type resolution for modern programming languages may be parameterized to a certain de gree. While compiling a module of a source program, the elaborator 303 may not use the entire code of other program modules. Rather, it can accept a description of interfaces which the other modules satisfy. For example, in a Java based environment, the interfaces can be supplied in class files, jar files, as well as Java files with source code of interfaces. These features of elaborators can be utilized in embodiments of the invention and can be additionally parameterized. The optimizer 305 can be built for intermediate representations of a large variety of programming languages. For example, a LLVM optimization framework can be used, which can accept languages such as C. C++, Java, and other imperative and/or object- oriented languages. The code generator 307 may not depend on details of the source language, may be designed for a class of intermediate representations, and may depend on primitive operations of a model of computation and/or computer architecture. It can generate code in terms of low level executable machine code or higher level code, which can use a target application programming interface that defines basic semantic notions in terms of which the semantics of the language are expressed.
Consequently, the specification of language syntax and related notions can be automated to a specific degree. On the other hand, the specification of semantic notions, from name and type resolution to code generation, may desire for additional efforts. Embodiments of the invention apply techniques collectively denoted as multistage programming or simply staging. There are different techniques to implement staging, e.g. staged evaluation.
Within the generic structure for compilation, an extension of a programming language can be difficult in particular with regard to the middle blocks. Programming languages, in particular domain specific languages, may not be created in one step. They may be created gradually by a stepwise addition of language features and
extension of the generic structure for compilation. The desire for adding a feature for a programming language can e.g. arise when a new basic semantic notion is added to a model of computation and can be expressed as a new primitive operation in the target application programming interface. Then, the task of extending the programming language and the compiler can arise.
It can comprise an extension of the language syntax and a modification of the blocks e.g. from parsing to code generation using the new operation. Changing language syntax can be easy when the parser 301 is implemented by using a convenient grammar definition and a parser generator. A modification of the other blocks 303- 307 may be much more laborious, as there may be no general convenient approaches, and they may be constructed using ad-hoc decisions.
An efficient structure of a compiler can be desired in order to make this task easy and to raise the productivity and decrease the cost of constructing and evolving new programming languages, in particular domain specific languages that can change rapidly. Furthermore, an efficient structure of an elaborator 303 for name and type resolution and of an optimizer 305 can be desired that can allow for parameterization with specific information about name and type resolution for a domain specific language and a mapping of nodes of an abstract syntax tree to operations of a backend.
Furthermore, a two-mode evaluation of the source code may further not be supported by integrated development environments (IDEs) but may be highly desirable in practice when using a generic structure for compilation. Common integrated development environments can provide a user interface around a compiler and/or interpreter of one or many programming languages. Since compilers may have a linear ordering of compilation steps, a user may only run and debug the program after it is compiled. Since many programs may utilize parallelized hardware and may thus comprise parts that run in parallel, it can be difficult to debug such parallelized programs. At the same time, an approach for developing parallelized programs can be to use high-level
abstractions with deterministic semantics. This determinism can enable semantic debugging as a special execution mode of the integrated development environment. Semantic debugging can be directly supported by both the compiler and the integrated development environment. Embodiments of the invention employ systematic approaches in order to implement a two-mode evaluation within integrated development environments.
Fig. 4 shows a diagram of a two-phase development approach using an integrated development environment. The two-phase development approach can equivalently be referred to as two-mode development approach.
Embodiments of the invention enable the usage of the two-phase development approach for a productive, cheap and easy specification of semantics of domain specific languages and for providing an extendible structure for the variety of domain specific languages.
Embodiments of the invention realize an efficient compiler which can be parameterized in such a way that a developer of a programming language can specify not just the syntax of the language and related notions, but also semantic notions, including a specification of a target application programming interface, which can define primitive operations of the language, and a mapping of nodes of the abstract syntax tree to the target application programming interface. Furthermore, given the target application programming interface and the mapping specifications, a complete name and/or type resolution for domain specific languages can be performed, which can be based on object-oriented notions and can allow for an easy integration with object-oriented languages, such as Java.
When a target language is extended, for example when a domain specific language implementation is extended with additional primitives and constructions, the frontend can automatically adapt to this change accordingly. So, new features can automatically be available within the frontend. The frontend can be a reflection of the semantics which are specified in the backend. Such a compiler can be regarded as a
compiler-compiler since it can allow for a specification of semantics of a domain specific language additionally to its syntax. Moreover, two-mode or two-phase program evaluation inside the integrated development environment can be supported. This two-mode evaluation can allow implementing a semantic debugging in addition to compilation. Thus, duplication of efforts can be eliminated, i.e. a single specification of the semantics can allow both for semantic debugging and for compilation.
Fig. 5 shows a diagram of an apparatus 100 for processing an abstract syntax tree being associated with a source code of a source program according to an embodiment. The apparatus 100 forms a possible implementation of the apparatus 100 as described in conjunction with Fig. 1. The apparatus 100 comprises a code virtualizer 101, an elaborator 501, a parser 503, an interpreter 505, an evaluator 507, an optimizer 509, and a code generator 51 1. Furthermore, an executer 513 is provided.
A source code of a domain specific language can be parsed by the parser 503 in order to provide an abstract syntax tree. The elaborator 501 can perform a name and/or type resolution upon the basis of the abstract syntax tree in order to obtain an abstract syntax tree with name and type information. The elaborator 501 can consider a predetermined mapping data structure, e.g. a table of mapping of nodes of the abstract syntax tree to calls of a target application programming interface and/or a predetermined semantics data structure, e.g. a semantics specification of the domain specific language. The code virtualizer 101 can generate a virtualized code upon the basis of the abstract syntax tree with name and/or type information, wherein the virtualized code can represent a single version of code with calls of the target application programming interface. The code virtualizer 101 can consider the predetermined mapping data structure, e.g. the table of mapping of nodes of the abstract syntax tree to calls of the target application programming interface and/or the predetermined semantics data structure, e.g. the semantics specification of the domain specific language. The interpreter 505 can semantically interpret the virtualized code, e.g. for enabling a semantic debugging. The interpreter 505 can process provided data. The parser 503, the elaborator 501, the code virtualizer 101 , and the interpreter 505
can form a frontend.
The evaluator 507 can be a staged evaluator and can generate an intermediate representation upon the basis of the virtualized code. The optimizer 509 and the code generator 51 1 can jointly provide an executable machine code upon the basis of the intermediate representation. The executer 513 can execute the executable machine code using provided data. The evaluator 507, the optimizer 509, the code generator 51 1 , and the executer 513 can form a backend. In an embodiment, the backend accepts as an input the virtualized code as a virtualized representation of the source program. The intermediate representation can be provided by the evaluator 507 using staged evaluation by constructing a graph data structure. The virtualized code can be executed by the evaluator 507, e.g. the staged evaluator, which generates the intermediate representation of the source program. Optimization by the optimizer 509 and/or code generation by the code generator 511 can be realized using standard approaches, or they can be implemented as part of a staged evaluation process using graph rewriting rules.
The interpreter 505 can be a semantic debugger which can accept the virtualized code and can execute semantic functions contained within the virtualized code. These semantic functions can be taken from the predetermined semantics data structure indicating semantic specifications of the domain specific language during code virtualization within the frontend. A property of the virtualized code is that it can allow both semantic and staged evaluations. Semantic evaluation can be used to implement semantic debugging. Staged evaluation can be used to generate an intermediate representation for subsequent stages of the backend. The virtualized code can comprise invocations of the target application programming interface. The respective mode of operation, e.g. the semantics of staged evaluation, of the virtualized code can be implemented by selecting a corresponding version of the target application programming interface implementation.
The target application programming interface can be specified in an object-oriented language, such as Java, or it can be specified in other languages that can support late- binding mechanisms. The target application programming interface can be generic, which means that the types of arguments and results of methods may have type parameters, as in Java, or this typing information can be erased and the target application programming interface can be specified in terms of a universal data type, such as an object in Java. The frontend comprises the elaborator 501 for name and/or type resolution and the code virtualizer 101. These blocks can accept as inputs the predetermined mapping data structure and/or the predetermined semantics data structure. The predetermined semantics data structure can indicate a specification of the target application programming interface, e.g. in form of object-oriented code or an interface object. The predetermined mapping data structure can indicate a mapping of the nodes of the abstract syntax tree of the source code to corresponding calls or invocations of the target application programming interface as implementations of semantic functions associated with the nodes of the abstract syntax tree. The frontend can be capable of performing a complete name and/or type resolution of the source code by the elaborator 501 including those language constructs that are specified by the semantics specification of the domain specific language.
A developer of a particular programming language may only provide semantics specifications of the domain specific language. He may not implement name and/or type resolution algorithms for the language, including the resolution of parametrically polymorphic types, e.g. generics. The implementation may be granted by the apparatus 100. When the semantics specification of the domain specific language is extended, i.e. the implementation of the domain specific language now supports additional primitives and/or constructions, the frontend can adapt to this change accordingly. So, new features are automatically available within the frontend. The frontend may be a reflection of the semantics which are specified in the backend. In an embodiment of the invention, a sub-division into a frontend and a backend is
provided, wherein the functionalities differ from those of generic structures for compilation.
In summary, a parser 503, which can be provided by a parser generator, can be used. The output of the parser 503 can be an abstract syntax tree. Furthermore, an elaborator 501 for name and/or type resolution can be used, which can perform a resolution of object-oriented languages, such as Java. It can additionally be parameterized with a mapping between primitive operations of the source language and method invocations of the target application programming interface. Given these parameters, the elaborator 501 can be capable to perform full name and/or type resolution for all language constructs given by the semantics specification of the domain specific language. A code virtualizer 101 can be employed, which can transform an abstract syntax tree into calls or invocations of the target application programming interface with respect to a provided mapping and semantics specification of the domain specific language. Optionally, an interpreter 505 of the virtualized code can be used, which can allow for a semantic execution of the virtualized code without performing optimization and/or code generation. For example, the interpreter 505 can be used for debugging. The interpreter 505 can bind interface methods of the target application programming interface, e.g. using late-binding, with a specific implementation which can allow a direct execution of all semantic functions from the semantics specification of the domain specific language.
An evaluator 507, e.g. a staged evaluator, can be applied, which can perform a staged evaluation of the virtualized code. Internally, it can perform a construction of a graph data structure, wherein it can bind interface methods of the target application programming interface with a specific implementation for generating an intermediate representation. After executing the virtualized code by this kind of interpreter, the intermediate representation can be constructed which can semantically be equivalent with regard to the virtualized code and can thus be equivalent to the source code of the source program. In order to specify different output representations or target languages, a developer of a language can define a different version of a staged evaluator library.
The staged evaluation can be implemented as described in the document WO 2015/01271 1 Al, which is herewith incorporated by reference in its entirety. In particular, the method for constructing a graph data structure can be used for providing the intermediate representation, wherein graph nodes of the graph data structure can be identified by symbols. In an embodiment, program operations are represented in an object-oriented programming language by objects of classes that can form a hierarchy growing from a base node class of the graph data structure. New graph nodes of the graph data structure can be produced by calling factory methods associated with existing graph nodes of the graph data structure based on a factory method design pattern implemented in the graph nodes of the graph data structure. The graph nodes of the graph data structure can be identified by symbols. The symbols can be used as proxies of the graph nodes of the graph data structure according to a proxy design pattern.
The target application programming interface, which is exposed by the evaluator 507, can be provided either as an interface in an object-oriented language or as a runtime interface object. Optionally, an optimizer 509 and/or a code generator 51 1 can be employed.
Embodiments of the invention allows for defining a compiler by specifying inputs of a generic structure for compilation. The inputs can be a specification of a target application programming interface, a semantics specification of a domain specific language, and a specification of a mapping of nodes of an abstract syntax tree to calls of the target application programming interface. The author of the specifications may know and understand the syntax and semantics of the specifications. This can be simpler and less error-prone than developing or modifying specific code of an implementation, which can be error-prone and may be based on understanding all internals of a compiler, even if the compiler is implemented in a high-level language.
Embodiments of the invention enable a cheaper and more productive development and evolution of domain specific languages and corresponding compilers. Embodiments of
the invention allow a systematic implementation of two-phase or two-mode integrated development environments and an implementation of semantic debugging. The code virtualization allows for alternative interpretations of the same source code which can simplify a transition from a prototype to a production ready code.
Fig. 6 shows a diagram of semantic equivalences between a source code PSRC, a virtualized code PVIRT, and an intermediate representation PIR within an apparatus 100 according to an embodiment. The apparatus 100 forms a possible implementation of the apparatus 100 as described in conjunction with Fig. 1 and Fig. 5. The apparatus 100 comprises a code virtualizer 101, an elaborator 501 , a parser 503, an interpreter 505, an evaluator 507, an optimizer 509, and a code generator 51 1. Furthermore, an executer 513 is provided.
Embodiments of the invention use a separation of blocks of an implementation of a domain specific language in two parts, a frontend and a backend. The functionality may differ from the functionality of a generic structure for compilation. The elaborator 501 can be provided with a predetermined mapping data structure of an abstract syntax tree to a target application programming interface. The output of the elaborator 501 can be transformed into a specific intermediate representation, referred to as virtualized code.
The virtualized code can comprise calls of the target application programming interface, which can implement a mechanism for invocations of semantic functions or primitives of a specific domain. The virtualized code can be an object-oriented virtualized code, which can be late-bound with regard to different implementations of the target application programming interface. Each binding can implement a different mode of evaluation, wherein at least two modes may be supported. Firstly, a semantic evaluation or interpretation can be supported in order to implement semantic debugging. Secondly, a staged evaluation or generation of an intermediate representation can be supported in order to implement a compilation of the code. The elaborator 501 for name and/or type resolution can have two additional parameters with regard to common approaches: a predetermined semantics data structure
indicating a specification of the target application programming interface, and a predetermined mapping data structure indicating a specification of a mapping from nodes of the abstract syntax tree to calls of the target application programming interface. The code virtualizer 101 can be provided with the same two parameters, and can yield the virtualized code.
The staged evaluation can satisfy the following equalities:
PSRC(DATA) = PVIRT(DATA) = P[R(DATA) wherein
PSRC denotes the source code of a program P,
PVIRT = CodeVirtualization(PsRc) denotes the virtualized code representation of the program P,
PIR = StagedEvaluation(PviRT) denotes the generated intermediate representation of the program P, and
DATA denotes provided data.
The above mentioned equalities can hold, wherein PSRC(DATA), PVIRT(DATA), and PIR(DATA) denote respective evaluations of the respective code using the provided data.
In common approaches, no support of two-phase development is provided, wherein a first phase is semantic validation and a second phase is performance profiling. In an embodiment of the invention, an author of a new domain specific language for a particular domain can describe domain operations using the backend target application programming interface and can then use the described integrated development environment being able to run code of the domain specific language in a debugger and to generate machine executable code for the target platform at runtime. In common approaches, duplication of efforts is made by programmers when developing parallelized programs. Embodiments of the invention support a two-phase or two- mode development workflow. In common approaches, the coupling of the frontend
and the backend can make it difficult to implement alternative frontends for a given backend. In embodiments of the invention, by exposing the backend as an application programming interface, it is easy to add alternative frontends for an already existing backend. It is also possible to combine many backends with a single frontend.
Throughout the description, the following definitions and acronyms are used.
An abstract syntax tree (AST) refers to a tree data structure that is used to represent an abstract syntactic structure of a source code. Each node of the abstract syntax tree can denote a construct occurring in the source code. An application programming interface (API) refers to a specification of a set of subroutines, e.g. procedures, functions, and methods, which are to be called in order to interoperate with some subsystem. Usually an application programming interface is meant to be a set of interfaces in terms of an object-oriented language. A target application programming interface refers to a specification of a set of primitive operations, which are used in a target code. These operations can be elementary building blocks of language semantics and constitute an application programming interface (API) to a language runtime system.
A compiler refers a computer program that processes a source code of a source program and generates an executable machine code. A compiler-compiler refers to a tool which can create a parser, an interpreter, or can compiler from a specific form of a formal description of a language and machine. A compile time refers to notions and operations performed when a compiler runs. An executable machine code refers to a sequence of machine code instructions to be executed by a processor of a computer in order to perform a given task.
A generic program refers to a program written in terms of to-be-specified-later types that are instantiated when needed for specific types provided as parameters. This approach permits writing common functions or types that may only differ in the set of types on which they operate when used, thus reducing duplication. An interpreter refers to a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously batch-compiling them
into machine language.
An intermediate representation (IR) refers to a data structure that can be used inside a compiler for representing the source program and can allows an analysis and transformation before outputting an executable machine code. An intermediate representation can be a graph or a tree data structure with specific information inside the nodes. As the intermediate representation can comprise all information for evaluating the program if given the input data, the evaluation of the intermediate representation can be regarded as an equivalent way to execute the source program. Late binding or dynamic binding refers to a computer programming mechanism in which the method being called upon an object is looked up by name at runtime.
In object-oriented programming (OOP), a method relates to a subroutine or procedure associated with a class. Methods can define a behavior to be exhibited by instances of the associated class at program runtime. Methods can have the special property at runtime that they have access to data stored within an instance of the class they are associated with and are thereby able to control the state of the instance. An object- oriented language (OOL) refers to a computer programming language that supports object-oriented programming (OOP). Object-oriented programming refers to a programming paradigm using objects - usually instances of a class - comprising data fields and methods together with their interactions in order to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance. Parametric polymorphism refers to the property of a programming language that types expressed in it may have parameters. Parametrically polymorphic types refer to types in a programming language that may have parameters. A runtime refers to notions and operations performed when a program is executed after compilation.
A source code refers to a textual representation of a source program using a specific programming language. A source program refers to a program that can be used as an input to a compiler. It can be translated into executable machine code. A staged code can relate to a staged program representation. A staged code for a source program P
can be a program P' such that evaluation of P' produces an intermediate representation, which is semantically equivalent to the program P. In staged evaluation, instead of resulting data values, the result of executing the program can be an intermediate representation (IR). This intermediate representation can comprise a data structure that keeps track of all operations that were used in the program along with their order. A target code refers to a programming language code, to which the source code is compiled. Target code may be either low level as an executable machine code or higher level as a code in a conventional programming language, including object-oriented ones.
Although the present invention and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims
1. An apparatus (100) for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the apparatus (100) comprising: a code virtualizer (101) being configured
to associate the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls, and
to generate a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
2. The apparatus (100) of claim 1, wherein the code virtualizer (101) is further configured to associate the plurality of nodes with the plurality of calls upon the basis of a predetermined semantics data structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface.
3. The apparatus (100) of any of the preceding claims, wherein the virtualized code is semantically equivalent to the source code of the source program.
4. The apparatus (100) of any of the preceding claims, further comprising: an elaborator (501) being configured to determine a plurality of names and a plurality of types associated with the plurality of nodes upon the basis of the predetermined mapping data structure, and to append the plurality of names and the plurality of types to the plurality of nodes.
5. The apparatus (100) of claim 4, wherein the elaborator (501) is further
configured to determine the plurality of names and the plurality of types associated with the plurality of nodes upon the basis of a predetermined semantics data structure, the predetermined semantics data structure indicating a semantic specification of the target application programming interface.
6. The apparatus (100) of any of the preceding claims, further comprising: a parser (503) being configured to parse the source code of the source program to obtain the abstract syntax tree comprising the plurality of nodes.
7. The apparatus (100) of any of the preceding claims, wherein the virtualized code is an object-oriented virtualized code.
8. The apparatus (100) of any of the preceding claims, further comprising: an interpreter (505) being configured to semantically interpret the virtualized code.
9. The apparatus (100) of any of the preceding claims, further comprising: an evaluator (507) being configured to generate an intermediate representation associated with the source program upon the basis of the virtualized code, the intermediate representation comprising the plurality of calls of the target application programming interface.
10. The apparatus (100) of claim 9, wherein the intermediate representation comprises a graph data structure indicating a graph comprising a plurality of graph nodes, the plurality of graph nodes comprising the plurality of calls of the target application programming interface.
1 1. The apparatus (100) of claims 9 or 10, wherein the intermediate representation is semantically equivalent to the virtualized code.
12. The apparatus (100) of claims 9 to 1 1, further comprising: a code generator (51 1) being configured to generate an executable machine code upon the basis of the intermediate representation, the executable machine code being executable by a processor of a computer.
13. The apparatus (100) of claim 12, further comprising: an optimizer (509) being configured to optimize the intermediate representation with regard to a predetermined performance metric of the executable machine code.
14. A method (200) for processing an abstract syntax tree being associated with a source code of a source program, the abstract syntax tree comprising a plurality of nodes, the method (200) comprising: associating (201) the plurality of nodes of the abstract syntax tree with a plurality of calls of a target application programming interface upon the basis of a predetermined mapping data structure, the predetermined mapping data structure indicating semantic associations between the plurality of nodes and the plurality of calls; and generating (203) a virtualized code upon the basis of the plurality of nodes, the virtualized code indicating the plurality of calls of the target application programming interface.
15. A computer program comprising a computer program code for performing the method (200) of claim 14, when executed on a computer.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201580078525.XA CN110149800B (en) | 2015-04-07 | 2015-04-07 | An apparatus for processing an abstract syntax tree associated with source code of a source program |
PCT/RU2015/000218 WO2016163901A1 (en) | 2015-04-07 | 2015-04-07 | An apparatus for processing an abstract syntax tree being associated with a source code of a source program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/RU2015/000218 WO2016163901A1 (en) | 2015-04-07 | 2015-04-07 | An apparatus for processing an abstract syntax tree being associated with a source code of a source program |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016163901A1 true WO2016163901A1 (en) | 2016-10-13 |
Family
ID=54366491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/RU2015/000218 WO2016163901A1 (en) | 2015-04-07 | 2015-04-07 | An apparatus for processing an abstract syntax tree being associated with a source code of a source program |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110149800B (en) |
WO (1) | WO2016163901A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457065A (en) * | 2019-08-14 | 2019-11-15 | 中国工商银行股份有限公司 | For obtaining the method and device of compatible multi version systematic difference |
CN111240772A (en) * | 2020-01-22 | 2020-06-05 | 腾讯科技(深圳)有限公司 | Data processing method and device based on block chain and storage medium |
CN111367527A (en) * | 2020-02-18 | 2020-07-03 | 北京字节跳动网络技术有限公司 | Language processing method, device, medium and electronic equipment |
US11113095B2 (en) | 2019-04-30 | 2021-09-07 | Automation Anywhere, Inc. | Robotic process automation system with separate platform, bot and command class loaders |
CN113672224A (en) * | 2021-08-20 | 2021-11-19 | 上海哔哩哔哩科技有限公司 | Method and device for generating small program page code and computer equipment |
US11243803B2 (en) * | 2019-04-30 | 2022-02-08 | Automation Anywhere, Inc. | Platform agnostic robotic process automation |
CN114090964A (en) * | 2021-11-18 | 2022-02-25 | 北京五八信息技术有限公司 | Code processing method and device, electronic equipment and readable medium |
US11301224B1 (en) | 2019-04-30 | 2022-04-12 | Automation Anywhere, Inc. | Robotic process automation system with a command action logic independent execution environment |
US11334467B2 (en) | 2019-05-03 | 2022-05-17 | International Business Machines Corporation | Representing source code in vector space to detect errors |
CN114528013A (en) * | 2021-12-27 | 2022-05-24 | 北京达佳互联信息技术有限公司 | Text generation method and device, electronic equipment, storage medium and product |
US11614731B2 (en) | 2019-04-30 | 2023-03-28 | Automation Anywhere, Inc. | Zero footprint robotic process automation system |
US11640282B2 (en) * | 2019-10-24 | 2023-05-02 | Here Global B.V. | Method, apparatus, and system for providing a broker for data modeling and code generation |
CN116974573A (en) * | 2023-07-10 | 2023-10-31 | 中国人民解放军陆军工程大学 | Compiling method for application program of fully distributed intelligent building system |
CN118296220A (en) * | 2024-03-25 | 2024-07-05 | 南通大学 | An intelligent retrieval API recommendation method based on LSTM |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111091612B (en) * | 2019-10-09 | 2023-06-02 | 武汉凌久微电子有限公司 | Method and device for generating coloring language machine code of abstract target code architecture |
CN112698819A (en) * | 2019-10-22 | 2021-04-23 | 北京信普飞科科技有限公司 | Method, device and storage medium for designing tree-oriented object programming program |
CN110825384A (en) * | 2019-10-28 | 2020-02-21 | 国电南瑞科技股份有限公司 | ST language compiling method, system and compiler based on LLVM |
CN112346730B (en) * | 2020-11-04 | 2021-08-27 | 星环信息科技(上海)股份有限公司 | Intermediate representation generation method, computer equipment and storage medium |
CN112631944A (en) * | 2020-12-31 | 2021-04-09 | 平安国际智慧城市科技股份有限公司 | Source code detection method and device based on abstract syntax tree and computer storage medium |
CN112799677B (en) * | 2021-02-05 | 2023-09-12 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for hook of compiling period |
US11847436B2 (en) * | 2022-01-25 | 2023-12-19 | Hewlett Packard Enterprise Development Lp | Machine learning (ML) model-based compiler |
CN114661341B (en) * | 2022-03-28 | 2025-03-18 | 北京白海科技有限公司 | A method and device for obtaining code auxiliary information |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7076772B2 (en) | 2003-02-26 | 2006-07-11 | Bea Systems, Inc. | System and method for multi-language extensible compiler framework |
WO2015012711A1 (en) | 2013-07-23 | 2015-01-29 | Huawei Technologies Co., Ltd | Method for constructing a graph-based intermediate representation in a compiler |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005515518A (en) * | 2001-05-11 | 2005-05-26 | コンピュータ アソシエイツ シンク,インコーポレイテッド | Method and system for converting legacy software applications into modern object-oriented systems |
CN100461132C (en) * | 2007-03-02 | 2009-02-11 | 北京邮电大学 | Software security code analyzer and detection method based on source code static analysis |
CN101261604B (en) * | 2008-04-09 | 2010-09-29 | 中兴通讯股份有限公司 | Software quality evaluation apparatus and software quality evaluation quantitative analysis method |
CN101286132B (en) * | 2008-06-02 | 2010-09-08 | 北京邮电大学 | A testing method and system based on software defect mode |
JP5385102B2 (en) * | 2009-11-24 | 2014-01-08 | 株式会社野村総合研究所 | Source analysis program, preprocessor, lexer, and syntax tree analysis program |
CN102073589B (en) * | 2010-12-29 | 2013-07-03 | 北京邮电大学 | Code static analysis-based data race detecting method and system thereof |
US9449185B2 (en) * | 2011-12-16 | 2016-09-20 | Software Ag | Extensible and/or distributed authorization system and/or methods of providing the same |
US9411581B2 (en) * | 2012-04-18 | 2016-08-09 | Gizmox Transposition Ltd. | Code migration systems and methods |
CN104182267B (en) * | 2013-05-21 | 2019-10-25 | 南京中兴新软件有限责任公司 | Compilation Method, means of interpretation, device and user equipment |
-
2015
- 2015-04-07 WO PCT/RU2015/000218 patent/WO2016163901A1/en active Application Filing
- 2015-04-07 CN CN201580078525.XA patent/CN110149800B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7076772B2 (en) | 2003-02-26 | 2006-07-11 | Bea Systems, Inc. | System and method for multi-language extensible compiler framework |
WO2015012711A1 (en) | 2013-07-23 | 2015-01-29 | Huawei Technologies Co., Ltd | Method for constructing a graph-based intermediate representation in a compiler |
Non-Patent Citations (4)
Title |
---|
ARVIND K SUJEETH ET AL: "Composition and Reuse with Compiled Domain-Specific Languages", 1 July 2013, ECOOP 2013 OBJECT-ORIENTED PROGRAMMING, SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 52 - 78, ISBN: 978-3-642-39037-1, XP047033428 * |
ARVIND K. SUJEETH ET AL: "Delite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages", ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, vol. 13, no. 4s, 1 July 2014 (2014-07-01), pages 1 - 25, XP055187083, ISSN: 1539-9087, DOI: 10.1145/2584665 * |
RICKARD E FAITH ET AL: "KHEPERA: A System for Rapid Implementation of Domain Specific Languages", USENIX,, 9 October 1997 (1997-10-09), pages 1 - 14, XP061011952 * |
ROMPF TIARK ET AL: "Scala-Virtualized: linguistic reuse for deep embeddings", HIGHER-ORDER AND SYMBOLIC COMPUTATION, KLUWER ACADEMIC PUBLISHER, NORWELL, MA, US, vol. 25, no. 1, 20 September 2013 (2013-09-20), pages 165 - 207, XP035456375, ISSN: 1388-3690, [retrieved on 20130920], DOI: 10.1007/S10990-013-9096-9 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11954514B2 (en) | 2019-04-30 | 2024-04-09 | Automation Anywhere, Inc. | Robotic process automation system with separate code loading |
US11113095B2 (en) | 2019-04-30 | 2021-09-07 | Automation Anywhere, Inc. | Robotic process automation system with separate platform, bot and command class loaders |
US11243803B2 (en) * | 2019-04-30 | 2022-02-08 | Automation Anywhere, Inc. | Platform agnostic robotic process automation |
US11614731B2 (en) | 2019-04-30 | 2023-03-28 | Automation Anywhere, Inc. | Zero footprint robotic process automation system |
US11301224B1 (en) | 2019-04-30 | 2022-04-12 | Automation Anywhere, Inc. | Robotic process automation system with a command action logic independent execution environment |
US11334467B2 (en) | 2019-05-03 | 2022-05-17 | International Business Machines Corporation | Representing source code in vector space to detect errors |
CN110457065B (en) * | 2019-08-14 | 2023-11-07 | 中国工商银行股份有限公司 | Method and apparatus for obtaining applications compatible with multi-version systems |
CN110457065A (en) * | 2019-08-14 | 2019-11-15 | 中国工商银行股份有限公司 | For obtaining the method and device of compatible multi version systematic difference |
US11640282B2 (en) * | 2019-10-24 | 2023-05-02 | Here Global B.V. | Method, apparatus, and system for providing a broker for data modeling and code generation |
CN111240772A (en) * | 2020-01-22 | 2020-06-05 | 腾讯科技(深圳)有限公司 | Data processing method and device based on block chain and storage medium |
CN111367527B (en) * | 2020-02-18 | 2023-03-28 | 北京字节跳动网络技术有限公司 | Language processing method, device, medium and electronic equipment |
CN111367527A (en) * | 2020-02-18 | 2020-07-03 | 北京字节跳动网络技术有限公司 | Language processing method, device, medium and electronic equipment |
CN113672224A (en) * | 2021-08-20 | 2021-11-19 | 上海哔哩哔哩科技有限公司 | Method and device for generating small program page code and computer equipment |
CN114090964A (en) * | 2021-11-18 | 2022-02-25 | 北京五八信息技术有限公司 | Code processing method and device, electronic equipment and readable medium |
CN114528013A (en) * | 2021-12-27 | 2022-05-24 | 北京达佳互联信息技术有限公司 | Text generation method and device, electronic equipment, storage medium and product |
CN116974573A (en) * | 2023-07-10 | 2023-10-31 | 中国人民解放军陆军工程大学 | Compiling method for application program of fully distributed intelligent building system |
CN118296220A (en) * | 2024-03-25 | 2024-07-05 | 南通大学 | An intelligent retrieval API recommendation method based on LSTM |
Also Published As
Publication number | Publication date |
---|---|
CN110149800A (en) | 2019-08-20 |
CN110149800B (en) | 2021-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110149800B (en) | An apparatus for processing an abstract syntax tree associated with source code of a source program | |
US8122440B1 (en) | Method and apparatus for enumerating external program code dependencies | |
US20150301813A1 (en) | Methods and systems for forming an adjusted perform range | |
US9164744B2 (en) | Method and system for program building | |
Jouault et al. | On the interoperability of model-to-model transformation languages | |
Floc'h et al. | GeCoS: A framework for prototyping custom hardware design flows | |
EP1668498A2 (en) | Creating and checking runtime data types | |
Córdoba-Sánchez et al. | Ann: A domain-specific language for the effective design and validation of Java annotations | |
US10983771B1 (en) | Quality checking inferred types in a set of code | |
US8776010B2 (en) | Data type provider for a data store | |
US11163545B2 (en) | Type inference optimization | |
Bergmayr et al. | fREX: fUML-based reverse engineering of executable behavior for software dynamic analysis | |
Gotti et al. | UML executable: a comparative study of UML compilers and interpreters | |
Hück et al. | Source transformation of C++ codes for compatibility with operator overloading | |
Fritzson et al. | Metamodelica–a symbolic-numeric modelica language and comparison to julia | |
Edmunds et al. | Tool support for Event-B code generation | |
US9015679B2 (en) | System and method for translating business application functions into DBMS internal programming language procedures | |
Panyala et al. | On the use of term rewriting for performance optimization of legacy HPC applications | |
GB2420638A (en) | Method of substituting code fragments in Internal Representation | |
WO2008015110A2 (en) | Methods, apparatus and computer programs for modelling computer programs | |
Marcelino et al. | Transpiling Python to Julia using PyJL. | |
Noguera et al. | Program querying with a SOUL: the barista tool suite | |
de Carvalho | Programming and mapping strategies for embedded computing runtime adaptability | |
RU2515684C1 (en) | Method for parsing programming language with extensible grammar | |
Somogyi et al. | Towards a Model Transformation based Code Renovation Tool. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15788493 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15788493 Country of ref document: EP Kind code of ref document: A1 |