CN116719530A - Methods, devices, computer equipment and storage media for improving code performance - Google Patents
Methods, devices, computer equipment and storage media for improving code performance Download PDFInfo
- Publication number
- CN116719530A CN116719530A CN202310699678.2A CN202310699678A CN116719530A CN 116719530 A CN116719530 A CN 116719530A CN 202310699678 A CN202310699678 A CN 202310699678A CN 116719530 A CN116719530 A CN 116719530A
- Authority
- CN
- China
- Prior art keywords
- function
- files
- file
- optimizer
- execution time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 238000000605 extraction Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims description 364
- 238000005457 optimization Methods 0.000 claims description 28
- 230000015654 memory Effects 0.000 claims description 27
- 238000012545 processing Methods 0.000 claims description 12
- 238000004891 communication Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims 1
- 230000008569 process Effects 0.000 description 13
- 230000000694 effects Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000013515 script Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 241000220317 Rosa Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/73—Program documentation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Stored Programmes (AREA)
Abstract
Description
技术领域Technical field
本发明涉及计算机技术领域,具体涉及提高代码性能方法、装置、计算机设备及存储介质。The present invention relates to the field of computer technology, and specifically to methods, devices, computer equipment and storage media for improving code performance.
背景技术Background technique
在计算机科学领域中,编译器作为程序优化的关键技术之一,在生成高效代码方面已经取得了显著的成果。编译器嵌入了大量的代码优化技术,其中循环嵌套代码是优化的重点,这些技术经过多年的不断发展与完善。然而,任意两个编译器所生成代码的性能可能由于其各自的中间表示(IR)、已实现的循环转换及其排序、使用的成本模型,以及指令选择和调度等因素而大相径庭。另外,编译器还需要考虑多核处理器中复杂的流水线、多个功能单元、以及复杂内存层次结构等因素对整体性能的影响。因此,现有的代码优化技术由于编译器所生成代码的性能可能无法与其他编译器完全匹配,且无法确定编译器是否接近最佳性能,或者是否存在改进的空间等原因,造成最终生成的目标代码性能较低的问题。In the field of computer science, compilers, as one of the key technologies for program optimization, have achieved remarkable results in generating efficient code. The compiler embeds a large number of code optimization technologies, among which loop nested code is the focus of optimization. These technologies have been continuously developed and improved over the years. However, the performance of code generated by any two compilers can differ significantly due to factors such as their respective intermediate representations (IRs), implemented loop transformations and their ordering, cost models used, and instruction selection and scheduling. In addition, the compiler also needs to consider the impact of factors such as complex pipelines, multiple functional units, and complex memory hierarchies in multi-core processors on overall performance. Therefore, existing code optimization technology may not fully match other compilers due to the performance of the code generated by the compiler, and it is impossible to determine whether the compiler is close to the optimal performance, or whether there is room for improvement, etc., resulting in the final generated target Issues with low code performance.
鉴于此,亟需一种能够有效提高代码性能的方法。In view of this, a method that can effectively improve code performance is urgently needed.
发明内容Contents of the invention
有鉴于此,本发明提供了一种提高代码性能方法、装置、计算机设备及存储介质,以解决生成的代码性能较低的问题。In view of this, the present invention provides a method, device, computer equipment and storage medium for improving code performance to solve the problem of low performance of generated code.
第一方面,本发明提供了一种提高代码性能方法,包括:In a first aspect, the present invention provides a method for improving code performance, including:
对程序源代码进行函数提取得到主体文件和多个函数文件;所述主体文件包含所述程序源代码的主体代码,每个所述函数文件包含一个第一函数;Perform function extraction on the program source code to obtain a main body file and multiple function files; the main body file contains the main body code of the program source code, and each of the function files contains a first function;
利用多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理,得到多个文件夹;每个所述文件夹包含多个第一可执行文件,每个所述第一可执行文件包含一个第二函数,所述第二函数是由第一函数优化处理得到的,所述多个文件夹与所述多个优化器一一对应;Each of the multiple optimizers is used to optimize the main body file and the multiple function files to obtain multiple folders; each folder contains multiple first executable files, and each folder contains multiple first executable files. Each of the first executable files includes a second function, the second function is obtained by optimizing the first function, and the plurality of folders correspond to the plurality of optimizers in a one-to-one manner;
执行每个第一可执行文件,以确定每个第一可执行文件中第二函数的执行时间;executing each first executable file to determine the execution time of the second function in each first executable file;
基于每个第一可执行文件中第二函数的执行时间,从所述多个优化器中为每个所述函数文件选择目标优化器。A target optimizer is selected for each function file from the plurality of optimizers based on an execution time of a second function in each first executable file.
本发明提供的提高代码性能方法,通过多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理得到第一可执行文件,能够对函数段代码进行全面搜索,查询到所有可能的优化选项;通过比较优化后函数的执行时间来选择性能最高的函数段代码进行链接得到优化效果最佳的可执行文件,从而达到有效提高生成代码的性能。The method for improving code performance provided by the present invention is to obtain a first executable file by optimizing each of the multiple optimizers on the main body file and the multiple function files, and can comprehensively perform function segment code optimization. Search and query all possible optimization options; by comparing the execution time of the optimized function, select the highest-performing function segment code to link to obtain the executable file with the best optimization effect, thereby effectively improving the performance of the generated code.
在一种可选的实施方式中,所述方法还包括:In an optional implementation, the method further includes:
对所述多个函数文件分别使用对应的目标优化器进行编译处理,生成多个第一可重定向文件;所述多个第一可重定向文件与所述多个函数文件一一对应;The plurality of function files are compiled using corresponding target optimizers respectively to generate a plurality of first redirectable files; the plurality of first redirectable files correspond to the plurality of function files in a one-to-one correspondence;
对所述主体文件使用初始优化器进行编译处理,生成第二可重定向文件;Compile and process the main body file using an initial optimizer to generate a second redirectable file;
对所述第二可重定向文件和所述多个第一可重定向文件进行合成处理,得到第二可执行文件。The second retargetable file and the plurality of first retargetable files are synthesized to obtain a second executable file.
本发明提供的提高代码性能方法,通过对分别多个函数文件分别使用对应的目标优化器进行编译生成多个第一可重定向文件,以及对第二可重定向文件和多个第一可重定向文件进行合成处理得到第二可执行文件,能够利用了多个编译器的优点,同时绕过单个编译器的缺点,从而使生成的目标代码的性能更好。The method for improving code performance provided by the present invention compiles multiple function files using corresponding target optimizers to generate multiple first redirectable files, and compiles the second redirectable files and the multiple first redirectable files. The directional file is synthesized to obtain a second executable file, which can take advantage of the advantages of multiple compilers while bypassing the shortcomings of a single compiler, thereby making the performance of the generated target code better.
在一种可选的实施方式中,所述利用多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理,得到多个文件夹,包括:In an optional implementation, each of the multiple optimizers is used to optimize the main body file and the multiple function files to obtain multiple folders, including:
步骤1:使用目标优化器对所述主体文件进行编译处理,得到第三可重定向文件;Step 1: Use a target optimizer to compile and process the main body file to obtain a third redirectable file;
步骤2:使用所述目标优化器分别对所述多个函数文件进行编译处理,得到多个第四可重定向文件;所述多个函数文件与所述多个第四可重定向文件一一对应;Step 2: Use the target optimizer to compile and process the plurality of function files respectively to obtain a plurality of fourth redirectable files; the plurality of function files and the plurality of fourth redirectable files are one by one. correspond;
步骤3:将所述第三可重定向文件分别与所述多个第四可重定向文件进行合成处理,得到多个第一可执行文件;所述多个第一可执行文件与所述多个第四可重定向文件一一对应;Step 3: Synthesize the third redirectable file with the plurality of fourth redirectable files to obtain a plurality of first executable files; the plurality of first executable files and the plurality of fourth redirectable files are synthesized. There is a one-to-one correspondence between the fourth redirectable files;
步骤4:基于所述多个第一可执行文件生成与所述目标优化器对应的所述文件夹;Step 4: Generate the folder corresponding to the target optimizer based on the plurality of first executable files;
步骤5:将所述多个优化器中的每个优化器分别作为目标优化器,重复上述步骤1至步骤4,得到多个文件夹;所述多个文件夹与所述多个优化器一一对应。Step 5: Use each optimizer in the multiple optimizers as a target optimizer, repeat the above steps 1 to 4, and obtain multiple folders; the multiple folders are combined with the multiple optimizers. One correspondence.
本发明提供的提高代码性能方法,通过多种代码优化器分别优化每个提取出的第一函数,并将其性能作为整个应用程序的一部分来进行评估,能够对函数段代码进行了全面的搜索,以找到所有可能的优化选项,通过比较优化后函数的执行时间来选择性能最佳的代码进行链接和最终的可执行文件生成,从而达到有效地提高生成的代码的性能的目的。The method for improving code performance provided by the present invention separately optimizes each extracted first function through a variety of code optimizers, and evaluates its performance as part of the entire application, enabling a comprehensive search of function segment codes. , to find all possible optimization options, and select the best-performing code for linking and final executable file generation by comparing the execution time of the optimized function, so as to effectively improve the performance of the generated code.
在一种可选的实施方式中,对程序源代码进行函数提取得到多个函数文件,包括:In an optional implementation, function extraction is performed on the program source code to obtain multiple function files, including:
对程序源代码进行函数识别得到多个第三函数;Perform function identification on the program source code to obtain multiple third functions;
对所述多个第三函数分别设置计时函数,得到多个第一函数;Set timing functions for the plurality of third functions respectively to obtain a plurality of first functions;
将所述多个第一函数分别进行提取,得到多个函数文件。Extract the multiple first functions respectively to obtain multiple function files.
本发明提供的提高代码性能方法,通过对提取的函数设置计时函数,能够达到对优化后函数的性能进行量化处理的目的,从而判断出优化器对函数的优化效果,为每个函数选择优化效果最佳的优化器提供了可能。The method for improving code performance provided by the present invention can achieve the purpose of quantifying the performance of the optimized function by setting a timing function for the extracted function, thereby judging the optimization effect of the optimizer on the function and selecting the optimization effect for each function. The best optimizer possible.
在一种可选的实施方式中,所述方法还包括:In an optional implementation, the method further includes:
根据与同一个优化器对应的多个第一可执行文件中第二函数的函数名和执行时间,构建多个执行时间文件;所述执行时间文件包含被同一优化器优化处理得到的多个第二函数的函数名和执行时间,所述多个执行时间文件与所述多个优化器一一对应。Construct multiple execution time files based on the function names and execution times of the second functions in multiple first executable files corresponding to the same optimizer; the execution time files include multiple second functions optimized and processed by the same optimizer. The function name and execution time of the function, the multiple execution time files correspond to the multiple optimizers one-to-one.
本发明提供的提高代码性能方法,通过建立与优化器对应的执行时间文件,执行时间文件包含被同一优化器优化处理得到的多个第二函数的函数名和执行时间,能够有效判断出每个优化器对每个第一函数的优化效果,为每个函数选择优化效果最佳的优化器提供了可能。The method for improving code performance provided by the present invention can effectively determine each optimization by establishing an execution time file corresponding to the optimizer. The execution time file contains the function names and execution times of multiple second functions optimized and processed by the same optimizer. The optimization effect of the optimizer on each first function provides the possibility to select the optimizer with the best optimization effect for each function.
在一种可选的实施方式中,所述方法还包括:In an optional implementation, the method further includes:
步骤1:利用读取函数对目标执行时间文件进行元素提取得到元素列表;Step 1: Use the read function to extract elements from the target execution time file to obtain the element list;
步骤2:对所述元素列表中的每个元素进行分割处理得到被同一优化器优化处理得到的多个第二函数的函数名和执行时间;Step 2: Divide each element in the element list to obtain the function names and execution times of multiple second functions optimized and processed by the same optimizer;
步骤3:将多个第二函数的函数名作为键,多个第二函数对应的执行时间作为值,生成用于记录多个第二函数执行时间的第一字典;Step 3: Use the function names of multiple second functions as keys and the execution times corresponding to multiple second functions as values to generate a first dictionary for recording the execution times of multiple second functions;
步骤4:将所述多个执行时间文件中的每个执行时间文件分别作为目标执行时间文件,重复上述步骤1至步骤3,得到多个第一字典;所述多个第一字典与所述多个优化器一一对应。Step 4: Use each execution time file in the multiple execution time files as a target execution time file, repeat the above steps 1 to 3, and obtain multiple first dictionaries; the multiple first dictionaries and the Multiple optimizers correspond one to one.
本发明提供的提高代码性能方法,通过对执行时间文件进行处理,生成第一字典,能够有效判断出每个优化器对每个第一函数的优化效果,为每个函数选择优化效果最佳的优化器提供了可能。The method for improving code performance provided by the present invention generates a first dictionary by processing execution time files, which can effectively determine the optimization effect of each optimizer on each first function and select the one with the best optimization effect for each function. The optimizer provides the possibility.
在一种可选的实施方式中,所述基于每个第一可执行文件中第二函数的执行时间,从所述多个优化器中为每个所述函数文件选择目标优化器,包括:In an optional implementation, selecting a target optimizer for each function file from the plurality of optimizers based on the execution time of the second function in each first executable file includes:
从所述多个第一字典中筛选出与目标函数文件的第一函数具有相同函数名的多个第二函数,所述目标函数文件表示多个函数文件中的任意一个Filter out a plurality of second functions that have the same function name as the first function of the target function file from the plurality of first dictionaries, and the target function file represents any one of the plurality of function files.
将所述多个第二函数的执行时间进行比较;Compare the execution times of the plurality of second functions;
将具有最少执行时间的第二函数对应的优化器作为目标优化器,所述目标优化器用于对所述目标函数文件进行优化处理。The optimizer corresponding to the second function with the smallest execution time is used as the target optimizer, and the target optimizer is used to optimize the target function file.
本发明提供的提高代码性能方法,通过将多个第二函数的执行时间进行比较,以及将具有最少执行时间的第二函数对应的优化器作为目标优化器,能够为每个函数选择优化效果最佳的优化器,从而达到有效提高生成代码的目的。The method for improving code performance provided by the present invention can select the best optimization effect for each function by comparing the execution times of multiple second functions and using the optimizer corresponding to the second function with the smallest execution time as the target optimizer. The best optimizer can effectively improve the generated code.
第二方面,本发明提供了一种提高代码性能装置,包括:In a second aspect, the present invention provides a device for improving code performance, including:
函数提取模块,用于对程序源代码进行函数提取得到主体文件和多个函数文件;所述主体文件包含所述程序源代码的主体代码,每个所述函数文件包含一个第一函数;A function extraction module is used to extract functions from the program source code to obtain a main body file and multiple function files; the main body file contains the main body code of the program source code, and each of the function files contains a first function;
优化处理模块,用于利用多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理,得到多个文件夹;每个所述文件夹包含多个第一可执行文件,每个所述第一可执行文件包含一个第二函数,所述第二函数是由第一函数优化处理得到的,所述多个文件夹与所述多个优化器一一对应,所述多个第一可执行文件与所述多个函数文件一一对应;An optimization processing module is configured to utilize each of multiple optimizers to optimize the main body file and the multiple function files to obtain multiple folders; each of the folders contains multiple third files. An executable file, each of the first executable files includes a second function, the second function is obtained by optimizing the first function, the multiple folders and the multiple optimizers are one by one Correspondingly, the plurality of first executable files correspond to the plurality of function files in one-to-one correspondence;
执行时间确定模块,用于执行每个第一可执行文件,以确定每个第一可执行文件中第二函数的执行时间;An execution time determination module, configured to execute each first executable file to determine the execution time of the second function in each first executable file;
目标优化器选择模块,用于基于每个第一可执行文件中第二函数的执行时间,从所述多个优化器中为每个所述函数文件选择目标优化器。A target optimizer selection module, configured to select a target optimizer for each function file from the plurality of optimizers based on the execution time of the second function in each first executable file.
第三方面,本发明提供了一种计算机设备,包括:存储器和处理器,存储器和处理器之间互相通信连接,存储器中存储有计算机指令,处理器通过执行计算机指令,从而执行上述第一方面或其对应的任一实施方式的提高代码性能方法。In a third aspect, the present invention provides a computer device, including: a memory and a processor. The memory and the processor are communicatively connected to each other. Computer instructions are stored in the memory, and the processor executes the computer instructions to execute the first aspect. Or any corresponding method for improving code performance.
第四方面,本发明提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机指令,计算机指令用于使计算机执行上述第一方面或其对应的任一实施方式的提高代码性能方法。In a fourth aspect, the present invention provides a computer-readable storage medium. Computer instructions are stored on the computer-readable storage medium. The computer instructions are used to cause the computer to execute the improved code of the above-mentioned first aspect or any of its corresponding embodiments. Performance approach.
附图说明Description of the drawings
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the specific embodiments of the present invention or the technical solutions in the prior art, the accompanying drawings that need to be used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description The drawings illustrate some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting any creative effort.
图1为本发明提供的提高代码性能方法的流程示意图;Figure 1 is a schematic flow chart of a method for improving code performance provided by the present invention;
图2为本发明提供的提高代码性能方法的一种具体实施例的示意图;Figure 2 is a schematic diagram of a specific embodiment of a method for improving code performance provided by the present invention;
图3为本发明提供的提高代码性能方法中关于文件夹的结构框图;Figure 3 is a structural block diagram of folders in the method for improving code performance provided by the present invention;
图4为本发明提供的提高代码性能方法的另一种具体实施例的示意图;Figure 4 is a schematic diagram of another specific embodiment of the method for improving code performance provided by the present invention;
图5为本发明提供的提高代码性能方法中关于执行时间文件的结构框图;Figure 5 is a structural block diagram of the execution time file in the method for improving code performance provided by the present invention;
图6为本发明提供的提高代码性能方法的又一种具体实施例的示意图;Figure 6 is a schematic diagram of another specific embodiment of the method for improving code performance provided by the present invention;
图7为本发明提供的提高代码性能装置的结构框图;Figure 7 is a structural block diagram of a device for improving code performance provided by the present invention;
图8是本发明提供的计算机设备的硬件结构示意图。Figure 8 is a schematic diagram of the hardware structure of the computer equipment provided by the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts fall within the scope of protection of the present invention.
根据本发明实施例,提供了一种提高代码性能方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, an embodiment of a method for improving code performance is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although A logical order is shown in the flowcharts, but in some cases, the steps shown or described may be performed in a different order than herein.
在本实施例中提供了一种提高代码性能方法,图1是根据本发明实施例的提高代码性能方法的流程图,如图1所示,该流程包括如下步骤:This embodiment provides a method for improving code performance. Figure 1 is a flow chart of a method for improving code performance according to an embodiment of the present invention. As shown in Figure 1, the process includes the following steps:
步骤S101,对程序源代码进行函数提取得到主体文件和多个函数文件;所述主体文件包含所述程序源代码的主体代码,每个所述函数文件包含一个第一函数。Step S101, perform function extraction on the program source code to obtain a main body file and multiple function files; the main body file contains the main body code of the program source code, and each of the function files contains a first function.
在该步骤中,程序源代码是指未编译的按照一定的程序设计语言规范书写的源文件;提取得到的主体文件与源文件在结构上相似,其区别点在于将源文件中的函数语句替换为对函数的调用,即对源文件进行函数提取得到的是一个将函数语句替换为函数调用的主体文件和多个函数文件。In this step, the program source code refers to the uncompiled source file written according to certain programming language specifications; the extracted main file is similar in structure to the source file, and the difference lies in the replacement of function statements in the source file. In order to call a function, that is, the function extraction from the source file results in a main file in which function statements are replaced with function calls and multiple function files.
更具体地,对于函数提取的具体实施方式在本发明中不做具体限定,通过函数提取可以得到一个主体文件和多个函数文件即可。例如可以通过重载ROSE编译器基础架构中SgTopDownBottomUpProcessing类的evaluateInheritedAttribute函数遍历源文件的AST分析原文件,以识别出多个函数语句;然后将每个函数语句分别以C++中的文件流的方式提取到一个独立可编译的文件中,通过这种方式可以将多个函数语句一一对应地提取到多个函数文件中,每个函数文件包含一个第一函数。More specifically, the specific implementation of function extraction is not specifically limited in the present invention, as long as one main file and multiple function files can be obtained through function extraction. For example, you can overload the evaluateInheritedAttribute function of the SgTopDownBottomUpProcessing class in the ROSE compiler infrastructure to traverse the AST of the source file and analyze the original file to identify multiple function statements; and then extract each function statement to the file stream in C++. In an independently compilable file, multiple function statements can be extracted into multiple function files in one-to-one correspondence in this way, and each function file contains a first function.
步骤S102,利用多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理,得到多个文件夹;每个所述文件夹包含多个第一可执行文件,每个所述第一可执行文件包含一个第二函数,所述第二函数是由第一函数优化处理得到的,所述多个文件夹与所述多个优化器一一对应。Step S102: Use each of multiple optimizers to optimize the main body file and the multiple function files to obtain multiple folders; each folder contains multiple first executable files. files, each of the first executable files includes a second function, the second function is obtained by optimization processing of the first function, and the plurality of folders correspond to the plurality of optimizers one-to-one.
在该步骤中,这里的多个优化器是指不同的优化器,可以理解为每个优化器对同一函数具有不同优化手段,以至于通过多个优化器对同一函数进行优化处理最终生成的多个可执行文件也具有作用相同但性能不同的特点。其中,优化器是指编译器,例如GNU GCC编译器、LLVM Clang编译器和ARM Compiler编译器以及基于多面体模型的循环编译器Polly;可执行文件是指可以由操作系统进行加载执行的文件。In this step, the multiple optimizers here refer to different optimizers. It can be understood that each optimizer has different optimization methods for the same function, so that multiple optimizers optimize the same function and ultimately generate multiple optimizers. Each executable file also has the same function but different performance characteristics. Among them, the optimizer refers to the compiler, such as the GNU GCC compiler, LLVM Clang compiler, ARM Compiler compiler and the polyhedral model-based loop compiler Polly; the executable file refers to the file that can be loaded and executed by the operating system.
更具体地,每个优化器对一个主体文件和多个函数文件进行优化处理后都会得到一个文件夹,例如图2中优化器1对主体文件、函数文件1、函数文件2...以及函数文件n进行优化处理,得到文件夹1,又例如GCC编译器将生成"GCC\_prof"文件夹。每个文件夹中包含多个第一可执行文件,例如图3中文件夹1中包含第一可执行文件1、第一可执行文件2...以及第一可执行文件n。每个可执行文件又包含一个第二函数,其中第二函数是由第一函数优化处理得到的,因此,多个可执行文件与多个函数文件是一一对应的关系,例如第一可执行文件1与函数文件1是一一对应的,第一可执行文件2与函数文件2是一一对应的,第一可执行文件n与函数文件n是一一对应的。More specifically, each optimizer will obtain a folder after optimizing a main body file and multiple function files. For example, in Figure 2, optimizer 1 pairs the main body file, function file 1, function file 2... and function File n is optimized and folder 1 is obtained. For example, the GCC compiler will generate the "GCC\_prof" folder. Each folder contains multiple first executable files. For example, folder 1 in Figure 3 contains first executable file 1, first executable file 2... and first executable file n. Each executable file also contains a second function, where the second function is obtained by optimizing the first function. Therefore, multiple executable files have a one-to-one correspondence with multiple function files. For example, the first executable file There is a one-to-one correspondence between file 1 and function file 1, there is a one-to-one correspondence between first executable file 2 and function file 2, and there is a one-to-one correspondence between first executable file n and function file n.
步骤S103,执行每个第一可执行文件,以确定每个第一可执行文件中第二函数的执行时间。Step S103: Execute each first executable file to determine the execution time of the second function in each first executable file.
在该步骤中,为了对比各个优化器对同一函数优化后的性能表现,需要在执行每个第一可执行文件的过程中计算出每个第一可执行文件中第二函数的执行时间。In this step, in order to compare the optimized performance of the same function by various optimizers, it is necessary to calculate the execution time of the second function in each first executable file during the execution of each first executable file.
以图2、3为例对上述步骤S103进行说明:Take Figures 2 and 3 as examples to illustrate the above step S103:
步骤1:确定与优化器1对应的文件夹1包含的n个第一可执行文件,例如第一可执行文件1(包含第二函数1)、第一可执行文件2(包含第二函数2)...以及第一可执行文件n(第二函数n);然后分别执行上述n个第一可执行文件,并计算每个第一可执行文件的第二函数的执行时间,例如优化器1-(第二函数1-执行时间1、第二函数2-执行时间2...以及第二函数n-执行时间n);Step 1: Determine the n first executable files contained in folder 1 corresponding to optimizer 1, such as first executable file 1 (containing second function 1), first executable file 2 (containing second function 2) )...and the first executable file n (second function n); then execute the above n first executable files respectively, and calculate the execution time of the second function of each first executable file, such as the optimizer 1-(Second function 1-execution time 1, second function 2-execution time 2...and second function n-execution time n);
步骤2:通过上述步骤1的方法,计算与其他优化器(除优化器以外的优化器)对应的多个第一可执行文件的第二函数执行时间。Step 2: Calculate the second function execution time of multiple first executable files corresponding to other optimizers (optimizers other than the optimizer) through the method of step 1 above.
步骤S104,基于每个第一可执行文件中第二函数的执行时间,从所述多个优化器中为每个所述函数文件选择目标优化器。Step S104: Based on the execution time of the second function in each first executable file, select a target optimizer for each function file from the plurality of optimizers.
通过以下实施例对步骤S104进行说明:Step S104 is explained through the following examples:
步骤1:从目标优化器(多个优化器中的任意一个)对应的多个第一可执行文件中筛选出与目标函数文件(多个函数文件1中的任意一个)对应的目标第一可执行文件,并确定目标第一可执行文件中第二函数的执行时间;其中目标第一可执行文件中的第二函数是由目标函数文件中的第一函数优化处理得到的。Step 1: Filter out the target first executable file corresponding to the target function file (any one of the multiple function files 1) from the multiple first executable files corresponding to the target optimizer (any one of the multiple optimizers). Execute the file, and determine the execution time of the second function in the target first executable file; wherein the second function in the target first executable file is obtained by optimizing the first function in the target function file.
以图2、3对上述步骤进行说明:目标函数文件为函数文件1,目标优化器为优化器1,从优化器1对应的多个第一可执行文件(第一可执行文件1、第一可执行文件2...以及第一可执行文件n)中筛选出第一可执行文件1,并确定第一可执行文件1中第二函数的执行时间A。The above steps are explained with Figures 2 and 3: the target function file is function file 1, the target optimizer is optimizer 1, and the multiple first executable files corresponding to optimizer 1 (first executable file 1, first The first executable file 1 is filtered out from the executable files 2... and the first executable file n), and the execution time A of the second function in the first executable file 1 is determined.
步骤2:按照上述步骤1从每个优化器对应的多个第一可执行文件中选择与目标函数文件对应的多个目标第一可执行文件。例如:从优化器1对应的多个第一可执行文件中筛选出一个目标第一可执行文件1,从优化器2对应的多个第一可执行文件中筛选出一个目标第一可执行文件2,从优化器n对应的多个第一可执行文件中筛选出一个目标第一可执行文件n,由此可以得到n个与目标函数文件对应的目标第一可执行文件。Step 2: Select multiple target first executable files corresponding to the target function file from the multiple first executable files corresponding to each optimizer according to the above step 1. For example: filter out a target first executable file 1 from multiple first executable files corresponding to optimizer 1, and filter out a target first executable file from multiple first executable files corresponding to optimizer 2. 2. Filter out a target first executable file n from multiple first executable files corresponding to the optimizer n, thereby obtaining n target first executable files corresponding to the target function file.
步骤3:将多个目标第一可执行文件的第二函数执行时间进行比对,将其中执行时间最少的第二函数的第一可执行文件的优化器作为目标优化器,所述目标优化器用于目标函数文件进行优化。例如:当得到目标第一可执行文件1的第二函数执行时间A、目标第一可执行文件2的第二函数执行时间B...以及目标第一可执行文件n的第二函数执行时间N,然后比较第二函数执行时间A、B....以及N的大小,当第二函数执行时间A最小时,将与第二函数执行时间A的第一可执行文件对应的优化器作为目标优化器,用于对目标函数文件进行优化编译。Step 3: Compare the execution times of the second functions of multiple target first executable files, and use the optimizer of the first executable file of the second function with the smallest execution time as the target optimizer. The target optimizer is Optimize in the objective function file. For example: when obtaining the second function execution time A of the target first executable file 1, the second function execution time B of the target first executable file 2... and the second function execution time of the target first executable file n N, and then compare the second function execution time A, B... and the size of N. When the second function execution time A is the smallest, use the optimizer corresponding to the first executable file of the second function execution time A as Target optimizer, used to optimize and compile target function files.
在一些可选的实施方式中,所述提高代码性能方法还包括以下步骤:In some optional implementations, the method for improving code performance further includes the following steps:
步骤S105:对所述多个函数文件分别使用对应的目标优化器进行编译处理,生成多个第一可重定向文件;所述多个第一可重定向文件与所述多个函数文件一一对应。Step S105: Use the corresponding target optimizer to compile and process the multiple function files respectively, and generate multiple first redirectable files; the multiple first redirectable files and the multiple function files are one by one. correspond.
具体地,如图4所示:在通过上述步骤S104为每个函数文件选择出具有最佳优化效果的目标优化器的基础上,对每个函数文件使用对应的目标优化器进行优化编译,得到第一可重定向文件。这里的可重定向文件主要用于编译和链接阶段,用于与其他文件合成为一个可执行文件。Specifically, as shown in Figure 4: on the basis of selecting the target optimizer with the best optimization effect for each function file through the above step S104, each function file is optimized and compiled using the corresponding target optimizer, and we obtain First redirectable file. The retargetable files here are mainly used in the compilation and linking phases to be combined with other files into an executable file.
步骤S106:对所述主体文件使用初始优化器进行编译处理,生成第二可重定向文件。Step S106: Compile the main body file using an initial optimizer to generate a second redirectable file.
具体地,如图4所示:初始优化器可以是指默认优化器,对于所有函数替换为函数调用的主体文件,可以通过默认优化器进行优化编译生成第二可重定向文件。Specifically, as shown in Figure 4: the initial optimizer may refer to the default optimizer. All functions are replaced with main files of function calls, and the second redirectable file can be generated through optimization and compilation by the default optimizer.
步骤S107:对所述第二可重定向文件和所述多个第一可重定向文件进行合成处理,得到第二可执行文件。Step S107: Synthesize the second retargetable file and the plurality of first retargetable files to obtain a second executable file.
具体地,当通过优化编译得到一个第二可重定向文件和多个第一可重定向文件后,可以将所有可重定向文件进行链接合成得到优化后的可执行文件。Specifically, after a second redirectable file and a plurality of first redirectable files are obtained through optimized compilation, all the redirectable files can be linked and synthesized to obtain an optimized executable file.
在一些可选的实施方式中,上述步骤S102还包括以下步骤:In some optional implementations, the above step S102 also includes the following steps:
步骤1:使用目标优化器对所述主体文件进行编译处理,得到第三可重定向文件。Step 1: Use a target optimizer to compile the main file to obtain a third redirectable file.
步骤2:使用所述目标优化器分别对所述多个函数文件进行编译处理,得到多个第四可重定向文件;所述多个函数文件与所述多个第四可重定向文件一一对应。Step 2: Use the target optimizer to compile and process the plurality of function files respectively to obtain a plurality of fourth redirectable files; the plurality of function files and the plurality of fourth redirectable files are one by one. correspond.
步骤3:将所述第三可重定向文件分别与所述多个第四可重定向文件进行合成处理,得到多个第一可执行文件;所述多个第一可执行文件与所述多个第四可重定向文件一一对应。Step 3: Synthesize the third redirectable file with the plurality of fourth redirectable files to obtain a plurality of first executable files; the plurality of first executable files and the plurality of fourth redirectable files are synthesized. There is a one-to-one correspondence between the fourth redirectable files.
例如当存在第三可重定向文件1、第四可重定向文件1、第四可重定向文件2以及第四可重定向文件3时,第三可重定向文件1与第四可重定向文件1进行链接合成得到第一可执行文件1,第三可重定向文件1与第四可重定向文件2进行链接合成得到第一可执行文件2,第三可重定向文件1与第四可重定向文件3进行链接合成得到第一可执行文件3。For example, when there are third redirectable file 1, fourth redirectable file 1, fourth redirectable file 2 and fourth redirectable file 3, the third redirectable file 1 and the fourth redirectable file 1 perform link synthesis to obtain the first executable file 1, perform link synthesis between the third redirectable file 1 and the fourth redirectable file 2 to obtain the first executable file 2, the third redirectable file 1 and the fourth redirectable file 2. The directed file 3 is linked and synthesized to obtain the first executable file 3.
步骤4:基于所述多个第一可执行文件生成与所述目标优化器对应的所述文件夹。例如目标优化器为GCC编译器,则生成"GCC\_prof"文件夹,在该文件夹中包含GCC编译器针对每个函数文件生成的优化后的第一可执行文件。Step 4: Generate the folder corresponding to the target optimizer based on the plurality of first executable files. For example, if the target optimizer is the GCC compiler, the "GCC\_prof" folder will be generated, which contains the optimized first executable file generated by the GCC compiler for each function file.
步骤5:将所述多个优化器中的每个优化器分别作为目标优化器,重复上述步骤1至步骤4,得到多个文件夹;所述多个文件夹与所述多个优化器一一对应。例如图2中优化器1对应文件夹1、优化器2对应文件夹2...优化器n对应文件夹n。Step 5: Use each optimizer in the multiple optimizers as a target optimizer, repeat the above steps 1 to 4, and obtain multiple folders; the multiple folders are combined with the multiple optimizers. One correspondence. For example, in Figure 2, optimizer 1 corresponds to folder 1, optimizer 2 corresponds to folder 2, and optimizer n corresponds to folder n.
在一些可选的实施方式中,上述步骤S101还包括以下步骤:In some optional implementations, the above step S101 also includes the following steps:
步骤S1011,对程序源代码进行函数识别得到多个第三函数。Step S1011: Perform function identification on the program source code to obtain multiple third functions.
步骤S1012,对所述多个第三函数分别设置计时函数,得到多个第一函数。Step S1012: Set timing functions for the plurality of third functions respectively to obtain a plurality of first functions.
步骤S1013,将所述多个第一函数分别进行提取,得到多个函数文件。Step S1013: Extract the multiple first functions respectively to obtain multiple function files.
在上述步骤中,第三函数是指通过函数提取得到的函数,第一函数是指在提取出来的第三函数前后添加计时函数后得到的函数;第二函数是指第一函数被优化器优化处理得到的函数。其中计时函数可以为gettimeofday计时函数。In the above steps, the third function refers to the function obtained through function extraction; the first function refers to the function obtained by adding the timing function before and after the extracted third function; the second function refers to the first function optimized by the optimizer Process the resulting function. The timing function can be the gettimeofday timing function.
本发明通过在提取出来的第三函数前后添加的gettimeofday计时函数,可以在执行每个优化器优化后的第一可执行文件的过程中可计算出第一可执行文件中第二函数的性能数据,即第二函数执行时间,可以用于分析优化器对第一函数的优化效果。By adding the gettimeofday timing function before and after the extracted third function, the present invention can calculate the performance data of the second function in the first executable file during the execution of the first executable file optimized by each optimizer. , that is, the execution time of the second function, which can be used to analyze the optimization effect of the optimizer on the first function.
在一些可选的实施方式中,所述方法还包括:In some optional implementations, the method further includes:
根据与同一个优化器对应的多个第一可执行文件中第二函数的函数名和执行时间,构建多个执行时间文件;所述执行时间文件包含被同一优化器优化处理得到的多个第二函数的函数名和执行时间,所述多个执行时间文件与所述多个优化器一一对应。Construct multiple execution time files based on the function names and execution times of the second functions in multiple first executable files corresponding to the same optimizer; the execution time files include multiple second functions optimized and processed by the same optimizer. The function name and execution time of the function, the multiple execution time files correspond to the multiple optimizers one-to-one.
具体地,当通过上述步骤S103计算出同一个优化器对应的多个第一可执行文件中第二函数的执行时间后,可以通过多个第一可执行文件中第二函数的函数名和执行时间建立与上述同一个优化器对应的执行时间文件,例如图5所示,与优化器1对应的执行时间文件1,该执行时间文件1中的每一行都存储了一个与优化器1对应的第一可执行文件中第二函数的函数名以及对应的执行时间,例如第二函数1-执行时间1、第二函数2-执行时间2以及第二函数3-执行时间3。Specifically, after the execution time of the second function in multiple first executable files corresponding to the same optimizer is calculated through the above step S103, the function name and execution time of the second function in the multiple first executable files can be calculated. Create an execution time file corresponding to the same optimizer mentioned above. For example, as shown in Figure 5, execution time file 1 corresponding to optimizer 1. Each line in the execution time file 1 stores a first value corresponding to optimizer 1. The function name of the second function in an executable file and the corresponding execution time, for example, second function 1 - execution time 1, second function 2 - execution time 2, and second function 3 - execution time 3.
在一些可选的实施方式中,所述方法还包括:In some optional implementations, the method further includes:
步骤1:利用读取函数对目标执行时间文件进行元素提取得到元素列表。Step 1: Use the read function to extract elements from the target execution time file to obtain the element list.
步骤2:对所述元素列表中的每个元素进行分割处理得到被同一优化器优化处理得到的多个第二函数的函数名和执行时间。Step 2: Divide each element in the element list to obtain the function names and execution times of multiple second functions optimized and processed by the same optimizer.
步骤3:将多个第二函数的函数名作为键,多个第二函数对应的执行时间作为值,生成用于记录多个第二函数执行时间的第一字典。Step 3: Use the function names of the multiple second functions as keys and the execution times corresponding to the multiple second functions as values to generate a first dictionary for recording the execution times of the multiple second functions.
步骤4:将所述多个执行时间文件中的每个执行时间文件分别作为目标执行时间文件,重复上述步骤1至步骤3,得到多个第一字典;所述多个第一字典与所述多个优化器一一对应。Step 4: Use each execution time file in the multiple execution time files as a target execution time file, repeat the above steps 1 to 3, and obtain multiple first dictionaries; the multiple first dictionaries and the Multiple optimizers correspond one to one.
具体地,如图6所示,可以从不同优化器对应的执行时间文件中提取性能数据,即执行时间文件中第二函数的执行时间;并基于提取出的第二函数的函数名和执行时间建立与优化器对应的第一字典。Specifically, as shown in Figure 6, performance data can be extracted from execution time files corresponding to different optimizers, that is, the execution time of the second function in the execution time file; and based on the extracted function name and execution time of the second function, a The first dictionary corresponding to the optimizer.
在上述步骤1至步骤4中,为了正确地从执行时间文件中提取出所有函数的性能数据进行处理,并生成存储不同优化器优化后的所有函数名及其执行时间的字典,探索器中的数据提取器使用了Python中的读取函数(readlines函数)对文件进行逐行提取。readlines()函数是Python的内置文件对象方法,其主要功能是按行读取文本文件中的内容。该方法将文件中的每一行作为一个字符串元素,将所有这些元素组合成一个列表并返回。换句话说,readlines()会将文本文件转换为一个包含多个字符串元素的列表,每个元素代表文件中的一行。在使用readlines()函数时,用户需要首先利用open()函数打开文件,然后通过调用readlines()方法来按行读取文件内容。在得到readlines函数返回的列表之后,依次对列表中的每个元素进行分割得到函数名以及函数对应的执行时间,并以函数名为key、函数对应的时间为value存储至字典中。最终为每个优化器生成一个记录函数时间的字典。In the above steps 1 to 4, in order to correctly extract the performance data of all functions from the execution time file for processing, and generate a dictionary that stores all function names and their execution times optimized by different optimizers, the explorer The data extractor uses the reading function (readlines function) in Python to extract the file line by line. The readlines() function is Python's built-in file object method. Its main function is to read the contents of a text file line by line. This method takes each line in the file as a string element, combines all these elements into a list and returns it. In other words, readlines() will convert the text file into a list containing multiple string elements, each element representing a line in the file. When using the readlines() function, the user needs to first open the file using the open() function, and then read the file contents line by line by calling the readlines() method. After obtaining the list returned by the readlines function, each element in the list is segmented in turn to obtain the function name and the execution time corresponding to the function, and stored in the dictionary with the function name as key and the time corresponding to the function as value. Finally, a dictionary recording function times is generated for each optimizer.
在一些可选的实施方式中,步骤S104还包括:In some optional implementations, step S104 also includes:
步骤S1041,从所述多个第一字典中筛选出与目标函数文件的第一函数具有相同函数名的多个第二函数,所述目标函数文件表示多个函数文件中的任意一个。Step S1041: Filter out multiple second functions that have the same function name as the first function of the target function file from the multiple first dictionaries, and the target function file represents any one of the multiple function files.
具体地,每个第一字典中均存在一个与目标函数文件的第一函数具有相同函数名的第二函数。Specifically, there is a second function with the same function name as the first function in the target function file in each first dictionary.
步骤S1042,将所述多个第二函数的执行时间进行比较。Step S1042: Compare the execution times of the plurality of second functions.
步骤S1043,将具有最少执行时间的第二函数对应的优化器作为目标优化器,所述目标优化器用于对所述目标函数文件进行优化处理。Step S1043: Use the optimizer corresponding to the second function with the smallest execution time as the target optimizer, and the target optimizer is used to optimize the target function file.
具体地,如图6所示:在生成了存储不同优化器优化后的所有函数名及其执行时间的多个第一字典后,通过最佳优化器选择算法对每个函数文件选择具有最少执行时间的优化器作为后续合成阶段的目标优化器。最佳选择器选择算法流程为:假设共有n个优化器,则共有n个优化器对应的第一字典,其key值为函数名,value值为执行时间。假设有m个函数需要选择,算法核心为两层循环:外层循环遍历字典中所有的函数名,一共遍历m次;内层循环针对选取的一个函数,首先将该函数对应最少执行时间指定为某个优化器优化后的执行时间,然后依次读取其他剩余的优化器优化后的执行时间并进行比较,每一轮比较都将函数的最少执行时间置为比较过程中的较小值,并记录对应的优化器,一共进行n-1次比较。最终选择算法将为每一个函数选择一个执行时间最少所对应的优化器,即选择算法选择的目标优化器。Specifically, as shown in Figure 6: After generating multiple first dictionaries that store all function names optimized by different optimizers and their execution times, each function file is selected with the least execution through the best optimizer selection algorithm The temporal optimizer serves as the target optimizer for subsequent synthesis stages. The best selector selection algorithm process is: assuming there are n optimizers in total, there are first dictionaries corresponding to n optimizers, whose key value is the function name and the value value is the execution time. Suppose there are m functions to be selected. The core of the algorithm is a two-layer loop: the outer loop traverses all the function names in the dictionary, a total of m times; the inner loop first specifies the minimum execution time of the selected function as The execution time optimized by a certain optimizer is then read in sequence and compared with the execution time optimized by other remaining optimizers. In each round of comparison, the minimum execution time of the function is set to the smaller value in the comparison process, and Record the corresponding optimizer and perform a total of n-1 comparisons. The final selection algorithm will select for each function an optimizer corresponding to the minimum execution time, that is, the target optimizer selected by the selection algorithm.
在一些可选的实施方式中,如图6所示:所述方法还包括:In some optional implementations, as shown in Figure 6: the method further includes:
根据每个函数文件以及其对应的目标优化器生成第二字典;所述第二字典中存储了每个函数文件以及对应的最少执行时间的目标优化器,第二字典的key为第一函数的函数名,字典的value为对应的目标优化器。A second dictionary is generated based on each function file and its corresponding target optimizer; each function file and the corresponding target optimizer with the minimum execution time are stored in the second dictionary, and the key of the second dictionary is the key of the first function. The function name and the value of the dictionary are the corresponding target optimizers.
具体地,探索器中的编译脚本生成器将根据第二字典依次为每个函数文件生成对应最优的优化器的编译命令,最后脚本生成器将所有的编译优化命令打包生成一个用于合成第二可执行文件的编译脚本,用于后续合成器的使用。Specifically, the compilation script generator in the explorer will generate compilation commands corresponding to the optimal optimizer for each function file in sequence according to the second dictionary. Finally, the script generator will package all the compilation optimization commands to generate a compilation command for synthesizing the third Two executable file compilation scripts for subsequent use of the synthesizer.
技术效果:Technical effect:
1、本发明能够针对应用程序中的每个函数进行优化,使用不同的代码优化器来优化各个函数,并在执行完整应用程序时测量每个优化版本的性能,选择性能最佳的代码版本生成完整的应用程序二进制代码。能够为编译器用户提供为其目标应用程序找到最佳可能解决方案的方法,并为编译器编写人员提供启发式的编译器性能优化方法。该框架集成了GNU GCC、LLVM Clang和ARM Compiler的代码优化器以及基于多面体模型的循环优化器Polly。同时多编译融合优化框架可以轻松地集成可用代码优化器的更新版本和更新配置,以及添加新的代码优化器。该方法利用了多个编译器的优点,同时绕过单个编译器的缺点,从而使生成的目标代码的性能更好。1. The present invention can optimize each function in the application program, use different code optimizers to optimize each function, measure the performance of each optimized version when executing the complete application program, and select the code version with the best performance to generate Complete application binary code. It provides compiler users with the means to find the best possible solution for their target applications and provides compiler writers with heuristic methods for optimizing compiler performance. The framework integrates the code optimizers of GNU GCC, LLVM Clang and ARM Compiler as well as the polyhedral model-based loop optimizer Polly. The simultaneous multi-compile fusion optimization framework can easily integrate updated versions and updated configurations of available code optimizers, as well as add new code optimizers. This method takes advantage of the advantages of multiple compilers while bypassing the disadvantages of a single compiler, resulting in better performance of the generated object code.
2.本发明通过多种代码优化器分别优化每个提取出的函数,并将其性能作为整个应用程序的一部分来进行评估。框架对函数段代码进行了全面的搜索,以找到所有可能的优化选项,通过比较后选择性能最佳的代码进行链接和最终的可执行文件生成,从而达到能够有效地提高生成的代码的性能的目的。2. The present invention optimizes each extracted function separately through multiple code optimizers, and evaluates its performance as part of the entire application. The framework conducts a comprehensive search of the function segment code to find all possible optimization options, and selects the code with the best performance for linking and final executable file generation after comparison, so as to effectively improve the performance of the generated code. Purpose.
在本实施例中还提供了一种提高代码性能装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。This embodiment also provides a device for improving code performance. The device is used to implement the above embodiments and preferred implementations. What has already been described will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
本实施例提供一种提高代码性能装置,如图7所示,包括:This embodiment provides a device for improving code performance, as shown in Figure 7, including:
函数提取模块,用于对程序源代码进行函数提取得到主体文件和多个函数文件;所述主体文件包含所述程序源代码的主体代码,每个所述函数文件包含一个第一函数;A function extraction module is used to extract functions from the program source code to obtain a main body file and multiple function files; the main body file contains the main body code of the program source code, and each of the function files contains a first function;
优化处理模块,用于利用多个优化器中的每个优化器分别对所述主体文件和所述多个函数文件进行优化处理,得到多个文件夹;每个所述文件夹包含多个第一可执行文件,每个所述第一可执行文件包含一个第二函数,所述第二函数是由第一函数优化处理得到的,所述多个文件夹与所述多个优化器一一对应,所述多个第一可执行文件与所述多个函数文件一一对应;An optimization processing module is configured to utilize each of multiple optimizers to optimize the main body file and the multiple function files to obtain multiple folders; each of the folders contains multiple third files. An executable file, each of the first executable files includes a second function, the second function is obtained by optimizing the first function, the multiple folders and the multiple optimizers are one by one Correspondingly, the plurality of first executable files correspond to the plurality of function files in one-to-one correspondence;
执行时间确定模块,用于执行每个第一可执行文件,以确定每个第一可执行文件中第二函数的执行时间;An execution time determination module, configured to execute each first executable file to determine the execution time of the second function in each first executable file;
目标优化器选择模块,用于基于每个第一可执行文件中第二函数的执行时间,从所述多个优化器中为每个所述函数文件选择目标优化器。A target optimizer selection module, configured to select a target optimizer for each function file from the plurality of optimizers based on the execution time of the second function in each first executable file.
在一些可选的实施方式中,所述提高代码性能装置还包括:In some optional implementations, the device for improving code performance further includes:
第一编译模块,用于对所述多个函数文件分别使用对应的目标优化器进行编译处理,生成多个第一可重定向文件;所述多个第一可重定向文件与所述多个函数文件一一对应;The first compilation module is used to perform compilation processing on the plurality of function files using corresponding target optimizers respectively, and generate a plurality of first retargetable files; the plurality of first retargetable files and the plurality of Function files correspond one to one;
第二编译模块,用于对所述主体文件使用初始优化器进行编译处理,生成第二可重定向文件;The second compilation module is used to compile and process the main body file using an initial optimizer and generate a second redirectable file;
合成模块,用于对所述第二可重定向文件和所述多个第一可重定向文件进行合成处理,得到第二可执行文件。A synthesis module, configured to synthesize the second retargetable file and the plurality of first retargetable files to obtain a second executable file.
在一些可选的实施方式中,所述优化处理模块,包括:In some optional implementations, the optimization processing module includes:
第一编译单元,用于使用目标优化器对所述主体文件进行编译处理,得到第三可重定向文件;The first compilation unit is used to compile and process the main body file using a target optimizer to obtain a third redirectable file;
第二编译单元,用于使用所述目标优化器分别对所述多个函数文件进行编译处理,得到多个第四可重定向文件;所述多个函数文件与所述多个第四可重定向文件一一对应;The second compilation unit is configured to use the target optimizer to respectively compile and process the plurality of function files to obtain a plurality of fourth retargetable files; the plurality of function files and the plurality of fourth retargetable files are Directed files correspond one to one;
合成单元,用于将所述第三可重定向文件分别与所述多个第四可重定向文件进行合成处理,得到多个第一可执行文件;所述多个第一可执行文件与所述多个第四可重定向文件一一对应;A synthesis unit, configured to synthesize the third retargetable file with the plurality of fourth retargetable files to obtain a plurality of first executable files; the plurality of first executable files and the plurality of fourth retargetable files are The plurality of fourth redirectable files are in one-to-one correspondence;
文件夹生成单元,用于基于所述多个第一可执行文件生成与所述目标优化器对应的所述文件夹;A folder generation unit configured to generate the folder corresponding to the target optimizer based on the plurality of first executable files;
重复执行单元,用于将所述多个优化器中的每个优化器分别作为目标优化器,重复执行上述第一编译单元、第二编译单元、合成单元和文件夹生成单元,得到多个文件夹;所述多个文件夹与所述多个优化器一一对应。A repeated execution unit, configured to use each optimizer in the plurality of optimizers as a target optimizer to repeatedly execute the first compilation unit, the second compilation unit, the synthesis unit and the folder generation unit to obtain multiple files. folder; the multiple folders correspond to the multiple optimizers one-to-one.
在一些可选的实施方式中,函数提取模块,包括:In some optional implementations, the function extraction module includes:
函数识别单元,用于对程序源代码进行函数识别得到多个第三函数;The function identification unit is used to perform function identification on the program source code to obtain multiple third functions;
设置计时单元,用于对所述多个第三函数分别设置计时函数,得到多个第一函数;Set a timing unit for respectively setting timing functions for the plurality of third functions to obtain a plurality of first functions;
函数提取单元,用于将所述多个第一函数分别进行提取,得到多个函数文件。A function extraction unit is used to extract the plurality of first functions respectively to obtain a plurality of function files.
在一些可选的实施方式中,所述提高代码性能装置还包括:In some optional implementations, the device for improving code performance further includes:
执行时间文件构建模块,用于根据与同一个优化器对应的多个第一可执行文件中第二函数的函数名和执行时间,构建多个执行时间文件;所述执行时间文件包含被同一优化器优化处理得到的多个第二函数的函数名和执行时间,所述多个执行时间文件与所述多个优化器一一对应。An execution time file construction module, configured to construct multiple execution time files based on the function names and execution times of the second functions in multiple first executable files corresponding to the same optimizer; the execution time files include Function names and execution times of multiple second functions obtained through optimization processing, and the multiple execution time files correspond to the multiple optimizers one-to-one.
在一些可选的实施方式中,所述提高代码性能装置还包括:In some optional implementations, the device for improving code performance further includes:
元素提取单元,用于利用读取函数对目标执行时间文件进行元素提取得到元素列表;The element extraction unit is used to extract elements from the target execution time file using the read function to obtain the element list;
元素分割单元,用于对所述元素列表中的每个元素进行分割处理得到被同一优化器优化处理得到的多个第二函数的函数名和执行时间;An element splitting unit, used to split each element in the element list to obtain the function names and execution times of multiple second functions optimized and processed by the same optimizer;
第一字典生成单元,用于将多个第二函数的函数名作为键,多个第二函数对应的执行时间作为值,生成用于记录多个第二函数执行时间的第一字典;A first dictionary generation unit used to use the function names of multiple second functions as keys and the execution times corresponding to multiple second functions as values to generate a first dictionary for recording the execution times of multiple second functions;
第二重复执行单元,用于将所述多个执行时间文件中的每个执行时间文件分别作为目标执行时间文件,重复上述元素提取单元、元素分割单元和第一字典生成单元,得到多个第一字典;所述多个第一字典与所述多个优化器一一对应。The second repeated execution unit is used to use each execution time file in the plurality of execution time files as a target execution time file, and repeat the above element extraction unit, element segmentation unit and first dictionary generation unit to obtain a plurality of third execution time files. A dictionary; the plurality of first dictionaries correspond to the plurality of optimizers one-to-one.
在一些可选的实施方式中,所述目标优化器选择模块,包括:In some optional implementations, the target optimizer selection module includes:
第二函数筛选单元,用于从所述多个第一字典中筛选出与目标函数文件的第一函数具有相同函数名的多个第二函数,所述目标函数文件表示多个函数文件中的任意一个;The second function filtering unit is used to filter out a plurality of second functions that have the same function name as the first function of the target function file from the plurality of first dictionaries. The target function file represents a plurality of function files. anyone;
执行时间比较单元,用于将所述多个第二函数的执行时间进行比较;An execution time comparison unit, used to compare the execution times of the plurality of second functions;
目标优化器选择单元,用于将具有最少执行时间的第二函数对应的优化器作为目标优化器,所述目标优化器用于对所述目标函数文件进行优化处理。A target optimizer selection unit, configured to use the optimizer corresponding to the second function with the smallest execution time as the target optimizer, and the target optimizer is used to optimize the target function file.
上述各个模块和单元的更进一步的功能描述与上述对应实施例相同,在此不再赘述。Further functional descriptions of the above-mentioned modules and units are the same as those in the above-mentioned corresponding embodiments, and will not be described again here.
本发明实施例还提供一种计算机设备,具有上述图7所示的提高代码性能装置。An embodiment of the present invention also provides a computer device having the device for improving code performance shown in FIG. 7 .
请参阅图8,图8是本发明可选实施例提供的一种计算机设备的结构示意图,如图8所示,该计算机设备包括:一个或多个处理器10、存储器20,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相通信连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在计算机设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在一些可选的实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个计算机设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图8中以一个处理器10为例。Please refer to Figure 8. Figure 8 is a schematic structural diagram of a computer device provided by an optional embodiment of the present invention. As shown in Figure 8, the computer device includes: one or more processors 10, a memory 20, and a device for connecting The interfaces of each component include high-speed interfaces and low-speed interfaces. Various components communicate with each other using different buses and can be installed on a common motherboard or in other ways as needed. The processor may process instructions executed within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative implementations, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple computer devices may be connected, each device providing part of the necessary operation (eg, as a server array, a set of blade servers, or a multi-processor system). Figure 8 takes a processor 10 as an example.
处理器10可以是中央处理器,网络处理器或其组合。其中,处理器10还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路,可编程逻辑器件或其组合。上述可编程逻辑器件可以是复杂可编程逻辑器件,现场可编程逻辑门阵列,通用阵列逻辑或其任意组合。The processor 10 may be a central processing unit, a network processor, or a combination thereof. The processor 10 may further include a hardware chip. The above-mentioned hardware chip can be an application-specific integrated circuit, a programmable logic device or a combination thereof. The above-mentioned programmable logic device may be a complex programmable logic device, a field programmable logic gate array, a general array logic or any combination thereof.
其中,所述存储器20存储有可由至少一个处理器10执行的指令,以使所述至少一个处理器10执行实现上述实施例示出的方法。The memory 20 stores instructions that can be executed by at least one processor 10, so that the at least one processor 10 executes the method shown in the above embodiment.
存储器20可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据计算机设备的使用所创建的数据等。此外,存储器20可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些可选的实施方式中,存储器20可选包括相对于处理器10远程设置的存储器,这些远程存储器可以通过网络连接至该计算机设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 20 may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required for at least one function; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some optional implementations, the memory 20 may optionally include memories remotely located relative to the processor 10 , and these remote memories may be connected to the computer device through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
存储器20可以包括易失性存储器,例如,随机存取存储器;存储器也可以包括非易失性存储器,例如,快闪存储器,硬盘或固态硬盘;存储器20还可以包括上述种类的存储器的组合。The memory 20 may include a volatile memory, such as a random access memory; the memory may also include a non-volatile memory, such as a flash memory, a hard disk or a solid state drive; the memory 20 may also include a combination of the above types of memories.
该计算机设备还包括通信接口30,用于该计算机设备与其他设备或通信网络通信。The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.
本发明实施例还提供了一种计算机可读存储介质,上述根据本发明实施例的方法可在硬件、固件中实现,或者被实现为可记录在存储介质,或者被实现通过网络下载的原始存储在远程存储介质或非暂时机器可读存储介质中并将被存储在本地存储介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件的存储介质上的这样的软件处理。其中,存储介质可为磁碟、光盘、只读存储记忆体、随机存储记忆体、快闪存储器、硬盘或固态硬盘等;进一步地,存储介质还可以包括上述种类的存储器的组合。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件,当软件或计算机代码被计算机、处理器或硬件访问且执行时,实现上述实施例示出的方法。Embodiments of the present invention also provide a computer-readable storage medium. The above-mentioned method according to the embodiment of the present invention can be implemented in hardware or firmware, or can be recorded in a storage medium, or can be implemented as original storage downloaded through the network. Computer code in a remote storage medium or a non-transitory machine-readable storage medium and to be stored in a local storage medium such that the methods described herein may be stored on a computer using a general purpose computer, a special purpose processor, or programmable or special purpose hardware Such software processing on storage media. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk or a solid state drive, etc.; further, the storage medium may also include a combination of the above types of memories. It can be understood that a computer, processor, microprocessor controller or programmable hardware includes a storage component that can store or receive software or computer code. When the software or computer code is accessed and executed by the computer, processor or hardware, the above implementations are implemented. The method illustrated.
虽然结合附图描述了本发明的实施例,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下做出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the appended rights. within the scope of the requirements.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310699678.2A CN116719530A (en) | 2023-06-13 | 2023-06-13 | Methods, devices, computer equipment and storage media for improving code performance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310699678.2A CN116719530A (en) | 2023-06-13 | 2023-06-13 | Methods, devices, computer equipment and storage media for improving code performance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116719530A true CN116719530A (en) | 2023-09-08 |
Family
ID=87869296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310699678.2A Pending CN116719530A (en) | 2023-06-13 | 2023-06-13 | Methods, devices, computer equipment and storage media for improving code performance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116719530A (en) |
-
2023
- 2023-06-13 CN CN202310699678.2A patent/CN116719530A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9239710B2 (en) | Programming language transformations with abstract syntax tree extensions | |
US8032873B2 (en) | Computer program code size partitioning system for multiple memory multi-processing systems | |
CN104020994B (en) | Stream process definition device and stream process based on streaming system define method | |
JP2002099312A (en) | Programmable controller and control program development support device | |
JP2024536124A (en) | Checking source code validity when updating code | |
CN109313547A (en) | Query optimizer for cpu busy percentage and code refactoring | |
CN113283613A (en) | Deep learning model generation method, optimization method, device, equipment and medium | |
CN108197027B (en) | Software performance optimization method, storable medium, computer program | |
US20160246580A1 (en) | Whole-program optimization using data from previous compilation runs | |
US20140189664A1 (en) | METHOD FOR ENABLING COMPILATION OF A COBOL SOURCE PROGRAM UTILIZING A TWO-STAGE COMPILATION PROCESS, THE COBOL SOURCE PROGRAM INCLUDING A MIX OF COBOL, C++ or JAVA STATEMENTS, AND OPTIONAL OPENMP DIRECTIVES | |
KR102041772B1 (en) | Program editing device, program editing method and program editing program stored in the storage medium | |
US8473933B2 (en) | Refactoring call sites | |
CN116541011A (en) | Compiler based on Machine Learning (ML) model | |
US8037463B2 (en) | Computer program functional partitioning system for heterogeneous multi-processing systems | |
CN107291522A (en) | A kind of compiling optimization method and system towards custom rule file | |
CN111444513A (en) | Firmware compiling optimization option identification method and device for power grid embedded terminal | |
US8196093B2 (en) | Apparatus and method for componentizing legacy system | |
JP4652680B2 (en) | Compiling method and apparatus, and compiler | |
CN107092474A (en) | Program developing method, ETL processing method and processing devices | |
CN116719530A (en) | Methods, devices, computer equipment and storage media for improving code performance | |
JP2006163686A (en) | Compiling method, compiling program, compiling device, and recording medium for compiling | |
US12045919B2 (en) | Systems and processes for multi-directional connection of directed acyclic graphs between dashboarding tools and external data tools | |
JP5686686B2 (en) | Program trace management apparatus, program trace management method and program | |
WO2021140568A1 (en) | Function generation program, function generation method, and information processing device | |
CN113031952A (en) | Method and device for determining execution code of deep learning model and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |