[Final Project] Performance Competition (substituting final exam, due: 8th of July) #168

jeehoonkang · 2020-06-01T14:42:14Z

In turned out that we cannot physically gather for the final exam. So I decided not to take the final exam. Instead, as the substitute task, we'll have a performance competition as the final project.

For the competition, you’ll submit your entire compiler. Predefined benchmark programs will be compiled and then executed on Hifive Unleashed (the first Linux-bootable RISC-V development board), which is sponsored by SemiFive. If any of the results are wrong, you’ll be disqualified. The geometric average of the number of CPU cycles will be compared among students’ compilers and clang -O1.

Please do whatever you can to reduce the number of cycles, e.g., by implementing more optimizations or by improving your asmgen with a better register allocation algorithm.

If your compiler is better than clang -O1, you’ll get A#. If your compiler is better than those of most students, you’ll get A+. Depending on the performance of your compiler, you'll get some bonus.

The text was updated successfully, but these errors were encountered:

cyron1259 · 2020-06-03T07:24:30Z

Would there be some kind of a leaderboard so that we can compare against others' performance?

jeehoonkang · 2020-06-03T07:49:37Z

@cyron1259 good idea! we will soon prepare for a leaderboard.

hestati63 · 2020-06-04T07:53:50Z

As the whole compiler will be run on the final competition, I want to fuzz each optimization pass.
But the fuzzer does not support it. Can you make options to fuzz the optimization pass?

Also, can you provide the command line argument for the compiler that will be used in the competition?

jeehoonkang · 2020-06-04T16:41:28Z

@hestati63

on fuzzing optimizations, let's discuss here: [HW 3~6] Testing Optimization Passes #178
on competition's specification, I will soon prepare for a detailed description of the submission procedure, competition rule, leaderboard, etc. For now, let's say (1) you can implement custom optimizations on IR; and (2) you can optimize the naive asmgen introduced in the lecture videos.

jeehoonkang · 2020-06-15T06:49:17Z

Clarification: you need to observe the LP64D calling convention: #209 (comment)

jeehoonkang · 2020-07-01T05:26:50Z

IMPORTANT UPDATE on FINAL PROJECT

Benchmark code is uploaded: kaist-cp/kecc-public@114f38c
In the bench directory, execute make run. Then it will build your compiler, build benchmark codes, run them, and measure the elapsed CPU cycles. The average is your score (lower is better).
For the time being, it's running on QEMU and the measurement is not accurate. I will soon provide a gg.kaist.ac.kr submission link so that you can run the benchmark codes on the SiFive HiFive Unleashed RISC-V machine.
Benchmark codes will be added in the near future.

hestati63 · 2020-07-01T06:29:42Z

Can you notify a specific deadline that you finalizes the benchmark codes?

cmpark0126 · 2020-07-02T14:36:55Z

IMPORTANT: you should use la pseudo instruction instead of HI20, LO12 pair when obtaining the address of the global variable.
We create a shared object using the assembly code to check the performance of the compiler on the final project.
However, the relocation function HI20 and LO12 can not be used when making a shared object.
Instead, you can generate a shared object normally by using la instruction.

So, please use la pseudo instruction instead of HI20, LO12 pair like below:

# before
lui     a5,%hi(nonce)
lw      a5,%lo(nonce)(a5)

# after
la      a5,nonce

jeehoonkang · 2020-07-02T16:38:02Z

IMPORTANT:

I just uploaded the final project grader: kaist-cp/kecc-public@542535f Please do whatever you want to improve cd bench; make run's "[AVERAGE]" score (lower is better). We recommend you to read driver.cpp.

@hestati63 Sorry for uploading the grader late. It's now finalized.
You'll upload your entire src directory. Please run ./scripts/make-submissions.sh and final.zip is the file you'll upload to gg (TBA).

Medowhill · 2020-07-03T09:58:49Z

Hi. Could you let us know the scores of some reference compilers (for example, gcc -O0 and gcc -O1)? Currently, it is hard to know whether my implementation performs well or not by only seeing the score. Also, if one tries to challenge gcc / clang -O1, those scores can be good targets.

jeehoonkang · 2020-07-03T09:59:48Z

@Medowhill make run-gcc will evaluate GCC with the optimization flag -O for the same benchmark. You can easily change Makefile to evaluate gcc -O0 and gcc -O1 as well.

Medowhill · 2020-07-03T10:03:03Z

Thank you! I didn't notice that.

hestati63 · 2020-07-05T16:41:23Z

When will be gg grader ready?
As qemu uses binary translation, the cycle looks like just dependent on the number of instructions.

jeehoonkang · 2020-07-05T18:50:17Z

@hestati63 I'm trying to provide the grader by tomorrow. Sorry for delay.

jeehoonkang · 2020-07-06T19:52:45Z

You can submit the final project to gg now: https://gg.kaist.ac.kr/assignment/16/
It's running on a RISC-V machine: SiFive HiFive Unleashed running Linux

jeehoonkang · 2020-07-06T19:54:00Z

FYI, gcc -O's result is as follows:

[exotic_arguments_struct_small] 52
[exotic_arguments_struct_large] 77
[exotic_arguments_struct_small_ugly] 34
[exotic_arguments_struct_large_ugly] 138
[exotic_arguments_float] 18
[exotic_arguments_double] 19
[fibonacci_recursive] 52089252
[fibonacci_loop] 1640
[two_dimension_array] 72229
[matrix_mul] 373849
[matrix_add] 53248
[graph_dijkstra] 78160627
[graph_floyd_warshall] 151599746
[fibonacci_recursive] 52089692
[fibonacci_loop] 1787
[two_dimension_array] 74329
[matrix_mul] 372201
[matrix_add] 57362
[graph_dijkstra] 79328213
[graph_floyd_warshall] 151565586
[fibonacci_recursive] 52048084
[fibonacci_loop] 1754
[two_dimension_array] 72840
[matrix_mul] 377440
[matrix_add] 55333
[graph_dijkstra] 78468837
[graph_floyd_warshall] 151555926
[fibonacci_recursive] 52089934
[fibonacci_loop] 1759
[two_dimension_array] 72798
[matrix_mul] 372444
[matrix_add] 52443
[graph_dijkstra] 75623586
[graph_floyd_warshall] 151648428
[fibonacci_recursive] 52082904
[fibonacci_loop] 1755
[two_dimension_array] 72791
[matrix_mul] 373361
[matrix_add] 54438
[graph_dijkstra] 76326790
[graph_floyd_warshall] 151566425
[fibonacci_recursive] 52048784
[fibonacci_loop] 1782
[two_dimension_array] 72896
[matrix_mul] 379304
[matrix_add] 52175
[graph_dijkstra] 76327618
[graph_floyd_warshall] 151525424
[fibonacci_recursive] 52046427
[fibonacci_loop] 1758
[two_dimension_array] 72529
[matrix_mul] 370371
[matrix_add] 53737
[graph_dijkstra] 77295262
[graph_floyd_warshall] 151636428
[fibonacci_recursive] 52042896
[fibonacci_loop] 1775
[two_dimension_array] 72726
[matrix_mul] 376777
[matrix_add] 55529
[graph_dijkstra] 75582433
[graph_floyd_warshall] 151547499
[fibonacci_recursive] 52043659
[fibonacci_loop] 1896
[two_dimension_array] 72959
[matrix_mul] 370099
[matrix_add] 53497
[graph_dijkstra] 75628374
[graph_floyd_warshall] 151557803
[fibonacci_recursive] 52043419
[fibonacci_loop] 1780
[two_dimension_array] 72684
[matrix_mul] 373757
[matrix_add] 57870
[graph_dijkstra] 79321067
[graph_floyd_warshall] 151603631
[AVERAGE] 1.06947e+06

lomotos10 · 2020-07-07T06:03:28Z

IMPORTANT: you should use la pseudo instruction instead of HI20, LO12 pair when obtaining the address of the global variable.
We create a shared object using the assembly code to check the performance of the compiler on the final project.
However, the relocation function HI20 and LO12 can not be used when making a shared object.
Instead, you can generate a shared object normally by using la instruction.

So, please use la pseudo instruction instead of HI20, LO12 pair like below:
# before
lui     a5,%hi(nonce)
lw      a5,%lo(nonce)(a5)
# after
la      a5,nonce

@cmpark0126 I am currently having trouble understanding the la instruction.
Does la return the address of the label, or the data inside that address?

cmpark0126 · 2020-07-07T06:25:07Z

@cmpark0126 I am currently having trouble understanding the la instruction.
Does la return the address of the label, or the data inside that address?

The address of the label

jesper-amilon · 2020-07-07T09:25:45Z

Should we use the la-instruction only for the Nonce-object or for all global variables?

cmpark0126 · 2020-07-07T12:02:47Z

@christofides You need to use la instruction for all global variables.

jesper-amilon · 2020-07-07T14:43:20Z

@cmpark0126 I am currently having trouble understanding the la instruction.
Does la return the address of the label, or the data inside that address?

The address of the label

So if it loads the address, we need also add lw to actually load the value of the variable? I.e.:

la     a5, nonce
lw     a5,  a5

Edit: Another question, can LA be used to get the address of also floating point variables? (I assume this is the case but want to make sure)

cmpark0126 · 2020-07-07T15:20:42Z

@christofides

Yes, you need to add load instruction to get the value of the global variable like below:
```
la     a5, nonce
lw     a5,  0(a5)
```
Yes, you can use la instruction for a global variable whose type is floating-point.

jeehoonkang added the assignment label Jun 1, 2020

jeehoonkang mentioned this issue Jun 1, 2020

[Homework 7] Assembly generation (due: 8th of July) #160

Closed

jeehoonkang mentioned this issue Jun 9, 2020

[Session 21] June, 9th #194

Closed

8000 jeehoonkang self-assigned this Jun 10, 2020

jeehoonkang closed this as completed Jul 8, 2020

This was referenced Jun 3, 2022

[HW8] Allow to using @PLT reallocation flag #436

Closed

[Final Project] Performance Competition (due: 6/20) #433

Closed

jirheee added the homework - performance competition src label Dec 25, 2022

jirheee mentioned this issue Mar 10, 2023

Tips on homework assignments #467

Closed

AnHaechan mentioned this issue Mar 22, 2023

[Final Project] Performance Competition (due: 6/19) #475

Closed

This was referenced Feb 26, 2025

[Final Homework] Performance Competition (due: 6/18) #550

Open

[Homework 7] Assembly Generation (due: 6/18) #549

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Final Project] Performance Competition (substituting final exam, due: 8th of July) #168

[Final Project] Performance Competition (substituting final exam, due: 8th of July) #168

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Final Project] Performance Competition (substituting final exam, due: 8th of July) #168

[Final Project] Performance Competition (substituting final exam, due: 8th of July) #168

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!