#### Final Year Project

High-level Synthesis Design Space Exploration

> Supervisor: Dr. Benjamin Carrion Schafer Student: LU Shanqi 13116151D



## Content

- Part1: Introduction
  - High-level Synthesis
- Part2: Methodology
  - Tool CyberWorkBench
  - Algorithm
- Part3: Graphical User Interface
  - Functions
  - Plotting widget
- Part4: Results
  - Comparison between brute force and simulated annealing



## High-level Synthesis

- A process of converting behavior descriptions to hardware implementation
- From SystemC to hardware
- Tools: CyberWorkBench, Xilinx Vivado....



## Higher Abstract Level

- Advantages
  - shorter marketing cycle
  - Increasing reusability of programming codes
- Disadvantages
  - Numerous synthesis options are available, hence hard to find optimal designs



## **Design Space Exploration**

- An exploration process aiming to find optimal designs among uncountable candidates in high level synthesis
- Multi-objective optimization problem
- One design pareto dominate another.
- One design is not inferior to another design in all objectives, additionally, there is at least one better objective.





## Objectives of this project

- To automate the process of design space exploration
- To develop a heuristic to accelerate design space exploration
- To develop a graphical user interface(GUI) to plot results dynamically



# Tool: CyberWorkBench

- Parse description language(e.g. SystemC)
   Special pragmas can be recognized by parser
- Generate constraint files(e.g. Functional Units constraint file)
- Synthesize
  - Results are store in a \*.CSV file



## Tool: CyberWorkBench

- Pragma insertion
- Key word: Cyber



 Change high-level synthesis options(also called attributes) by inserting such kind of pragmas



#### Tool: CyberWorkBench





## Problem clarification

• Treat high-level synthesis process as a black box function

$$(Area, latency) = f(x1, x2, x3....)$$

- Multiple input variables: x1,x2,x3.....
- Two objectives: area, latency
- Find more pareto optimal designs in shorter time



#### **Problem clarification**







# Algorithms

- Brute Force(exhaustion method)
  - a generate-and-test algorithm to check all possible candidates that satisfy specification of a problem
- Simulated Annealing
  - probabilistic heuristic
  - other heuristics(e.g. genetic algorithm)



• Step1: Generate a initial design randomly as the base design and synthesize it. Synthesized result is used to calculate first GCF as base state of system. Set an initial temperature for the system.

design1

array=RAM

loop=unroll all

function=inline

• Global cost function is defined as:

$$GCF = \partial A + \beta L$$



• Step 2: Generate a new design from base design by randomly modifying one attribute's value.



• Step3: Compare new GCF and previous GCF. Then, determine whether to accept the new design. Probability to accept a worse design:

$$P = e^{-\frac{\Delta GCF}{k}}$$



 Step4: If 5 better designs are consecutively generated, reduce current temperature by 10%. Change parameters in GCF for every 8 designs.

• If no exit condition is met, iterate from step2 to step 4. Exit iteration if one exit condition is met.



- Exit conditions:
  - i. Current temperature is less than threshold.
  - ii. Consecutively, more than 5 new designs are worse than previous design.
  - iii. Cannot generate new designs by changing one of the attributes.
  - iv. Synthesized designs are more than 70% of all designs.



• Why SA?

(Area, latency) = f(x1, x2, x3....)

• Some input sequence might lead to a good result. e.g.

array=RAM loop=unroll all small area and small latency
 Change one attribute at a time while maintaining other attribute combinations



# Graphical User Interface for Design Space Exploration



# MainWindow C File Library New file BLIB library FLIB library





## Qt Framework

- A cross-platform framework for developing applications
- Signals and Slots: Communication mechanism between different parts of the program
- QCustomPlot: An online open source widget for plotting
- Multithread programming: To prevent GUI freezing
- Model/View Programming: To modify data outside current program(used in file list)



- ComboBox
- Selecting files or options









- Selecting other
   benchmarks
- Selecting other libraries
- Filtering out files with wrong extensions



|                     | C Fil     | P         | Library       |               |           |        |          |           |           |
|---------------------|-----------|-----------|---------------|---------------|-----------|--------|----------|-----------|-----------|
|                     | CTH       |           | Library       |               |           |        |          |           |           |
| < C                 | New       | file B.I  | B library     | FLIB library  |           |        |          |           |           |
|                     | fir.cpp   | ~         | ASIC          | *             | Auto attr | Delete | All      | simul     | lated ann |
|                     | Nar       | me        | Size          | Туре          | Date      | 1      |          |           |           |
|                     | ۱ E       | works     |               | Folder        | 13/       | 4.8 -  |          |           |           |
|                     |           | attr_lib  | 270 by        | tes File      | 10/       | 1      |          |           |           |
|                     |           | dafina h  | 1             | KB h Eila     | 9/4       |        |          |           |           |
| New file BLIB       | library   | FLIB libr | ary           |               |           |        |          |           |           |
|                     |           |           | •             |               |           | _      | _        | _         | _         |
| r.cpp 👻             | ASIC      | <u> </u>  | Choose a file | e             |           |        |          | _         |           |
| Name C              | 170       | Look in   | C /bar        | no/clu/bonchm | orko/fir  |        | - 0      | 00        | a i       |
| Name S              | ize       | LOOK IN:  | inor          | ne/siu/benchm | arks/III  |        | • •      |           |           |
| attr lib            | 27        | 🖪 Comp    | uter Na       | me            |           | ▼ Size | Туре     | Date Mod  | lified    |
| define.h            | ~ '       | slu       | e wo          | orkspace      |           | 1 100  | Folder   | 13/4/151  | :08 AM    |
| fir_coe             | 2         | -         | fir           | _gen.cpp      |           | 1 KB   | cpp File | 12/4/15 8 | 3:26 PM   |
| fir_gen             |           |           | - m           | ain.opp       |           | 2 KB   | con File | 9/4/15 7: | 30 PM     |
| fir_in              | 6         |           | tb            | fir.cpp       |           | 4 KB   | cpp File | 9/4/15 7: | 30 PM     |
| fir_out             |           |           |               |               |           |        |          |           |           |
| fir.cpp             |           |           |               |               |           |        |          |           |           |
| fir.cpp~            |           |           |               |               |           |        |          |           |           |
| fir.h               |           |           |               |               |           |        |          |           |           |
| main.cpp            |           |           |               |               |           |        |          |           |           |
| Makefile            |           |           |               |               |           |        |          |           |           |
| ome/slu/henchma     | rks/fir/  |           |               |               |           |        |          |           |           |
| one, star benefitie |           |           |               |               |           |        |          |           |           |
|                     | _         |           |               |               |           |        |          |           |           |
| ogress              | F         |           |               |               |           |        |          |           |           |
| Run                 | Stop      |           |               |               |           |        |          |           | -         |
| home/slu/benchm     | arke/fir  | File name | fir.cpp       |               |           |        |          |           | Open      |
| ionie/stu/benchin   | 011/2/18) |           | E             | 14            |           |        |          |           | Connel    |

- Automatic attribute insertion Auto attr
- To examine input SystemC file and insert pragma automatically based on syntax.

sc\_uint<8> in\_data\_read[9]/\*attr1\*/; sc\_uint<16> coeff\_read[9]/\*attr2\*/;



- To show selection information such as which technology library has been selected.
- To Show design information dynamically while running, like that total number of designs, current number of designs and synthesis results.

| Run                                                                                                                                 | Stop | Optimize | Show all                                                                                   | Run                                                                | Stop | Optimize | Show all |  |
|-------------------------------------------------------------------------------------------------------------------------------------|------|----------|--------------------------------------------------------------------------------------------|--------------------------------------------------------------------|------|----------|----------|--|
| /home/slu/benchmarks/fir/fir.cpp<br>/eda/bin/cwb/cyber/lib/asic_45.FLIB<br>/eda/bin/cwb/cyber/lib/asic_45.BLIB<br>Method not chosen |      |          | run high-level synthesis<br>Start design space exploration<br>Total number of designs: 128 |                                                                    |      |          |          |  |
| Method: simulated annealing file information                                                                                        |      |          |                                                                                            | new design 1/128<br>Area is : 7119 Latency is 23 Throughput is 736 |      |          |          |  |
|                                                                                                                                     |      |          |                                                                                            | new design 2/128<br>Area is : 4726 Latency is 29 Throughput is 696 |      |          | 696      |  |
| Design information Clean                                                                                                            |      |          | Design information Clean                                                                   |                                                                    |      |          |          |  |

design information



- Embedded editor
- To edit synthesis command
- Standard shortcuts for editor like copy, cut and paste can work

| Show command before running Update command Through                             |                                                     |                   |  |  |  |  |
|--------------------------------------------------------------------------------|-----------------------------------------------------|-------------------|--|--|--|--|
| bdltran -c1000 -s -lfl /eda/bin/cwb/<br>asic_45.BLIB fir.IFF -Zflib_fcnt_out - | cyber/lib/asic_45.FLIB -lb /eda/b<br>Zmlib_mcnt_out | in/cwb/cyber/lib/ |  |  |  |  |
|                                                                                |                                                     |                   |  |  |  |  |



#### Plotting widget





HE HONG KONG

香港理工大學

**POLYTECHNIC UNIVERSITY** 

# Plotting widget

• Zoom out and zoom in using mouse wheel





#### Plotting widget functions

- Interactivity
- Mouse clicking



#### Plotting widget functions

• File list



#### Timer

• Elapsed time

Elapsed time:

- 25: 00: 00
- Remaining time

Remainning Time: Estimating...

Remainning Time: 00:00:00



## Progress Bar

- To show progress
- Relatively accurate for brute force
- Not useful if simulated annealing method is used





# **Conflict handling**

- Conflicts between widgets in program sometimes appear and render program's crashing
- Mechanisms to check conflicts and pop up warning messages





# Change Coordinate

• Change to area versus throughput

 $Throughput = Output \_port\_number \times \frac{1}{CP\_delay \times Latency}$ 





#### Show optimal designs

Click button Optimize Show all





#### Synthesis Results

• Three benchmarks

| Bench         | Туре | #lines | Explorable operations          | Brute | SA  |
|---------------|------|--------|--------------------------------|-------|-----|
| fir           | С    | 86     | array(2), loop(2), function(1) | 340s  | 55s |
| qsort         | С    | 119    | array(1), loop(2),function(3)  | 843s  | 64s |
| adpcm_encoder | С    | 179    | array(1),for(1),function(2)    | 130s  | 54s |



#### **Brute Force Results**





#### adpcm\_encoder



#### Simulated Annealing Results







### Comparison

- Time Comparison
- Averagely, running time of simulated annealing algorithm for these three benchmarks is 21.8 percent of brute force algorithm, which means SA algorithm's speed is 4.59 times of BF algorithm's speed.

| Bench          | Brute force | Simulated annealing | SA versus BF |
|----------------|-------------|---------------------|--------------|
| fir            | 340s        | 55s                 | 16.2%        |
| qsort          | 843s        | 64s                 | 7.6%         |
| adpcm_enconder | 130s        | 54s                 | 41.5%        |

$$average = \frac{16.2 + 7.6 + 41.5}{3} = 21.8\%$$





## Comparison

- Qualitative comparison
- In these experiments, brute force method has gone through all designs. Therefore, pareto dominance for brute force is 100%
- It has been found that simulated annealing could find 66% pareto dominated points.

| Bench          | Brute force | Simulated annealing | SA versus BF |
|----------------|-------------|---------------------|--------------|
| fir            | 5           | 2                   | 40%          |
| qsort          | 6           | 5                   | 83%          |
| adpcm_enconder | 4           | 3                   | 75%          |



average = 
$$\frac{40 + 83 + 75}{3} = 66\%$$
Part4: Results

## Conclusion

#### Achievements

- This project has completed the main goals. It has developed one heuristic (simulated annealing) for design space exploration that achieved around four times faster speed than brute force method.
- A graphical user interface that could display synthesis results dynamically was developed.

#### Limitations and future works

- Other heuristics can be developed and compared to simulated annealing
- Number of tested benchmarks is not enough
- Automatic attribute insertion could only support special language formats





