As a followup to a previous blog post on out-of-the-box Vivado HLS vs HercuLeS comparison the following table provides updated information on the performance of HercuLeS against Vivado HLS 2013.2 on Virtex-6 and Kintex-7 (XC7K70TFBG676-2 FPGA device).
Better results (lower execution time; smaller area) have been typeset in bold. It can be clearly seen that HercuLeS outperforms Vivado HLS in key benchmarks such as filtering and numerical processing. As expected in many occasions, better speed/performance can be traded-off for lower area. With 12 partial wins each, one could call this a tie
Benchmark | Description | Vivado HLS (VHLS) | HercuLeS | Device | |||||
LUTs | Regs | TET (ns) | LUTs | Regs | TET (ns) | ||||
1 | bitrev | Bit reversal | 67 | 39 | 72.0 | 42 | 40 | 11.6 | Virtex-6 |
2 | divider | Radix-2 division | 218 | 226 | 63.6 | 318 | 332 | 30.6 | Kintex-7 |
3 | edgedet | Edge detection | 246 | 130 | 1636.3 | 680 | 361 | 1606.4 | Virtex-6; 1 BRAM for VHLS |
4 | fibo | Fibonacci series | 138 | 131 | 60.2 | 137 | 197 | 102.7 | Virtex-6 |
5 | fir | FIR filter | 89 | 114 | 1027.1 | 606 | 540 | 393.8 | Kintex-7 |
6 | gcd | Greatest common divisor | 210 | 98 | 35.2 | 128 | 93 | 75.9 | Virtex-6 |
7 | icbrt | Cubic root approximation | 239 | 207 | 260.6 | 365 | 201 | 400.5 | Virtex-6 |
8 | sieve | Prime sieve of Eratosthenes | 525 | 595 | 6108.4 | 565 | 523 | 3869.5 | Virtex-6; 1 BRAM for VHLS |
NOTES:
- TET is Total Execution Time in ns.
- VHLS is a shortened form for Vivado HLS.
- Vivado HLS 2013.2 was used.
- Bold denotes smaller area and lower execution time.
- Italic denotes an inconclusive comparison.
- For the cases of edgedet and sieve, VHLS identifies a BRAM; HercuLeS does not. In these cases, HercuLeS saves a BRAM while VHLS saves on LUTs and FFs (Registers).
Pingback: Top 12 HercuLeS HLS user feedback patterns | EDA stuff