Vivado HLS vs HercuLeS

I’ve spent these last couple of days to perform head-to-head comparisons of Xilinx Vivado HLS against HercuLeS on HLS-generated digital circuits (from input C code).

I believe that HercuLeS lived up to the challenge; it is competitive to Vivado HLS. The reader should take account that:

  1. Both tools have been used (almost) out-of-the-box. Vivado HLS was configured with no bufg inclusion, and in “out_of_context” mode. These mean that no clock buffers and I/O pins were routed.
  2. HercuLeS does not (yet) customize the generated HDL in order to fit better specific architectural features (DSP blocks, embedded SRL units).
  3. Vivado HLS had some TOTAL FAILURES on some relatively simple codes such as a simple perfect number detector (positive integers equal to the sum of their divisors), a 1D wavelet code, and easter date calculation. It seems that Vivado HLS experiences some hard time with integer modulo/remainder. Codes are provided to anyone interested.

The following table provides a summary of the results:

Vivado HLS (VHLS) HercuLeS Comment
Benchmark Description LUTs Regs TET (ns) LUTs Regs TET (ns)
1 arraysum Array sum 102 132 26.5 103 63 73.3
2 bitrev Bit reversal 67 39 72.0 42 40 11.6
3 edgedet Edge detection 246 130 1636.3 680 361 1606.4 1 BRAM for VHLS
4 fibo Fibonacci series 138 131 60.2 137 197 102.7
5 fir FIR filter 102 52 833.4 217 140 2729.4
6 gcd Greatest common divisor 210 98 35.2 128 93 75.9
7 icbrt Cubic root approximation 239 207 260.6 365 201 400.5
8 popcount Population count 45 65 19.4 53 102 26.1
9 sieve Prime sieve of Eratosthenes 525 595 6108.4 565 523 3869.5 1 BRAM for VHLS
10 sierpinski Sierpinski triangle 88 163 11326.5 230 200 16224.9

NOTES:

  • Measurements where obtained for the KC705 development board device: xc7k325t-ffg900-2
  • TET is Total Execution Time in ns.
  • VHLS is a shortened form for Vivado HLS.
  • Vivado HLS 2013.1 was used.
  • Bold denotes smaller area and lower execution time.
  • Italic denotes an inconclusive comparison.
  • For the cases of edgedet and sieve, VHLS identifies a BRAM; HercuLeS does not. In these cases, HercuLeS saves a BRAM while VHLS saves on LUTs and FFs (Registers).

Overall, there are about 30% wins for HercuLeS and ~70% wins for Vivado HLS. Not too bad for a tool like HercuLeS; producing generic, portable, vendor-independent code. I estimate that HercuLeS development effort is around 1-5% to Vivado HLS.

I believe that HercuLeS will do much better in the out-of-the-box experience (which is of high importance in order to draw more software-minded engineers in the game) in the near future.

Both HercuLeS and Vivado HLS have optimization features (e.g. loop unrolling). HercuLeS applies optimizations by using a source-to-source C code optimizer. Vivado HLS mostly resorts to end-user directives. These coding aspects will be taken into account in a followup comparison; they also yield a much more extensive solution space.

 

3 thoughts on “Vivado HLS vs HercuLeS

  1. Josh Monson

    A nice simple evaluation of these hls tools. I am a phd student currently studying debug for hls. Thus, the Vivado HLS failures you reported are interesting to me. At what point in the flow did Vivado HLS fail? If you still have it, I am interested in looking at the source for these benchmarks. Thanks.

    Reply
  2. j9395nkavv Post author

    Dear Mr. Monson,

    thank you for your comment. I digged and found one of the benchmarks.

    Even for Vivado HLS 2013.2 (latest that I have), SystemC-VHDL cosimulation does not work.

    Please find the benchmark’s code as below; use -DTEST to compile:


    /*
    * Filename: perfect.c
    * Purpose : C implementation of a naive algorithm for detecting perfect
    * numbers. A perfect (positive integer) number is equal to the sum of
    * its divisors. The first members of this sequence are:
    * 6, 28, 496, 8128.
    * Author : Nikolaos Kavvadias (C) 2010, 2011, 2012, 2013, 2014
    * Date : 17-Apr-2010
    * Revision: 0.3.0 (17/04/10)
    * Initial version.
    */

    #ifdef TEST
    #include
    #include
    #endif

    void perfect(unsigned int value, unsigned int *isperfect)
    {
    unsigned int factorsum = 1, i;
    for (i = 2; i < = value/2; i++) { if (value % i == 0) { factorsum += i; } } if (factorsum == value) { *isperfect = 1; } else { *isperfect = 0; } } #ifdef TEST int main(void) { FILE *fp; unsigned int i; unsigned int result; fp = fopen("out.dat", "w"); for (i = 2; i <= 65535; i++) { perfect(i, &result); if (i <= 6 || i == 28 || i == 496 || i == 8128) { fprintf(fp, "%08x %08x\n", i, result); } } fclose(fp); printf ("Comparing against reference data \n"); if (system("diff -w out.dat perfect_test_data.txt")) { fprintf(stdout, "*******************************************\n"); fprintf(stdout, "FAIL: Output DOES NOT match the golden output\n"); fprintf(stdout, "*******************************************\n"); return 1; } else { fprintf(stdout, "*******************************************\n"); fprintf(stdout, "PASS: The output matches the golden output!\n"); fprintf(stdout, "*******************************************\n"); return 0; } } #endif

    Reply
  3. Pingback: Vivado HLS vs HercuLeS (Kintex-7 and VDS 2013.2 update) | EDA stuff

Leave a Reply

Your email address will not be published. Required fields are marked *