Services - tools - models - for embedded software development
Embecosm divider strip
Prev  Next

7.5.  Summary of Performance Gains Through Optimization

The examples in this chapter can be distilled to some simple guidelines for obtaining the fastest possible models

  1. Build new code so it does not generate Verilator warnings.

  2. Most warnings can be ignored in known good legacy code. However UNOPTFLAT (and UNOPT, which was not encountered here) should be addressed, since they will lead to performance gains.

  3. Use -O1 or -Os for simple C++ optimization, or where build time is onerous. For maximum speed using -O3 with profiling.

  4. Profile the generated model using gprof to identify any performance bottlenecks in the Verilog.

There is a trade off between increased time taken to create the model and reduced execution times of the resulting model. Key data points from the various optimization steps are summarized in Table 7.6.

Run Description

Build Time

Run Time

Performance

Baseline event driven simulation

1.78 s

796.84 s

1.48 kHz

Optimized event driven simulation

1.78 s

803.39 s

1.47 kHz

Baseline Verilator model

13.94 s

27.67 s

42.66 kHz

Verilator with all language fixes

13.91 s

24.85 s

47.49 kHz

Verilator g++ -Os

26.23 s

12.24 s

96.41 kHz

Verilator g++ -O3 and profiling

85.87 s

9.13 s

129.28 kHz

Table 7.6.  Summary of ORPSoC model performance with various optimizations.


These results are shown graphically in Figure 7.2.

Summary of model build and run times for ORPSoC

Figure 7.2.  Summary of model build and run times for ORPSoC


Embecosm divider strip