Beta Version 0.5 and our presentation at HPCAC’17 @ Stanford

We’re pleased to release beta version 0.5 of DatArcs Optimizer which now supports static tuning mode (more information about the changes are available in the Change Log). Earlier this week we’ve shared some results using the new version at the HPC Advisory Council Stanford Conference:
The original slides are available for download from the Conference website. The full video of the presentation is available below:

In the presentation we’ve detailed our experience with the Phoronix Apache Web Server test when running on a Packet type-2 server. The performance in the different phases is detailed in the below graph, where the horizontal axis is the iteration number and the vertical axis is the normalized performance (normalized number of requests per second that were served by the web server):

We ran the benchmark 200 times as follows:

  1. 20 iterations in baseline – We simply ran the benchmark without Optimizer in the background. The results of this phase were used to normalize the graph.
  2. 140 iterations in learning phase – We enabled Optimizer in the beginning of this phase with a clean database. The performance in the first 20 iterations (~50 minutes) in this phase were jittery. This is because Optimizer explored various knob options. After around 20 iterations, the performance stabilized at around 23% improvement over the baseline.
  3. 20 iterations in best phase – We switched Optimizer to “best” mode at the beginning of this phase, which suppresses further exploration and reduces some overhead required for continuous tuning. The performance in this phase averaged 24%.
  4. 20 iterations in static phase – We switched Optimizer to “static-best” mode at the beginning of this phase. This resulted in Optimizer applying the best settings it had found in the learning phase. After that, Optimizer exited and no longer consumed any CPU cycles or memory. Since dynamic tuning was turned off, the performance improvement dropped from 24% to 8.8%

The full output of the run is available below:

[root@pkt-type2 ~]# datarcs-benchmark 20/140/20/20 pts/apache
Set up benchmark suite ...
Start benchmark ...
Baseline Phase : [########################################] 100%
Setting mode to tune
Learning Phase : [########################################] 100%
Setting mode to best
Optimization Phase : [########################################] 100%
Setting mode to static-best
Static Phase : [########################################] 100%
Summary for benchmark pts/apache:
phase, runs, performance, improvement, relative_stdev
Baseline, 20, 27317, 0%, 2.141%
Learning, 140, 32752.9, 19.899%, 5.092%
Optimized, 20, 33940.3, 24.246%, 0.888%
Static, 20, 29730.5, 8.835%, 2.412%

DatArcs to present at the next HPC Advisory Council Stanford Conference

Next week we’ll be presenting DatArcs at the HPC Advisory Council Stanford Conference and discussing the era of self-tuning servers. We believe that server tuning as we know it is about to change forever, and that DatArcs Optimizer is the tool that will help drive this trend.

We will also share some of the latest results of our upcoming beta 0.5 release.

Join us next week on Tuesday February 7, 2017 at 1:30pm: http://www.hpcadvisorycouncil.com/events/2017/stanford-workshop/agenda.php

Demo of DatArcs Optimizer beta version 0.3

This Wednesday we presented at the Downtown NYC Tech Meetup which focused on how hardware still matters. During the presentation, we’ve shown off DatArcs Optimizer for the first time and it felt great! You can see the video here:

How dynamic tuning can outperform manual tuning

We focused on a knob that toggles CPU cache prefetching and an example benchmark that had two phases: one that runs faster with the prefetchers, and one that runs faster without the prefetchers. During the demo on a Packet type-1 server, the benchmark took 55 seconds to complete with the prefetchers and 57 seconds without the prefetchers. We then tried running Optimizer in the background, and after 5 minutes of training, the runtime dropped to around 52 seconds. Optimizer correctly identified the two phases and applied the best setting for each phase. This demo showed how dynamic tuning can achieve better results than manual tuning.

Tuning Mellanox ConnectX-3 Pro cards

With the help of Mellanox we’ve demonstrated how DatArcs Optimizer tunes a networking knob (tx-usecs) on Packet typ2-2 servers with Mellanox ConnectX-3 Pro network cards. We ran two benchmarks. The first ran 60% faster when tx-usecs is set to 512usecs instead of the default 16usecs. The second ran 3.3x slower when tx-usecs changes from the default 16usecs to 512usecs. DatArcs Optimizer correctly identified which setting performed better for each application, and set tx-usecs to the appropriate value in each case.

Other results 

We’ve also presented some exciting results for Phoronix Apache Web Server benchmark and Cloudsuite in-memory analytics benchmark. We’ll share these results in another post next week when we publish the download links for beta version 0.3. Stay tuned!