Uses stress-ng for benchmarking, even though the stress-ng documentation says it is not suitable for benchmarking. It was written to max out one component until it burns.
Using a real app, like Memcached or Postgres would show more realistic numbers, closer to what people use in production.
The difference is not major, 50% utilization is closer to 80% in real load, but it breaks down faster. Stress-ng is nicely linear until 100%, memcached will have a hockey stick curve at the end.
The advantage of stress-ng is that it's easy to make it run with specific CPU utilization numbers. The tests where I run some number of workers at 100% utilization are interesting since they give such perfect graphs, but I think the version where I have 24 workers and increase their utilization slowly is more realistic for showing how production CPU utilization changes.
Fun data point though, I just ran three data points of the Phoronix nginx benchmark and got these results:
- Pinned to 6 cores: 28k QPS
- Pinned to 12 cores: 56k QPS
- All 24 cores: 62k QPS
I'm not sure how this applies to realistic workloads where you're using all of the cores but not maxing them out, but it looks like hyperthreading only adds ~10% performance in this case.