# Benchmarking drives and drive systems for Sample Libraries



## colony nofi (Jan 5, 2021)

Hey all,

I've got a few storage based tech projects that I need to carry out over the next few months (although time frames may well depend on being able to get my hands on a bunch of specific hardware...)

With these, I will have the opportunity to test some pretty esoteric gear / some extremely high speed drives both directly connected and through 10GbE to my workstation.

Over the years, I've done a decent number of tests with kontakt (and more recently, a little with spitfire sample player) looking at how and why different drives perform - under different circumstances.

However, I've never been as scientific / exacting as I'd like in order to put results out in public. I've come across some quite unusual results at times, but without designing the tests rigorously, I'm reluctant to discuss those results as they could very easily be misleading. 

Given I'm away from my studios for at least another week (covid / project related unexpected travel circumstances) I feel like this is a good time to start a little project that looks into drives with sample libs a little deeper, with the help of others in this great community. 

Using the next week or two to get a lot of the thinking done.

Some of the aims may already be solved - in which case, great. We can tick them off and move on. The overall idea behind this little project is to end up with a single resource for composers / music technologists to be able to make decisions about storage for sample libraries.

So without further ado, the questions I'd like answered through all this are things like :

What are the drive specs / benchmarks that are important to look at when choosing drives for sample libs.
Are there interactions between different specs? I'm thinking things like read speeds of specific file sizes. ***WHAT FILE SIZE*** is most important - or is it a range of sizes? Perhaps we need to speak to some people at NI about this? Are there differences in what drive specs we need to look at for loading vs real time perfomance?

To my mind, there are 3 main things to concentrate on
R/W (concentrating on read) of different file sizes
Random I/O's of different file sizes
And with those, how the Interface (SATA, M.2 / U.2,) and transport layer (Pcie, SATA, thunderbolt, USB, IP/Network) etc effects things (and at what points with different technologies those things become important)

For real world, composers want to know about load times, as well as making sure the drive can keep up with real-time usage (after all if I understand things correctly kontakt pre-loads some data, and then grabs more as you are playing). 

And while we are looking at kontakt, it seems useful to look into Sine + Spitfire samplers as well right?

Ok - I'll leave it here for now. I'd love to hear first thoughts from others about how we might go about designing tests. What list of software will be required (ultimately, for it to be successful, we need to use easily available software suites that are cross platform, or develop a series of tests for both macs and pc's separately.


----------



## colony nofi (Jan 6, 2021)

For reference, here's a discussion which has contributed to me thinking about getting this little project going.





Sample libraries on 10gbe NAS?


Curious if anyone is running their libraries off a NAS connected via 10gbe?




vi-control.net


----------



## jiten (Jan 6, 2021)

I really hope to see this thread take off because I think it is a great idea and can possibly turn into a very helpful resource or database.

I know Richard Ames (can't seem to tag him!) and @tack have done a decent amount of work with more specific focus on some of this. I just dug up tack's document here, which you may already be familiar with:








Kontakt Patch Load Performance


Kontakt Patch Loads: NVMe vs SATA SSD (Windows) tl;dr I compared a 1TB Samsung 960 PRO NVMe with a 2TB Samsung 850 EVO SATA SSD and measured patch load times of a multi consisting of all section patches from Cinematic Studio Strings. The goal was to answer the question: will the added performanc...




docs.google.com





Ultimately, what I care about most at the end of the day is how many notes/voices I can stream simultaneously from a given HD and system configuration for each major sample player, and how to maximize this. That was Richard Ames' approach in some of his videos and I think that's a good framework to evaluate things.

Personally, I wonder how much pure HD speed tests using tools like Atto (even testing IOPS, R/W, etc.) actually translate to number of notes streamed in an actual project given differing interactions of players, compression formats, file sizes etc. For example, one observation in tack's document is Kontak doesn't go above a queue depth of 1 and is curiously inefficient with how it handles file reads. Then on top of that, some Kontakt libs have many tiny NCW files like Musical Sampling while others have larger 2GB NCX chunks like Cinematic Studio series. Spitfire uses its own file format and smaller-medium chunks (some in the 100-200mb range, some in the <50mb range), EW Play uses tons of very tiny files (1mb or less each in a deep folder structure) SINE uses yet another format/compression scheme with large monoliths, UVI uses by comparison gigantic monoliths (each lib is basically one file, with one I have being 23gb!)

Unfortunately, there is a dizzying number of variables that affect my simplistic "number of notes" metric and they are also all kind of interconnected. You mentioned most of the relevant ones for HDs. But add to that everything from file system, OS, CPU, buffer size, the sample player you are using, buffer preload settings, multithreading options, DAW, random drivers in your system that may be interfering etc.

What I think would be great to see, and may have been ultimately what you were getting at, is a set of standardized benchmark projects created for the main DAWS using some of the great free sample libraries available on each player (Layers, BBCSO Discover, Project SAM or VSL). The key then would be to focus on _relative performance within an individual system_ to keep things comparable and controlled that many people can run. Something like "if I try to stream using an internal NVME connected via PCIe vs SSD connected via USB3 while leaving everything else unchanged on my system, I can stream X% or X number more/fewer voices in Kontakt, SINE, Play, etc." DAW benchmarks exist for synths and plugins, obviously entirely-CPU focused, and they should exist for sample library streaming too!

To get a bit more specific, as a very simple starting point, e.g. you can set up within each DAW project, ~10 tracks with a Kontakt, Spitfire, SINE, VSL, Play, Engine, UVI instance loaded with some standard lib that a large number of people likely have or can get free. Keep stacking a chord and see how many notes you can stream without glitches for each player _one at a time _(need to handle each player separately for the reasons mentioned above). Then make some change in whatever specific thing you want to test and log the difference, maybe measure 3x and take the average.

If enough people do that, contributing to a database and focusing on a few key questions, interesting patterns or truths may emerge. E.g. "people tend to get meaningfully more streaming notes out of Kontakt using NVMe over USB3 vs. SSD over USB3, etc." Or maybe it'll all be a giant waste of time, what do I know .

Granted this is just an incredibly simple starting point and doesn't account for multiple instances of each player trying to stream simultaneously or impact of different players trying to stream from the same disk simultaneously, etc.

What sorts of questions do people want to test? For me one of particular interest is the age-old one massive drive vs. many smaller drives (both internal SATA/PCIe connections and external USB or TB3).

And of course, let me know if I'm hijacking your thread or taking it in a different direction than you had intended / please redirect if so!


----------



## colony nofi (Jan 7, 2021)

jiten said:


> I really hope to see this thread take off because I think it is a great idea and can possibly turn into a very helpful resource or database.
> 
> I know Richard Ames (can't seem to tag him!) and @tack have done a decent amount of work with more specific focus on some of this. I just dug up tack's document here, which you may already be familiar with:
> 
> ...


No - this is a really good first post off the rank! 
And covers a bunch of some of the things that I've been mulling over.

I must admit its quite daunting to start to think about designing benchmarks that are dedicated to a single subsystem, as so many other things can get in the way (other parts of a build can alter results / other system components become unknown bottlenecks)

I'm going to think some more and try and respond more specifically in the next couple of days.


----------

