# Benchmarking Kontakt 3.5 vs. PLAY on a 64-bit PC



## ddas (Aug 10, 2009)

I've got an experiment going on which I'd love to get some input on. I'd like to, as accurately as possible, determine how Kontakt 3.5 is comparing to PLAY.

I set up a brand new custom-built PC as a softsynth machine. It is an Intel Core 2 Duo chip running at 2.8GHz, with 8GB of RAM. I have both EWQLSO-NI and EWQLSO-PLAY (Platinum Plus) loaded on to a 7200rpm SATA drive mounted internally. The PC is running the latest public beta of Windows 7 64-bit. 

Kontakt 3.5 and PLAY are running in standalone mode (though of course not at the same time) with a 128 sample buffer for low latency. I have both Kontakt and PLAY loading an identical 48-channel orchestral template that encompasses all of the keyswitch instruments (strings, WW, brass) plus some of the full orchestral patches. Only the stage mics are loaded. The sample pool for Kontakt vs. PLAY should be roughly equivalent although I'm sure there are minor differences that EW did during the reprogramming for PLAY. I chose every single keyswitch instrument possible because, when composing, I love the idea of every articulation being instantly available to me without having to go back and load other patches.

On my main sequencing machine, I set up the 48-track template (see attached screenshot) and played a monophonic line on every single track. (Of course, I played in the appropriate range for each instrument to make sure samples are actually being triggered.) This is done to approximate how it might actually be used in a real-world situation. Note that far more than 48 notes of polyphony are being used, due to some instruments which trigger attack samples, release samples, and notes that slightly overlap with other notes. The actual polyphony count seems to range between 200-300 voices in both Kontakt and PLAY.

Upon triggering the sequence, both Kontakt and PLAY aren't able to play every single note of polyphony. There are dropouts and occasional glitches.

Just eyeballing it, the CPU and streaming meters for Kontakt and PLAY are roughly similar.

I'd love to run more comprehensive tests to figure out how each app is functioning, and, not being a PC guy, I'm open to any suggestions for benchmarking programs, or other things to try, to get this running as smoothly as possible.

So I open the floor...


----------



## germancomponist (Aug 10, 2009)

One suggestion: Compare the release times of the samples between Kontakt and Play. I have noticed that in Kontakt the most release times are set too long and use more ram than needed. I have adjusted them in Kontakt and got much more voices at the same time.


----------



## ddas (Aug 10, 2009)

germancomponist @ Mon Aug 10 said:


> One suggestion: Compare the release times of the samples between Kontakt and Play. I have noticed that in Kontakt the most release times are set too long and use more ram than needed. I have adjusted them in Kontakt and got much more voices at the same time.



That would be a *lot* of work, being that all of the loaded instruments are keyswitches. Not impossible though. That's a good tip for optimizing this for maximum efficiency.


----------



## germancomponist (Aug 10, 2009)

Yepp, but test it on, lets say, one keyswitched instrument? Possible?

Thabks!


----------



## gsilbers (Aug 10, 2009)

why not and see what track count the glitches start to occur 1st. then see which
one has more tracks. but also change to different setting to realize the best potential 
for each VI.


----------



## ddas (Aug 11, 2009)

OK, first test: 

Windows 7, latest public beta
2.8GHz Intel Core 2 Duo
8GB of physical RAM
Tascam US-1641 USB Audio/MIDI Interface
Kontakt 3.5.0.025 standalone application
Buffer: 128 samples
Override instrument's preload size: 156k
CPU overload protection: off
Multiprocessor support: 2 cores (the max since this is a dual core CPU)
RAM in use: 6.17GB (out of 8GB)
48 instrument multi loaded (see screenshot in first post)

Out of 48 MIDI tracks (each playing a monophonic phrase simultaneously) I can get about 40 playing at once, with total polyphony ranging from 250-300 voices, and CPU ranging from 60-90%. If I try and play all 48, polyphony raises to above 400, CPU hits 100% regularly, and I get very bad audio glitching. Curiously, the Disk Stream meter is extremely low -- like 0-1%.

I also tried relaxed CPU overload protection, but this ended up killing voices far sooner (<200) and so I abandoned it.

Over to PLAY standalone!

Windows 7, latest public beta
2.8GHz Intel Core 2 Duo
8GB of physical RAM
Tascam US-1641 USB Audio/MIDI Interface
Play 1.2.05 standalone application
Buffer: 128 samples
Engine level: 5 (I assume equivalent to a higher instrument preload size). When I set engine level to 5, I see the following hardcoded settings: Engine memory: 512MB, Max Voices: 1024, Prime Buffer: 40k, Play Buffer 384k)
CPU overload protection: off
RAM in use: 1180MB
Equivalent 48 instrument multi loaded (exact same instruments)

Out of 48 MIDI tracks, I can get about 32 playing at once, with total polyphony around 180-210, and CPU ranging from 80-100%. If I try and play all 48, polyphony raises to 220-250, CPU hits 100-150% regularly (!), and I get very bad audio glitching. PLAY doesn't have a disk stream meter expressed as a %, but its disk meter varies between 3000-6000kB/sec.

It's worth noting that the difference in RAM use is substantial. I used Kontakt's override instrument preload size to force it to use more of the 8GB of RAM. I tried switching PLAY's Engine Level setting (the only rough equivalent I see) in order to use more of the RAM, but level=5 is its max, and when set to that, it's using 1.1GB out of 8GB in my machine. Curiously, changing PLAY's Engine level back to 2 (the default) results in 590MB of RAM use, yet the performance results are exactly the same as what I wrote above.

Exactly like Kontakt, if I use PLAY's overload protection, voices get cut off when the polyphony is not near its limit, so it's not useful. I'd rather have Kontakt/PLAY using 100% of the processor and me manually manage its polyphony limit, than use the overload protection which starts cutting out voices at far lower polyphony levels than the computer is capable of.

Conclusion: apples to apples, PLAY standalone is less efficient than Kontakt standalone. 

Both engines do not handle overloads well -- once I get beyond the 40 channels in Kontakt or the 32 channels in PLAY, I get bad audio glitching in both.

Neither program seems to be having a speed issue getting the data off the drive. Kontakt gives me a Disk meter as a % and it's barely moving. PLAY gives me a Disk meter as an amount, so I don't know what its theoretical max is. Yet it seems like in both cases, the bottleneck might be CPU power; in both programs, the CPU is close to or hitting or exceeding 100% of CPU and that coincides with audio glitches and dropouts.

What else can I try, in either program, to either measure this more scientifically, or to eke out better performance out of either?

Does anyone think that running each sampler in a host might perform better than in standalone? I'm a Mac guy, and on the Mac I've definitely noticed that NI products tend to run better in a host (where the host addresses the audio interface) than standalone. If so, any good, lean, mean, free hosts to recommend I try? (I may need a bit of help in learning how to route 3 different 16 channel interfaces of MIDI into the host and routing to three plug-in instances to replicate the test I've got going on here.)

Budget allowing, I'd stick a faster (quad) CPU in here to try it to see if I got better polyphony.


----------



## chimuelo (Aug 12, 2009)

Thanks for confirming Kontakt 3.5 for me.
I have only tested K3.0 using the 3GB PAE Switch and 4GB's of RAM.
I have been testing apps and builds with a video archiving Mac freak here in Vegas, and we have found that fast Dual Core CPU's and Mini-ITX designs are excellent for streamers. Because of the 128MB's of Sideport RAM the need for 4U designs using video cards is no longer necessary for audio apps or streamers.
I am using the Scope DSP platform and it's GUI is similar to AutoCAD's orbital mode as it stacks modules and allows transparency that can read up to 4 levels ( modules ) of stacking. It is more demanding than the basic 2D designs that audio apps use, and the 128MB of Sideport RAM is more than sufficient.
We have alreay built the first 1U which is 2 x slaves, each with 8GB's of DDR3 1333MHz of RAM w/ CL9 settings.
This design is using the older 790GX Mini-ITX boards and works fine. I will be building 2 x 1U's w/ the 785G boards, and 4 x of the fastest dual cores.
I have been preparing this for VSL's VE Pro. It will cost 2900 USD for 4 x streaming slaves, which will all fit in a 2U ATA Shockrack due to their short depth.
I am waiting for VE Pro and hopefully Nicki Batz will do an in depth review of it, similar to his excellent review of VE 3.0 a couple of months back.
I am so done chasing around O.S.'s and hardware upgrades. For 3 large I can do this build and it will last a few years of 24/7 Live performances.

Thanks Again for your tests using Windows 7 64 and K3.5. These boards are Windows 7 64bit ready and were designed with that O.S. in mind. 
Also instead of having a certain instrument with it's own CPU/Mobo, I will be spanning all instruments across 4 x SSD's for more effective Divisi's of Horns, Woodwinds and Strings. I can load 40+ top shelf reverbs from my 1U DSP rack along with all of the usual synthesizers and Modular devices, so I am very excited about this. I pray VE-Pro will allow 3rd party apps. If not, I will have Bidule and FX Teleport as a worst case scenario, and that ain't too bad for live work.


----------



## JohnG (Aug 12, 2009)

Hi ddas,

This is very interesting as I am sure many are still using the EWQLSO libraries on the Kontakt format and wondering about benefits of the PLAY version. 

However, as I thought a bit about your test, I wondered whether the comparison is 100% apples-to-apples, because of the new scripting that came with the PLAY versions of the samples.

The next question would be, I assume, "how big a difference would that make," but I don't know the answer to that either.

Also, it would be great to know which version of PLAY you used.

Thanks for running the test.


----------



## ddas (Aug 12, 2009)

Version of PLAY is noted above (it's the latest).

The scripting could potentially affect the results a little, however, since it's all "locked" and hidden, there isn't necessarily any scripting going on. I certainly don't hear any unusual behaviors when I'm using the instruments in PLAY. Also note that a lot of scripting in Kontakt, unless it is very, very complex, has a negligible CPU impact. So I'm inclined to assume that scripting is probably not making any difference.

I understand that EW has probably reprogrammed EWQLSO for PLAY so the instruments are not absolutely identical. But this is as close a comparison as I can make, and it's a real-world comparison. I literally did make sure to choose the same patches across both platforms, e.g. for the flute solo, both samplers have the patch loaded that contains all flute solo keyswitches, with stage mics only.

That's as close to apples and apples as we're going to be able to get for now...


----------



## stevenson-again (Nov 24, 2009)

> Both engines do not handle overloads well -- once I get beyond the 40 channels in Kontakt or the 32 channels in PLAY, I get bad audio glitching in both.



hi mate,

i have only just had a very cursory glance, but like i mentioned in the other thread (from which you directed me to here) you are not taking into account scripting. this will affect your performance significantly.

in my tests, i was experiencing glitches and drop outs with very very minimal polyphony and hardly any CPU stress at all. so it couldn't be lack of CPU or disk speed. it only occurred if there certain patches working concurrently. it would require no more than 4 patches working in order to induce the crackling, and that was at a buffer setting of 256 in logic - not even in plogue, though i tried it there as well with exactly the same results.

in both cases setting the buffer to 512 removed all restrictions to polyphony (with in reason), and crackling and glitching disappeared altogether.

i conclude from this that there are timing issues within the scripts that reside in kontakt certainly and most likely play as well that are outside of the amount of processing overhead available. in other words, those limits exist no matter what CPU you are using.

from everything i have read, both play and kontakt are more efficient on PC than on a mac and it is possible to run at lower buffer settings. but i think there are limits that won;t be resolved by having a faster computer. the optimizations have to occur in the samplers first.


----------



## ddas (Nov 24, 2009)

The NI version of EWQLSO has no scripting at all. The PLAY version, I'm not sure. If there is scripting, it's probably pretty minimal.

I find a buffer size of 512 absolutely intolerable for real sequencing. 256 is tolerable. 128 is great. These tests were done at 128. If I get time I can redo them at 256, but I expect the results would be slightly better than the original tests (above) but probably in line with those results roughly.


----------



## stevenson-again (Nov 25, 2009)

the crackling and popping i mentioned only occurred in script heavy patches such as 'the trumpet' and LASS. if you don't use those patches then i can also run lots of polyphony at 128 in either logic or plogue without any problems.

at 128, 'the trumpet' is barely useable as a single instance with all 8-cores sitting around not doing very much. this is my experience. and is why i conclude there must be timing issues outside of clock-speed.

i compose step-time so the latency rarely bothers me, but i do suggest that if you are using these script heavy patches that cause the crackling, to run them at 256 while you compose and switch to 512 for mixing.


----------



## Waywyn (Nov 25, 2009)

I know that latency is a pain with 512 or 1024 buffer ... and I know we talked about this somewhere else David ... also thanks a lot for doing that test ... BUT:

Running such a hardcore template in 128 buffer - and with "just" a DualCore 2,8Ghz, it is no wonders that the performance is kinda bad.

I never ever managed to have an orchestra setup with my 8core 3Ghz to run at like 256 or so ... I always use 1024. .. but get polyphony up to like 450-480.

If you really want good performance with such a low buffer size the best way would be to have slaves, midi routing and external soundcard into a console or a ò_;   ¸Ô


----------

