# What is faster: bit operations or if statements?



## thesteelydane (Jul 29, 2020)

I’ve seen a bit (pun intended) of code that used bit operations to make sure a value didn’t exceed a certain threshold, basically using sh_right(value,31) and some math.
I can accomplish the same with a simple If statement:



```
if (value >= threshold)
      value := threshold
end if
```

So why use sh_right instead? Said piece of code was being called inside a busy listener callback, but is it really that much more efficient than an if statement?


----------



## EvilDragon (Jul 29, 2020)

Bitshifts are faster, but with today's CPUs it's all superfast anyways...

The reason to use bitshifting here is so that you can inline that operation with other math, instead of spending 3 lines on it.


----------



## thesteelydane (Jul 29, 2020)

Ah I see. Thanks!


----------



## MartinH. (Jul 29, 2020)

I don't use and know nothing about KSP, but in other languages there's a clamp() function for things like that, which I find more readable than a bit shift and also is just one line.


----------



## EvilDragon (Jul 29, 2020)

Yep KSP doesn't have it. This is the equivalent of it


----------



## Ben (Jul 29, 2020)

Native bit-shifts are super fast (normally only one cycle for the calculation). Compared to that simple if checks with bigger or smaller require around 3 cycles. Comparisons with smaller-equals (<=) or bigger-equals (>=) are also more expensive need additional 1-2 cycles.

As mentioned this all is true for native code (assembly, C, C++). If you are using another language you should check how the compiler/interpreter handles these instructions.
Many script languages are not capable of true bit-operations, which makes these operations really expensive during run-time. In this case a simple "if" is not only more readable but performs better as well.


----------



## EvilDragon (Jul 29, 2020)

In KSP it's more efficient than an if


----------



## thesteelydane (Aug 2, 2020)

Well, I have a double sequencer running in a very busy listener callback, and after I replaced my if statement with sh_right and some math, CPU usage went from 2% to 1 %, so there is a noticeable difference! I don't think my sequencers can be more efficient than they are now, so I'm happy I discovered this trick!


----------



## polypx (Aug 2, 2020)

Can you explain the bitshift clamping trick?


----------



## thesteelydane (Aug 2, 2020)

Sure! Say you have a value that you want to keep between 0 and a max. So if its less than 0 you want it to be 0 and if its more than max, you want it to be max, you can do something like this: 


```
result := value*(1+(sh_right(max-value, 31) .or. sh_right(value, 31)))-(max * sh_right(max-value, 31))
```


----------



## polypx (Aug 2, 2020)

Thanks! That's remarkable. But with the three bitshifts, the .or. comparison, a subtraction, an addition, and two multiplications ... all that is still computationally more efficient than >= ?


----------



## EvilDragon (Aug 2, 2020)

Bit shifts and binary comparisons are superfast operations.

Whenever you have a branch, the CPU usually guesses what might happen, but if it guesses wrongly, you end up with slower code execution because it needs to trace things back, reload different data into registers, and so on. Oftentimes this ends up being longer than three bitshifts, binary OR, and a few adds/multiplies. 






Avoiding Branches - Chessprogramming wiki







www.chessprogramming.org


----------



## thesteelydane (Aug 2, 2020)

polypx said:


> Thanks! That's remarkable. But with the three bitshifts, the .or. comparison, a subtraction, an addition, and two multiplications ... all that is still computationally more efficient than >= ?



I don't even understand how bit shifts work, I just know what they spit out. In this case I replaced an if statement in a listener callback with this line of code, and my CPU usage dropped from 2 to 1 %, so it would seem so....

This is all anecdotal evidence of course, I have no hard data to back it up.


----------



## ism (Aug 2, 2020)

Is this kind of ultra low level optimization-by-hand really necessary in KSP?


----------



## EvilDragon (Aug 2, 2020)

It's useful in some cases, but as a must-have optimization, I think you shouldn't look at it like that. Bitshifting is useful when working with bitmasks, for example.


----------



## ism (Aug 2, 2020)

Bitmasks have semantic value in a language as primitive as KSP. And premature optimization is one if the great sins of engineering.


I’m just learning KSP myself and surprised that this kind of register level optimization is even contemplated in such an high level, abstracted scripting language. Even a 1% performance increase seems quite shocking.


----------



## EvilDragon (Aug 2, 2020)

Again I don't think these commands should be looked at through the prism of optimization. You can do nifty stuff with bitwise operations...


----------



## polypx (Aug 2, 2020)

It's probably unnecessary for KSP optimisation generally. But I DO use a lot of clamping in some of my scripts, so much so that it's almost always a called function. If that function is slightly improved, it certainly doesn't hurt to use that instead.

Plus I'm curious about the bitshift logic... it's slightly tricky to get my head around.


----------



## EvilDragon (Aug 2, 2020)

Bitshifts are fast multiplies or divisions. Divisions are more CPU expensive than bitshifts. Single bitshift left is multiply by 2, single bitshift right is divide by 2.


----------



## thesteelydane (Aug 2, 2020)

polypx said:


> It's probably unnecessary for KSP optimisation generally. But I DO use a lot of clamping in some of my scripts, so much so that it's almost always a called function. If that function is slightly improved, it certainly doesn't hurt to use that instead.
> 
> Plus I'm curious about the bitshift logic... it's slightly tricky to get my head around.



It’s a head scratcher for sure, and I suck at math, so took a while for me to get my head around. By the way, if you only need to make sure your value doesn’t get over the max, you can get rid of the .or. and the second sh_right in the example above.


----------



## polypx (Aug 3, 2020)

Actually I'm more curious about how to include the $min as a variable as well...


----------



## thesteelydane (Aug 3, 2020)

polypx said:


> Actually I'm more curious about how to include the $min as a variable as well...


I've been thinking about that too, but since I don't need it for my current project, it's not high on my list right now. I'm sure ED has a solution.


----------



## polypx (Aug 3, 2020)

EvilDragon said:


> Bitshifts are fast multiplies or divisions. Divisions are more CPU expensive than bitshifts. Single bitshift left is multiply by 2, single bitshift right is divide by 2.



_So sh_right(x, 31) is dividing by 2^31? That seems a huge number._


----------



## EvilDragon (Aug 3, 2020)

Well, yeah. But that's also the limit of value range of 32-bit variables. So you use something like that to check the sign of the number really quickly, doing:


```
sh_right(value, 31) .or. 1
```

will return -1 for negative number and 1 for positive number (and will also return 1 for zero).


----------



## Al Maurice (Aug 3, 2020)

I find bitshifting extremely helpful when used in the right cases, but not when it makes the code appear confusing and convoluted. So when you want to check between two values sometimes it's just easier to just use a simple if or with the assignment if you were using.

Alternatively you can use a subtraction operator, so anything that comes greater than 0 is generally larger and anything less smaller. Although this all depends which way you place the variables.

Also if you're thinking of reusing a value, perhaps it might be better to store that calculation in a variable and then you can check it multiple times.


----------



## Ben (Aug 3, 2020)

Al Maurice said:


> I find bitshifting extremely helpful when used in the right cases, but not when it makes the code appear confusing and convoluted.


That's what code-comments are made for 
If it makes your programm significant faster I would use these optimizations, but put easy understandable pseudo-code in comments next to it.

But: Never start optimizing code too early. Make sure you have running and correct code before you start optimizing it. The fastest code is no use if it does not work as intended and you will loose too much time of your life.


----------



## szcz (Aug 3, 2020)

I took time to do some tests.
First, the bitshifting based clamp function has some restrictions:
1. If input value is between -2147483647 and -2146483648 it returns max instead of 0.
2. If max value is 2147483647, any negative number will return max instead of 0.
This is not really a dealbreaker, but something to have in mind.

Then I did tests processing array of 10^6 numbers, filled with pseudo-random data.
testing code looked like this:

```
reset_ksp_timer
$x := 0
while($x < $A_SIZE)
    $value := %in[$x]
    call value_clamp
    %out[$x] := $result
    inc($x)
end while
set_text($info,"time: " & $KSP_TIMER)
```

I tested three functions.

```
{quoted bitshift based clamp}
$result := $value*(1+(sh_right($clamp_max - $value, 31) .or. sh_right($value, 31)))-($clamp_max * sh_right($clamp_max - $value, 31))
```


```
{crude if series}
$result := $value
if($result > $clamp_max)
    $result := $clamp_max
end if
if($result < 0)
    $result := 0
end if
```


```
{optimized if series}
$result := $value
if($result > $clamp_max)
    $result := $clamp_max
else
    if($result < 0)
        $result := 0
    end if
end if
```

Results I was getting for a single pass (rounded microseconds to milliseconds):
bitshift based function: 368-408 ms
crude if: 257-258 ms
optimized if: 228-229 ms

If I used ">=" instead of ">" in "crude if": 260-261 ms

Of course that's a 'synthetic' test.

updated: bitshift times varies a bit if I close and re-open Kontakt, or start new patch, if seems consistent.


----------



## polypx (Aug 3, 2020)

I was planning to do a test like this when I got home... thanks!


----------



## PaulieDC (Aug 3, 2020)

This is usually about the point in the conversation where I jump in and add absolutely nothing of value except for a potential laugh (28% success rate), so here's the joke that will explain programming to _anyone_:

A software developer is headed to the store, and asks his wife if she needs anything. She replies "Yes, get a loaf of bread, and if the eggs are on sale, get a dozen".

↓↓↓↓↓↓




He came back home with a dozen loaves of bread.


----------



## thesteelydane (Aug 4, 2020)

szcz said:


> I took time to do some tests.
> First, the bitshifting based clamp function has some restrictions:
> 1. If input value is between -2147483647 and -2146483648 it returns max instead of 0.
> 2. If max value is 2147483647, any negative number will return max instead of 0.
> ...



Damn, so a simple if statement is faster. So why did I see better performance with the bitshift (anecdotal of course)? My code was inside a listener callback.


----------



## EvilDragon (Aug 4, 2020)

Computers are as inconsistent as they come, even from CPU to CPU


----------



## szcz (Aug 4, 2020)

Yes, and situation is different: processing large array vs listener tick (doing different stuff in between ticks). Also CPU meter is not very accurate, I guess. You may try to push the listener to create a lot more strain. Change from BEAT to MS, give it some crazy interval, that would push the CPU into 50%+ range and then see what performs better? I'm curious.


----------



## PaulieDC (Aug 4, 2020)

EvilDragon said:


> Computers are as inconsistent as they come, even from CPU to CPU


There's so many factors like that, oy vey. Here I built this 65lb 12-bay psycho tower in 2018 with 14 cores, 128GB ram, 30 million in NVMe drives, and I try it out with a few EW libraries and it spits and crackles like a rattlesnake in dry bowl of corn flakes. 

After I discovered Studio One's poor multicore management I ditched that completely, crossgraded to Cubase 10 Pro and invested in an RME Babyface Pro. WHOA. I don't even remember what rattlesnakes sound like at this point, TOTALLY rectified the issue. Don't blow the barn on the fastest CPU on earth, make sure every last component you are running is roughly in the same league. Just a thought.


----------

