# MacBook Pro M1 - What is Memory Bandwidth?



## Prockamanisc (Oct 20, 2021)

I can't get a straight answer regarding this. What is memory bandwidth, specifically as it applies to what we do? Some resources I've found have made me think that it's only for video tasks. 

If the new M1 Pro has 200GB/s of bandwidth, and the M1 Max has 400GB/s, would that mean that applications will run twice as fast?


----------



## Technostica (Oct 20, 2021)

Memory bandwidth is the rate at which data is transferred between the CPU and RAM.
If your app is bandwidth constrained, then increasing it will help performance.
In many cases, doubling it will make no difference at all.
There are benchmarks comparing systems with dual and quad channel RAM for audio work and I don’t recall there being a massive difference.
It might be significant to some people in some scenarios, but you’d need to look at the data.


----------



## Technostica (Oct 20, 2021)

The M1 has about 70GB/s, so the jump to 200GB/s is very likely already overkill for audio work.
GPUs can use over a TB/s and they demand even more, so the 400GB/s will be handy there.
So unless the M1 Max chip also has better memory latency, which seems unlikely, the M1 Pro should be fine.
This is typical behaviour, not that Apple are normal.


----------



## rnb_2 (Oct 20, 2021)

Technostica said:


> The M1 has about 70GB/s, so the jump to 200GB/s is very likely already overkill for audio work.
> GPUs can use over a TB/s and they demand even more, so the 400GB/s will be handy there.
> So unless the M1 Max chip also has better memory latency, which seems unlikely, the M1 Pro should be fine.
> This is typical behaviour, not that Apple are normal.


Yes - I think the decider between Pro and Max for music is your RAM requirement, nothing more. If you need 64GB, then you have to pay the "Max Tax" to enable that. You'll get an extra 8-10 GPU cores (vs. the 10-core/16-core or 10-core/14-core Pros) minimum and the faster memory bandwidth, which certainly shouldn't hurt, but neither would sway me to go Max by themselves for audio work. For graphics/video/photo work, I think the base 24-core GPU Max might be worthwhile, even without upgrading past 32GB.


----------



## Nick Batzdorf (Oct 20, 2021)

I also haven't seen any GHz specs for the processing cores.

Yes, computer manufacturers would love to get away from that, plus I have a natural aversion to benchmark specs - such as memory bandwidth  - that are almost always irrelevant. But the speed of the processor does actually seem to mean something for our application.

Also, is multithreading still a useful feature? It does appear to be with Intel processors. This is my trusty 12-core:


----------



## mauriziodececco (Oct 21, 2021)

Some more information to complements what already said: the M1 family use this idea of "unified memory", nothing especially new, but especially well done in their architecture.

The memory is shared between all the CPUs, and the CPUs and the other computing elements, GPUs, neuron engine, I/O etc. So the memory bandwidth is shared between all these elements.

In many multicore chips, performance do not scale with the number of core because it become limited by the shared memory bandwidth.

So, yes, memory bandwidth do affect performance, but more as a limit when it is not enough, or an enabler when good, than directly; it is a very important parameter in the design of a architecture, but not the only one.

The M1 memory bandwidth wasn't enough for the M1 Pro or M1 Max CPUs and GPUs. 
The difference in memory bandwidth between the Pro and the Max was included to support the higher number of GPUs and other video hardware.

So, it is unlikely that the difference in memory bandwidth between the Pro and the Max will make any difference on applications that are light on the GPUs, like ours, *unless* the M1 Pro memory bandwidth is not "enough".

Anyway, any difference in performance would be visible only in applications loading all the CPUs very hard (situation that may occur using a DAW); i think in the next months we will know if this is the case .

Note that in the Apple.com site, there are exemples of comparative performance between the M1 Pro, the M1 Max and the old Intel version, and for Logic Pro they do *not* advertise a performance difference, so there is probably none.

Maurizio


----------



## Nick Batzdorf (Oct 22, 2021)

mauriziodececco said:


> for Logic Pro they do *not* advertise a performance difference, so there is probably none.


They do say it's optimized for the new MacBook Pros, but that could mean many things.






Logic Pro release notes


Learn about the enhancements and improvements in the most recent versions of Logic Pro.



support.apple.com


----------



## khollister (Oct 22, 2021)

Nick Batzdorf said:


> I also haven't seen any GHz specs for the processing cores.
> 
> Yes, computer manufacturers would love to get away from that, plus I have a natural aversion to benchmark specs - such as memory bandwidth  - that are almost always irrelevant. But the speed of the processor does actually seem to mean something for our application.
> 
> Also, is multithreading still a useful feature? It does appear to be with Intel processors. This is my trusty 12-core:


Clock speed is only a useful metric when comparing CPU's with similar Instructions Per Cycle (IPC). I believe the M1's nominal clock is 3.2 GHz, but that means nothing relative to an i9 clock. It's the same reason why an i9-11900k is faster than a 10900K at the same clock speeds - higher IPC.

And while we are on the subject of M1 performance, I noticed something really interesting today. I use a UA Apollo Solo (TB3) with my M1 and a pair of UA Apollo X's (X6 + X8) with my iMac Pro. At 64/48k, the Apollo/M1 reports an RTL in Logic of 4.9ms. The Apollo/iMP combo is 6.0ms. I need to disconnect the X's from the iMac and try the Solo to make sure it is not some HW variable, but the same version of the UA drivers are on both machines with no DSP plugins loaded.


----------



## khollister (Oct 22, 2021)

An update on the latency observations. As it turns out, the difference between my M1 and iMP has nothing to do with the architecture but with the specific Apollo HW. When I connected the Solo to the iMP, it reported 4.9ms - exactly the same as the M1. I presume the difference is due to more DSP in the rack units and or slightly different converters. Kind of off topic here, but I thought I should tie off any questions


----------



## mauriziodececco (Oct 23, 2021)

Nick Batzdorf said:


> They do say it's optimized for the new MacBook Pros, but that could mean many things.
> 
> 
> 
> ...


Yes, i mean no differences betweeen M1 Pro and M1 Max, of course they talk of differencies wrt to the M1


----------



## Nick Batzdorf (Oct 23, 2021)

mauriziodececco said:


> Yes, i mean no differences betweeen M1 Pro and M1 Max, of course they talk of differencies wrt to the M1



Hopefully the increased power translates to more Logic!


khollister said:


> difference between my M1 and iMP


Sorry, WTF is an iMP?

There should BAFL that abbreviations be OFAE.

EDIT: I figured it out - iMac Pro.

Oy!


----------



## Nick Batzdorf (Oct 23, 2021)

khollister said:


> Clock speed is only a useful metric when comparing CPU's with similar Instructions Per Cycle (IPC).


So they say, but I'm not buying it completely (based on real-world reports from random strangers on the Internet).


----------



## Dewdman42 (Oct 23, 2021)

khollister said:


> An update on the latency observations. As it turns out, the difference between my M1 and iMP has nothing to do with the architecture but with the specific Apollo HW. When I connected the Solo to the iMP, it reported 4.9ms - exactly the same as the M1. I presume the difference is due to more DSP in the rack units and or slightly different converters. Kind of off topic here, but I thought I should tie off any questions


latency is not a direct result of CPU speed. Latency is related to the buffer size. period. No CPU will make a certain buffer size any less latent then another CPU. The buffer size determines the latency, end of story. 

For example, if the buffer size is 64 samples, that represents 1.33ms of time at 48k sample rate.

Except that some device drivers double the buffer....so..now you're up to 2.66ms. And some device drivers will add other safety buffers on top of that. So it is also dependent on the specific audio device you are using how they implemented their drivers. Yes the software programming can make a difference, they are not all created equal...and it also matters whether you are using PCI bus, USB, etc...which can impact things forcing the device developer to add safety buffers (with more latency). Lynx and RME products, for example, are well regarded for very low latency and making good use of small buffers, on the PCI bus.

But in any case, a faster CPU won't make the buffer size be any less latent.

Now.. Faster CPU's can sometimes handle smaller buffer sizes with less drop outs...so there is that... You may be able to routinely run a 32 sample buffer when using a certain CPU, but with another CPU...no dice. But I think the specific audio device you use has a much bigger impact on your latency results.

With regards to the Apple Silicon, it would be interesting to see some actual comparison tests in this regard between various different chips to see who can handle 32 sample buffer or god forbid a 16 sample buffer without drop outs. But the only way to really do this kind of comparison, you have to compare apples to apples. Change the CPU, while using consistently the same audio device and drivers so that the CPU will be compared and not the audio device.


----------



## mauriziodececco (Oct 26, 2021)

On the subject, there is an interesting article on Anandtech, pretty technical but very complete.
Actually, the 10 cores alones *can* saturate memory bandwidth on a M1 Pro, and not on a M1 Max.
No idea if a DAW can actually do it, that another story.
But if i had to do something like video and audio editing with something like Da Vinci Resolve, i would absolutely get a M1 Max.

Here is the link to the article 

Maurizio


----------



## Al Maurice (Oct 26, 2021)

I tend to agree with @Dewdman42, the important factor now comes down to the buffering technology, not the memory read or write speeds alone. So I'd check that out to be honest, which probably will have more impact on audio and midi performance.


----------



## seclusion3 (Oct 26, 2021)

Let’s hope we can set Logic to its lowest 32 buffer setting and never speak of that setting again regardless of the project size.


----------



## gsilbers (Oct 26, 2021)

Dumb question but why isn’t the memory inside the chip and as small as the other components?
From images it looks it’s on the sides and as big as the whole chip.
But they where able to fit some crazy gpu specs in there but memory still seems like a big clunker on the sides and the chip has a buffer for it.


----------



## rnb_2 (Oct 26, 2021)

gsilbers said:


> Dumb question but why isn’t the memory inside the chip and as small as the other components?
> From images it looks it’s on the sides and as big as the whole chip.
> But they where able to fit some crazy gpu specs in there but memory still seems like a big clunker on the sides and the chip has a buffer for it.


RAM is a commodity component provided by a third party with existing manufacturing capacity, so you let them make it and supply what you need. On top of that, there are limits as to how big a chip anybody wants to manufacture, as it limits the number of processors you can get from a single wafer - the M1 Pro is a fairly large chip, and the M1 Max is bigger still.


----------



## Technostica (Oct 26, 2021)

gsilbers said:


> Dumb question but why isn’t the memory inside the chip and as small as the other components?
> From images it looks it’s on the sides and as big as the whole chip.
> But they where able to fit some crazy gpu specs in there but memory still seems like a big clunker on the sides and the chip has a buffer for it.


RAM is fabricated on different process nodes and production lines.
Not everything can be scaled as small as the logic circuits or cache in SoCs.
Maybe RAM chips could be a lot smaller, but then the cost might be too high!
They are commodity items which are produced in vast quantities which helps with price and supply.
It can be very expensive to shrink designs as you need new tooling.
So you only see it done when the benefits outweigh the cost.
For CPUs and GPUs the benefit is there.
I doubt it is for RAM.
Apple are seemingly using industry standard memory and storage chips.
I am not sure that even they want the expense of designing and manufacturing their own.


----------

