What's new

AI - Next steps we can take [SPINOFF for non "next steps" conversations]

everything sounds somewhat the same, so the novelty wears off fast. I think it's currently just a fun toy. Maybe it's enough for elevator music.
It can do a wide range of styles:







... as well as Chinese opera....
 
The reason it sucks now, and may continue to suck in the future at high levels, is it forgoes the process for the output.

The magic is in the process, not the final output. The individual notes, dynamics and intricacies of recordings. The performance capturing that moment in time. Writing something that makes the human composer happy. The computer doesn’t care, and that’s why it’s no fun to listen to behind a gimmicky lyric song.

Yes, of course some types of music will go away to AI. But unless the creators of these programs can give the process back to humans and creators, it’ll become stuck relatively soon. Thankfully music is currently at the bottom of the AI totem pole below text and image.

Also giant conglomerates will sue the shit out of them for stealing all content In existence. But that’s an entirely different discussion.
 
Looks like Suno is going to be killed off by competition already.



Supposedly Udio's release date is April 20th.
 
I haven't played with Suno yet, but if the Model is not very responsive to prompting, that's the problem with the training data.
This will be little speculative. I guess the training data consists of songs/tracks and their descriptions based on tags and the description/comment/title that the author provided. However most of the time this will be something like "Epic, Orchestral, Emotional, Soundtrack". This is very limited and a lot of tracks will have this similar description. But if the tracks will be described like: "The piece starts with a slow rhythm played by the taiko drums. Next the violins start to play an ostinato pattern in D minor harmonized by cellos and doubled by basses. At bar 20 french horn starts playing emotional melody with the low brass and second violins playing slow chords.".
The more detail in the description, the better the Model can find the individual elements of the music. If the training data will get this kind of description level, then the Model will generate more varieties and have better prompt responsiveness. It is very hard to describe and quantify music. Actually the best descriptions for music we have right now are the notation of classical music. Having a full score notation you can easily see what instruments are playing and what is the rhythm, harmony, structure and overall emotion. Those are not text descriptions, because notation is more of a graphical language, but it those are avery good descriptions.
What is holding the level of quality of current Text-to-Music Models right now is a lack (that I know of) of good Music-to-Text models. Current Text-to-Image and Image-To-Text Models can train one another. You generate random promt, then you generate image based on that prompt and next description of this image. Finally you compare prompt with description and if they match, you keep the Model parameters, if not you update the parameters and start again. After you "kickstart" the training with the real world data, then at some point you can continue the training only on synthesized data. And one can argue that after several iterations using this method, there is no real world (and Copyrighted) data inside the Models anymore.
The moment a good Music-to-Text Model will appear, all Text-to-Music Models will skyrocket.
 
And another - I get the feeling spotify etc is going to be absolutely bombarded with AI music over the next few months. Broadcasters are probably waiting cautiously to see how things shake out legally before looking into using this stuff.

 
In my opinion, that's a valid use-case for this. Get inspired and then make your music out of it.
Especially if also editing the melody and rhythm, making creative choices about instrumentation, vocals, and effects, combinations of different Suno renders and different subgenre combinations or creative prompts, etc. If the melody is edited substantially enough it will qualify for copyright in the US (though it's not clear exactly how much would it have to be edited... I don't think there's any set standard in non-AI copyright cases for changing an existing public domain melody to claim copyright on it, or for editing an existing copyrighted melody enough to avoid infringement and make your own copyright claim).
 
A lot of these demos coming out have the sonic imprint of audio that has been unmixed via stem separation… I wonder if there’s not really a way to improve on that if they’re using music by established artists to train these models, given that cleanly separated parts in the actual stems aren’t publicly available?
 
From where does the AI obtain their instruments and vocal sounds? Are they "stealing" from actual music previously created or even sample libraries (I wonder what developers think about this)?
 
Mike's right. The people I'm talking to are the evangelists, not the business side of this equation.
Then that's a entire new level of naivety.
Generative A.I. is just replacing human work, it's not solving our most urgent problems like climate changes, housing crisis, deadly diseases , social inequities, etc.

All it does is remove humans from art creation. How does it make it a better world ?
 
That's quite a statement considering today nobody knows how to get to AGI.
It's also quite the statement to assert the exact nature of the infrastructure required to run AGI

And when it happens, it will be a super monster computer. Imagine a full data center powered by its own power plant. It will be extremely expensive to run. I seriously doubt it will be used to power chat bots and generate images/music.
 
It's also quite the statement to assert the exact nature of the infrastructure required to run AGI
I was obviously speculating on that but we know for sure it's going to require a lot of compute. And unless a miracle happens, compute needs a lot of power. Either for CPUs or cooling.

Kurzweil estimated something like 10^16 flops for the human brain.

We already have exascale super computers that have surpassed this (10^18 flops) and AGI is nowhere to be seen. Even the most efficient of those already consumes 40MW which is equivalent to the energy consumed by 30,000 homes. See the attached image.

So it looks like we're going to need a lot more compute.

In a decade or so we could get to zettascale super computers which would require multiple nuclear power plants to operate. BTW that's a claim by AMD's CEO.


Yes, a miracle could happen with quantum computing or fusion power but today nobody really knows how this is going to turn out and in which time frame.
 

Attachments

  • 1712786454075.png
    1712786454075.png
    1.1 MB · Views: 20
Coders or people spearheading companies like these tend to be dreamers, motivated by challenges
They're also often cloistered, unworldly nerds. Sorry, but lets be real here. Yes they're motivated by challenges, like all of us, but to them, a challenge is something like "how can we get the music we need for this production without having to interact socially with other humans?". They see music / art in general as a problem to be solved with coding.
 
I was obviously speculating on that but we know for sure it's going to require a lot of compute.
Wasn’t obvious to me - In fact you appeared 100% certain. As you do with most of your Luddite views on AI it has to be said

Quantum AI may only be a decade away
 
I can’t get my SynthV vocal VST to sound this good…


But some kid with likely no (or limited) musical skills can pump this out with a few words.

I agree with what some others have said. The average person just wants background music while they do other things. If AI can make good-enough music, they’ll be happy to listen to it.

Lots of money involved here. I think its doubtful this can be stopped, even with all the lawsuits. Music (and art in general) may end up as a hobby and nothing more for the vast majority of people. Live music will be less impacted of course.

In the end, was social media good for kids? I think AI might be far worse. I work full time in IT, but I’m liking less and less what it’s doing to humanity.
 
I'm guessing it's more naivety, at least with the people Simon would be talking with. Coders or people spearheading companies like these tend to be dreamers, motivated by challenges, rather than pure profits. Both Google and OpenAi started with very honorable mission statements. They evolved, of course, but they started as dreamers, "make the world a better place" and all that.

"don't do evil" and such.... That went well...
Even if initially (still difficult to believe) it really may be a genuine feeling, honestly it takes a fistful of dollars for them to change in just a few years.
From "help connecting people" to "help inciting hate world wide for ad money" was a really short way.

After the first decades of tech experiences I think we should learn not to trust these particular "dreamers".
Giving them too much power and dodge rules is a mistake.

The law should move swiftly with the tech as it is developing and still in its infancy I think.
 
Top Bottom