Conversation
Notices
-
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:36:51 JST iced depresso somewhat amused that gaussian mixture models are still relevant.
a group figured out in 2021 how to train them with gradient descent, instead of expectation maximizing. so basically no more manual feature alignment and all that jazz and the distributions are even better tuned.
you still pay out the ass to run gradient descent, but you don't pay out the ass for inference (because GMMs are cheap :ablobcatrainbow:)-
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:39:49 JST iced depresso i also rly want to know if we can use wavelets on sound instead of this mel-cepstrum shit, because wavelets are invertible (versus having to train these weird mel-cepstrum reverse synthesis things that add all kinds of artifacts.)
i read a few papers that were looking at this question for physics problems and... wavelets do very well with neural networks. in some cases they entirely replaced the need for convolution kernels, because they already convolve frequency/temporal space, and were detecting gravity wave anomalies and such with it.
wavelet->gmm->wavelet is attractive because this is potato level technology. you can run it all day on your 10 year old thonkpad. -
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:42:15 JST iced depresso would love to but will likely never write the code to have it run on those larynx features too :ablobcatattention:
i don't know how to make that shit work though. thor and paul might have known. only thing that comes to mind is to try and make the larynx synth differentiable somehow, or use a genetic solver and just leave it to run on a pine quartz for a week. -
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:47:09 JST iced depresso i'm probably one of the only people that wants voice cloning just to read books and text messages and very little harmful intention at all :ablobcatgooglytenor: -
Embed this notice
Miander (mian@berserker.town)'s status on Monday, 09-Oct-2023 14:33:27 JST Miander @icedquinn
It's kinda sad that only Spaghetti Monster AI (NNs) gets attention when there is a whole zoo of models to use and learn about.iced depresso likes this. -
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 14:36:39 JST iced depresso @mian that's where the research funding is rn.
i've seen some interesting tricks done using gradient descent and such though. i'd love to have things like nvidias model optimizer fixed up a bit to work in realistic timeframes, or a system for making decent old HMM voice machines.
being able to synthesize character voices in text chats has been an age old dream :blobcatpain:
-
Embed this notice