Conversation

Notices

Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:36:51 JST iced depresso

somewhat amused that gaussian mixture models are still relevant.

a group figured out in 2021 how to train them with gradient descent, instead of expectation maximizing. so basically no more manual feature alignment and all that jazz and the distributions are even better tuned.

you still pay out the ass to run gradient descent, but you don't pay out the ass for inference (because GMMs are cheap :ablobcatrainbow:)

In conversation about a year ago from blob.cat permalink
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:39:49 JST iced depresso
  in reply to
  
  i also rly want to know if we can use wavelets on sound instead of this mel-cepstrum shit, because wavelets are invertible (versus having to train these weird mel-cepstrum reverse synthesis things that add all kinds of artifacts.)
  
  i read a few papers that were looking at this question for physics problems and... wavelets do very well with neural networks. in some cases they entirely replaced the need for convolution kernels, because they already convolve frequency/temporal space, and were detecting gravity wave anomalies and such with it.
  
  wavelet->gmm->wavelet is attractive because this is potato level technology. you can run it all day on your 10 year old thonkpad.
  
  In conversation about a year ago permalink
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:42:15 JST iced depresso
  in reply to
  
  would love to but will likely never write the code to have it run on those larynx features too :ablobcatattention:
  
  i don't know how to make that shit work though. thor and paul might have known. only thing that comes to mind is to try and make the larynx synth differentiable somehow, or use a genetic solver and just leave it to run on a pine quartz for a week.
  
  In conversation about a year ago permalink
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 12:47:09 JST iced depresso
  in reply to
  
  i'm probably one of the only people that wants voice cloning just to read books and text messages and very little harmful intention at all :ablobcatgooglytenor:
  
  In conversation about a year ago permalink
- Embed this notice
  Miander (mian@berserker.town)'s status on Monday, 09-Oct-2023 14:33:27 JST Miander
  in reply to
  
  @icedquinn
  It's kinda sad that only Spaghetti Monster AI (NNs) gets attention when there is a whole zoo of models to use and learn about.
  
  In conversation about a year ago permalink
  
  iced depresso likes this.
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Monday, 09-Oct-2023 14:36:39 JST iced depresso
  in reply to
  - Miander
  @mian that's where the research funding is rn.
  
  i've seen some interesting tricks done using gradient descent and such though. i'd love to have things like nvidias model optimizer fixed up a bit to work in realistic timeframes, or a system for making decent old HMM voice machines.
  
  being able to synthesize character voices in text chats has been an age old dream :blobcatpain:
  
  In conversation about a year ago permalink

Public

Notices

Feeds