Conversation

Notices

Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Sunday, 18-Feb-2024 16:08:59 JST iced depresso

huh. neat.
reading this paper on sparse neural network training. they mention trying to track which parts of layers tend to be used or not.

this is what an old ukraine scientist did when he made something called GMDH. he didn't make a huge mesh and mask it out--he has it grow the network layer by layer, then prunes it, and continues.

i discovered this because some obscure mac developer uses this method in software he sells to financial forecasting professionals.

In conversation about 9 months ago from blob.cat permalink
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Sunday, 18-Feb-2024 16:11:11 JST iced depresso
  in reply to
  
  it basically takes all input parameters and adds a whole layer of neurons. then it runs that network through your standard gradient descent stuff. it then does a 'top k' pass so only the top k number of neurons (basically lowest error in testing) is kept and the rest are trashed. then it repeats the process again, until some stop condition.
  
  wasn't thinking about it much but thats probably very similar to what this paper is doing, albeit in a way that is different and doesn't require changing how pytorch works.
  
  In conversation about 9 months ago permalink
- Embed this notice
  iced depresso (icedquinn@blob.cat)'s status on Sunday, 18-Feb-2024 16:12:20 JST iced depresso
  in reply to
  - Vo
  @Vo yeah its fascinating its a deep learning sparse model from ages past. i think he invented it in the 90s or something and it was long forgotten.
  
  one of those things that reminds you slavs are very smart actually
  
  In conversation about 9 months ago permalink
- Embed this notice
  Vo (vo@noauthority.social)'s status on Sunday, 18-Feb-2024 16:12:21 JST Vo
  in reply to
  
  @icedquinn >Pruning: GME...
  
  In conversation about 9 months ago permalink

Public

Notices

Feeds