Embed Notice
HTML Code
Corresponding Notice
- Embed this noticeit basically takes all input parameters and adds a whole layer of neurons. then it runs that network through your standard gradient descent stuff. it then does a 'top k' pass so only the top k number of neurons (basically lowest error in testing) is kept and the rest are trashed. then it repeats the process again, until some stop condition.
wasn't thinking about it much but thats probably very similar to what this paper is doing, albeit in a way that is different and doesn't require changing how pytorch works.