Conversation
Notices
-
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Thursday, 14-Mar-2024 19:46:40 JST iced depresso
@tero i think this is something of an artifact with how most people are doing it. i passed by a paper that was suggesting neural networks *are* decision trees, and some of the really old stuff on sparse networks (GMDH comes to mind) made attempts to winnow connections down to some useful subset at various layers.
it's kind of the modern "google-brained" thing to just slap dense nets of every parameter to every parameter and just throw a datacenter at the problem to make up for how ineffective this is.-
Embed this notice
Tero Keski-Valkama (tero@rukii.net)'s status on Thursday, 14-Mar-2024 19:46:41 JST Tero Keski-Valkama
I don't think I have written it anywhere but deep neural networks have a unit bias. It's one of those things one just knows and doesn't even realize other people don't know that.
It's the reason why tree models generally win deep neural networks with tabular data. For decision tree based models one of the basic operations is a comparison, and those operations generally have meaning with tabular data. It has correct inductive biases respecting units of things. It doesn't sum ages to countries.
With neural networks the first operation is a weighted sum over *everything*. In tabular data that means ages, religions, countries, weights, dollars. What's 7 years plus 21 dollars? Does it make any sense? No?
That's why neural networks have hard time with tabular data. Their inductive bias is assuming that everything is in the same units.
-
Embed this notice