Addendum 1
Theory for Emergence of Complex Skills in Language Models
https://arxiv.org/abs/2307.15936
* new skills emerge in language models when their parameter set, training corpora are scaled up
* poorly understood phenomenon; mathematical analysis of gradient-based training difficult
* paper analyzes emergence using scaling laws & simple statistical framework
* mathematical analysis imply strong form of inductive bias that allows pre-trained model to learn very efficiently
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.