Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://mastodon.online/users/ATLeagle/statuses/112441916205947083">mark (atleagle@mastodon.online)'s status on Wednesday, 15-May-2024 07:58:04 JST</a><a href="https://mastodon.online/@ATLeagle" title="atleagle@mastodon.online"><img src="https://gnusocial.jp/avatar/33848-48-20230715063505.webp" width="48" height="48" alt="mark" style="position: absolute; left: 0; top: 0;">mark</a><div><a href="https://mastodon.green/@gerrymcgovern/112433747877755267" rel="in-reply-to">in reply to</a><ul><li></ul></div></section><article><p><a href="https://mastodon.green/@gerrymcgovern">@gerrymcgovern</a> I always hate to see these stats slurred across uses. The training of the model is a fixed cost. That output is used on inference servers that do the customer facing work. Those are still very powerful (hot), but the way the "each query costs... " really works is an economy of scale. Inference is one machine. Training is enormous clusters. More queries actually makes the cost of training go down per transaction.</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3084663#notice-6126441">In conversation</a><time datetime="2024-05-15T07:58:04+09:00" title="Wednesday, 15-May-2024 07:58:04 JST">about a year ago</time> <span>from <span><a href="https://mastodon.online/@ATLeagle/112441916205947083" rel="external" title="Sent from mastodon.online via ActivityPub">mastodon.online</a></span></span><a href="https://mastodon.online/@ATLeagle/112441916205947083">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
mark (atleagle@mastodon.online)'s status on Wednesday, 15-May-2024 07:58:04 JSTmark
in reply to
- Gerry McGovern
@gerrymcgovern I always hate to see these stats slurred across uses. The training of the model is a fixed cost. That output is used on inference servers that do the customer facing work. Those are still very powerful (hot), but the way the "each query costs... " really works is an economy of scale. Inference is one machine. Training is enormous clusters. More queries actually makes the cost of training go down per transaction.
In conversationabout a year ago from mastodon.onlinepermalink

Public

Embed Notice

HTML Code

Corresponding Notice