Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://poa.st/objects/c8f8543f-abb4-4f19-ace6-f4f0cb37a2e7">Aether ??? (aether@poa.st)'s status on Wednesday, 21-Jun-2023 20:40:27 JST</a><a href="https://poa.st/users/Aether" title="aether@poa.st"><img src="https://gnusocial.jp/avatar/3119-48-20220802142317.webp" width="48" height="48" alt="Aether ???" style="position: absolute; left: 0; top: 0;">Aether ???</a></section><article>A new LLM called phi-, with 1.3 billion tokens, scores over 50% on the HumanEval problem set. <br><br><a href="https://twitter.com/SebastienBubeck/status/1671326369626853376">twitter.com/SebastienBubeck/status/1671326369626853376</a><br><br>GPT-4 scores 67% - but uses 1.7 trillion tokens.<br><br><a href="https://twitter.com/swyx/status/1671272883379908608">twitter.com/swyx/status/1671272883379908608</a><br><br>How did they achieve this miracle? They trained phi-1 using textbooks rather than on the internet.<br><br>What does it means? It means you can produce an AI that is smart enough to perform simple tasks and small enough to run on your laptop or, probably, your phone.<br><br>What else does it mean? It means to score 85% on that test using the same approach as GPT-4 you'd need something like 2 quadrillion tokens, which would cost billions of dollars to train even if you could find that much data. And then years to "align" to get it to stop giving obviously wrong answers because you stuffed it full of nonsense.<br><br>Garbage in, garbage out.<br><br>Also, phi-1 took four days to train. <br><br><a href="https://arxiv.org/pdf/2306.11644.pdf">arxiv.org/pdf/2306.11644.pdf</a><br><br>Speaking of garbage, don't use textbooks published after 2010.</article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/1675406#notice-3296266">In conversation</a><time datetime="2023-06-21T20:40:27+09:00" title="Wednesday, 21-Jun-2023 20:40:27 JST">Wednesday, 21-Jun-2023 20:40:27 JST</time> <span>from <span><a href="https://poa.st/objects/c8f8543f-abb4-4f19-ace6-f4f0cb37a2e7" rel="external" title="Sent from poa.st via ActivityPub">poa.st</a></span></span><a href="https://poa.st/objects/c8f8543f-abb4-4f19-ace6-f4f0cb37a2e7">permalink</a><h4>Attachments</h4><ol><li><label><a rel="external" href="https://gnusocial.jp/attachment/1152804">Untitled attachment</a></label><br></li><li><label><a rel="external" href="https://gnusocial.jp/attachment/1152805">Untitled attachment</a></label><br></li><li><label><a rel="external" href="https://gnusocial.jp/attachment/1152806">Untitled attachment</a></label><br></li></ol></footer></blockquote>

Corresponding Notice

Embed this notice
Aether ??? (aether@poa.st)'s status on Wednesday, 21-Jun-2023 20:40:27 JST Aether ???
A new LLM called phi-, with 1.3 billion tokens, scores over 50% on the HumanEval problem set.

twitter.com/SebastienBubeck/status/1671326369626853376

GPT-4 scores 67% - but uses 1.7 trillion tokens.

twitter.com/swyx/status/1671272883379908608

How did they achieve this miracle? They trained phi-1 using textbooks rather than on the internet.

What does it means? It means you can produce an AI that is smart enough to perform simple tasks and small enough to run on your laptop or, probably, your phone.

What else does it mean? It means to score 85% on that test using the same approach as GPT-4 you'd need something like 2 quadrillion tokens, which would cost billions of dollars to train even if you could find that much data. And then years to "align" to get it to stop giving obviously wrong answers because you stuffed it full of nonsense.

Garbage in, garbage out.

Also, phi-1 took four days to train.

arxiv.org/pdf/2306.11644.pdf

Speaking of garbage, don't use textbooks published after 2010.
In conversationWednesday, 21-Jun-2023 20:40:27 JST from poa.stpermalink
Attachments

Public

Embed Notice

HTML Code

Corresponding Notice