GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    irwin (irwin@saturation.social)'s status on Monday, 12-Aug-2024 00:55:01 JST irwin irwin

    UX idea for local LLMs:

    Speed and responsiveness are highly desirable when chatting with LLMs, but on edge devices we don’t have the same kind of computing horsepower at our disposal. So why don’t we use the same kind of tactics humans use in normal conversation: linguistic fillers, signals we are formulating our thoughts (like the … animation in chat), even asking clarifying questions. These tactics can be used with little computing power while the real answer is formulated in the background.

    In conversation about a year ago from saturation.social permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Monday, 12-Aug-2024 00:54:59 JST Eaton Eaton
      in reply to

      @irwin In theory that sounds good, but unless you’re sidestepping the LLM it simply adds to the resource-demand spiral. And if you’re sidestepping the LLM that way, you’ll need a non-LLM mechanism to make the topically relevant filler, which means that you’re using vanilla NLP techniques, which… makes the LLM a bit of an albatross.

      In conversation about a year ago permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Monday, 12-Aug-2024 03:17:04 JST Eaton Eaton
      in reply to
      • Daniel

      @Ergo42 that was what I wanted to do early on, but it’s far beyond my relatively limited abilities, unfortunately. The real issue is that there is jo real mechanism to “reach in” — you can use the LLM to generate embeddings, and techniques like function calling are a neat hack, but like many ML models the actual internals are kind of a black box.

      In conversation about a year ago permalink
    • Embed this notice
      Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:17:05 JST Daniel Daniel
      in reply to
      • Eaton

      @eaton
      Well said.

      Admittedly, my experience with training and designing neural networks is limited, but wouldn't it be only slightly more computation to create a high level meta neural network that reaches into the LLM and estimates some attributes of the current LLM context, by attaching to specific neuron outputs?

      The attributes could be level of confidence through time, expected response time, etc. Then just create a UI program from the output of that meta neural net realtime value

      @irwin

      In conversation about a year ago permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Monday, 12-Aug-2024 03:25:19 JST Eaton Eaton
      in reply to
      • Daniel

      @Ergo42 @irwin this is where ontologies and know
      Edge bases integrated into the process come in, but (again) the challenge is that you can never “teach the LLM X” in the sense we think of it. You can get the LLM to kind-of extract kinds of words, then match those kinds of words to your ontology elements, then generate an idea of “correctness” and stuff it back into the LLM as context to drive its response.. but at that point the LLM is acting as a very computationally expensive text parser.

      In conversation about a year ago permalink
    • Embed this notice
      Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:25:20 JST Daniel Daniel
      in reply to
      • Eaton

      @irwin
      Exactly!

      Context matters. But maybe it's not as simple as I assume. How do you define confidence against truth when the system doesn't understand reality vs fiction.

      ... the only neural networks I've made surmount to a kid playing in a sandbox next to the construction of a dyson sphere. I can imagine the goal and that dirt goes here, but the science of AI is moving so fast, I couldn't even tell you why modern LLMs are so functional.

      @eaton

      In conversation about a year ago permalink
    • Embed this notice
      irwin (irwin@saturation.social)'s status on Monday, 12-Aug-2024 03:25:21 JST irwin irwin
      in reply to
      • Eaton
      • Daniel

      @Ergo42 @eaton My experience on the technical side is orders of magnitude smaller than yours. Just looking at them from a user interaction lens. Level of confidence <— YES. If only once in awhile Claude or ChatGPT would say “Hm, I’m not sure..” or “Take this with a grain of salt…” I mean, they do this currently with date-specific queries that are outside of their training data. But the idea of a meta neural net for UI seems so good I bet someone’s already started doing it? Apple Intelligence?

      In conversation about a year ago permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Monday, 12-Aug-2024 03:28:12 JST Eaton Eaton
      in reply to
      • Daniel

      @Ergo42 yeah, that was my hope when I started messing with them but (again) way beyond my side-project skills alas. AFAICT the challenge is that the internal model of language the LLM developa is not necessarily one that matches ours, just one that generates consistently similar outputs to ours. So a lot of debugging and experimenting with one specific LLM could tease out good theories about (say) which dimension is “noun-ish” or “height-ish” but…

      In conversation about a year ago permalink
    • Embed this notice
      Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:28:14 JST Daniel Daniel
      in reply to
      • Eaton

      @eaton
      I guess I was thinking that to "reach in" you could just process the inter layer values of a select few of the LLM layers by masking or compressing them down to a lower matrix size. Do that at a few points along the LLM, and use them as the input to a lower parameter network you training (after the LLM is trained) to predict the LLM time to respond.

      In conversation about a year ago permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Monday, 12-Aug-2024 03:31:34 JST Eaton Eaton
      in reply to
      • Daniel

      @Ergo42 @irwin If the next x years results in LLMs with the same basic capability but multiple orders of magnitude more respurce efficiency, I think the net result will end up positive. They’re fantastic and genuinely effective at (say) taking some correct but awkwardly assembled text — the stuff you’d be able to generate with “oldschool” methods of knowledge and response retrieval — and finessing them to read smoothly, more succunctly, match the brand voice, etc.

      In conversation about a year ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.