GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:07 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
    Thank you, Pewdiepie. You are far too kind.
    In conversation about 23 days ago from poa.st permalink

    Attachments


    1. https://i.poastcdn.org/fd/d5/89/fdd5891be40868057b1802718df1ac23a25258f389010792165ea76005f7930f.JPG
    • Embed this notice
      verita84 (verita84_64@poster.place)'s status on Friday, 12-Jun-2026 10:38:03 JST verita84 verita84
      in reply to
      • <- ?? Striderpod ?? ->
      @WandererUber @striderpod

      I never gave you LLM snippets.

      Go use PewDiePie's software lol.
      In conversation about 23 days ago permalink
    • Embed this notice
      Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:03 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
      in reply to
      • <- ?? Striderpod ?? ->
      • verita84
      >I am quoting Claude

      >I never gave you LLM snippets.


      I think I just about had enough of this nonsense convo.


      >Go use PewDiePie's software lol.
      I am using it right now. You know why? because (halfway down the page, mind you) your github readme tells me to install gigabytes of shit in docker and it's a bit of a hassle to do make space for this in my vm for unidentified benefit.
      In conversation about 23 days ago permalink
      Blurry Moon likes this.
    • Embed this notice
      verita84 (verita84_64@poster.place)'s status on Friday, 12-Jun-2026 10:38:04 JST verita84 verita84
      in reply to
      • <- ?? Striderpod ?? ->
      @WandererUber @striderpod

      1. It's using llamacpp
      2. That's literally what it's called, guidance because it sucks. I am quoting Claude
      3. Again, like I said, I designed this so low-end GPU's have a shot. This is super important and should help larger GPU's too.
      In conversation about 23 days ago permalink
    • Embed this notice
      Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:04 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
      in reply to
      • <- ?? Striderpod ?? ->
      • verita84
      You're really doing your software a disservice by talking nonsense like this, I gotta be real here.

      2. Please explain the technical details of what we talked about instead of copy-pasting LLM snippets at me. this is very disrespectful and you're wasting my time.

      1. If you are using llamacpp anyways, why the fuck does your docker container ship with it. That's literally my only point dude. Pewdiepie does not have to care about getting GPU performance code right because he doesn't ship any. So I could only guess you were implying you're doing it differently. Now you're telling me you also don't even modify it. So I was right! it *IS* an absurd decision, making the user install llamacpp in the docker container even if he has inference up and running elsewhere. Nobody does it like this. Not anythingllm, not openwebui, not sillytavern, not prometheus, not even fucking oobabooga.

      3. once again, how exactly did you "design it", if the inference is done by an external codebase anyways? Please quantify this.
      In conversation about 23 days ago permalink
    • Embed this notice
      verita84 (verita84_64@poster.place)'s status on Friday, 12-Jun-2026 10:38:05 JST verita84 verita84
      in reply to
      • <- ?? Striderpod ?? ->
      @WandererUber @striderpod

      Absurd? It all started with getting the Intel Arc to work properly on Gentoo without Docker and with newer support for images. Everyone fell short of this.

      You have to do some guidance on the backend to get models to code properly. Think of it as holding their hand. The lower the model is, say 9b, the more help it needs.

      Also, since I made the backend, I can handle large source code files on small GPU's. Nobody does that
      In conversation about 23 days ago permalink
    • Embed this notice
      Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:05 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
      in reply to
      • <- ?? Striderpod ?? ->
      • verita84
      please don't vaguepost at me like this.

      >It all started with getting the Intel Arc to work properly on Gentoo without Docker and with newer support for images.
      I seriously doubt that you get significantly higher inference speeds than llamacpp or the forks that are out there, by rolling your own backend.

      >You have to do some guidance on the backend to get models to code properly. Think of it as holding their hand. The lower the model is, say 9b, the more help it needs.
      I don't appreciate being talked to this way. We're not on reddit and I'm not five years old. Besides that, Gemma4 E4B is decent at it. The benchmarks are out there, this is not some secret sauce.

      >Also, since I made the backend, I can handle large source code files on small GPU's. Nobody does that
      I can run 200k context with Q3.6-35B on beellama and use it in opencode/pi no problem. Again, please quantify what you're saying.
      In conversation about 23 days ago permalink
    • Embed this notice
      Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:06 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
      in reply to
      • <- ?? Striderpod ?? ->
      • verita84
      best-in-class. Not aware of anything better than prometheus.
      I will also try @verita84_64 's kit.
      a bit buggy. I had to fix some annoyances. The biggest was that it would truncate messages and sometimes get context-size wrong. Deepseek V4 flash + opencode did it in minutes. If you need details later, just ask. I can make a patch file.

      I'm running Qwen3.6-35B-A3B-Q6 on pic rel
      In conversation about 23 days ago permalink

      Attachments


      1. https://i.poastcdn.org/d0/bb/0c/d0bb0c3ba94b6da9b40b5201dc6ff9f09d8aaabcda872f211d073000ad891822.jpg
    • Embed this notice
      verita84 (verita84_64@poster.place)'s status on Friday, 12-Jun-2026 10:38:06 JST verita84 verita84
      in reply to
      • <- ?? Striderpod ?? ->
      @WandererUber @striderpod

      PewDiePie don't realize how hard agentic tooling is with quants. I am sure he don't have the GPU support either
      In conversation about 23 days ago permalink
    • Embed this notice
      Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Friday, 12-Jun-2026 10:38:06 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
      in reply to
      • <- ?? Striderpod ?? ->
      • verita84
      He doesn't need to because, unlike you, he didn't make the decision to roll the backend into this tool. Quite frankly, your decision to do this still seems absurd to me.

      I have it running on my GPU right now.

      >agentic hard with quants
      could you quantify this?
      In conversation about 23 days ago permalink
    • Embed this notice
      <- ?? Striderpod ?? -> (striderpod@poa.st)'s status on Friday, 12-Jun-2026 10:38:07 JST <- ?? Striderpod ?? -> <- ?? Striderpod ?? ->
      in reply to
      So, how is it? What are your specs?
      In conversation about 23 days ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.