GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Thursday, 11-Jun-2026 11:42:55 JST

  1. Embed this notice
    Wanderer über dem Nebelmeer (wandereruber@poa.st)'s status on Thursday, 11-Jun-2026 11:42:55 JST Wanderer über dem Nebelmeer Wanderer über dem Nebelmeer
    in reply to
    • lainy
    • bronze
    • binkle
    • Wanderer über dem Nebelmeer
    • Blurry Moon
    update:
    Running beellama (cuda) on the same config is faster than llama-cpp vulkan, which is already faster than vanilla llama-cpp cuda.

    I can't use TurboQuant because it's slower. It needs cpu-moe = true and apparently my cpu is NOT moe. Costs me ~15% t/s

    I have not had ANY success with the dflash drafting. Logs show a lot of rejections. Maybe that's it. It's slow.


    The absolute kicker why I will keep using it:
    A 3X increase in prompt processing speed, that's on top of the inference speed increase. I have no idea what they did to achieve this.
    In conversation about a month ago from gnusocial.jp permalink

Feeds

  • Activity Streams
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.