@Moon@shitposter.club this is why I'll never purchase a Mac tbh not worth paying $2500 for something that will need $1500 in maintenance cost if anything ever happens to it
my framework has its own problems but I can take it apart in a couple minutes, was helpful when I dunked it in water a few weeks ago
@Suiseiseki There are various model files in different sizes. They need to fit inside your VRAM. So, you can get normal-sized models and run. them on Linux. I do in fact have a Linux box with 24GB of VRAM but that is still not enough for the largest models. A Mac, if you spend the money, can run a model in over 190GB.
@Moon If the software is actually free and RAM usage is the blocker, I would suggest getting a KGPE-D16 and installing 128 GB of DDR3 ECC RAM into it (or 256GB if you want).
@lain Apache 2.0 doesn't do much to ensure the software is free and remains free, so I often need to check if certain packages licensed under the Apache 2.0 are actually free software.
@Suiseiseki@Moon you probably could, given enough time and effort, run the models on asahi using free reverse engineered gpu/tpu drivers because apple doesn't sabotage re efforts for it. no such comfort awaits the nvidia user.
it's due to a number of factors that macOS uses less ram more or less, but a common thing in OSs is memory compression. That said, arm macbook's memory bandwidth, and i suppose also speed, is pretty crazeh
@mischievoustomato@Suiseiseki@Moon the highest end alchemist card currently available is still pretty weak for ai stuff afaik, whether battlemage changes the situation remains to be seen
@mischievoustomato@Suiseiseki@Moon from what I've seen most everything calls out to a few big libraries like pytorch and if those are supported on your hardware you should be fine.
@allison >run the models on asahi using free reverse engineered gpu/tpu drivers because apple doesn't sabotage re efforts for it I wouldn't call those drivers free, considering that Aarch64 Macs have a bootloader that loads a huge amount of proprietary software like peripheral software onto the GPU etc at boottime (proprietary software lovers really love that "feature") and it's not possible cryptographically to free that software.
While apple for now doesn't dedicate all their effort to stop half of the GPU/TPU driver from being re-written, half of the driver being free doesn't really result it as a whole being free.
>no such comfort awaits the nvidia user. Actually it kind of does 700 series and below with the free nouveau driver.
OpenCL support is currently pretty poor, but those GPUs do current have a 95% free driver considering that the Linux driver and the peripheral software are free - the remaining issue is that VBIOS's stored in an EEPROM chip on the GPUs are proprietary - but freeing such VBIOS's seems possible with enough effort, as 700 series and below lack cryptographic handcuffs.
@allison@Suiseiseki@mischievoustomato I git cloned llama.cpp today and ran make -j and I had a working system in less than five minutes. It took over a day to get it working on my Linux box.
@Moon@Suiseiseki So I really can't just go get a MEGA ULTRA GAMER ELITE PC with a 4090 (24 GB VRAM) and run the biggest AI stuff on it by having it share/offload some of the model to system RAM? I've never really delved into the big AI/GPU stuff.
@Moon@Suiseiseki@mischievoustomato what are the vram requirements for llama.cpp? seems like something I might want to mess with on my intel/intel rig lol
@allison@Suiseiseki@mischievoustomato it all depends on the model you run. 16GB will run a lot of stuff. 12GB is will too but it's not big enough for some useful models.
@mischievoustomato@Suiseiseki@Moon Still can't believe the spaghetti I spilled when I told him I preferred Motorola/PowerPC in our one IRL encounter lmfao
>Is the software really free if it can't be run at all on freedom respecting computers? Once I reach wizard status I may be able to answer such questions.
@pomstan Freedom 0, the freedom to run the software.
Intentionally writing software in a way that ensures it only ever runs at all with proprietary software arguably means that the software doesn't respect freedom 0, as you can't run it in freedom after all.
@mischievoustomato@Suiseiseki@Moon Even by the time I was 8 or 9 I was all in on the "damn I hate x86" train, I actually mainly used PowerPC iron at home for a good chunk of the late 2000s and early 2010s and I honestly quite enjoyed the experience
@mischievoustomato I don't see anything revolutionary about getting existing CPU and GPU soft cores, making a few changes and slapping them all into an Aarch64 SoC and paying TSMC ridiculous sums to fab them at the smallest node size.
Sure, a big improvement in processing power/power consumption resulted, but that's hardly revolutionary.
@Suiseiseki nobody is forcing you to run software in a specific way. you can reimplement proprietary libs by yourself according to the expectations expressed in that free software
@mischievoustomato@Suiseiseki@Moon It's just a way nicer looking asm and sanely designed architecture compared to x86. I actually learned a bit of x86 asm around that point and it felt incredibly dirty to me. As for power efficiency concerns, that was down to economies of scale, Apple's partners fumbled hard with the G5 by essentially making their version of the Pentium 4 and Intel at the time had a really good contingency plan in the form of the Israeli-developed Pentium M and Core lineage. Eventually Intel repeated IBM's G5 mistake with Skylake, Apple decided to switch architectures *again*, and now they have ARM64 which if you squint really hard you could almost imagine is recent POWER.
when i was 8 or 9 i was too small to know about that, and my house didn't have internet. why did you prefer it over x86? everyone seemed to prefer x86 and apple moved to x86 for efficiency/performance reasons too, no?
@mischievoustomato@Suiseiseki@Moon Apple was so furious Intel couldn't deliver what they promised with Skylake that they were forced to go back to the drawing board when the mobile parts were unveiled, AMD also came out with Ryzen around that time and started regaining their footing directly because of this embarassment (it's also why Intel eventually brought Gelsinger back on board, not a single other person on earth could even have a prayer of a chance at fixing this mess)
@mischievoustomato@Suiseiseki I hope you're ready for Intel's Bulldozer moment I like x86 and all, but I'm looking forward to hard cores of Risc-V and how they compete I'm STILL a bit miffed that IA64 died before I could get my hands on it, it had flaws but it's got a good idea
@allison@Suiseiseki@mischievoustomato I was always a PowerPC skeptic. I got a G4 Mac Mini and it just didn't perform as well as their benchmarks claimed.
In the 64-bit space, you could get a top-end G5 and beat Intel. But when Apple switched their laptops to Intel their laptops got an immediate 2-4x speedup. I tried it and it was immediately apparent that the PowerPC hype was bullshit.
@mischievoustomato@Suiseiseki@Moon Conversely, I want to leave x86 as soon as I can and I want wherever I end up next to be a soft landing with a pleasant low-level experience.
> by having it share/offload some of the model to system RAM?
You're missing the point of how expensive of an operation it is for the CPU to access memory on the GPU. The latency is the problem, plus it needs to be copied over the PCIE bus (except in the situations with the newer CPUs and GPUs that improve upon this operation). On a unified memory platform the CPU can access the same memory address space that the GPU processor operates on. No copies, no additional latency.
@Moon@Suiseiseki@mischievoustomato The reasons I liked the architecture had little to do with Apple's marketing and everything to do with the architecture itself. When it comes to implementations, Apple failed hard by essentially pushing their Pentium III equivalent way past its sell by date and by getting a Pentium 4 equivalent hand delivered to them just as Dennard scaling slammed into a wall.
@feld@Suiseiseki@vic I haven't kept up with the state of the art with games. It used to not really be a problem that VRAM wasn't shared because you just pushed never pulled. I assume shared is better because you don't even have to copy. But for AI stuff you're pushing AND pulling.
@iska I cloned it, but I can't compile it without docker, which is kind of proprietary.
At least it looks like the source-code of Ollama is there (but there's no real indication as to the license of each file, as MIT expat dumped into the root with no other details as to what it applies to doesn't mean much legally), but I'm not sure about the dependencies.
@allison@Suiseiseki@mischievoustomato I was more just annoyed that Apple seemed to be incredibly dishonest about it (until they completely flipped when they went to intel) as far as the architecture was concerned yes power is sane
@Moon@Suiseiseki@mischievoustomato the funny thing is that the "megahertz myth" marketing stuff actually brought up a bunch of completely valid points.... which they subsequently completely ignored when they tried pushing the g5 everywhere, lmao
@TURBORETARD9000 AMD is only the slightest bit less bad than nvidia now - the VBIOS and peripheral software for their GPUs is proprietary and is cryptographically restricted to stop its replacement.
@mischievoustomato@Suiseiseki I'm heading for AMD when I can, they've been showing better performance and thermals in their recent generations I've heard
I also know llama.cpp exists https://github.com/ggerganov/llama.cpp and it doesn't seem to force Docker or cuda I did just remember wanting to run an LLM, I might use this
@Suiseiseki bleghh it's been forever since I read up on AMD/Intel history, yeah you got that much right
RISC-V implements 40 instructions in its base ISA, 163 for RV64GCS (+40 if you count the compressed instructions separately), the minimum needed to run a general purpose OS
@Moon@Suiseiseki It has some quirks yeah, but I'm still in the implement stage atm with my visual core so I've got a bit of work to do before I'm ready to do a deep dive beyond reading the docs and figuring out how the hell to read verilog
@TURBORETARD9000@Suiseiseki they actually said why, which seemed to be based on actually looking at it. I just don't have knowledge in that area so I don't know if what they said was accurate. They're a pretty smart person.
@Moon@TURBORETARD9000@Suiseiseki >RISC-V having no provision for a condition code register (status register), or a carry bit which is... unfortunate for some specific applications
@romin@Suiseiseki@Moon Such checks are typically implemented with a compare, it adds an instruction or so but the instructions are cheap enough that it shouldn't be an issue (?)