@sidereal@hipsterelectron@ciggysmokebringer Since free source success, capitalists have turned towards the subscription model for software sales. Basically buying software is not useful alone, you need to buy the support instead. Now IBM was the fist to push this model in the 70s way before anyone else learned about it. Though they did with hw support. But hw support is no longer being handled that way because most hw doesn't need that kind of setup any more. Cloud is a mainframe...
@sidereal@hipsterelectron@ciggysmokebringer " I appreciate linux but damn, it seems they have created a lot of value for the capitalists while not getting much leverage back." So I am going to have to disagree. Only because I remember when every hw vendor had their own Unix system. They did that explictly for capitalism reasons. Some still have it (IBM and Oracle), though both also now support Linux ecosystem.
@FinalOverdrive And don't get me started on why autoconf is around in the first place. Most of the pro-cmake/ninja folks don't know of the time when you had to also deal with all of those OS's and executable formats. a.out, a.outb, coff, xcoff, elf, etc.
@FinalOverdrive We don't do enough story telling. So many of youth don't realize the time when every major hardware vendor (and even some software vendor) had their own closed source Unix OS; AIX (IBM), HPUX (HP), SCO, IRIX (SGI), Solaris (Sun), Tru64 (DEC), etc. (not to mention embedded OS or phone OS)
@camelcdr Though looking at it again ARMv9-a's SVE should be able to optimize it. in a reasonible fashion I think. But neither GCC nor LLVM is able to handle it with SVE either.
@diedofheartbreak "the energy consumption is temporary" is same as oh that mine over there is just temporary and will not cause any damage. It is the exactly the same as capitalists talking about long term damage; even though the mine (polution) might be short term, it has unkown long term ecological effects. Plus renewables still has an ecological effect that most folks still don't take into account . E.g. mining rare earth, daming up water.
@camelcdr With a patch I am working on, aarch64 can vectorize this but only with -fno-vect-cost-model. The code generated is bad. Looks like GCC does not realize it could unroll the loop 4x to get a reasonible code generation (or with my patch just 2x).
@velzie I looked into the git history and it was included in the initial checkin (into cvs which then was converted into git) back in 1993. So it was added a long time ago and the history on it looks to be lost. The 42 that is used by the xor is definitely a reference to `hitchhiker's guide to the galaxy`.
blah, I forgot some of GCC's IR is not valid for pointer types :) e.g. BIT_IOR and BIT_AND are not valid for pointer types. https://gcc.gnu.org/PR117646 . Lucky the fix is simple; just leaving it broken until tomorrow.
@paul@adam_chal well there are different implementations of tan too. Some have a slow path included and some don't. Even different versions of glibc are different. see https://sourceware.org/bugzilla/show_bug.cgi?id=15267 for talking about the slow path. I am not 100% sure FMA is the only difference here though. But it does change the ULP so does the slow path in some cases.
@paul@adam_chal I suspect the difference is because on x86_64 the default libc does not use FMA (since it is not part of x86_64v1) while on aarch64 (since FMA is part of armv8-a) will use FMA. Especially when it is 2up difference. Fused Multiple Add can make a huge difference since the addition is done in infinite precision after the multiply and only rounded afterwards.
@zwarich@resistor@chandlerc So I suspect one reason is because GCC has both global ranges and on demand ranges. most of the time global ranges are enough for most passes to use. And then for the on demand ones, it is uses the global ones as a starting point rather than recomputing those too. Plus there is a cache while doing on demand ones since it is just as likely you ask for one SSA name in one BB as you could another SSA name.
So looking at what improvements I made for GCC for GCC 15, and will say https://gcc.gnu.org/PR19661 might have the biggest impact for embedded folks. In the case of libstdc++, the number of __cxa_exit calls is reduced by 7 (out of 17). A nice improvement really.
@whitequark For x86_64, most structs between windows and elf ABIs are similar alignment if not the same. Also if done correctly the ABI for library would not expose any structs which will allow things even more.
As far as EH is concern, I suspect the library where this would be used the best for are ones which don't do any external calls just do stuff like encryption or some other kind of encoding/decoding like audio or picture or movies.