So, it turns out, Techbro Rasputin’s FOSDEM talk isn’t because he is a donor, the organisers actually though it was a good idea.
I guess I’m not sad anymore that I’ll miss the event this year.
So, it turns out, Techbro Rasputin’s FOSDEM talk isn’t because he is a donor, the organisers actually though it was a good idea.
I guess I’m not sad anymore that I’ll miss the event this year.
@futurebird This was obviously nonsense, for the same reason most voice control is: we had prior experience with it.
Before computers were common, executives had typists who would type letters for them. Initially you’d dictate to someone who would write shorthand (at the speed of speaking) and then someone (possibly the same person) would transcribe it with a typewriter. By the ‘80s, it was common to replace this with a dictaphone that you’d speak into and then the secretary would replay the tape and be able to rewind and pause, eliminating the need for shorthand.
Once computers became useful enough that every executive had one on their desk, they learned to type and found that typing their own letters was faster than dictating. A lot of these people were sufficiently well paid that having someone to type your letters as a status symbol was perfectly viable and they still didn’t do it. A human who knows you and your style well is going to do a lot better a job than a computer, so serves as a good proxy for the perfect computerised text to speech. The people who had access to it and had an incentive to treat using it as a status symbol did not use it because it was less productive than just typing.
The only people for whom it makes a difference are those who can’t use their hands, whether as a permanent disability or something transient like having them occupied performing surgery, driving, cooking, or whatever. And there the comparison point is remembering the thing you wanted to type until later. Computers are great at things that replace the need for remembering things. As was paper before it (sorry Socrates, all the cool kids use external memory, listen to Plato).
In the ‘90s there were experiments doing the same kind of ‘simulate the perfect voice command by using a human as a proxy’ thing and they all showed that it was an improvement only when the human had a lot of agency. None of the benefit came from using natural language (using jargon or restricted command sets was usually less ambiguous) all of the benefit came from a human being able to do a load of things in response to a simple command. And you can get the same benefits without adding voice control.
Humans evolved manual dexterity long before they evolved language and have a lot of spare processing available for offloading tasks that involve hands. Try reading a piece of piano music and saying the notes in your head as fast as they’re played (you can’t say them aloud that fast, but even forming them into thoughts expressed in natural language is hard).
@aral Coincidentally, I wrote to my MP the day before Starmer said this nonsense. I was writing specifically about the consultation on weakening UK copyright law to allow AI grifters to launder the commons with no repercussions, which is framed as 'we obviously want to give everything away and kill the British creative industry to make some Americans richer, help us find the most efficient way of doing that'.
The relevant part of my letter was:
Unfortunately, there seems to be a lack of understanding in government of how current 'AI' systems work. Announcements by members of the cabinet could easily be press releases from companies trying to sell things on the current hype wave. The only ray of sunshine has been the skepticism from the MOD. Much of the current hype wave surrounding generative AI is from companies run by the same people behind the Bitcoin / blockchain / web3 hype (which consumed a lot of energy, made the climate disaster worse, and failed to produce a single useful product).
There are a few places where machine learning techniques have huge value. Anomaly detection can be very useful for at-scale early diagnosis of various medical conditions, but this alone will not fix the NHS. Most of the hype has failed to create any products of real value. For example:
77% of employees report that using AI tools makes them less productive[1].
A study on Google's own workers found that using AI tools made them less productive[2].
OpenAI, the flagship company driving the hype wave is still making massive losses[3], including losing money on the $200/month subscription plan[4].
Software written using AI has more security vulnerabilities[5]
It is not worth throwing the UK's creative sector under a bus to provide more money for these companies and their investors.
If you want a good overview of these problems, I'd recommend Pivot-to-AI[6] as a starting point. Beyond this, I'd also point out that OpenAI has been caught harvesting data from sites whose terms of use specifically prohibit it[7] (see also the LibGen article on Pivot-to-AI). Breaking the law should not be rewarded and no opt-out can work with people who do not follow the law. Opt in is the only viable solution.
The footnotes were:
[1] https://www.forbes.com/sites/torconstantino/2024/09/12/77-of-surveyed-employees-say-ai-tools-make-them-less-productive/
[2] https://redmonk.com/rstephens/2024/11/26/dora2024/
[3] https://www.cnbc.com/2024/09/27/openai-sees-5-billion-loss-this-year-on-3point7-billion-in-revenue.html
[4] https://9meters.com/technology/ai/openai-loses-money-on-200-month-pro-plan-because-people-are-using-it-too-much
[5] https://arxiv.org/html/2404.03823v1
[6] https://pivot-to-ai.com
[7] https://aoir.social/@aram/113811386580314915
I don't encourage people to send the same text to their MPs, because that gets ignored, but a set of citations like this may help push back.
@dansup Might be worth checking with the EU regulators. I would be pretty shocked if this did not violate the Digital Markets Act, and that has some fairly beefy financial penalties.
@jwildeboer I non-profit that manages this for people might help. You may even be able to set it up as a registrar so it doesn't need to integrate with third parties. It would need to provide a mechanism for buying domains and a (community contributed) way of generating DNS records for specific things so you could say 'I use service X, set up DNS records for it thanks' and a legal structure so that the domains that it registered were fully owned by the individuals who registered them and would be returned to them in the event the non-profit went out of business.
Someone should point out to Trump that if each Canadian province became a US state then there would be enough blue electoral college votes to ensure that his team never got in again, but if he sold the blue costal states to Canada then he would have enough votes for a constitutional amendment to make him dictator for life to pass.
@ryanc I was going for -INF, but -0 is probably better.
Housing efficiency subsidies rarely work well for the people who needed them most: tenants in cheap housing. For many years, I rented somewhere with single-glazed windows and no loft insulation. The cost of fixing it would have been at least £20k (probably a lot more, given the state of the roof), which was many years of the £350/month rent I was paying. It cost a lot to heat in the winter, I was basically running the boiler all of the time. There were some insulation subsidies available for the property owner, but they weren’t the ones paying the heating bill so didn’t see the saving and, because the subsidies didn’t cover 100% of the costs, they would have spent money to save me money. The incentives were not aligned.
So I have a simple proposal: mandate minimum efficiency requirements for rented accommodation with a 3+ year compliance requirement, require letting agents and private rental listings to include the average heating cost for the year in the advert, and provide government-backed loans to cover 100% of the cost of compliance. If you don’t have the liquid funds available to pay for upgrades (or don’t want to spend them), you can get a Bank of England base rate loan that covers the cost of the upgrade. The loan accrues interest at the base rate, but you don’t have to pay any of it back until you sell the property, at which point the capital plus interest must all be repaid (you can also pay it back in advance). The same technique could be applied to heat pumps, solar panels, and so on.
I would expect that this would stimulate the economy (jobs for builders), reduce emissions (less waste from inefficient heating), and have a high take up because the incentives are now aligned. Having a well-heated house reduces the risk of damp and similar things that can dramatically lower property values and the improvements are likely to increase property values with no up-front costs, making them attractive to landlords.
Anyone know why this wouldn’t work?
@ryanc @sophieschmieg @Lookatableflip I guess it’s not pure software, but anything running on a real computer has a hardware component. The randomness bit is pure software, using whatever it can from the environment as entropy sources, but none of the entropy sources alone (without a hardware random number generator) has enough entropy to be useful, and interrupt timings can sometimes be under attacker control (some fun attacks from the ‘90s involved sending packets at specific timing to influence the entropy collection).
@ryanc @Lookatableflip @sophieschmieg That depends a lot on the system. It will use all of the entropy sources available to the kernel. On modern systems, that typically includes at least one hardware entropy source. These are often a set of free-running ring oscillators, which then feed into some cryptographic hash function for whitening.
Without these, it will use much weaker things. The contents of the password file, the hash of the kernel binary, the cycle count at the time interrupts fire or devices are attached, and so on.
There have been some high-profile vulnerabilities from embedded devices that did things like generating private keys on first boot, with deterministic device attach time, and ended up with a handful of different private keys across the entire device fleet.
This is a terrible take and you should really know better. It's not different than chastising people who use higher level programming languages or Dreamweaver to make a website instead of studying HTML.
I feel like you didn’t read past the quoted section before replying with a needlessly confrontational reply.
It is very different. If you give someone a low-code end-user programming environment, they have a tool the helps them to unambiguously express their intent. It gives them a tool to do so concisely, often more concisely (at the expense of generality), which empowers the user. This is a valuable thing to do.
We should all be able to agree that giving people a way to use natural language to build little apps, tools, and automations that solve problems nobody is going to build a custom solution for is a good thing.
No, I disagree with that. Giving them a natural-language interface and you remove agency from them. The system, not the user, is responsible for filling in the blanks. And the system does so in a way that does not permit the user to learn. Rather than using the tool badly and then improving as a result of their failure, the system fills in the blanks in arbitrary ways.
A natural-language interface and an easy-to-learn interface are not the same thing. There is enormous value in creating easy-to-learn interfaces that empower users but giving them interfaces that use natural language is not the best (or even a very good) way of doing this.
@cesarb @tthbaltazar @mjg59 Don’t confuse on-package TPMs and fTPMs. A lot of fTPMs (which run on the main core in a privileged mode) are often vulnerable to side channels. Several of the recent transient execution attacks could leak fTPM secrets. I think most of these were patched by doing some aggressive state flushing on TPM events, but people keep finding new side channels. On-package TPMs, where the TPM is a separate component either in the same package or on the same die are typically not vulnerable to these attacks. On the MS Surface laptops, there’s a Pluton subsystem on die, which runs the TPM stack. Pluton is one of the few Microsoft security products I have a lot of faith in (I worked with that team, they’re great): it stood up to over a decade of attacks from people with physical access and a strong financial incentive to break it.
@baltauger In linguistics, the Whorf-Sapir hypothesis, also known as the Linguistic Relativity hypothesis, argues that language constrains thought. This was the idea behind Orwell's Newspeak. The strong variant argues that you cannot think an idea that your language cannot express (the goal of Newspeak), the weak variant argues that language guides thought. The strong variant is largely discredited because it turns out that humans are really good at just making up new language for new concepts. The weak variant is supported to varying degrees.
I keep trying to persuade linguists to study it in the context of programming languages, where humans are limited in the things that they can extend because a compiler / interpreter also needs to understand the language. I think there are some very interesting research results to be found there.
@jonmsterling The right mental model for interacting with an LLM is to treat it like a person being tortured: It will say whatever is most likely to make you stop, the only trustworthy answers are ones that you can instantly validate.
A lot of the current hype around LLMs revolves around one core idea, which I blame on Star Trek:
Wouldn't it be cool if we could use natural language to control things?
The problem is that this is, at the fundamental level, a terrible idea.
There's a reason that mathematics doesn't use English. There's a reason that every professional field comes with its own flavour of jargon. There's a reason that contracts are written in legalese, not plain natural language. Natural language is really bad at being unambiguous.
When I was a small child, I thought that a mature civilisation would evolve two languages. A language of poetry, that was rich in metaphor and delighted in ambiguity, and a language of science that required more detail and actively avoided ambiguity. The latter would have no homophones, no homonyms, unambiguous grammar, and so on.
Programming languages, including the ad-hoc programming languages that we refer to as 'user interfaces' are all attempts to build languages like the latter. They allow the user to unambiguously express intent so that it can be carried out. Natural languages are not designed and end up being examples of the former.
When I interact with a tool, I want it to do what I tell it. If I am willing to restrict my use of natural language to a clear and unambiguous subset, I have defined a language that is easy for deterministic parsers to understand with a fraction of the energy requirement of a language model. If I am not, then I am expressing myself ambiguously and no amount of processing can possibly remove the ambiguity that is intrinsic in the source, except a complete, fully synchronised, model of my own mind that knows what I meant (and not what some other person saying the same thing at the same time might have meant).
The hard part of programming is not writing things in some language's syntax, it's expressing the problem in a way that lacks ambiguity. LLMs don't help here, they pick an arbitrary, nondeterministic, option for the ambiguous cases. In C, compilers do this for undefined behaviour and it is widely regarded as a disaster. LLMs are built entirely out of undefined behaviour.
There are use cases where getting it wrong is fine. Choosing a radio station or album to listen to while driving, for example. It is far better to sometimes listen to the wrong thing than to take your attention away from the road and interact with a richer UI for ten seconds. In situations where your hands are unavailable (for example, controlling non-critical equipment while performing surgery, or cooking), a natural-language interface is better than no interface. It's rarely, if ever, the best.
@dalias @brokengoose @LPerry2 @josh0 It's amazingly difficult to find good book data. Most things use Amazon's database (with all of the tracking that comes with) because everything else is so much worse.
Publishers seem to treat their slice of the ISBN database as something to sell, rather than something that, as part of the commons, would increase the value of the books that they sell. This means any kind of mapping between ISBNs and books is hard (and it's a many-to-many relationship since an ISBN identifies a print volume, which may be a single edition of a book or an omnibus edition that includes multiple logical books). Building any kind of meaningful ontology on top of this is really hard. Wikidata trues but is missing a lot of things.
LibraryThing provided services to libraries before being bought by Amazon but their data is really bad. Lots of books seem to have been entered by using computer vision on the cover so the title fields include every word on the cover, such as 'the new novel in the X series' and so on. So much value is lost to society by there being no maintained database for this. I suspect the amount that half a dozen libraries pay as a result of it not existing could completely fund its development and maintenance.
@futurebird @servelan @fivetonsflax @nazokiyoubinbou @justafrog @clayote I’ve been actively avoiding Amazon for about ten years. I first realised that they were not the cheap option when I bought some garden furniture 15 years ago. I discovered that the same seller sold the four-seat version for the price I paid them for the two-seat version on Amazon. After that, I started using Amazon just for discovery: find the thing I want to buy there and then find the place I actually buy it from elsewhere. Often, searching for the product name and seller from Amazon will take you to another shop front that charges less because it isn’t giving Amazon a cut.
For book, even Hive, which supports local book sellers, is cheaper.
Niche things that used to be Amazon-only are now often sold through eBay as well (eBay seems to have become more of a generic shop front and less of a second-hand auction site now).
@lanodan @carbontwelve Spam filtering has been a good application for machine learning for ages. I think the first Bayesian spam filters were added around the end of the last century. It has several properties that make it a good fit for ML:
Note that this is not the same for intrusion detection and a lot of ML-based approaches for intrusion detection have failed. It is bad if you miss a compromise and you don’t have enough examples of malicious and non-malicious data for your categoriser to adapt rapidly.
The last point is part of why it worked well in my use case and was great for Project Silica when I was at MS. They were burning voxels into glass with lasers and then recovering the data. With a small calibration step (burn a load of known-value voxels into a corner of the glass) they could build an ML classifier that worked on any set of laser parameters. It might not have worked quite as well as a well-tuned rule-based system, but they could do experiments as fast as the laser could fire with the ML approach, whereas a rule-based system needed someone to classify the voxel shapes and redo the implementation, which took at least a week. That was a huge benefit. Their data included error-correction codes, so as long as their model was mostly right, ECC would fix the rest.
@iximeow I suspect that a big part of the 'computer people' vs 'not computer people' split is similar to the experience eating capsicum. When you eat capsicum, it causes pain and your body then creates dopamine to counter the pain. For a lot of people, the dopamine effect is greater than the pain and so the overall experience is pleasant (there was a fascinating experiment a few years ago that fed people chillies and blocked the dopamine response: universally, everyone hated the taste of chillies, even people who loved them normally).
Everyone gets frustrated by computers doing the wrong thing for bizarre reasons (which may be a simple misnamed thing), but some people really enjoy the experience that you get after you've found and fixed the problem. Whether that joy outweighs the suffering varies a lot between people.
I miss the time when progressive tax rates meant that wealth accumulation plateaued well before individuals could afford to buy small countries. I don’t miss the fact that a small number of descendants of kings already had that much. I do miss the memory of Cromwell and Robespierre being front and centre of their minds if they tried to abuse that power.
I miss Labour and Civil Rights movements constantly gaining ground and the expectation that, even if things weren’t great for everyone now, they would be slightly better each year. I don’t miss the various forms of oppression that they fought against.
I miss corporations being accountable to governments. I don’t miss corporations like the East India Companies being de-facto governments.
I miss technology being exciting and each new advance making the world better. I don’t miss nuclear first-strike doctrine being acceptable military policy.
I miss the Geneva Convention being a thing that major powers took seriously and prosecuted war criminals who violated. I don’t miss all of the atrocities that led to people deciding it was necessary.
I am nostalgic for a small set of slices of the last 200 years of history, which never happened at the same time.
David Chisnall (*Now with 50% more sarcasm!*)
I am Director of System Architecture at SCI Semiconductor and a Visiting Researcher at the University of Cambridge Computer Laboratory. I remain actively involved in the #CHERI project, where I led the early language / compiler strand of the research, and am the maintainer of the #CHERIoT Platform. I was on the FreeBSD Core Team for two terms, have been an LLVM developer since 2008, am the author of the GNUstep Objective-C runtime (libobjc2 and associated clang support), and am responsible for libcxxrt and the BSD-licensed device tree compiler.Opinions expressed by me are not necessarily opinions. In all probability they are random ramblings and should be ignored. Failure to ignore may result in severe boredom and / or confusion. Shake well before opening. Keep refrigerated.Warning: May contain greater than the recommended daily allowance of sarcasm.
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.