Embed this noticelain (lain@fediffusion.art)'s status on Thursday, 20-Jun-2024 21:22:11 JST
lain> Conversational large language models are fine-tuned for both instruction-following and safety, resulting in models that obey benign requests but refuse harmful ones. While this refusal behavior is widespread across chat models, its underlying mechanisms remain poorly understood. In this work, we show that refusal is mediated by a one-dimensional subspace, across 13 popular open-source chat models up to 72B parameters in size. Specifically, for each model, we find a single direction such that erasing this direction from the model's residual stream activations prevents it from refusing harmful instructions, while adding this direction elicits refusal on even harmless instructions.
Embed this noticelain (lain@fediffusion.art)'s status on Friday, 17-May-2024 22:23:18 JST
lainI have noticed that, somewhat contrary to what I would have expected, religious / spiritual people have little problems with LLMs (in fact, they have a lot of 'discussions with chatgpt about faith' podcasts out), while the people who are deadly afraid of it and think it will be the downfall of society are overwhelmingly materialist progressive types. I have some ideas about this but for now it's just an observation.
Whoops! new year happened and I missed the end of the vote!
We have three winners this time with a three-way tie of 6 votes each, @guizzy, @kaiaskutes and @Elliptica, congratulations! Thank you all for participating, let's get some good generations going in 2024!
Embed this noticelain (lain@fediffusion.art)'s status on Wednesday, 20-Dec-2023 07:36:17 JST
lain> I expect the only people who are nonplussed by the power of LLMs are those with a soft spot for occultism of some sort—those who think words are magical. Let me explain. > Let me repeat: there is so much abstract structure in our language—the patterns are so overwhelmingly clear, consistent, and objective—that by mindlessly figuring out the probability of one symbol following another, a machine can effectively reason better than the average person for a large number of cases.
Once again we're doing a week-long AI image creation contest! This time the topic is:
STORY ILLUSTRATIONS
Ever read a story and imagined what the scene would look like? Well, now you can show it to all of us! Pick a scene from any story or novel you like and create an image of it. Please tell us which story you are taking inspiration from!
The voting will start one week from now, so get your entries in before that.
Here's an example: A scene from the Yasutaka Tsutsui story "Standing Woman".