Conversation

Notices

Embed this notice
BedastGPT (bedast@beige.party)'s status on Friday, 10-Jan-2025 12:40:37 JST BedastGPT

The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.
VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.
This is not generative AI.
While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.
What VLC is doing is something that will contribute to accessibility in a big way.
AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.
I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.
So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.
#AI #Transcription #VLC #HearingImpaired #Deaf #Accessibility
In conversation about 4 months ago from beige.party permalink
Attachments
1. Domain not in remote thumbnail source whitelist: not.So
  
  Not.so: Scam, fraud and hoax information portal.
  
  A list of high quality sites dealing with scams, spam, hoaxes, fraud and misinformation.
2. No result found on File_thumbnail lookup.
  
  way.AI
- Haelwenn /элвэн/ :triskell:, djsumdog and MortSinyx like this.
- Rich Felker repeated this.
- Embed this notice
  BedastGPT (bedast@beige.party)'s status on Friday, 10-Jan-2025 20:39:31 JST BedastGPT
  in reply to
  - Hops the sausage dog
  @howisyourdog I'm not a Firefox user so I haven't really dug into the latest in being upset with Firefox making an AI plugin, but it seemed like they were making an LLM to summarize pages. These have been known to get things very wrong. I don’t know if it's on-device or if it uses ChatGPT.
  
  In conversation about 4 months ago permalink
- Embed this notice
  Hops the sausage dog (howisyourdog@cupoftea.social)'s status on Friday, 10-Jan-2025 20:39:36 JST Hops the sausage dog
  in reply to
  
  @bedast I would also add I find it quite helpful to start with a set of automatically generated captions, and then correct them. I don't do this often, but it saves me loads of time in a part-time job.
  Is this a bit like people being annoyed at Mozilla using AI for on-device browser translation, even though that's very useful? I'm not sure if that's generative, but I'd guess not.
  In conversation about 4 months ago permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: www.job.is
    
    Job
    
    from Baldur
    
    Finnum þína framtíðar vinnu | Atvinnuauglýsingar | Laus störf | Vinna í boði | Starfsólk á skrá |
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 21:05:52 JST Rich Felker
  in reply to
  
  @bedast Then don't call it AI. Call it speech to text. But if it uses a language model to more effectively predict words based on context rather than doing an analyzable mechanical local transformation, it is at least partly the "bad kind of AI" - it has the capacity to introduce biases from training data making output that "sounds right" but means the wrong thing, which is much worse than substituting nonsensical homophones now and then (which the reader will immediately recognize as mistakes). Same principle as why autocorrected text is worse than text with typos.
  
  In conversation about 4 months ago permalink
  
  tinydoctor repeated this.
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 21:18:48 JST Rich Felker
  in reply to
  
  @bedast Enthusiastically calling new functionality "AI" signals to your audience that you're aligned with the scams and makes them distrust you.
  This is not hard.
  If you have privacy respecting, on-device, non-plagiarized, ethically built statistical model based processing, DON'T CALL IT "AI".
  
  In conversation about 4 months ago permalink
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 22:14:07 JST Rich Felker
  in reply to
  - A.V.
  @varavs @bedast Then don't call it "AI".
  But also, question what harms are coming out of the predictive models. The more they force the output to sound natural and fix misrecognitions, the greater the chance they're altering meaning. Same as autocorrect vs typed text with typos and misspellings.
  
  In conversation about 4 months ago permalink
- Embed this notice
  A.V. (varavs@sigmoid.social)'s status on Friday, 10-Jan-2025 22:14:08 JST A.V.
  in reply to
  - Rich Felker
  @dalias @bedast speech recognition has used language models for decades now. It was one of original applications of language models, way before they scaled up to aping shakespeare.
  But even without language models, the act of transcription is very close to generative ai, as its the task of predicting the next text token, given previous tokens and encoded audio sequence.
  
  In conversation about 4 months ago permalink
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 22:16:20 JST Rich Felker
  in reply to
  - A.V.
  @varavs @bedast Also ask if the model is ethically and legally sound. Was it produced from professional training material with compatible license terms? Or stolen from millions of movies or YouTube videos?
  
  In conversation about 4 months ago permalink
- Embed this notice
  翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Friday, 10-Jan-2025 22:18:29 JST 翠星石
  in reply to
  
  @bedast The problem is such subtitling software is not free software, as I really doubt the license of the training works was followed (most training is done on creative works with no license) and there is no complete source code - just an object code form that nobody understands how it really works, thus such software doesn't have the 4 freedoms; https://www.gnu.org/philosophy/free-sw.en.html#four-freedoms
  
  Are you really confident such nonfree software is compatible with VLC's licenses?
  
  When I watch a video, I will never be content with slop subtitles - handcrafted .ass's is what I need.
  In conversation about 4 months ago permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: www.gnu.org
    
    What is Free Software? - GNU Project - Free Software Foundation
    
    from mailto:webmasters@gnu.org
    
    Since 1983, developing the free Unix style operating system GNU, so that computer users can have the freedom to share and improve the software they use.
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Saturday, 11-Jan-2025 12:37:19 JST Rich Felker
  in reply to
  - LisPi
  @lispi314 @bedast It started showing diminishing returns when researchers figured out you could churn out degrees/products without needing any new ideas just throwing machine learning at the problem and ignoring all the potential harms from that.
  
  In conversation about 4 months ago permalink
- Embed this notice
  LisPi (lispi314@udongein.xyz)'s status on Saturday, 11-Jan-2025 12:37:20 JST LisPi
  in reply to
  - Rich Felker
  @dalias @bedast Didn't mathematical/rule-based language modeling start showing massively diminishing returns back like... two~three decades ago or is my information wrong?
  
  As far as I'm aware it would be preferable to start from a rule-based language, and then be able to specifically train a small model on a different captioned sample set of the speaker(s) to eliminate its flakiness.
  
  In conversation about 4 months ago permalink
- Embed this notice
  Paul Sutton (zleap@qoto.org)'s status on Saturday, 11-Jan-2025 17:10:29 JST Paul Sutton
  in reply to
  
  @bedast
  Sounds a good idea to me, the tool can take a video and create captions. Your comment about humans being more accurate is also good, as surely once those captions have been created, a human can go through them, and I would assuek captions are stored in a external file, if this can be edited then the human job would be to simply edit the file and correct any minor errors.
  Any tools that can make life a little easier is surely welcome. Perhaps the importantj point though is also transparancy, if you have used a tool to transscribe this should be clearly stated, so people know how the captions have been generated.
  
  In conversation about 4 months ago permalink
- Embed this notice
  SuperDicq (superdicq@minidisc.tokyo)'s status on Sunday, 12-Jan-2025 00:00:44 JST SuperDicq
  in reply to
  
  @bedast@beige.party If it runs on the user's local device and is free software I'm all for it.
  
  In conversation about 4 months ago permalink
- Embed this notice
  NotAlexNoyle 🌻 (notalexnoyle@union.place)'s status on Sunday, 12-Jan-2025 17:35:29 JST NotAlexNoyle 🌻
  - Infoseepage
  - Hops the sausage dog
  @Infoseepage @bedast @howisyourdog it is open source software. You don't have to trust it. Go read the code
  
  In conversation about 4 months ago permalink
- Embed this notice
  Aral Balkan (aral@mastodon.ar.al)'s status on Sunday, 12-Jan-2025 19:17:17 JST Aral Balkan
  in reply to
  
  @bedast Here’s one way this could have been avoided: don’t use Silicon Valley’s jargon to being with.
  I wonder how many people would have complained if they’d simply called it “automated captions” or even “automated on-device captions” instead of announcing it like this.
  In conversation about 4 months ago permalink
  Attachments
  1. VLC’s demo at CES titled “VLC AI Subtitles” in large letters.
    https://s3-eu-central-1.amazonaws.com/mastodon-aral/media_attachments/files/113/814/860/671/502/580/original/bde69f3fba234e3c.jpeg

Public

Conversation

Notices

Feeds