GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    BedastGPT (bedast@beige.party)'s status on Friday, 10-Jan-2025 12:40:37 JST BedastGPT BedastGPT

    The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.

    VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.

    This is not generative AI.

    While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.

    What VLC is doing is something that will contribute to accessibility in a big way.

    AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.

    I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.

    So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.

    #AI #Transcription #VLC #HearingImpaired #Deaf #Accessibility

    In conversation about 4 months ago from beige.party permalink

    Attachments

    1. Domain not in remote thumbnail source whitelist: not.So
      Not.so: Scam, fraud and hoax information portal.
      A list of high quality sites dealing with scams, spam, hoaxes, fraud and misinformation.
    2. No result found on File_thumbnail lookup.
      way.AI
    • Haelwenn /элвэн/ :triskell:, djsumdog and MortSinyx like this.
    • Rich Felker repeated this.
    • Embed this notice
      BedastGPT (bedast@beige.party)'s status on Friday, 10-Jan-2025 20:39:31 JST BedastGPT BedastGPT
      in reply to
      • Hops the sausage dog

      @howisyourdog I'm not a Firefox user so I haven't really dug into the latest in being upset with Firefox making an AI plugin, but it seemed like they were making an LLM to summarize pages. These have been known to get things very wrong. I don’t know if it's on-device or if it uses ChatGPT.

      In conversation about 4 months ago permalink
    • Embed this notice
      Hops the sausage dog (howisyourdog@cupoftea.social)'s status on Friday, 10-Jan-2025 20:39:36 JST Hops the sausage dog Hops the sausage dog
      in reply to

      @bedast I would also add I find it quite helpful to start with a set of automatically generated captions, and then correct them. I don't do this often, but it saves me loads of time in a part-time job.

      Is this a bit like people being annoyed at Mozilla using AI for on-device browser translation, even though that's very useful? I'm not sure if that's generative, but I'd guess not.

      In conversation about 4 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: www.job.is
        Job
        from Baldur
        Finnum þína framtíðar vinnu | Atvinnuauglýsingar | Laus störf | Vinna í boði | Starfsólk á skrá |
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 21:05:52 JST Rich Felker Rich Felker
      in reply to

      @bedast Then don't call it AI. Call it speech to text. But if it uses a language model to more effectively predict words based on context rather than doing an analyzable mechanical local transformation, it is at least partly the "bad kind of AI" - it has the capacity to introduce biases from training data making output that "sounds right" but means the wrong thing, which is much worse than substituting nonsensical homophones now and then (which the reader will immediately recognize as mistakes). Same principle as why autocorrected text is worse than text with typos.

      In conversation about 4 months ago permalink
      tinydoctor repeated this.
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 21:18:48 JST Rich Felker Rich Felker
      in reply to

      @bedast Enthusiastically calling new functionality "AI" signals to your audience that you're aligned with the scams and makes them distrust you.

      This is not hard.

      If you have privacy respecting, on-device, non-plagiarized, ethically built statistical model based processing, DON'T CALL IT "AI".

      In conversation about 4 months ago permalink
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 22:14:07 JST Rich Felker Rich Felker
      in reply to
      • A.V.

      @varavs @bedast Then don't call it "AI".

      But also, question what harms are coming out of the predictive models. The more they force the output to sound natural and fix misrecognitions, the greater the chance they're altering meaning. Same as autocorrect vs typed text with typos and misspellings.

      In conversation about 4 months ago permalink
    • Embed this notice
      A.V. (varavs@sigmoid.social)'s status on Friday, 10-Jan-2025 22:14:08 JST A.V. A.V.
      in reply to
      • Rich Felker

      @dalias @bedast speech recognition has used language models for decades now. It was one of original applications of language models, way before they scaled up to aping shakespeare.

      But even without language models, the act of transcription is very close to generative ai, as its the task of predicting the next text token, given previous tokens and encoded audio sequence.

      In conversation about 4 months ago permalink
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Friday, 10-Jan-2025 22:16:20 JST Rich Felker Rich Felker
      in reply to
      • A.V.

      @varavs @bedast Also ask if the model is ethically and legally sound. Was it produced from professional training material with compatible license terms? Or stolen from millions of movies or YouTube videos?

      In conversation about 4 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Friday, 10-Jan-2025 22:18:29 JST 翠星石 翠星石
      in reply to
      @bedast The problem is such subtitling software is not free software, as I really doubt the license of the training works was followed (most training is done on creative works with no license) and there is no complete source code - just an object code form that nobody understands how it really works, thus such software doesn't have the 4 freedoms; https://www.gnu.org/philosophy/free-sw.en.html#four-freedoms

      Are you really confident such nonfree software is compatible with VLC's licenses?


      When I watch a video, I will never be content with slop subtitles - handcrafted .ass's is what I need.
      In conversation about 4 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: www.gnu.org
        What is Free Software? - GNU Project - Free Software Foundation
        from mailto:webmasters@gnu.org
        Since 1983, developing the free Unix style operating system GNU, so that computer users can have the freedom to share and improve the software they use.
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Saturday, 11-Jan-2025 12:37:19 JST Rich Felker Rich Felker
      in reply to
      • LisPi

      @lispi314 @bedast It started showing diminishing returns when researchers figured out you could churn out degrees/products without needing any new ideas just throwing machine learning at the problem and ignoring all the potential harms from that.

      In conversation about 4 months ago permalink
    • Embed this notice
      LisPi (lispi314@udongein.xyz)'s status on Saturday, 11-Jan-2025 12:37:20 JST LisPi LisPi
      in reply to
      • Rich Felker
      @dalias @bedast Didn't mathematical/rule-based language modeling start showing massively diminishing returns back like... two~three decades ago or is my information wrong?

      As far as I'm aware it would be preferable to start from a rule-based language, and then be able to specifically train a small model on a different captioned sample set of the speaker(s) to eliminate its flakiness.
      In conversation about 4 months ago permalink
    • Embed this notice
      Paul Sutton (zleap@qoto.org)'s status on Saturday, 11-Jan-2025 17:10:29 JST Paul Sutton Paul Sutton
      in reply to

      @bedast

      Sounds a good idea to me, the tool can take a video and create captions. Your comment about humans being more accurate is also good, as surely once those captions have been created, a human can go through them, and I would assuek captions are stored in a external file, if this can be edited then the human job would be to simply edit the file and correct any minor errors.

      Any tools that can make life a little easier is surely welcome. Perhaps the importantj point though is also transparancy, if you have used a tool to transscribe this should be clearly stated, so people know how the captions have been generated.

      In conversation about 4 months ago permalink
    • Embed this notice
      SuperDicq (superdicq@minidisc.tokyo)'s status on Sunday, 12-Jan-2025 00:00:44 JST SuperDicq SuperDicq
      in reply to

      @bedast@beige.party If it runs on the user's local device and is free software I'm all for it.

      In conversation about 4 months ago permalink
    • Embed this notice
      NotAlexNoyle 🌻 (notalexnoyle@union.place)'s status on Sunday, 12-Jan-2025 17:35:29 JST NotAlexNoyle 🌻 NotAlexNoyle 🌻
      • Infoseepage
      • Hops the sausage dog

      @Infoseepage @bedast @howisyourdog it is open source software. You don't have to trust it. Go read the code

      In conversation about 4 months ago permalink
    • Embed this notice
      Aral Balkan (aral@mastodon.ar.al)'s status on Sunday, 12-Jan-2025 19:17:17 JST Aral Balkan Aral Balkan
      in reply to

      @bedast Here’s one way this could have been avoided: don’t use Silicon Valley’s jargon to being with.

      I wonder how many people would have complained if they’d simply called it “automated captions” or even “automated on-device captions” instead of announcing it like this.

      In conversation about 4 months ago permalink

      Attachments


      1. https://s3-eu-central-1.amazonaws.com/mastodon-aral/media_attachments/files/113/814/860/671/502/580/original/bde69f3fba234e3c.jpeg

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.