GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Eaton (eaton@phire.place)'s status on Tuesday, 11-Jun-2024 23:16:51 JST Eaton Eaton

    One of the signs you're Not Well is that putting a 'reading list' section on your blog starts with building a rotating-proxy amazon scraper and a NLP powered metadata scrubbing engine that fixes common problems — Edition names jammed onto the end of Publisher names, publication dates appended to Imprint names, subtitles that are actually author lists, etc.

    But, like… what am I supposed to do? Let BAD DATA just STAY BAD?

    *wild-eyed, twitchy stare*

    In conversation Tuesday, 11-Jun-2024 23:16:51 JST from phire.place permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Tuesday, 11-Jun-2024 23:21:41 JST Eaton Eaton
      in reply to

      Structured book metadata is less of a contract and more of a puzzle game, in which you try to figure out whether ‘PENGUIN' appearing in the Publisher, Format, and Edition fields of a book about penguins is a sign that Penguin/Random House published it, that the metadata is duplicated, or that someone’s enthusiastic 5 year old got ahold of the keyboard on release day.

      In conversation Tuesday, 11-Jun-2024 23:21:41 JST permalink
    • Embed this notice
      Mark Llobrera (markllobrera@mastodon.social)'s status on Wednesday, 12-Jun-2024 01:32:03 JST Mark Llobrera Mark Llobrera
      in reply to

      @eaton It’s such a headache with my own reading log, and I view any API-pulled data as a mere starting point that needs human intervention

      In conversation Wednesday, 12-Jun-2024 01:32:03 JST permalink
    • Embed this notice
      Eaton (eaton@phire.place)'s status on Wednesday, 12-Jun-2024 01:32:03 JST Eaton Eaton
      in reply to
      • Mark Llobrera

      @markllobrera 100%. I've got things in ... DECENT shape, though not so good that I’d be comfortable doing (say) auto-generation of author-name index pages.

      In conversation Wednesday, 12-Jun-2024 01:32:03 JST permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.