GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    Colin Gordon (csgordon@discuss.systems)'s status on Wednesday, 14-May-2025 11:25:24 JSTColin GordonColin Gordon

    When you submit a paper to an ACM journal, it gets run through TurnItIn (yes, really) and the editors in chief have to look at the report and decide if there are plagiarism concerns. Most submissions have a small percentage (~5%) of verbatim-matching text, from a wide variety of sources. The matches are usually small turns of phrase, technical phrases, affiliations, or ACM copyright text 😛 The exceptions are generally extended versions of conference papers, where obviously large chunks of the extension match the original publication.

    But recently I've noticed an up-tick, so far only in the wildly-out-of-scope papers that get desk rejected (mostly papers about using LLMs for NLP) of a high percentage of the paper's text (~30%) being flagged as matching, still from a wide variety of sources, but much larger chunks. A long phrase from here, most of a sentence from there, etc., from very scattered sources across different far-ranging fields. This seems unlikely to be from authors picking up phrases they like from papers they actually encountered. I can't help but think these papers have a high fraction of LLM-generated text, and that LLM-generated text on similar topics tends to output a lot of phrases and sentences repeatedly in aggregate, and these patterns are now getting picked up by traditional plagiarism checkers since there's so much LLM-generated text in the world now.

    In conversationabout a month ago from discuss.systemspermalink
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.