GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    Pixelcode 🇺🇦 (pixelcode@social.tchncs.de)'s status on Monday, 18-Sep-2023 00:23:37 JSTPixelcode 🇺🇦Pixelcode 🇺🇦
    in reply to
    • Aral Balkan
    • :thilo:
    • Adrian Roselli
    • Kai Klostermann

    @aardrian @odddev @thilo @aral

    My approach would be this:

    If there are any non-Latin characters present, tokenise. For each non-Latin token, use a pre-defined hash table to rewrite each symbol to its Latin equivalent (if there is one). If the result is a purely Latin token, lemmatise it to determine whether it's an existing word in the post's language. If so, read the natural word instead of the non-Latin token.

    In conversationMonday, 18-Sep-2023 00:23:37 JST from social.tchncs.depermalink
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.