GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Monday, 23-Dec-2024 02:03:46 JSTDavid Chisnall (*Now with 50% more sarcasm!*)David Chisnall (*Now with 50% more sarcasm!*)
    in reply to
    • Haelwenn /элвэн/ :triskell:
    • Mx Autumn :blobcatpumpkin:

    @lanodan @carbontwelve Spam filtering has been a good application for machine learning for ages. I think the first Bayesian spam filters were added around the end of the last century. It has several properties that make it a good fit for ML:

    • The cost of letting spam through is low, the value in filtering most of it correctly is high.
    • There isn’t a rule-based approach that works well. You can’t write a list of properties that make something spam. You can write a list of properties that indicate something has a higher chance of being spam.
    • The problem changes rapidly. Spammers change their tactics depending on what gets through filters and so a system that adapts on the defence works well. You have a lot of data of ham vs spam to do the adaptation.

    Note that this is not the same for intrusion detection and a lot of ML-based approaches for intrusion detection have failed. It is bad if you miss a compromise and you don’t have enough examples of malicious and non-malicious data for your categoriser to adapt rapidly.

    The last point is part of why it worked well in my use case and was great for Project Silica when I was at MS. They were burning voxels into glass with lasers and then recovering the data. With a small calibration step (burn a load of known-value voxels into a corner of the glass) they could build an ML classifier that worked on any set of laser parameters. It might not have worked quite as well as a well-tuned rule-based system, but they could do experiments as fast as the laser could fire with the ML approach, whereas a rule-based system needed someone to classify the voxel shapes and redo the implementation, which took at least a week. That was a huge benefit. Their data included error-correction codes, so as long as their model was mostly right, ECC would fix the rest.

    In conversationabout 5 months ago from gnusocial.jppermalink
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.