@lanodan @carbontwelve Spam filtering has been a good application for machine learning for ages. I think the first Bayesian spam filters were added around the end of the last century. It has several properties that make it a good fit for ML:
- The cost of letting spam through is low, the value in filtering most of it correctly is high.
- There isn’t a rule-based approach that works well. You can’t write a list of properties that make something spam. You can write a list of properties that indicate something has a higher chance of being spam.
- The problem changes rapidly. Spammers change their tactics depending on what gets through filters and so a system that adapts on the defence works well. You have a lot of data of ham vs spam to do the adaptation.
Note that this is not the same for intrusion detection and a lot of ML-based approaches for intrusion detection have failed. It is bad if you miss a compromise and you don’t have enough examples of malicious and non-malicious data for your categoriser to adapt rapidly.
The last point is part of why it worked well in my use case and was great for Project Silica when I was at MS. They were burning voxels into glass with lasers and then recovering the data. With a small calibration step (burn a load of known-value voxels into a corner of the glass) they could build an ML classifier that worked on any set of laser parameters. It might not have worked quite as well as a well-tuned rule-based system, but they could do experiments as fast as the laser could fire with the ML approach, whereas a rule-based system needed someone to classify the voxel shapes and redo the implementation, which took at least a week. That was a huge benefit. Their data included error-correction codes, so as long as their model was mostly right, ECC would fix the rest.