Embed Notice
HTML Code
Corresponding Notice
- Embed this notice
on-lain ✔ᵛᵉʳᶦᶠᶦᵉᵈ (lain@lain.com)'s status on Thursday, 25-Jul-2024 02:16:02 JSTon-lain ✔ᵛᵉʳᶦᶠᶦᵉᵈ @feld @yogthos @oz yeah, filtering out bad data and creating good data is the current main problem in the field, many think that synthetic data generation is the way to go, and both llama3 and WizardLM use it. adding 'tarpits' to random indieweb blogs will not do much.