• Dojan@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    9 months ago

    They don’t really parrot unless they’re overfitted.

    It’s more that they have been trained to produce a certain kind of result. One method you can train them on is by basically assigning a score on how good the output is. Doing this manually takes a lot of time (Google has been doing this for years via captcha), or you could train other models to score text for you.

    The obvious problem with the latter solution is that then you need to ensure that that model is scoring roughly in line with how humans would score it; the technical term for this is alignment. There’s a pretty funny story about that with GPT-2, presented in a really cute animation format by Robert Miles.