Brutkey

SnoopJ
@SnoopJ@hachyderm.io

I would argue precisely the opposite: Markov language models are more complex because one generally means using a more corpus representation than what's done with LLMs.

I.e. when you build a Markov n-(word)gram model, you
explicitly bake the concept of "word" into your representation. With LLMs, you do no such transformation and it's generally considered a faux paux to do "feature engineering" with convolutional networks in general.

The LLM's data model is
less complex in this sense.

SnoopJ
@SnoopJ@hachyderm.io

The commenter also says that Markov models are a good way to talk about what LLMs are doing, but even this is something I disagree with.

You don't need to talk about another kind of model
at all, you only need to establish that language has statistical patterns (I like Zipf's Law for this), and then point out that with obscene amounts of compute and data you can brute-force solutions to statistical problems. It's one of the original reasons for building computers, for crying out loud.


SnoopJ
@SnoopJ@hachyderm.io
SnoopJ
@SnoopJ@hachyderm.io