Let’s just say the quiet part out-loud: loading and running LLM models is not easy.
There … I said it.
Now, the issue here is that the model just spits out random text and talks with itself (I’m not a fan of watching). Briefly, the issue has to do with the model accepting it’s own output as it’s new input ad nauseum.
Here’s the solution:
Load in CHATML mode
That’s your answer. If that doesn’t work, then load in “Instruct” mode.
Leave a Reply