Brutkey

David Gerard
@davidgerard@circumstances.run

currently wondering what if anything to write about OpenAI's open-weights LLM GPT-OSS

lots of "here's how it performs" "boo censored" sure

but

is it significant? i'm not at all convinced it is. doesn't seem to change the game.


David Gerard
@davidgerard@circumstances.run

The thing about gpt-oss is, the market for LLM-at-home is nerd experimenters. home llm runners on localllama. there isn't a market as such.

there's cloud providers offering it as a model you could use

the purpose of gpt-oss seems to be marketing - pure catchup cos llama and deepseek and qwen are open weights. look not left behind.

and apparently it's pretty good as a home model? if that's your bag

OpenAI censored the shit out of it because that risks bad press less than not censoring it - and breaking the censorship appears not that hard

this thread on how gpt-oss seems to have been trained is hilarious
https://xcancel.com/jxmnop/status/1953899426075816164

this thing is clearly trained via RL to think and solve tasks for specific reasoning benchmarks. nothing else.
and it truly is a tortured model. here the model hallucinates a programming problem about dominos and attempts to solve it, spending over 30,000 tokens in the process
completely unprompted, the model generated and tried to solve this domino problem over 5,000 separate times
i could easily write 250 words on this thing but i'm not sure they'd be ones i'd care to read either

Michael Westergaard
@michael@westergaard.social

might be a case of trying to invoke classical tropes from tech. openai arguably makes the best (closed) models right now; few come even close. gpt-oss is not the best model, but it is from openai (and better than some of the older smaller open models)

osborne effect: why bother with shitty mistral or red-china's deepseek now, if you can get the real-deal openai model soon

eee: embrace open weight models by releasing a slightly worse version of the best models, internally extend it as the best models, extinguish attempts by third parties to pour billions into nvidia chips to make potentially competing models

it might be too late, but just consider the eco-system that emerged from opening up stable diffusion (mostly titty generators, let's be honest). only multi-model models come close to rivaling the amount of users stable diffusion and derivatives have. it can definitely funnel effort from the smaller, shittier independent models (looking at mistral again)

tbh, it's probably better to consolidate the effort to make the bestest t9 dictionary rather than having 21M models, but not if altman is in charge.