minus-squareDyf_Tfh@lemmy.sdf.orgtoOpen Source@lemmy.ml•Proton's biased article on Deepseeklinkfedilinkarrow-up10arrow-down4·edit-215 hours agoThose are not deepseek R1. They are unrelated models like llama3 from Meta or Qwen from Alibaba “distilled” by deepseek. This is a common method to smarten a smaller model from a larger one. Ollama should have never labelled them deepseek:8B/32B. Way too many people misunderstood that. linkfedilink
Those are not deepseek R1. They are unrelated models like llama3 from Meta or Qwen from Alibaba “distilled” by deepseek.
This is a common method to smarten a smaller model from a larger one.
Ollama should have never labelled them deepseek:8B/32B. Way too many people misunderstood that.