Overview
-
Founded Date October 10, 1910
-
Sectors Education Training
-
Posted Jobs 0
-
Viewed 6
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking designs, accomplishing efficiency equivalent to OpenAI-o1 throughout mathematics, code, and .
Models
DeepSeek-R1
Distilled designs
DeepSeek group has shown that the thinking patterns of larger designs can be distilled into smaller designs, resulting in much better performance compared to the reasoning patterns found through RL on small designs.
Below are the designs created by means of fine-tuning versus a number of dense models widely utilized in the research study neighborhood using reasoning data produced by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense models carry out incredibly well on benchmarks.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The model weights are accredited under the MIT License. DeepSeek-R1 series support industrial usage, permit for any adjustments and acquired works, including, but not restricted to, distillation for training other LLMs.