Overview

  • Founded Date October 10, 1910
  • Sectors Education Training
  • Posted Jobs 0
  • Viewed 6
Bottom Promo

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation thinking designs, accomplishing efficiency equivalent to OpenAI-o1 throughout mathematics, code, and .

Models

DeepSeek-R1

Distilled designs

DeepSeek group has shown that the thinking patterns of larger designs can be distilled into smaller designs, resulting in much better performance compared to the reasoning patterns found through RL on small designs.

Below are the designs created by means of fine-tuning versus a number of dense models widely utilized in the research study neighborhood using reasoning data produced by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense models carry out incredibly well on benchmarks.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The model weights are accredited under the MIT License. DeepSeek-R1 series support industrial usage, permit for any adjustments and acquired works, including, but not restricted to, distillation for training other LLMs.

Bottom Promo
Bottom Promo
Top Promo