deepseek-r1: incentivizing reasoning capability in llms viareinforcement learninggrok 3 vs deepseek vs chatgptollama deepseekdeepseek fp32