queirozf.com

Entries by tag: reinforcement-learning

Including child/synonym tags

Paper Summary: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning  19 Apr 2025    paper-summary language-modeling reinforcement-learning
Summary of the 2025 article "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" by DeepSeek AI. Read More ›

Paper Summary: DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models  06 Apr 2025    paper-summary reinforcement-learning language-modeling
Summary of the 2024 article "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" by Shao et al. Read More ›

Paper Summary: Proximal Policy Optimization Algorithms  06 Apr 2025    paper-summary reinforcement-learning language-modeling
Summary of the 2017 article "Proximal Policy Optimization Algorithms" by Schulman et al. Read More ›

Paper Summary: Deep Reinforcement Learning from Human Preferences  15 Jul 2023    paper-summary reinforcement-learning rlhf
Summary of the 2017 article "Deep Reinforcement Learning from Human Preferences" by Christiano et al. AKA the RLHF article. Read More ›