Summarize of RL paper I have learned
1 minute read ∼ Filed in : A paper noteRL basic category
Model-Free
Model-Base
Offline RL
Online RL
Model-Base + Offline
Model-Base + Online
Model-Free + Offline
Decision Transformer: Reinforcement Learning via Sequence Modeling (2021)
Large models require few samples to reach the same performance.
(Desired return, past states, actions) => sequence model (Decision Transformer) => actions with desired return.
This work is on offline RL.
This work is given a reward, and it try to predict an action to achieve the desired return
Model-Free + Online
Large transformer models have shown strong generalization capability on language understanding when fine-tuned with limited data.