Decision Transformer Reinforcement Learning via Sequence Modeling

Posted on April 10, 2024   1 minute read ∼ Filed in  : 

This paper mainly casts the problem of RL as conditional sequence modeling.

Where they input the expected reward and predict the best action here.

One problem is how to define the expected rewards.

The experiments show that it should be within in training dataset, but I feel there is a problem.

How to decide the expected rewards if we cannot predict future rewards?





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.