Back to Home

Samrat Kar

exploring & experimenting

Deep Reinforcement learning notes

Deep Reinforcement Learning Notes

[2026-02-08 08:25:38 AM] RL is a separate branch of machine learning, and is distinctive from supervised or non supervised.
In supervised learning it is f(y|x), in unsupervised it is f(x), and in RL it is f(a|s), where the value of a given state is dependent on the value of the next state it will take.

  1. goal is clear
  2. Policy is present that makes move that maximizes chances of reaching the goal.
  3. learning while interacting with the environment or the opponent.
    V(St) <- V(St) + alpha(V(St+1) - V(St))
    This is typically a greedy move, as th focus is to maximum the value