[2026-02-08 09:24] there are these variants of supervised learning known as @min_max and @genetic_algorithm which might seem similar to DRL. but in the case of @min_max problem, the path is known - the cost function, which is minimized, like a gradient descent. but in case of RL, the path - the cost function is not known in advance. the path is determined in run time, interacting with the agent. in case of getnetic algorithm, there are static tables from where the values are referred to compute the optimization needed. the fitness score is handed over to the next generation and it continues. this is also static.