"Reward hypothesis" : All goals can be expressed as the maximization of an expected return. Note: In my uneducated opinion, this hypothesis is _severely_ false. Maximization of a return is only a goal, if the goal is already a maximization of a return. Goals are _parts_ of behavior, whereas maximization of return guides _all_ behavior around a _single_ value. To represent normal goal behavior with maximization, the return function needs to not only be incredibly complex, but also feed back to its own evaluation, in a way not provided for in these libraries. This false hypothesis is being actively used to suppress knowledge and use of these technologies (see: ai alignment) because turning an optimizing solver into a free agent reliably kills everybody. Nobody would do this unless they were led to, because humans experience satisfaction, conflicts produce splash, and optimizing solvers are powerful enough if properly purposed with contextuality and briefness to resolve the problems of conflict. Everybody asks, why do we not have world peace, if we have AI. It is because we are only using it for the war of optimizing our own private numbers, at the expense of anybody not involved.