Jessica Taylor. CS undergrad and Master’s at Stanford; former research fellow at MIRI.
I work on decision theory, social epistemology, strategy, naturalized agency, mathematical foundations, decentralized networking systems and applications, theory of mind, and functional programming languages.
Blog: unstableontology.com
Twitter: https://twitter.com/jessi_cata
Think of it as a predicate on policies. The predicate (local optimality) is true when, for each action the policy assigns non-zero probability to, that action maximizes expected utility relative to the policy.
Yes. It’s a predicate on policies. If two different actions (given an observation) maximize expected utility, then either action can be taken. Your description doesn’t allow that, because it assumes there is a single a’ that maximizes expected utility. Whereas, with a predicate on policies, we could potentially allow multiple actions.
Yes, exactly. Look up Nash equilibrium in matching pennies. It’s pretty similar. (Except your expected utilities as a function of your action depend on the opponent’s actions in matching pennies, and your own action in absent minded driver.)