intelligent behavior optimizes for a value function
shifting emergent behavior

Let's pick some powerful definition for intelligence. The process of reaching a goal through decision making. Next, value function. Given a state of the world, your function would assign it a score. Say between 0 and 1. We have to limit this function to not consider the complete state of the world but in priciple we would like to consider all relevant bits which is likely all bits. Of course the less influential the bit, the lower the need to include its valuation in our value function. With our definition for intelligence and a perfect value function we can continue.

Training an AI, our data could be bad. As-in, biased (in a technical not sociological sense), or incomplete, or otherwise lacking bits which we humans would consider relevant to our decision making. The same can be true on the function side, making it imperfect. Missing bits of information that we would consider in our scoring.

Here we come into the genius of GANs. We choose our value function to perfectly simple. Examples are, does the seek agent have a uninterupted line of sight to the hide agent in a game of hide and seek. Did the Go agent have more pieces of territory (i.e. win the game), was the rubiks cube solved? was this image taken with a camera, or was this picture generated? Everywhere you could state win-the-game we're really just saying the rules are clear, and one player won. The scoring function is often perfect. In life the same is often available though scoring functions we're more interested in as humans are is this split fair? is that food healthy? or will this shirt look attractive? Obviously states we could score, but although at least some local consensus often exists, experiences appear to diverge too much to define a score of 0 or 1. i.e. someone will find that horrendous shirt beautiful and some people are not attracted to Alicia Vikander. But that's fine, imperfect goals limit us to imperfect decisions. We're more interested in whether the plane we're in should be tilting up or down. Whether it will rain more than 5mm tomorrow or not. The scoring is obivous. Now, if we take a simple and obvious goal, by which our decision can be said to lead to a winning or losing outcome something interesting happens. Our scoring function is often imperfect. We will by definition take too little relevant state into account. By taking a goal that is perfect, 0 or 1, we can make the AI generate or curveball is it a 0 or is it a 1 scenarios. By making AIs compete one side is trying to converge on generating states that are perfect 0.5 scores, whereas the other side is constantly trying to rule them a 0 or 1. Now do it a million times and you have created a powerful GAN.

With Go for example the trouble is that for a given move evaluating its score is computationally hard. Although the rules are simple, considering all possible continuations, to be able to say if this move wins more games than any other move, is not available. So we're left approximating whether this state is winning. Humans are incredibly good at doing this cheaply. Thanks evolution. Generating a move that makes many opposing moves a 0.5 score is hard.

AI playing itself is GAN.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why-are-GANs-powerful.md

why-are-GANs-powerful.md

Files

why-are-GANs-powerful.md

Latest commit

History

why-are-GANs-powerful.md

File metadata and controls