Hybrid Reward Architecture breaks world record AI and Human for Pac Man

Maluuba, a microsoft bought up AI startup achieved the highest possible score (999 990) for the very difficult to beat Ms Pac-Man. It used a divide and conquer like reinforcement learning where responsibilities for each positive and negative reward giving elements in the game are assigned an individual agent that seeks to suggest to the player a move that is best for reaching that particular local goal. A “manager” agent recieves these suggestinos and decides what move the user shall perform in order to achieve the maximum reward.



Read the paper here

