Completed week 1 notes
This commit is contained in:
115
CM3020 Artificial Intelligence/Week 1/1. Week 1 notes.md
Normal file
115
CM3020 Artificial Intelligence/Week 1/1. Week 1 notes.md
Normal file
@ -0,0 +1,115 @@
|
||||
# What is a gameplaying AI?
|
||||
`Games are interesting because they are too hard to solve.`
|
||||
|
||||
-Russell, Stuart and Peter Norvic
|
||||
|
||||
---
|
||||
How fair are the matches between game AI's and human players?
|
||||
|
||||
Once we solve Starcraft we have solved real world problems.
|
||||
|
||||
From "When are we done with Games? (2019) N. Justesen, S. Risi.
|
||||
|
||||
---
|
||||
When will AI exceed human performance?
|
||||
|
||||
There are several areas of popular interest in AI (2018).
|
||||
|
||||
---
|
||||
Commercial interest in AI game players:
|
||||
* Teammates
|
||||
* Enemies
|
||||
|
||||
"AI is faster than human game playtesting"
|
||||
Zhao
|
||||
|
||||
---
|
||||
AI competitions use benchmarks.
|
||||
|
||||
# Moravec Paradox
|
||||
About AI
|
||||
|
||||
Things that humans find hard: no problem.
|
||||
|
||||
Things that humans find easy: not so much.
|
||||
|
||||
This leads the trend of generalization in game playing AI's that cam play multiple games.
|
||||
|
||||
---
|
||||
DQN is a general AI for retro arcade games
|
||||
|
||||
Agent57 is a state-of-the-art Ai from 2020 for this
|
||||
https://www.deepmind.com/blog/agent57-outperforming-the-human-atari-benchmark
|
||||
|
||||
Other communities on AI
|
||||
http://www.gvgai.net/
|
||||
https://github.com/GAIGResearch/GVGAI
|
||||
|
||||
---
|
||||
What are the long-term plans for AI
|
||||
|
||||
AlphaStar and OpenAI Five show that agents can be trains without explicit hierarchical macro-actions to reach superhuman skill in games.
|
||||
|
||||
Raiman, Jonathan, Susan Zhang, Filip Wolski. "Long-term planning and situational awareness in OpenAI five" (2019)
|
||||
|
||||
https://openai.com/blog/openai-five-defeats-dota-2-world-champions/
|
||||
|
||||
---
|
||||
When will it end?
|
||||
|
||||
When there's an AI a human can play against it for a long period of time.
|
||||
|
||||
Justesen, Niels, Michael S. Debus, Christian Risi. "When are we done with games? (2019)
|
||||
|
||||
# AI game player milestones
|
||||
In 1950 Claude Shannon published his work on how to program a computer to play chess.
|
||||
|
||||
1952 Checkers/drafts by Stachey. Logical or non-mathematical programs.
|
||||
|
||||
1979 Backgammon H. Berliner. In 1980 it beat the backgammon world champion.
|
||||
|
||||
1992 Jonathan Schaeffer beats the world champion.
|
||||
|
||||
1997 Deep Blue beats world chess world champion Garry Kasparov.
|
||||
|
||||
2002 Scrabble. Brian Sheppard created a world-champion caliber Scrabble AI.
|
||||
|
||||
2007 Jonathan Schaeffer indicates Checkers is solved with AI.
|
||||
|
||||
2009 The first Mario AI competition to 2012. Nintendo did not agree with the competition using the graphics. Lasted until 2012.
|
||||
|
||||
2015 Atari. Human-level control through deep reinforcement learning.
|
||||
|
||||
2016 Go. An AI beat the world champion.
|
||||
|
||||
2018 Head-up no-limit Texas hold'em Poker. Libratus Ai beats top pros in this type of poker.
|
||||
|
||||
2019 Dota 2. Long-term planning and situational awareness in OpenAI five. Jonathan Raiman, Susan Shang.
|
||||
|
||||
2020 Atari 57. The AI agent was able to play all 57 Atari games and outperform human players.
|
||||
|
||||
# How might we build a game playing AI?
|
||||
|
||||
What is the DQN?
|
||||
|
||||
We start by playing the game.
|
||||
|
||||
We ask questions:
|
||||
* How do we get points?
|
||||
* Which way is the ball going?
|
||||
* Where are the bricks?
|
||||
* How do we get the ball on the correct side?
|
||||
|
||||
The AI gets the pixels view of the screen (RGB), score and whether the game is over or not. It outputs Left, Right or No-move.
|
||||
|
||||
### Reinformcent learning is
|
||||
> How agents can learn what to do in the absence of labeled examples of what to do.
|
||||
> Playing a new game you don't know and after a hundred or so moves, the opponent announces if you lose.
|
||||
-Russell, Norvic
|
||||
|
||||
The key elements of a reinforcement learning algorithm are:
|
||||
* States (The pixels coming in)
|
||||
* Actions (Left, Right, Nowt)
|
||||
* Rewards (The score points)
|
||||
|
||||
By formalizing the problem we can operate on it to simulate how a human plays a game (limited access)
|
||||
Reference in New Issue
Block a user