Oct 8, 2003 ------------- - New topic: learning - using past experience to improve behavior - Learning - changing something internal to agent to improve performance - e.g., program, script, state, goals, utilities - Notion of performance standard is important - has to be external to agent - Examples - students, question papers, and grading - Different ways to slice learning - based on nature of feedback - based on if/what you already know (something) - Classification based on type of feedback - instructive (supervised learning) - evaluative (reinforcement learning) - need to do credit assignment - no feedback (unsupervised learning) - Example of supervised learning - learning to drive a car with an instructor - passing 4804 midterm, given solution sketches for "similar problems" - Example of unsupervised learning - just observing and noticing patterns - no goal in mind, so cannot effectively use what is learnt for any purpose - Classification based on what you already know - nothing (inductive learning) - something - improve it (reinforcement learning) e.g., improve h(s) - compile it into more efficient form (speedup learning) - Reinforcement learning - brings tradeoff between exploration and exploitation - More ideas for learning - curve fitting example - how many possible answers are there? - Lesson 1: Learning requires bias - e.g., polynomials - Lesson 2: there might still be many answers - choose the one that is "simplest" (called Occam's razor) - Lesson 3: some biases may be too limiting - e.g., try learning XOR with linearly separable decision surfaces