Oct 8, 2003
-------------

- New topic: learning
        - using past experience to improve
          behavior

- Learning 
        - changing something internal to agent
          to improve performance
        - e.g., program, script, state, goals, utilities

- Notion of performance standard is important
        - has to be external to agent

- Examples
        - students, question papers, and grading

- Different ways to slice learning
        - based on nature of feedback
        - based on if/what you already know (something)

- Classification based on type of feedback
        - instructive (supervised learning)
        - evaluative (reinforcement learning)
                - need to do credit assignment
        - no feedback (unsupervised learning)

- Example of supervised learning
        - learning to drive a car with
          an instructor
        - passing 4804 midterm, given
          solution sketches for "similar problems"

- Example of unsupervised learning
        - just observing and noticing patterns
        - no goal in mind, so cannot effectively
          use what is learnt for any purpose

- Classification based on what you already know
        - nothing (inductive learning)
        - something
            - improve it (reinforcement learning)
                 e.g., improve h(s)
            - compile it into more efficient form
                        (speedup learning)

- Reinforcement learning
	- brings tradeoff between exploration and exploitation

- More ideas for learning
        - curve fitting example
                - how many possible answers are there?

- Lesson 1: Learning requires bias
        - e.g., polynomials

- Lesson 2: there might still be many answers
        - choose the one that is "simplest" 
          (called Occam's razor)

- Lesson 3: some biases may be too limiting
        - e.g., try learning XOR with linearly 
          separable decision surfaces