Thursday 28 September 2023

Principal Component Analysis exam notes

 PCA is like a magic tool for making big, confusing data simpler and easier to understand. It helps us find the most important things in the data.

How PCA Works:


1. Data: Think of data as information about people. For example, you might have data about their height, weight, and age.


2. Centering Data: First, we find the "center" of the data by figuring out the average height, weight, and age of all the people. This is like where most people are in terms of these three things.


3. Calculating Relationships: PCA looks at how these things (height, weight, age) are connected to each other. It checks if they move together or separately.


4. Orthogonal Vectors: PCA finds special ways to look at the data, called "orthogonal vectors." These vectors are like arrows pointing in different directions. Each arrow shows a different aspect of the data.


   - The first arrow (the first vector) shows us the biggest difference in the data, like the main story.

   - The second arrow (the second vector) shows us the second biggest difference, and so on.


5. Eigenvalues and Eigenvectors: The length of each arrow (vector) is called an "eigenvalue." It tells us how important that arrow is. The longer the arrow, the more important it is.


   - The direction of each arrow (vector) is called an "eigenvector." It tells us what aspect of the data that arrow represents.


6. Reduced Data: We can change our data into these special ways of looking at it using the arrows. It's like having a new, simpler set of information.


Why Use PCA:


- We use PCA to understand our data better.

- It helps us see the main things that matter and ignore the less important stuff.


In simple words, PCA helps us make our data simpler by finding the most important aspects (represented by orthogonal vectors with long eigenvalues) and understanding them better.


Difference between Q- Learning and Model Based Learning - As told by ChatGPT

 Model-Based Learning:

  1. Model of the Environment: In model-based learning, the focus is on building a model or representation of the environment. The robot creates an internal map of the world, including information about states, actions, transitions, and rewards. This map helps the robot simulate its environment.

  2. Planning: Model-based learning involves planning and using the learned model to make decisions. The robot can simulate different actions and predict their outcomes without taking physical actions. This planning aspect helps it choose the best actions to reach its goals efficiently.

  3. Data-Intensive: It requires a lot of data collection and modeling effort. The robot needs to explore the environment extensively to build an accurate model, which can be data-intensive and time-consuming.

  4. Use of a Learned Model: The robot relies on its learned model to make decisions about actions. It doesn't directly use Q-values or learn from rewards through trial and error.

Q-Learning:

  1. Q-Values: In Q-learning, the focus is on learning Q-values, which represent the expected cumulative rewards for taking a specific action in a particular state. The robot maintains a Q-table, where each entry corresponds to a state-action pair.

  2. Trial and Error: Q-learning is based on trial and error. The robot explores the environment by taking actions and learning from the rewards it receives. It doesn't necessarily build a detailed model of the environment.

  3. Action Selection: Q-learning involves selecting actions based on the Q-values in the Q-table. The robot chooses the action with the highest Q-value for its current state, emphasizing exploitation of learned information.

  4. Data-Efficient: Q-learning can be more data-efficient than model-based learning because it directly learns from rewards during exploration without needing to build an explicit model.

In summary, the key difference lies in how they approach learning and decision-making:

  • Model-based learning focuses on creating a detailed model of the environment and using it for simulation and planning.
  • Q-learning focuses on learning the values of actions directly from rewards through trial and error, without necessarily building a detailed model.

The choice between these approaches depends on the specific problem and the available resources for learning and exploration.

Model Based Reinforcement Learning example

 Model Based Reinforcement Learning example:

Source: https://www.google.com/url?sa=i&url=https%3A%2F%2Fmedium.com%2Fanalytics-vidhya%2Fmodel-based-offline-reinforcement-learning-morel-f5cd991d9fd5&psig=AOvVaw17zqlCJPf9ASKzbC2CkiVg&ust=1675731090336000&source=images&cd=vfe&ved=0CBEQjhxqFwoTCOCm6PTW__wCFQAAAAAdAAAAABAE


Example: Robot in a Maze


- Imagine a robot in a maze trying to find a treasure.


Experience:

- The robot explores the maze, moving around and gathering experience.

- It remembers which actions it took, where it went, and the rewards it received.


Model:

- The robot builds a "map" or model of the maze.

- This model includes information about where walls are, possible paths, and what might happen at each location.

- The model helps the robot understand the maze better.


Value Function:

- The robot keeps track of values for different states in the maze.

- These values represent how good it is to be in a particular state.

- For example, finding the treasure has a high value.


Policy:

- The robot uses its value function and model to create a "policy."

- A policy is like a set of rules that tell the robot which actions to take in different situations.

- It helps the robot decide where to go to maximize its rewards.


Tables:

- The robot maintains tables to store information.

- One table keeps track of its experiences.


Experience Table:

| State | Action | Next State | Reward |

|------- |-------- |------------|--------|

| Start | Move Up | Wall       | -1     |

| ...   | ...    | ...        | ...    |

| Treasure | Grab  | Exit       | +100   |


- Another table stores the value function, showing how good each state is.


Value Function Table:

| State | Value |

|------- |------- |

| Start | 0    |

| ...   | ...          |

| Treasure | 100 |


How It Works:


1. The robot starts in the maze, taking actions and learning from rewards.


2. It uses the experiences to update its model of the maze.


3. It calculates values for each state using its value function.


4. With the model and values, it creates a policy for making decisions.


5. The robot follows the policy to find the treasure efficiently.


In this example, model-based reinforcement learning helps the robot build a model of the maze, use it to make decisions, and find the treasure while keeping things simple.


Q- Learning With a simple example

Example: Student's Study Strategy with Q-Learning

Imagine a student who wants to maximize their grades by studying effectively.

Goal: get maximum grade


Q-Values:

- Q-values represent the perceived effectiveness of different study strategies.

- Each study strategy has a Q-value indicating how good it is for achieving high grades.


States:

- States represent the student's current situation, like the subject they're studying and their current knowledge level.


Actions:

- Actions are the various study strategies the student can choose from, such as "Read the textbook," "Take practice quizzes," "Watch video lectures," and "Review notes."


Q-Table:

- We maintain a Q-table to track Q-values for each state-action pair.


Initial Q-Table:


Let's start with an initial Q-table:


| State (Subject, Knowledge Level) | Action (Study Strategy) | Q-Value (Effectiveness) |

|--------------------------------- |------------------------- |--------------------------|

| Math, Novice                     | Read the textbook               | 0                        |

| Math, Novice                    | Take practice quizzes          | 0                        |

| ...                             | ...                             | ...                      |

| History, Intermediate            | Review notes                    | 0                        |


Learning Process:


1. Initial Values: Initially, all Q-values in the table are set to 0.


2. Studying: The student studies various subjects and chooses study strategies.


3. Grades: After each exam, the student receives a grade (reward).


4. Updating Q-Values: After receiving a grade, the student updates the Q-values for the state-action pairs involved in their study strategy.


   - For instance, if they studied math as a novice and took practice quizzes, and they received an excellent grade, they increase the Q-value for "Take practice quizzes" when in the state (Math, Novice) because they learned that this strategy is effective.


   Updated Q-Table:


   | State (Subject, Knowledge Level) | Action (Study Strategy) | Q-Value (Effectiveness) |

   |---------------------------------            |-------------------------     |--------------------------|

   | Math, Novice                     | Read the textbook          | 0                                |

   | Math, Novice                     | Take practice quizzes     | 100                          |

   | ...                                         | ...                                    | ...                              |

   | History, Intermediate                | Review notes                  | 0                                |


5. Choosing Actions: As the student continues to study different subjects and use study strategies, they refer to the Q-table to decide which strategy to use. They select the strategy with the highest Q-value for their current situation.


6. Learning Continues: The student repeats this process, studying, receiving grades, and updating Q-values. Over time, they become better at selecting effective study strategies to maximize their grades.


In this Q-learning example, the student's goal is to learn the optimal study strategies for each subject and knowledge level to maximize their grades. They achieve this by updating Q-values based on their exam results and using these values to make decisions about study strategies. Over time, the student becomes more proficient at studying effectively.