1. Correct the faulty bag implementation and rerun on it.
2. The custom strategy shown in Section 12.4 chooses an action with maximum reward. Implement another Select Action method that chooses an action with probability that is proportional to the reward. 3. Implement a strategy for decrementing action weights discussed in Section 12.5.4. In order to assign initial weights to actions, use state properties.