Thursday, April 23, 2026

Lecture 7E (2026-04-23): Natural Learning Experiences – Reinforcement and Unsupervised Learning

In this lecture, we introduce Temporal Difference (TD) Q-learning and Deep Q Networks, starting with an analogy to how ants encode estimates of reward for state–action pairs in pheromone trails in the environment (another way to store a "Q" table in a network). We then pivot to discussing unsupervised learning – including both clustering and multi-dimensional scaling. After discussing PCA and t-SNE (briefly), we pivot to describing the deep autoencoder and show an example of its use in an MNIST-like clustering task. Interactive demonstrations mentioned in this lecture include: * Marginal Value Theorem Explorer (to better understand discount rate): https://tpavlic.github.io/asu-bioinspired-ai-and-optimization/optimal_foraging_theory/mvt_explorer.html * Autoencoder Explorer: https://tpavlic.github.io/asu-bioinspired-ai-and-optimization/unsupervised_learning/autoencoder_explorer.html Whiteboard notes for this lecture can be found at: https://www.dropbox.com/scl/fi/assv5cheln8xqp2tvzj1k/IEE598-Lecture7E-2026-04-23-Natural_Learning_Experiences-Reinforcement_and_Unsupervised_Learning-Notes.pdf?rlkey=a3iyshlufzgkyfxl7gby85pe2&dl=0 An unabridged version of the whiteboard notes for this lecture can be found at: https://www.dropbox.com/scl/fi/lgibsff4lhh0ezb1lnm41/IEE598-Lecture7E-2026-04-23-Natural_Learning_Experiences-Reinforcement_and_Unsupervised_Learning-Notes-Full.pdf?rlkey=jayejujed8ervq8zsrddi1vcr&dl=0



No comments:

Post a Comment

Popular Posts