Tuesday, April 14, 2026

Lecture 7B (2026-04-14): Feeding Forward from Neurons to Networks (SLP, RBFNN, MLP, and CNN)

In this lecture, we move from the basics of learning foundations from the last lecture into models of neurons that can be combined to form machine learning tools. We start with the single-layer perceptron (SLP), explain where the term "weights" comes, and describe how it can linearly separate a space. We then introduce a hidden layer of receptive field units (RFU's) and discuss how Radial Basis Function Neural Networks use Gaussian or Logistic RBF's as nonlinear projections into high-dimensional space that Cover's theorem suggests should be more likely to e linearly separable. After demonstrating how RBFNN's work, we then introduce Cybenko's Universal Approximation Theorem (UAT) and use it to motivate looking for other (and deeper) latent structures. That leads us to the Multi-Layer Perceptron (MLP), backpropagation, and the Convolutional Neural Network.

Interactive widgets referenced in this lecture include:

Whiteboard notes for this lecture can be found at: https://www.dropbox.com/scl/fi/t2aoepucn0swlkvisococ/IEE598-Lecture7B-2026-04-14-Feeding_Forward_from_Neurons_to_Networks-Notes.pdf?rlkey=s5pr1zdrnup2ca1nthf7zxp3n&dl=0



Lecture 7A (2026-04-09): Neural Foundations of Learning

In this lecture, we prepare to discuss artificial and spiking neural networks -- bio-inspired information processing mechanisms inspired by the central nervous system and models of learning in psychology. We open with a discussion of the relationship between learning, memory, and neuroplasticity and then introduce a canonical model of a neuron that is the basis of the mechanisms thought to underly neuroplasticity. We discuss the different ways in which neuroplasticity supports working, short-term, and long-term memory. We introduce Hebbian learning (and briefly mention spike-timing-dependent plasticity, STDP) as a foundational learning paradigm that, when combined with neuromodluation and specialized circuits, can implement all forms of learning described in the lecture. Those forms of learning include non-associative learning (habituation and sensitization), associative learning (classical and operant conditioning), and latent learning. We map each of those to machine learing paradigms including unsupervised learning, self-supervised learning/pre-training, reinforcement learning, and supervised learning. In the next lecture, we will directly model the canonical neuron with a signle-layer perceptron and start to build statistical models based on this artificial neuron model. Interactive demonstrations mentioned in this video:

Whiteboard notes for this lecture can be found at: https://www.dropbox.com/scl/fi/x4t0y6q9rblrn78o8ns2r/IEE598-Lecture7A-2026-04-09-Neural_Foundations_of_Learning-Notes.pdf?rlkey=im6unlrptbfppqeds2y9gpga7&dl=0



Sunday, April 5, 2026

Lecture 6B (2026-04-07): Bacterial Foraging Optimization and Ant Colony Optimization

Closing out the Swarm Intelligence unit, this lecture pivots from Particle Swarm Optimization (PSO) to two examples of stigmergic swarm optimization – Bacterial Foraging Optimization (BFO) and Ant Colony Optimization (ACO). Stigmergy is the act of indirection through modifications of the environment, as in leaving chemical trails or depositing chemical gradients, as opposed to direct communication between one individual and another. BFO solves continuous optimization problems similar to PSO but uses attractants and repellants to modify the environment as opposed to directly informing others of information about discovered solutions. The repellants in BFO along with its reproduction and elimination–dispersal phases help to ensure it searches globally over a space as opposed to the more concentrated search of PSO. ACO also uses chemical coordination, but it is developed for combinatorial optimization problems. Although ACO was originally developed for the Traveling Salesman Problem (TSP), we discuss ACO first in a simpler layered model that better matches the foraging paths of real ants before briefly discussing the application to the TSP. We close with a brief mention of more complex recruitment dynamics in real ants, where trail laying plus noise can provide the ability to track changing feeder distributions and how one-on-one recruitment by some ants and bees can lead to different distributions of recruits across options (similar to changing the temperature in a softmax).

Interactive demonstrations referenced in this lecture can be found at:

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/fqm4jcfr1mkxsnz8ng61r/IEE598-Lecture6B-2026-04-07-Bacterial_Foraging_Optimization_and_Ant_Colony_Optimization-Notes.pdf?rlkey=q4omc6oyot9vrq8nnq3etx6k4&dl=0



Thursday, April 2, 2026

Lecture 5E/6A (2026-04-04): Parallel Tempering and Swarm Intelligence through Social Cohesion (Particle Swarm Optimization)

In this lecture, we finish our unit on physics-inspired ML and optimization by covering Parallel Tempering (PT), which combines multiple, parallel Metropolis–Hastings MCMC samplers each with different temperatures (rather than using an annealing schedule, as in Simulated Annealing (SA)). We then pivot toward motivating why certain problem sets, like optimizing high-dimensional weights of neural networks, may not be well suited by the optimization metaheuristics discussed so far in the course. We use this as an opportunity to introduce Swarm Intelligence and the Particle Swarm Optimization (PSO) algorithm, which is particularly good at finding and exploring local optima in spaces with many similarly performing local optima. We explore how PSO was inspired by the Boids Model from Craig Reynolds (in computer graphics) and how it overlaps with the Vicsek model (from statistical physics). We also show how PSO really depends on is social information but, under the influence of social information, tends to very quickly purge the diversity in its solution candidates. Online interactive demonstration modules associated with this lecture can be found at:

Whiteboard notes for this lecture can be found at: https://www.dropbox.com/scl/fi/7jwuytadieywwilqazjq5/IEE598-Lecture5E_6A-2026-04-02-Parallel_Tempering_and_Particle_Swarm_Optimization-Notes.pdf?rlkey=p1pr7cs241okovkgjnevvhdp5&dl=0



Tuesday, March 31, 2026

Lecture 5D (2026-03-31): Metropolis–Hastings Markov Chain Monte Carlo and Simulated Annealing/Parallel Tempering

In this lecture, we start with a reminder that the Boltzmann–Gibbs distribution is the maximal entropy (MaxEnt) distribution of physical microstates when the average energy is fixed at a temperature at thermal equilibrium. We then move toward motivations where it would be useful to sample microstates from such a distribution. First, we introduce Monte Carlo methods for parameter estimation, and we pivot toward applications of Monte Carlo sampling for numerical integration. This leads us back to physics applications where integration using the Boltzmann–Gibbs is much more practical. This gives the opportunity to introduce Metropolis–Hastings Markov Chain Monte Carlo (MCMC) sampling, which allows for sampling from the Boltzmann–Gibbs and more. After discussing connections to importance sampling (from stochastic simulation) and Bayesian/MCMC statistics, we introduce Simulated Annealing, which combines Metropolis–Hastings sampling with an annealing schedule for temperature. We close with a very brief introduction to Parallel Tempering, which swaps out the annealing schedule for parallel MCMC samplers that periodically swap states based on their relative energies. We will pick up with Parallel Tempering in the next lecture.

On-line simulations referenced in this lecture can be found at:

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/s5dcgqrvm4qzz4y0fs64a/IEE598-Lecture5D-2026-03-31-Markov_Chain_Monte_Carlo_Metropolis_and_Simulated_Annealing_Parallel_Tempering-Notes.pdf?rlkey=v2m33lhh7sjhwogffotbyq3k7&dl=0



Thursday, March 26, 2026

Lecture 5C (2026-03-26): Boltzmann–Gibbs and other Maximum Entropy Distributions

In this lecture, we start by reviewing the formal definition of Shannon entropy/information in both is discrete and continuous (differential entropy) forms. We then transition to discussing several different MaxEnt distributions and the constraints that they are associated with. Ultimately, this brings us to the Boltzmann–GIbbs distribution and several applications of it. Throughout the lecture, different interactive demonstrations were used (and can be accessed directly at the links below).

Demonstrations referenced in this lecture can be found at:

Softmax Visualizer: https://tpavlic.github.io/asu-bioinspired-ai-and-optimization/softmax/softmax_temperature_explorer.html

MaxEnt Explorer (SDM and NLP): https://tpavlic.github.io/asu-bioinspired-ai-and-optimization/maxent/maxent_demo.html

Boltzmann Distribution via Random Exchanges of Conserved Quantity: https://tpavlic.github.io/asu-b]]ioinspired-ai-and-optimization/boltzmann_maxent/boltzmann_maxent_random_exchange.html

Beta Distribution Explorer: https://tpavlic.github.io/asu-bioinspired-ai-and-optimization/boltzmann_maxent/beta_spacings.html

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/zwdrab929yg47jm67vope/IEE598-Lecture5C-2026-03-26-Boltzmann-Gibbs_and_other_MaxEnt_Distributions-Notes.pdf?rlkey=3zka62o08gnw8z38r7lknjsqf&dl=0



Tuesday, March 24, 2026

Lecture 5B (2026-03-24): From Entropy to Maximum Entropy (MaxEnt) Methods

In this lecture, we pivot from our motivation from the Simulated Annealing optimization metaheuristic to thinking about how to sample from microstates within the physically inspired search process. This requires us to introduce the concept of entropy, a quantity which measures the number of microstates in a coarse-grained "macrostate" description of a system. Within the constraints of a system, we seek a distribution of microstates that represents only those constraints and not any additional information. This is the maximal entropy distribution for those constraints. We provide a few formalities on how to make this a little more rigorous and then introduce Maximum Entropy (MaxEnt) methods once popular in NLP that remain to be popular in Species Distribution Modeling and archaeology. We will use MaxEnt to help us define the Boltzmann–Gibbs distribution (and Monte Carlo methods to sample from it) next time.

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/01pfdkj3d3ilk7wiyu79a/IEE598-Lecture5B-2026-03-24-From_Entropy_to_Maximum_Entropy_MaxEnt_Methods-Notes.pdf?rlkey=xfe1pie4sxu0qklg871czuc05&dl=0



Popular Posts