Thursday, March 27, 2025

Lecture 5D/6A (2025-03-27): Simulated Annealing Wrap-up and Distributed AI and Swarm Intelligence, Part 1 - Introduction to Ant Colony Optimization (ACO)

In this lecture, we wrap-up our coverage of Simulated Annealing (SA) and then shift to a new unit on Swarm Intelligence, with our first topic in that unit being Ant Colony Optimization (ACO). We start the lecture with a description of Monte Carlo sampling, which leverages a Law of Large Numbers to provide a methods for approximating integrals using random sampling from a high-dimensional space. This allows us to introduce the Metropolis–Hastings algorithm, a Markov Chain Monte Carlo approach to sampling from arbitrary distributions (originally the Boltzmann distribution for the Metropolis algorithm, which was purely a Boltzmann sampler). We then show how to use the Mentropolis algorithm within Simulated Annealing, which combines it with an annealing schedule that turns an MCMC sampler into an optimizer that starts out as an explorer and finishes as an exploiter. Simulated Annealing also helps import conceptual frameworks from physics (specifically statistical mechanics) into optimization and even Machine Learning more broadly. After finishing the coverage of SA, we introduce Ant System (AS), an early version of Ant Colony Optimization (ACO), which is a combinatorial optimization metaheuristic based on the trail-laying and recruitment behaviors of some ants. We will conclude ACO next time and move on to other Swarm Intelligence algorithms.

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/b0wmj4a3lnbgyrs8rmts5/IEE598-Lecture5D_6A-2025-03-27-Simulated_Annealiung_Wrap_Up_and_Distributed_AI_and_Swarm_Intelligence_Part_1_Introduction_to_Ant_Colony_Optimization_ACO-Notes.pdf?rlkey=y49aa1x0oi0k53u5x5tv9kngs&dl=0



Tuesday, March 25, 2025

Lecture 5C (2025-03-25): Toward Simulated Annealing: Introduction to Boltzmann Sampling and Monte Carlo Integration

In this lecture, we continue our march toward Simulated Annealing by ensuring that we have the necessary foundations in information theory to support discussing thermodynamics and statistical mechanics. This allows us to introduce Boltzmann sampling, which will be used in Simulated Annealing, and the computational applications from physical chemistry that first inspired the creation of Boltzmann sampling. In particular, we introduce Monte Carlo integration and start to discuss how Boltzmann sampling can greatly reduce the number of samples necessary to estimate macroscopic variables of interest to physicists as they investigate thermodynamic equations of state. This also effectively allows us to introduce Markov Chain Monte Carlo (MCMC) methods, of which the Metropolis algorithm is recognized as the first popular MCMC method (and the foundation of Simulated Annealing).

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/0nmyk60h6xtfa335x2bpd/IEE598-Lecture5C-2025-03-25-Toward_Simulated_Annealing-Introduction_to_Boltzmann_sampling_and_Monte_Carlo_integration-Notes.pdf?rlkey=ig1tbmmyenqpt7b6vwtnzbkem&dl=0



Thursday, March 20, 2025

Lecture 5B (2025-03-20): From Maximum Entropy (MaxEnt) Methods Toward Optimization by Simulated Annealing

In this lecture, we review the concept of entropy from physics and introduce information-theoretic connections to the increase in entropy and the erasure of latent information. We use that to motivate why maximum entropy probability distributions are the "least biased" choices given a set of constraints. We give a few examples of maximum entropy distributions and take a little time to introduce Maximum Entropy (MaxEnt) methods in machine learning and data science that have been successful in Natural Language Processing (NLP) as well as many other fields, from archeology to ecology. Equipped with maximum entropy, we introduce the Boltzmann distribution as a key tool in the implementation of Simulated Annealing, and we highlight connections between the Boltzmann distribution and "softmax" (as well as "softmin") from Machine Learning/AI. We will conclude our introduction of Simulated Annealing in the next lecture.

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/dv7gei6rbb0yltqekmhrb/IEE598-Lecture5B-2025-03-20-From_Maximum_Entropy_MaxEnt_Methods_Toward_Optimization_by_Simulated_Annealing-Notes.pdf?rlkey=jaifjaxazwn3u0efui0lthlky&dl=0





Tuesday, March 18, 2025

Lecture 5A (2025-03-18): Introduction to Simulated Annealing and Entropy

This lecture introduces Simulated Annealing (SA) and the physical background from thermodynamics, statistical mechanics, and information theory needed to understand its function. After describing a general outline of SA, we pivot to discussing entropy (in terms of microstates and macrostates) and why systems move toward macrostates (distributions of microstates) with higher entropy. We link this to bias in stochastic modeling, which sets us up to start discussing Maximum Entropy (MaxEnt) methods in the next lecture. We will eventually bring this discussion back to Simulated Annealing after briefly discussing both MaxEnt Methods and Markov Chain Monte Carlo (MCMC) methods first (starting in the next lecture).

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/dabgveu9swzj6d4dcveza/IEE598-Lecture5A-2025-03-18-Introduction_to_Simulated_Annealing_and_Entropy-Notes.pdf?rlkey=3dhycmnzij3fdd3fejes1wlvk&dl=0



Tuesday, March 4, 2025

Lecture 4C (2025-03-04): Niching Methods in Multi-Modal Optimization

In this lecture, we cover popular niching/niche-preserving methods for multi-modal optimization (which are also used in multi-objective optimization), which allow for multi-modal optimization techniques to find as many local fitness peaks as possible while minimizing representation (and maintaining high selective pressure) within each peak. These niching methods include fitness sharing methods (including variations inspired by k-means clustering), clearing and crowding methods, and the popular restricted tournament selection (RTS). Throughout the lecture, the computational complexity of different methods is highlighted so that at the end of the lecture (in detail in the PDF notes linked below) an assessment of the costs and benefits of each can be compared.

Whiteboard notes for this lecture (which include a list of computational complexities that was not covered in detail during the lecture) are available at:
https://www.dropbox.com/scl/fi/gdtk02v4gy9boxgj0vdbf/IEE598-Lecture4C-2025-03-04-Niching_Methods_in_Multi-Modal_Optimization-Notes.pdf?rlkey=gvfnxo20j9m76d4hfgvyhluzm&dl=0



Thursday, February 27, 2025

Lecture 4B (2025-02-25): From DGA/PGA to Niching Methods for Multi-Modal Optimization

In this lecture, we start by introducing distributed and parallel genetic algorithms (DGA/PGA) that not only have the potential to leverage parallel hardware resources but, perhaps surprisingly, can improve the performance of canonical genetic algorithms (GA's) through population-genetic effects that turn genetic drift in *meta-populations* into a source of diversity (instead of a force that reduces diversity, as it does in a single population). In particular, we introduce the idea of "shifting-balance theory" from Sewall Wright, which is one of the early models that attempts to understand the effect of limited gene flow on drift and selective pressure in complex population structures. This allows us to revisit the power and value of diversity in population-based evolutionary algorithms for optimization, and we use that to introduce the field of multi-modal optimization. Following this idea, we close with an introduction to niching methods for multi-modal optimization, starting with fitness sharing. We will pick up with this topic in our next lecture.

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/dqfz3d2rienofebqpa30e/IEE598-Lecture4B-2025-02-25-From_DGA_PGA_to_Niching_Methods_for_Multi_Modal_Optimization-Notes.pdf?rlkey=fknh18uafdqh2ah2lqsfi7lcb&dl=0



Tuesday, February 25, 2025

Lecture 3C/4A (2025-02-25): Pareto Ranking and Moving from Communities to Meta-Populations (DGA, PGA)

In this lecture, we review the connections between Pareto optimality and concepts from both population genetics as well as community ecology. We discuss how "fitness" in multi-objective evolutionary algorithms (MOEA's) is about reducing the "distance" to the Pareto front, establishing a group of "non-dominated solutions" that are "equally fit." Thus, Pareto optimization is not only about maximizing fitness (arriving at the Pareto front) but also about completeness and diversity (i.e., maximizing spread of the non-dominated set across the Pareto front). We then introduce the notion of "Pareto ranking", which helps to formally establish a "fitness" concept that captures the "distance from Pareto front" even in cases where the Pareto front location is not known. We discuss how different notions of fitness come with different selective pressures and remind ourselves that high selective pressure can create challenges for maintaining diversity (and spread across the Pareto front). We then re-introduce fitness scaling (which we first discussed in the context of GA's) and fitness sharing/clearing. These are concepts that maximize the number of "niches" along the Pareto front while reducing the number of solution candidates within each niche. We discuss how NSGA-II and NSGA-III are dominant players in the MOEA space, (and well represented in off-the-shelf tools like gamultiobj in MATLAB and pymoo in Python). We then pivot to introducing the idea of "metapopulations" (populations of populations) and how this framework lets us use genetic drift as a tool for maintaining and exploring diversity (whereas before we only viewed drift as a force of reducing diversity). We will pick up next lecture building on this idea to introduce Distributed and Parallel GA's and tools for multi-modal optimization.

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/scl/fi/kmqhx2po0o05c36vnh972/IEE598-Lecture3C_4A-2025-02-25-Pareto_Ranking_and_Moving_from_Communities_to_MetaPopulations-DGA_PGA-Notes.pdf?rlkey=dut1pok8mcusqq3sagljbz9yi&dl=0



Popular Posts