Wednesday, April 29, 2020

Lecture 8C: Elementary Cellular Automata, Stochastic Cellular Automata, Ising models, and Cellular Evolutionary Algorithms (2020-04-29)

In this lecture, we close out the semester's technical content with some final words about cellular automata (CA) and interacting particle systems (IPS). We keep the focus from the previous lecture on Elementary CA's (ECA's), reminding of the naming convention and a mention of some popular rules (such as rule 30, rule 110, rule 184, rule 232). We then pivot to stochastic/probabilistic CA's, with a focus on 1-D stochastic ECA and show how probabilistic rules do not only generate "fuzzy patterns." This allowed us to talk about Ising models and physical/hardware implementations of annealing machines for high-speed combinatorial optimization by metaheuristics. That closes some of the loop between the end of this class and earlier in the class. To fully close that loop, we close with an introduction to Cellular Evolutionary Algorithms (cEA's), which define genetic operators on spatially defined neighborhoods in a way clearly inspired by CA's (and these evolutionary algorithms can certainly be viewed as a kind of CA).

Whiteboard notes for this lecture can be found at:
https://www.dropbox.com/s/iwrwe59u5xt2y0v/IEE598-Lecture8C-Notes.pdf?dl=0

Monday, April 27, 2020

Lecture 8B: Elementary Cellular Automata, Computation, and Experimentation (2020-04-27)

In this lecture, we continue discussing Interacting Particle Systems (IPS) and complex systems approaches for understanding adaptive properties of distributed systems, such as neural networks. We specifically focus on cellular automata (CA), starting with 2D cases (such as models of fire spread, the Game of Life, and "Brian's Brain" from NetLogo) and then moving to Elementary Cellular Automata (ECA). We discuss some popular ECA rules, such as the traffic rule (184) and the majority rule (rule 232) and how they can be combined to do density classification. We also discuss how rule 110 is Turing complete and thus can do much more.

Whiteboard notes for this lecture can be found at:

Wednesday, April 22, 2020

Lecture 8A: Complex Systems Approaches to Computation - Interacting Particle Systems and the Voter Model (2020-04-22)

In this lecture, we introduce thinking about computation from a complex systems perspective. This transition is motivated by thinking about physiological details of neuron-to-neuron communication (e.g., different synapse types, a variety of neurotransmitters, neuromodulatory feedback from other parts of the brain, neuropharmacology) are much more well understood at the local level than they are at the level of emergent cognitive phenomena, and none of these details are currently being incorporated into neuromorphic approaches to artificial cognition. Complex systems methods (which are methods applied to systems where local-level interactions are easy to describe but large-scale, system-level phenomena emerge non-trivially from those interactions) provide one perspective to better understanding such emergence.

We then introduce Interacting Particle Systems (IPS), for which cellular automata (and both artificial neural networks, ANNs, and spiking neural networks, SNNs) can be viewed as a special case. Before jumping into cellular automata in the next lecture, we close this lecture discussing the "voter model", which is a model of neutral evolution (i.e., evolution by drift alone with no selective/fitness effects). Asking whether the voter model reaches fixation is equivalent to asking whether the computational system of interacting particles reach consensus. The answer to this requires using a dual, time-reversed system of coalescing Markov chains akin to doing contact tracing in an epidemiological system. The result (making use of Polya's recurrence theorem) is that consensus (fixation) will occur for 1 or 2 dimensions but will not always occur (i.e., will fail with non-zero probability) in 3 dimensions or higher.

Whiteboard notes for this lecture can be found at:

Monday, April 20, 2020

Lecture 7H: Associative Learning, Spiking Neural Networks, and Alternatives to Backpropagation (2020-04-20)

In this lecture, we start with a review of this unit on Artificial Neural Networks, starting with the original inspiration from Otto Schmitt's work on the squid neuron (inspiring switches with hysteresis, like the Schmitt trigger). We revisit the basic model of an artificial neural as a generalized linear model (GLM) for regression and classification and discuss how the receptive fields and input layers of radial basis function neural networks (RBFNN)'s transform data so that a neuron built for linear separability is able to perform nonlinear classification as well. That lets us pivot into revisiting deep neural networks (such as the multi-layer perceptron, MLP) and forms like the convolutional neural network (CNN) that also make use of receptive fields to simplify the downstream learning. We then remind ourselves of the recurrent neural network (RNN) and more exotic versions like reservoir machines (echo state networks, ESNs) that use randomly connected recurrent neural networks as filters that allow downstream neurons to learn to classify patterns in time (not just space). After being reminded of backpropagation for the training of all of these networks, we discuss how associative learning (specifically Hebbian learning and spike-timing-dependent plasticity, STDP) can be viewed as an alternative to backpropagation for training. This requires us to discuss how psychologists view associative learning as able to generate both classical and operant conditioning, which can then be used to implement supervised, unsupervised, and reinforcement learning. This lets us discuss a new result from Google, AutoML-Zero, that evolves (using genetic programming, GP, genprog) novel ML algorithms that themselves instantiate new neural networks. AutoML-Zero has "discovered" backpropagation on its own, but it has also found other ways to train weights. We end with a discussion of the hardware implementation from Boyn et al. (2017, Nature Communications) of a crossbar array of memristors that can do unsupervised pattern recognition through STDP (no backpropagation required).


Whiteboard notes for this lecture can be found at:

Wednesday, April 15, 2020

Lecture 7G: Intro. to Spiking Neural Networks and Neuromorphic Computing (2020-04-15)

In this lecture, we continue discussing associative/Hebbian learning in neural networks – starting with the inspiration from real neurons and leading to the approximation of this learning mechanism in Artificial Neural Networks (ANN). We discuss a simple pattern-recognition example that is well suited for spike-timing-dependent plasticity (STDP). We then setup the foundations for discussing true spiking neural networks (SNN), introducing some of the existing platforms (IBM TrueNorth, Intel Loihi, and SpiNNaker). We will start the next lecture with a discussion of memristor-based crossbar arrays that implement STDP, as first described in Boyn et al. (2017) in Nature Communications.


Whiteboard notes for this lecture can be found at:

Monday, April 13, 2020

Lecture 7F: Unsupervised Learning, Multidimensional Scaling, & Hebbian/Associative Learning with Artificial Neural Networks (2020-04-13)

In this lecture, we continue our discussion of unsupervised learning methods with artificial neural networks (ANN's), with reminders of clustering, anomaly detection, and generative adversarial networks (GANs) – all of which represent a form of computational creativity. We then pivot to review the basics of principal component analysis (PCA) as an example method for multi-dimensional scaling (MDS). That allows us to introduce the autoencoder ("self encoder") feed-forward network, which is an example of using methods from supervised learning to generate an unsupervised learning process that implements nonlinear multidimensional scaling. We close the lecture with an introduction to Hebbian/associative learning (and motivations from spike-timing-dependent plasticity, STDP) and how Hebbian update rules can be applied to conventional artificial neural networks for pattern learning.

Whiteboard notes for this lecture can be found at:

Wednesday, April 8, 2020

Lecture 7E: Reinforcement and Unsupervised Learning (2020-04-08)

In this lecture, we continue our investigation of Artificial Neural Networks (ANN), but we move from supervised learning through reinforcement learning (RL) to an introduction to unsupervised learning. We pick up where we left off with reservoir machines (e.g., Echo-State Networks, ESN's), which reduce the amount of training necessary for a complex supervised learning task. We then pivot to talk about reinforcement learning and introduce Q-learning (specifically deep Q-learning). We then close with an introduction to unsupervised learning, with a brief discussion of clustering, anomaly detection, and computational creativity provided by generative adversarial networks (GAN). This will set us up for multidimensional scaling (MDS) in the next lecture and a march toward Hebbian learning, spike-time-dependent plasticity, and spiking neural networks (SNN's).

Whiteboard notes for this lecture can be found at:

Monday, April 6, 2020

Lecture 7D: More Complex Neural Networks and their Training (2020-04-06)

In this lecture, we continue our review of artificial neural networks (ANN's) and some of the nature-inspired directions that inspired them. We move from feed-forward multi-layer perceptrons to more general topologies that skip layers or add recurrent connections (as in Recurrent Neural Networks, RNN's). This provides a brief introduction to Time Delayed Neural Networks (TDNN) and an opportunity to make connections between Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filtering from linear systems theory and signal processing. We then discuss how a generalized version of backpropagation, known as backpropagation thru time (BPTT) can be used to train neural networks in much the same was as multi-layered perceptrons. We then discuss a specialized RNN known as Long Short-Term Memory (LSTM) and its motivations, and we close with a brief introduction to reservoir computing ("reservoir machines", "Echo-State Networks", ESN's), which only require a small subset of the total network to be trained and yet can do a wide variety of the tasks that other "fully trained" RNN's can do.

Whiteboard notes for this lecture can be found at: https://www.dropbox.com/s/pa5ju9pxja38zxv/IEE598-Lecture7D-Notes.pdf?dl=0

Wednesday, April 1, 2020

Lecture 7C: Deep Neural Networks (backpropagation, CNNs, & insect brains) (2020-04-01)

In this lecture, we continue our introduction of supervised learning for feed-forward neural networks. We start with a re-introduction of convolutional neural networks (CNN's) and their loose inspiration from the vertebrate brain's visual cortex and cast them as a special case of multi-layer perceptrons (MLP's). We then continue our explanation of backpropagation as a method of using (stochastic) gradient descent (SGD) to train neural networks in supervised learning. We then pivot to discuss alternative deep neural network topologies potentially inspired by the architecture of other natural systems, such as the insect brain. We introduce the major neuropils of the insect brain (optic lobe, antennal lobe (and glomeruli), lateral horn, mushroom bodies (and Kenyon cells), and central complex) and the various interneurons that connect locally within and project among them. This allows us to portray the apparently random projections between the antennal lobe's glomeruli to the Kenyon cells as a kind of hashing function. The lecture ends with a discussion of the utility of thinking of conventional ANN's as being nature inspired.

Whiteboard notes for this lecture can be found at:

Monday, March 30, 2020

Lecture 7B: Introduction to Neural Networks: RBF, MLP, & Backpropagation (2020-03-30)

In this lecture, we continue our introduction to artificial neural networks (ANN). We start with a review of a model of a single neuron can be viewed as a generalized linear model (GLM), and how a single-layer (single-neuron) perceptron (SLP) can be a binary classifier for linearly separable data sets. We then discuss how radial basis function (RBF) neural networks (RBFNN) can be viewed as nonlinear transformations of data that allow for linear separability in the new space. These RBF's combine multiple neurons together, which gives us an opportunity to introduce the multi-layer perceptron (MLP) and closely related variants (like the convolutional neural network, CNN). We discuss the Universal Approximation Theorem (UAT) and what it means for why we bother with deep neural networks. We then start to introduce backpropagation as a way of training MLP's. 

Whiteboard notes for this lecture can be found at:

Wednesday, March 25, 2020

Lecture 7A: Introduction to Neural Networks (2020-03-25)

This lectures introduces the very basics of neural networks, from a description of a basic neuron from biology to the simple single-layer perceptron that it inspired. We cover the relationships between a single-layer perceptron and generalized linear modeling (GLM). We then demonstrate how the weights of a single artificial neuron can be trained for binary classification problems (with linearly separable data).

Whiteboard notes for this lecture can be found at:

Monday, March 23, 2020

Lecture 6D: Distributed AI and Swarm Intelligence, Part 4 - Particle Swarm Optimization (PSO) and friends (2020-03-23)

In this lecture, we continue to cover basic Particle Swarm Optimization (PSO) and some slight variations on the basic model that help improve performance and practicality of the approach. We use the social aggregation in PSO to motivate topics that will come up in the next few lectures on opinion dynamics and related dynamical phenomena that are less related to optimization and more related to consensus and stability in distributed groups.

The whiteboard notes for this lecture can be found at: https://www.dropbox.com/s/dmzjecmuxzzv3xk/IEE598-Lecture6D-Notes.pdf?dl=0

Wednesday, March 18, 2020

Lecture 6C: Distributed AI and Swarm Intelligence, Part 3 - Classical Particle Swarm Optimization (PSO) & Its Motivations (2020-03-08)

This lecture continues our introduction to Swarm Intelligence algorithms, with a major focus on Particle Swarm Optimization (PSO) and the models of collective motion that influenced it (like Reynolds' Boids model for computer graphics, Vicsek self-propelled particles model, and "selfish herds" in general). We contrast PSO with more "stigmergic" optimization algorithms like ant colony optimization (ACO) and bacterial foraging optimization (BFO) and view this more as an algorithm that tries to maintain cohesion of a group of individuals searching for good positions within a space.

Whiteboard notes for this lecture can be found at:

Monday, March 16, 2020

Lecture 6B: Distributed AI and Swarm Intelligence, Part 2 - Bacterial Foraging Optimization and Intro to Particle Swarm Optimization (PSO) (2020-03-16)

In this lecture, we continue our discussion of algorithms in the category of swarm intelligence. We start with a recap of Ant System (AS) – a precursor to Ant Colony Optimization (ACO) with very similar underlying mechanisms of stigmergic operation based on virtual pheromones. We then pivot to describe bacterial foraging optimization (BFO), which considers freely moving bacteria that can "tumble" whenever reaching a nutrient decline and "run" across nutrient increases. Like trail-laying ants, these bacteria can use chemicals to aggregate and spread social information. These chemicals can be attractive or repulsive. Like evolutionary algorithms, these bacteria can be eliminated based on poor local performance and increased in areas based on good local performance. After covering bacterial foraging optimization, we start to introduce particle swarm optimization (PSO), which is a similar motion-based metaheuristic meant to maintain cohesive swarms that move through a decision space. We will pick up on PSO more in the next lecture.

Whiteboard lecture notes (with some corrections) can be found at:

Wednesday, March 4, 2020

Lecture 6A: Distributed AI and Swarm Intelligence, Part 1 - Ant Colony Optimization (ACO) (2020-03-04)

This lecture introduces distributed AI and swarm intelligence. After a brief discussion of the difference between these approaches and the previous evolutionary and nature-inspired metaheuristics, we cover Ant Colony Optimization (ACO)/Ant System (AS), which is an approach for combinatorial optimization over discrete decision spaces based on how trail-laying ants find the shortest path from nest to food. Because this approach involves agents communicating/coordinating through persistent modifications of the environment, it falls under the category of using "stigmergy."

Whiteboard notes can be found at:

Monday, March 2, 2020

Lecture 5C: From MCMC Sampling to Optimization by Simulated Annealing (2020-03-02)

In this lecture, we start with an outline of the Metropolis–Hastings (MH) Markov Chain Monte Carlo (MCMC) sampling algorithm and describe how instantiating it with the Boltzmann–Gibbs distribution (which turns it into the classical Metropolis algorithm) is equivalent to sampling from an abstract energy landscape in the "least biased" way given a kinetic energy budget. We then discuss how the acceptance likelihood ratio allows for this (and other) MCMC algorithms to be used in applications where the desired relative frequency of outcomes is known but the normalization constant usually needed for a probability density is cumbersome to calculate. From there, we move from MH to Simulated Annealing (SA), which is a special case of the Metropolis algorithm with an annealing schedule. SA sets a temperature and then samples in the "least biased" (maximum entropy) way for that temperature before decreasing temperature, in the hopes of eventually finding the minimum of an optimization objective.

Whiteboard notes from this lecture can be found at: https://www.dropbox.com/s/hprzvgrshg755ge/IEE598-Lecture5C-Notes.pdf?dl=0

Wednesday, February 26, 2020

Lecture 5B: Physics-Inspired Algorithms and Simulated Annealing (2020-02-26)

In this lecture, we continue to discuss nature-inspired algorithms that come from the non-biological, physics realm. Maximum entropy (MaxEnt) is portrayed through a number of more concrete examples as a modeling framework that generates a probability distribution that is constrained by what information is formally known and is otherwise maximal in its degrees of freedom (thereby not introducing additional biases). It is applied to a simple Natural Language Processing (NLP) example, and then we pivot to talking about MaxEnt in terms of physical systems and how the Boltzmann–Gibbs distribution results. We then use the Boltzmann–Gibbs distribution as a motivation for the Metropolis (and then Metropolis–Hastings) Markov Chain Monte Carlo (MCMC) algorithm and predict how we will change the MH algorithm into the Simulated Annealing (SA) algorithm from Kirkpatrick et al. (1983). We will cover the SA more concretely in the next lecture.

Whiteboard notes for this lecture can be found at: 

Monday, February 24, 2020

Lecture 5A: Introduction to Simulated Annealing (2020-02-24)

In this lecture, we dive into the history behind simulated annealing and how it intersects with other tools such as MaxEnt (maximum entropy) methods used in NLP and Markov Chain Monte Carlo (MCMC) sampling. We go back to the original article by Metropolis et al. on using random sampling methods to estimate the mean properties of materials at equilibrium. This allows us to introduce the Boltzmann distribution and discuss its relationship to the exponential distribution and maximum entropy justifications for using Poisson process models. We then hint at the likelihood ratio generalization by Hastings and the ultimate co-opting of these methods for optimization later by Kirkpatrick et al.

Whiteboard notes for this lecture can be found at:

Wednesday, February 19, 2020

Lecture 4B: Niching Methods for Multi-modal Optimization [audio only] (2020-02-19)

In this lecture (which unfortunately does not have a video captured due to technical problems in the room), we cover several different niching methods that evolutionary (population-based) multi-modal optimization techniques use to stabilize populations across all of the different peaks of an optimization function. We motivate why peak finding is a potentially useful optimization objective and describe how each approach (which are primarily designed to integrate into genetic algorithms but could be used in other evolutionary algorithms) differs not only in its performance but its computational cost. Ultimately, we would like to distribute very many individuals across the many peaks, which means that we seek algorithms that scale manageably as we scale up the number of individuals. By the end of the lecture, restricted tournament selection (RTS) is discussed as being very useful in multi-modal optimization both in terms of its good performance and hyperparameters that allow for trading off speed and accuracy.

Whiteboard notes for this lecture can be found at:

Monday, February 17, 2020

Lecture 4A: From Multiobjective Genetic Algorithms to DGA, PGA, and Multimodal Optimization (2020-02-17)

In this lecture, we move from talking about multi-objective genetic algorithms to distributed and parallel genetic algorithms. The discussion of distributed genetic algorithms (DGA) as a metapopulation shows us how, in principle, small populations undergoing evolutionary processes can collectively move through "adaptive valleys" to explore much better than a monolithic population. This idea that distributed subpopulations that each make use of genetic drift as a mechanism for exploration is precisely the shifting-balance theory of Sewall Wright, a theory which has not been extremely useful in biology but may be a perfect fit for evolutionary computing in a distributed setting. We then discuss how our insights from multiobjective and distributed GA's can be used to reinvigorate approaches for single-objective, single-processor GA's and so-called "multimodal optimization" (which will be described in the next lecture).

Whiteboard notes for this lecture can be found at:

Wednesday, February 12, 2020

Lecture 3C: Multi-Objective Genetic Algorithms (MOGA and Friends), Part 2 (2020-02-12)

In this lecture, we expand upon the first two classes of multi-objective genetic algorithms (i.e., those based on random weights and those based on alternating fitness objectives) and add the third and most modern class of approaches – Pareto-ranking approaches. After reviewing the first two classes, we discuss how a new "fitness" can be used that represents how many other solutions a focal solution dominates. By selecting according to this fitness, solutions are driven toward a Pareto frontier without collapsing to any particular region of that frontier. To further improve diversity, "fitness sharing" is introduced. We close with a brief introduction to distributed GA's and multi-modal optimization and hint at a connection to Sewall Wright's "shifting-balance theory" from population genetics.

Whiteboard notes for this lecture can be found at:

Monday, February 10, 2020

Lecture 3B: Multi-Objective Genetic Algorithms (MOGA and Friends), Part 1 (2020-02-10)

This lecture briefly re-introduces the Pareto-efficient set and Pareto frontier and then describes different early GA-inspired approaches to find the Pareto-efficient set among a set of conflicting objectives. The key conceptual framework used throughout the lecture is that of a "community" of organisms, selected to live in similar environments, that inhabit different regions of niche space. The entire continuum of niche space is the Pareto frontier, and particular solutions along the frontier (different ways in which objectives can be balanced against each other) are like different species living in the community of optimizers that balance objectives differently.

This lecture also includes some comments about population size and its effect on evolution as well as how phenotypic variance can enhance evolvability.

Whiteboard notes for this lecture can be found at:

Wednesday, February 5, 2020

Lecture 3A: Introduction to Multi-criteria Decision Making and Pareto Optimality (2020-02-05)

In this lecture, we introduce the general topic of multi-criteria decision making and, more specifically, multi-objective optimization. The lecture starts with a discussion of Nash optimality (and Variational Inequalities), as formulated for the continuous variable case. We then conclude with an introduction of Pareto-efficient sets and Pareto optimality and how these set-based approaches may be more practically superior to more traditional point-based optimization approaches. This lecture sets up the background we will use in upcoming lectures to build evolutionary algorithms for discovering Pareto-efficient sets.

The whiteboard notes for this lecture can be found at:
https://www.dropbox.com/s/opseeleih8aakfk/IEE598-Lecture3A-Notes.pdf?dl=0

Monday, February 3, 2020

Lecture 2B: GA Approaches Beyond Optimization - Evolutionary Programming (2020-02-03)

In this lecture, we continue to describe applications outside of typical numerical optimization that have been greatly influenced by the field of evolutionary computing. We focus here on evolutionary programming (in general) and genetic programming (specifically).

The whiteboard for this lecture was captured in PDF form at: https://www.dropbox.com/s/5b4vqjms9196z2o/IEE598-Lecture2B-Notes.pdf?dl=0

Wednesday, January 29, 2020

Lecture 2A: GA Approaches Beyond Optimization - Artificial Immune Systems (2020-01-29)

This lecture closes out our discussion of the basic Genetic Algorithm (GA) and introduces closely related approaches for problems not directly related to optimization. In this lecture, we cover Artificial Immune Systems (AIS), a topic from immunocomputing. We cover the basic clonal selection and negative selection/elimination methods for artificial immune systems (as well as some very basic biology about the immune system, both innate and adaptive/specific).

Notes taken during this lecture can be found at:
https://www.dropbox.com/s/8rexbjaefvslq7c/IEE598-Lecture2A-Notes.pdf?dl=0

Monday, January 27, 2020

Lecture 1D: Implementing the Genetic Algorithm, Part 2 (2020-01-27)

Continued coverage of details of how the Genetic Algorithm is implemented, with a focus on the selection and mutation operators. A PDF of the lecture notes can be found at: https://www.dropbox.com/s/61u8vnlsfd4ed7c/iee598-lecture1d-notes.pdf?dl=0

Wednesday, January 22, 2020

Lecture 1C: Implementing the Genetic Algorithm, Part 1 (2020-01-22)

Introduction to details of how the Genetic Algorithm is implemented, including an introduction to the choice of hyperparameters and genetic operators that can balance selection pressure to maximize the effectiveness of the GA. A PDF of the lecture notes is linked from this blog post.

Wednesday, January 15, 2020

Lecture 1B: Basics of Evolutionary Algorithms and Population Genetics (2020-01-15)

The prototypical structure of evolutionary algorithms are presented, along with the background context of direct search metaheuristics (including tabu search). Preparing for the introduction of the genetic algorithm in the next lecture, basic biological terms from population genetics are covered, culminating in a brief discussion of genetic drift and how it applies to evolutionary algorithms for optimization.

Monday, January 13, 2020

Lecture 1A: Introduction to Course, Metaheuristics, and Evolutionary Algorithms (2020-01-13)

Today's lecture introduces the course, its structures, and its policies. Then a basic introduction to metaheuristics is given, leading up to a few introductory words about evolutionary algorithms. A PDF of the digital lecture notes has been linked from this blog post.

Popular Posts