The syllabus for the course is here.

This page lists everything else you need to know for the course. If you don't see it here, check the syllabus before asking.

Time: Tuseday & Thursday, 1:00-2:15

Where: WTS 48

Dan's office hours: Fridays 2:00 - 3:00, by Zoom.

TA: Tracey (Xinyi) Zhong: office hours Monday 2:45-3:45, room 225 of DL, 10HLH

The following is the list of recommended readings, notes, and homeworks for each lecture. Those for future dates are speculative.

For background, I recommend Solomon, Chapter 1. You should have seen most of this before. But, I will cover material from 1.4.3 in some class in more detail.

Many readings come from the following resources, which are freely available online (although some will require access from inside Yale or the VPN).

**S**: Numerical Algorithms by Solomon.**BV**: Convex Optimization by Boyd and Vandenberghe**NW**: Numerical Optimization by Nodecal and Wright.**LY**: Linear and Nonlinear Programming, by Luenberger and Ye.

**Jan 14**: **Introduction**

- Reading is Solomon, Sections 2.1 and 2.2 .
- Recommended additional reading: https://software.intel.com/en-us/articles/how-memory-is-accessed and the beginning of https://mitmath.github.io/18337/lecture2/optimizing
- My notes
- Julia Jupyter notebook, also available in pdf

**Jan 16: Automatic Differentiation**

- Reading is NW, Chapter 8, Section 1, through "Approximating the Gradient", and
- NW, Chapter 8, Section 2, through "The Reverse Mode".
- Additional suggested reading: MIT 18.337: Automatic Differentiation and Applications
- Suggested background on differentiation: section 1.4 of Solomon; section on derivatives in Appendix A of NW (starts around A.44).
- W. Baur and V. Strassen. The complexity of partial derivatives. Theoretical Computer Science, 22(3):317 330, February 1983
- My lecture notes, Jupyter notebook, and pdf of the Jupyter notebook.

**Jan 21: Norms, accuracy, and solving linear equations by gradient descent.**

- My lecture notes
- Reading about norms: Solomon, Chapter 4.3.
- Solving linear equations by GD: Solomon, Chapter 11.1, or LY 8.2 or NW 3.3. Note these often talk about a PSD matrix M or Q, which is our A
^{T}A. - Reading about forward and backward error: Solomon 2.2, or Higham's Accuracy and Stability of Numerical Algorithms, 1.5.
**First problem set out today.**

**Jan 23: Condition numbers.**

- Suggested reading: Demmel, Applied Numerical Linear Algebra, Sections 2.2.
- Solomon, Section 4.3.2.
- My lecture notes

**Jan 28: Solving linear equations by LU factorization. Pivoting**

- Reading: Solomon, Chapter 3, or Trefethen & Bau Chapters 20 & 21.
- My lecture notes
- A Julia/Jupyter notebook for this and next lecture, and a pdf of it.

**Jan 30: More pivoting (partial), and Positive Definite Matrices.**

- Reading: Solomon, Chapter 3, or Trefethen & Bau Chapters 22 & 23
- See also Trefethen’s class essay “Three Mysteries of Gaussian Elimination”, about which I note
- Strassen’s algorithm starts to win in practice around n = 100.

**Feb 4: The Randomized Kaczmarz method of solving linear equations, and condition numbers under perturbation.**

- My lecture notes
- Sections 14.1 and 14.2 of Vishnoi's Lx=b
- A randomized Kaczmarz algorithm with exponential convergence, by Strohmer and Vershynin
- The probabilistic analysis of condition numbers comes from
**“**Smoothed Analysis of the Condition Numbers and Growth Factors of Matrices“. The right analysis of the probability that a perturbed matrix has small singular value comes from the very recent “Gaussian Regularization of the Pseudospectrum and Davies' Conjecture”. - The best analysis of growth factors of partial pivoting under perturbation are from Arvind Sankar’s thesis.
- Jupyter Julia notebook, and in pdf.

**Feb 6: Singular values and vectors.**

- Suggested reading: Chapter 3 (at least Sections 3.1-3.8) of Foundations of Data Science by Blum, Hopcroft and Kannan.
- My lecture notes

**Feb 11-20: Linear Programming**

My approach to linear programming is a little unusual. First, I like to treat it as a geometrically rather than algebraically. Second, I like to describe it in the polar form rather than the standard. I can not find any nice treatments of the polar form. So, I suggest multiple sources, and hope that one resonates with you.

- Notes from lectures I gave long ago: BAP11, BAP12, BAP13,
- The geometric treatment of Jeff Erickson, in his notes: Chapter H, sections 1,2,4,5,7,8. And, Chapter I, sections 1,2,3,
- LY: Sections 2.1., 2.3, 2.4 and 2.5
- Notes by Michel Goemans, Sections 1-10

**Feb 11: Linear Programming and Convex Sets**

- Reading: BV Sections 2.1, 2.2 and 2.5, and 4.3 (not 4.3.1 or 4.3.2)
- My lecture notes

**Feb 13: Linear Programming: the simplex method and duality, in the Polar**

**Feb 18: Linear Programming: strong duality, feasibility, Farkas' Lemma.**

**Feb 20: Linear Programming: fast first-order methods, and overview of what's known**

**Feb 25: Introduction to Convex Programming. Convex Functions**

- Suggested Reading: BV Sections 3.1 and 3.2
- my lecture notes

**Feb 27: Gradient descent for convex functions.**

- The results that I will present appear as
- Theorem 1 in Section 8.2 of LY, and
- The analysis in Section 9.3.1 of BV

- my lecture notes

**March 2:**

- In class Midterm (10% of grade)

**March 24 & 26: Lagrange multipliers and the KKT conditions for optimality.**

- Recommended reading: Chapter 5 of BV, and Chapter 10 of Undergraduate Convexity by Lauritzen (available through the Yale Library, inside the VPN)
- March 24:
- Mostly based on Sections 10.1 - 10.3 of Undergraduate Convexity by Lauritzen
- my lecture notes

- March 26:
- Mostly based on Section 10.4 of Lauritzen, Sections 5.1-5.2 of BV and Section 5.9 of BV (which relies on sections 2.4, 2.6, 3.6 and 4.6)
- my lecture notes, and what I wrote in class

**March 31: Polynomial time algorithms for convex programming.**

- Recommended reading:
- Lauritzen's book, Section 10.5.
- Section 4.1 of the book Combinatorial Optimization, Theory and Algorithms, by Korte and Vygen. Available as a pdf from Springer when inside Yale VPN.
- Lecture notes by Ryan O'Donnell: mostly lecture 9, but lecture 8 could also be helpful.
- my lecture notes, and what I wrote in class

- References to algorithms for solving convex optimization problems that I mentioned in class include:
- The book Geometric Algorithms and Combinatorial Optimization, by Grötschel, Lovasz, and Schrijver. (available inside Yale VPN).
- The paper "Solving convex programs by random walks" by Bertsimas and Vempala.
- The paper "A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization" by Lee, Sidford and Wong.

**April 2: Introduction to NP.**

- For a clean and humorous introduction to NP, I recommend sections 12.1-12.5 of Jeff Erickson's book, but where Erickson uses the boolean satisfiability problem, I will use solvability of polynomial equations.
- For an introduction to NP hardness in the context of optimization problems, I recommend Vavasis's survey: Complexity Issues in Global Optimization.
- For a list of many NP-hard problems, I suggest A compendium of NP optimization problems
- my lecture notes, and what I wrote in class
- And, note that the zoom meeting was recorded.

**April 7: MaxCut and some NP-hard problems.**

- We will use the hardness of MaxCut to prove that the following problems are NP-hard:
- Maximizing a convex function over a convex polytope.
- Finding the maximum norm point in a polytope.
- Approximating the p-norm of a matrix, for p > 2.

- An introduction to maxcut can be found on Wikipedia, and some discussion of our first goal is in this blog post .
- The result for matrix p-norms appears in Section 6 of the paper Approximating Matrix p-norms by Bhaskara and Vijayaraghavan. (warning: there are a few typos in that section, which I think I have fixed in my notes).
- my lecture notes, and what I wrote in class

**April 9: Exact Cover -> Sparse solutions to linear equations, and solving most linear equations, k-Color, NMF, and low-rank matrix completion.**

- I can't find good references for most of this. The papers I reference are:
- Peeters, "Orthogonal Representations over Finite Fields and the Chromatic Number of Graphs"
- Shitov, "A short proof that NMF is NP-hard" (warning - haven't checked it carefully)
- Vavasis, "On the complexity of nonnegative matrix factorization"
- For more on NMF, I recommend "Computational Limits for Matrix Completion" by Hardt, Meka, Raghavendra, and Weitz.

- my lecture notes,

**April 14: Sparse (exact) solutions to linear systems: Orthogonal Matching Pursuit**

- Recommending reading: Sections 4.2 and 4.3 of Moitra's Algorithmic Aspects of Machine Learning
- my lecture notes

**April 16: Sparse (approximate) solutions to linear systems: LASSO**

- We will cover Theorem 1.1 from the paper Stable signal recovery from incomplete and

inaccurate measurements by Candès, Romberg, and Tao. - If there is time, we will do some experiments.
- my lecture notes, what we wrote in class
- The jupyter notebook, and a pdf of it.

**April 21: Approximating MaxCut**

- Recommended reading: the famous paper of Goemans and Williamson
- Sections 6.1 and 6.2 from The Design of Approximation Algorithms by Shmoys and Williamson
- I will also cover the local search algorithm described in the lecture notes of Chandra Chekuri
- my lecture notes, what we wrote in class

**April 23: Matrix completion by nuclear norm minimization, and other stuff I want to say in the last class.**

- Section 7 of Ankur Moitra's book.
- my lecture notes, what we wrote in class

**May 6: **Final Exam -- CANCELLED