Day 22: mat385

Today:

Announcements:

Your third and final midterm exam is coming up. It will cover sections 4.1, 6.1-6.3, and 7.2-7.4. That's a bunch of sections. Coverage will be even however, so expect about one problem from each section (although I may be able to combine a few sections into a single problem).
On Friday I'll do a review, and if anyone has any particular problems or exercises they'd like me to go over, I'd be happy to. Just let me know.
I've already posted my "Exam 2s" from the past -- those would cover most of this material. You can check those out for some sample problems (see day 15).
Projects:
- Remember that projects are supposed to be fun: communicate your fun to your audience (which will be me, and a couple of other class-mates -- and maybe some more innocent bystanders, who drop by from the future -- think about them!:).
- You're going to be evaluated by two other students (from the present, anonymously). You'll be evaluating two other students (anonymously). That selection process occurs after all the projects are in.
- If you have any questions, please let me know. There are a few of you who were supposed to talk to me about projects. Please reach out, if so. Email or office hours are best.
- Reminder: projects are due by April 23.
Your Section 7.2 homework is graded (#1, 16, 31):
1. I had a little fun with this one (which was trivial): what's the worst you could do, given that no Euler path is possible? What's the best you can do?
1. You should be able to construct the adjacency matrix for a graph easily (using symmetry, etc.). Then the determination of the existence of an Euler path, by the author's algorithm, is very simple.
1. This is a really interesting counting problem. More on it in a bit.

I received the following from Prof. Newell, who asked me to advertise his CSC485 course for this summer:

*************************

 I just wanted to point out to all of you that CSC-485/585 will be
  offered this summer session. This is the "Theory of Computation" course which
  is required for the Computer Science major.  Theory can be a difficult course
  for some students and being able to focus upon that one course alone can be a
  great advantage for many.

  I have designed an online version of the course which has worked out quite
  well over the past 2.5 semesters (I know it sounds like I am bragging but it
  really is a good online course I think).  The summer schedule is available
  and registration is open so if you are interested please sign up.

 If you are curious about what the course covers I can summarize it as follows:
 We search for the answer to one simple question : "What is computable?" or, 
 to put it another way, "Are there problems that we cannot solve with any
 computer?".  That is, we search for a simple model of computation which is
 equivalent in computational power to your friendly neighborhood computer and
 we try to determine if there are problems that cannot be solved using said
 model (Spoiler alert - there are!!!). We then find a way to actually prove
 that some problems are, in fact, unsolvable with ANY computational device -
 and I do not mean weird metaphysical questions like "What is the meaning of
 life?" or "Can God create a stone that she herself can't lift?" or stuff like
 that - I mean clear problems that can be stated in straight-forward
 mathematical terms.  Along the way, as we search for our model and proof
 technique, we discover models for other families of problems that are quite
 useful in day to day problem solving and we discover properties about them
 which we can use in our repertoire of programming skills.  If you have any
 questions, please let me know.

 Also, please keep in mind that, as a required course, CSC 485 often fills up
 very quickly during the Fall/Spring semesters.  If you have any questions
 about the course or anything else that I can help you with feel free to
 contact me at  newellg@nku.edu
*****************************

Video of today's highlights

Today: Highlights of Chapter 7.
Chapter 7 is the last material covered for your exam. While we introduced a fair number of techniques, we didn't do as much analysis as we might. So today we do some analysis. The following ideas/algorithms are featured in these three sections (7.2-7.4):
- Euler Paths (highway inspector problem)
- Hamiltonian circuits (traveling salesman problem)
- Shortest Path algorithms (for a simple, positively weighted, connected graph):
  - Dijkstra's Algorithm
  - Bellman-Ford Algorithm
  - Floyd's algorithm
- Minimal Spanning Trees algorithms
  - Prim's
  - Kruskal's
- Graph traversal algorithms:
  - depth-first
  - breadth-first
First of all, I want to begin by thinking of a graph as an adjacency matrix. That representation makes some comparisons easy.
A few initial comments:
1. Sometimes we use "$\infty$" (nil, null, etc.) for entries in the adjacency matrix to indicate no connection, and sometimes we use $0$. It generally depends on whether you think of the entry as a distance or not.
2. Emphasize symmetry when you can. This can reduce the workload.
Now to the algorithms.
- Prim's and Kruskal's algorithms are easy; furthermore, the minimal spanning tree is not generally unique.
  1. Prim's algorithm picks a row of the adjacency matrix (let's suppose it's 1), then finds the smallest entry in that row (choice is arbitrary in the event of ties), and adds that entry (incorporating a new node).
    Now find the smallest entry in the two rows (other than the one already used), in a new column (representing at new node). One only need seek in "new columns", so the search each time is reduced by one in terms in columns, but augmented by one in terms of rows -- so the first search is $1(n-1)$; then $2*(n-2)$; then $3*(n-3)$; and so on.
    If one is clever, one can minimize the amount of sorting (in fact, only finding the minimum is necessary -- which is O(n)). If one is not clever, then you're sorting $n-1$ lists of lengths \[ \{k(n-k)\}_{k=1,n-1}, \] each of which requires $m \log_2(m)$ comparisons. \[ \sum_{k=1}^{n-1} k(n-k)\log_2(k(n-k)) \]
    This is actually a little worse than cubic, actually.
  2. Kruskal's steps through the adjacency matrix, ordering all the entries from lowest to highest; then begins refilling the matrix with the smallest, and works up to the bigger. When every row has an entry, you're done.
    So there's an $n^2$ sort (if the graph is undirected, you can get by with half that -- but it's still on the order of $n^2$); and then one refills the matrix and checks that it's got every row covered. Can you guess the worst case situation?
- Determination of the existence of an Euler path is easy (counting odd degree nodes).
  - An interesting question is can one screw up an Euler path? Suppose one exists. Can you have trouble finding it?
  - We know that you must start at one odd and end at the other if there are two odd nodes. We also know that you can start at any node if there are no odd nodes.
  - But can one screw up in seeking an Euler path? The answer is....yes!
  - Fleury's algorithm assures one that one will find an Euler path: if given the choice, never take a bridge (a bridge turns a connected graph into a disconnected graph).
    I won't go into the details, but let's look at an example.
- As we've seen, the Hamiltonian circuit problem does not have any great solution algorithms. As we saw in the homework problem, a brute force approach for even a small graph can take a long time to produce a solution, the problem growing factorially in $n$, the number of vertices).
- In terms of shortest path algorithms, I want to focus on Dijkstra's Algorithm.
  As I showed, this algorithm is one in which the problem is reduced in size with each vertex visited. A vertex is trimmed from the graph, with weights updated to incorporate the short-cuts from that node.
  This is a greedy algorithm, and it starts by choosing the nearest node (because the weights are all positive, you can't do any better to the nearest node by using another node -- it can only increase your distance).
  This should remind us of linear recurrence relations: for a simple graph, \[ C_n = C_{n-1} + (n-1) \] and $C_1=0$.
  This implies that we're counting only the work you have to do to find the minimum among the other $n-1$ nodes to visit (call it $k$).
  But then the weights to other nodes may need be adjusted to account for shortcuts using the chosen node. Each of that node's adjacencies (of which we need to consider $n-2$, at most) must be checked for improvement, involving checking a sum (the distance to $k$ plus the distance along the adjacency) against the current shortest distance to each other node.
  So in sort of a worst case, counting everythingish way, we might have
  \[ C_n = C_{n-1} + 3(n-1) \] and $C_1=0$.
  This is still quadratic. Solving it is easy:
  \[ 0, 3, 3+3(2)=9, 9+3(3)=18, 18+3(4)=30, ... \] That is, $3(0, 1, 3, 6, 10, ...)$, or \[ 3\frac{n(n-1)}{2} \]
- The other algorithms (Bellman-Ford, Floyd's) are also quadratic.
- Finally we come to the most recent material, the depth first and breadth first traversals of a graph.
  Each of these can be formulated as a recursive scheme: we trim out a vertex each time a vertex is written (since it is never re-written). In each case, it's simply a matter of writing $n$ vertices, and deciding how to choose the order.
  1. Depth first: Starting from node 1, say, we need to consider all the adjacencies ($n-1$), and we choose the nearest by whatever means, then descend the reduced matrix.
    Again,
    \[ C_n=C_{n-1}+n-1 \] with a worst case of $\frac{n(n-1)}{2}$.
    Similarly for breadth first.
- The up-shot is that, with the exception of the Hamiltonian circuit, each of these algorithms is quadratic or so. None of these are particularly intensive.
  You should know how to carry out each of these algorithms, and any problems that can arise (e.g. Euler path).
Links:
- Depth first and Breadth first algorithms for problems from Gersting (in lisp)
- Here's a traversal which was a student's idea, based on the breadth-first algorithm. It computes the distances between all pairs of vertices (like Floyd's algorithm).
Website maintained by Andy Long. Comments appreciated.