Tip:
Highlight text to annotate it
X
In this video I'll talk about various aspects of the course, the topics that
we'll cover, the kinds of skills you can expect to acquire, the kind of background
that I expect, the supporting materials and the available tools for self
assessment. Let's start with the specific topics that this course is going to cover.
The course material corresponds to the first half of the ten week Stanford
course. It's taken by all computer science undergraduates, as well as many of our
graduate students. There will be five high level topics, and at times these will
overlap. The five topics are first of all, the vocabulary for reasoning about
algorithm performance, the design and conquer algorithm design paradigm,
randomization and algorithm design, primitives for reasoning about graphs, and
the use and implementation of basic data structures. The goal is to provide an
introduction to and basic literacy in each of these topics. Much, much more could be
said about each of them, than we'll have time for here. The first topic is the
shortest, and probably also the driest. But it's a prerequisite for thinking
seriously about the design and analysis of algorithms. The key concept here is big O
notation, which, conceptually, is a modeling choice about the granularity with
which we measure a performance metric like the running time of an algorithm. It turns
out that the sweet spot for clear high level thinking about algorithm design, is
to ignore constant factors and [inaudible] terms. And to concentrate on how well
algorithm performance scales with large input sizes. Big O notation is the way to
mathematize this sweet spot. Now, there's no one silver bullet in algorithm design.
No single problem solving method that's guaranteed to unlock all of the
computational problems that you're likely to face. That said, there are a few
general algorithm design techniques. High level approaches to algorithm design that
find successful application across a range of different domains. These relatively
widely applicable techniques are the backbone of a general algorithms course
like this one. In this course, we'll only have time to deeply explore one such
algorithm design paradigm, namely that of the divide and conquer algorithms. In the
sequel course as we'll discuss, there's two other major algorithms on paradigms to
get covered. But for now, divide and conquer algorithm, the idea is to first
break the problem into smaller problems which then gets solved recursively, and
then to somehow quickly combine the solutions to the sub problems into one for
the original problem that you actually care about. So for example, in the last
video. We saw two algorithms of this sort, two divide and conquer algorithms from
multiplying two large integers. In later videos we will see a number of different
applications. We'll see how to design fast divide and conquer algorithms for problems
ranging from sorting to matrix multiplication to nearest neighbor-type
problems and computation of geometry. In addition, we'll cover some powerful
methods for reasoning about the running time of recursive algorithms like these.
As for the third topic. A randomized algorithm is one that, in some sense,
flips coins while it executes. That is, a randomized algorithm will actually have
different executions if you run it over and over again on a fixed input. It turns
out, and this is definitely not intuitive, that allowing randomization internal to an
algorithm, often leads to simple, elegant, and practical solution to various
computational problems. The canonical example is randomized quick sort, and that
algorithm and analysis we will cover in detail in a few lectures. Randomized
formality testing is another killer application that we'll touch on. And we'll
also discuss a randomized approach to [inaudible]. [inaudible], and finally
we'll discuss how randomization is used to reason about half functions. Has maths.
One of the themes of this course, and one of the concrete skills that I hope you
take away from the course, is, literacy with a number of computational primitives
for operating on data, that are so fast, that they're, in some sense, essentially
free. That is, the amount of time it take to invoke one of these computational
primitives is barely more than the amount of time you're already spending just
examining or reading the input. When you have a primitive which is so fast, that
the running time is barely more than what it takes to read the input, you should be
ready to apply it. For example, in a preprocessing step, whenever it seems like
it might be helpful. It should just be there on the shelf waiting to be applied
at will. Sorting is one [inaudible] example of a very fast, almost [inaudible]
primitive of this form. But there are ones that operate on more complex data as well.
So recall that a graph of data structure that has, on the one hand, vertices, and
on the other hand, edges. Which connects pair of vertices. Graphs model among any
other things. Different types of networks. So even the graphs are much more
complicated than you are raised. There's still a number of reasons we [inaudible]
for reasoning about their structure. In this possible focus on primitives for
competing connectivity information. We also talk on how some primitives have been
used to investigate the structure of information in social networks. Finally,
data structures are often a crucial ingredient in the design of fast
algorithms. A data structure's responsible for organizing data in a way that supports
fast queries. Different data structures support different types of queries. I'll
assume that you're familiar with the structures that you typically encounter in
a basic programming class including rays and vectors. Lists, stats, and. Cubes.
Hopefully, you've seen at some point, both trees and heaps, or you're willing to read
a bit about them outside of the course, but we'll also include a brief review of
each of those data structures as we go along. There's two extremely useful data
structures that we'll discuss in detail. The first is balanced binary search trees.
These data structures dynamically maintain an ordering on a set of elements, while
supporting a large number of queries that run in time are logarithmic in the size of
the set. The second data structure we'll talk a, a fair bit about is hash tables or
hash maps, which keep track of a dynamics set, while supporting extremely fast
insert and lookup queries. We'll talk about some canonical uses of such data
structures, as well as what's going on under the hood in a typical [inaudible],
implementation. Of such a date of structure. >> There's a number of
important concepts in the designing analysis of algorithms that we won't have
time to cover in this five week course. Some of these will be covered in the
sequel course, Designing Analysis of Algorithms II, which corresponds to the
second half of Stanford's ten week course on this topic. The first part of this
sequel course focuses on two more algorithms designed paradigms. First of
all, the design analysis of Gree Algorithms with applications to minimum
spanning trees, scheduling, and information theoretic coding. And
secondly, the design analysis of dynamic programming algorithms with example
applications being in genome sequence alignment and the shortest path protocols
in communication networks. The second part of the sequel course concerns NP complete
problems, and what to do about them. Now, NP complete problems are problems that,
assuming a famous mathematical trajectory you might have heard of, which is the
peanut equal to NP conjecture. Our problem is it cannot be solved under this
conjecture by any computationally efficient algorithm. We'll discuss the
theory of NP completeness, and, with a focus on what it means for you as an
algorithm designer. We'll also talk about several ways to approach [inaudible]
problems, including fast algorithms that correctly solve special cases. Fast
heuristics with provable performance guarantees, and exponential time.
Algorithms that are qualitatively faster than brute force search, of course there
are plenty of important topics that can't be fit into either of these. Two five week
courses. Depending on the demand, there might well be further courses on more
advanced topics. Following this course is going to involve a fair amount of time and
effort on your part. So it's only reasonable to ask what can you hope to get
out of it, what skills will you learn. Well. Primarily, you know, even though
this isn't a programming class per se, it should make you a better programmer.
You'll get lots of practice describing and reasoning about algorithms, you'll learn
algorithm design paradigms, so really high level problem-solving strategies that are
relevant for many different problems across different domains, and tools for
predicting the performance of such algorithms. You'll learn several extremely
fast subroutines for processing data and several useful data structures for
organizing data that can be deployed directly. In your own programs. Second,
while this is not a math class per se, we'll wind up doing a fair amount of
mathematical analysis. And this in turn will sharpen you mathematical analytical
skills. You might ask, why is mathematics relevant for a class in the design and
analysis of algorithms, seemingly more of a programming class. Well let me be clear.
I am totally uninterested in merely telling you facts or regurgitating code
that you can already find on the web or in any number of good programming books. My
goal here in this class, and the way I think I can best supplement the resources
Is that you probably already have access to is to explain why. >> Algorithms are
the way they are. Why we analyze the algorithms in the way that we do, why
there is super fast algorithms already super fast, and so on. And it turns out
that good algorithmic ideas usually require nontrivial mathematical analysis
to understand properly. You'll acquire fundamental insights into the specific
algorithms and data structures that we discuss in the course. And hopefully, many
of these insights will prove useful, more generally, in your other work. Third and
perhaps the most relevant for those of you who work in some other discipline this
course should help you learn how to think algorithmically. Indeed after studying
algorithms it's hard enough not to see them pretty much everywhere whether you
are riding an elevator, watching a flock of birds, buying and selling stocks out of
your portfolio even watching an infant learn. As I said in the previous video
algorithm thinking is becoming increasingly useful and prevalent if you
are outside of computer science and technology like in biology, statistics and
economics. Fourth, if you're interested in feeling like a card carrying computer
scientist, in some sense, then you'll definitely want basic literacy in all of
the topics that we'll be covering. Indeed, one of the things that makes studying
algorithms so fun, is, it really feels like you're studying a lot of the greatest
hits from the last 50 years of computer science. So, after this class, no longer
will you feel excluded at that computer science cocktail party when someone cracks
a joke about Dexter's Algorithm. Now you'll know exactly what they mean.
Finally, there's no question that studying this material is helpful for technical
interview questions. To be clear, my sole goal here is to teach you algorithms, not
to prepare you for interviews, per se. But over the years, countless students of mine
have regaled me with stories about how mastering the concepts in this class
enabled them to ace every technical question they were ever asked. I told you,
this is fundamental stuff. So, what do I expect from you? Well, honestly, the
answer is nothing. After all isn't the whole point of a free online class like
this one that anyone can take it and devote as much effort to it as they like.
So that said, as a teacher it's still useful to have one or more canonical
students in mind. And I thought I'd go ahead and be transparent with you about
how I'm thinking about these lectures. Who I have in mind that I'm teaching to. So
again, please don't feel discouraged if you don't conform to this canonical
student template. I'm happy to have the opportunity to teach you about algorithms
no matter who you are. So first, I have in mind someone who knows at least some
programming. For example, consider the previous lecture. We talked about a
[inaudible] approach to multiplying two numbers and I mentioned how in certain
mathematical expression, back then we labeled it star and circled it in green.
How that expression naturally translated into a recursive algorithm. In particular,
I was certainly assuming that you had some familiarity with the recursive programs.
If you feel comfortable with my statement in that lecture, if you feel like you
could code up a recursive integer multiplication algorithm based on the high
level outline that I gave you, then you should be in good shape for this course.
You should be good to go. If you weren't comfortable with that statement, well, you
might not be comfortable with the relatively high conceptual level at which
we discuss program in this course. But I encourage to watch the next several videos
anyway, to see if you get enough out of them to make it worth your while. [sound].
Now, while I'm aiming these lectures at people who know some programming, I'm not
making any assumptions whatsoever about exactly which programming languages. Any
standard imperative language. You know, something like C, Java or Python is
totally fine for this course. Now, to make these lectures accessible to as many
programmers as possible. And to be honest, you know, also to promote thinking about
programming at a relatively abstract conceptual level. I won't describing
algorithms in any particular programming language. Rather, when I discuss the
algorithms, I'll use only [inaudible] pseudo-code, or often simply English. My
inductive hypothesis is that you are capable of translating such a high level
description into a working program in your favorite programming language. In fact, I
strongly encourage everyone watching these lectures to do such a translation of all
of the algorithms that we discussed. This will ensure your comprehension, and
appreciation of them. Indeed, many professional computer scientists and
programmers don't feel that they really understand an algorithm until they've
coded it up. Many of the course's assignments will have a problem in which
we ask you to do precisely this. Put another way, if you're looking for a sort
of coding cookbook, code that you can copy and paste directly into your own programs.
Without necessarily understanding how it works, then this is definitely not the
course for you. There are several books out there that cater to programmers
looking for such coding cook books. Second, for these lecture I have in mind
someone who has at least a modest amount of mathematical experience though perhaps
with a fair bit of accumulated rust. Concretely I expect you to be able to
recognize a logical argument that is a proof. In addition, two methods of proof
that I hope you've seen before are proofs by induction and proofs by contradiction.
I also need you to be familiar with basic mathematical notation, like the standard
quantifier and submission symbols. A few of the lectures on randomized algorithms
and hashing will go down much easier for you if you've seen [inaudible] probability
at some point in your life. But beyond these basics, the lectures will be self
contained. You don't even need to know any calculus. Say, for a single simple
[inaudible] that magically pops up. The analyses of the randomized quick sort
algorithm. I imagine that many of you have studied math in the past, but you could
use a refresher, you're a bit rusty. And there's plenty of free resources out there
on the web, and I encourage you to explore and find some that you like. But one that
I want to particularly recommend is a great set of free lecture notes. It's
called Mathematics for Computer Science. It's authored by Eric Lehman and Tom
Layden, and it's quite easy to find on the web if you just do a web search. And those
notes cover all of the prerequisites that we'll need, in addition to tons of other
stuff. In the spirit of keeping this course as widely accessible as possible,
we're keeping the required supporting materials to an absolute minimum. Lectures
are meant to be self-contained and we'll always provide you with the lecture notes
in PowerPoint and PDF format. Once in a while, we'll also provide some additional
lecture notes. No textbook is required for this class. But that said, most of the
material that we'll study is well covered in a number of excellent algorithms books
that are out there. So I'll single out four such books here. The first three I
mention because they all had a significant influence on the way that I both think
about. And teach algorithms. So it's natural to acknowledge that debt here. One
very cool thing about the second book, the one by [inaudible], is that the authors
have made a version of it available online for free. And again, if you search on the
authors' names and the textbook title, you should have no trouble coming up with it
with a web search. Similarly, that's the reason I've listed the forth book because
those authors have likewise made essentially a complete version of that
book available online and it's a good match for the material that we're going to
cover here. If you're looking for more details about something covered in this
class, or simply a different explanation than the one that I give you, all of these
books are gonna be good resources for you. There are also a number of excellent
algorithm textbooks that, that I haven't put on this list. I encourage to explore
and find you own favorite. >> In our assignments, we'll sometimes ask you to
code up an algorithm and use it to solve a concrete problem that is too large to
solve by hand. Now, we don't care what program and language and development
environment you use to do this as we're only going to be asking you for the final
answer. Thus, we're not requiring anything specific, just that you are able to write
and execute programs. If you need help or advice about how to get set up with a
suitable coding environment, we suggest that you ask other students for help via
the course discussion forum. Finally, let's talk a bit more about assessment.
Now this course doesn't have official grades per se, but we will be assigning
weekly homeworks. Now we're going to assign homeworks for three different
reasons. The first is just for self-assessment. It's to give you the
opportunity to test your understanding of the material so that you can figure out
which topics you've mastered and which ones that you haven't. The second reason
we do it is to impose some structure on the course, including deadlines, to
provide you with some additional motivation to work through all the topics.
Deadlines also have a very important side effect that synchronizes a lot of the
students in the class. And this of course makes the course discussion forum a far
more effective tool for students to seek and provide help in understanding the
course material. The final reason that we give homework is to satisfy those of you
who, on top of learning the course material, are looking to challenge
yourself intellectually. [sound]. Now, this class has tens of thousands of
students. So it's obviously essential that the assignments can be graded
automatically. Now, we're currently only in the 1.0 generation of free online
courses such as this one. So the available tools for auto graded assessment are
currently rather primitive. So, we'll do the best we can, but I have to be honest
with you. It's difficult, or maybe even impossible to test deep understanding of
the design and analysis of algorithms, using the current set of tools. Thus,
while the lecture content. In this online course is in no way watered down from the
original Stanford version. The required assignments and exams we'll give you, are
not as demanding as those that are given in the on campus version of the course. To
make up for this fact, we'll occasionally propose optional algorithm design
problems, either in a video or via supplementary assignment. We don't have
the ability to grade these, but we hope that you'll find them interesting and
challenging, and that you'll discuss possible solutions with other students via
the course discussion forum. So I hope this discussion answered most of the
questions you have about the course. Lets move on to the real reason that we're all
here, to learn more about algorithms.