Coursera - Design and analysis of algorithms i - 1.2 about the course

In this video I'll talk about various aspects of the course, the topics that we'll cover, the kinds of skills you can expect to acquire, the kind of background that I expect, the supporting materials and the available tools for self assessment. Let's start with the specific topics that this course is going to cover. The course material corresponds to the first half of the ten week Stanford course. It's taken by all computer science undergraduates, as well as many of our graduate students. There will be five high level topics, and at times these will overlap. The five topics are first of all, the vocabulary for reasoning about algorithm performance, the design and conquer algorithm design paradigm, randomization and algorithm design, primitives for reasoning about graphs, and the use and implementation of basic data structures. The goal is to provide an introduction to and basic literacy in each of these topics. Much, much more could be said about each of them, than we'll have time for here. The first topic is the shortest, and probably also the driest. But it's a prerequisite for thinking seriously about the design and analysis of algorithms. The key concept here is big O notation, which, conceptually, is a modeling choice about the granularity with which we measure a performance metric like the running time of an algorithm. It turns out that the sweet spot for clear high level thinking about algorithm design, is to ignore constant factors and [inaudible] terms. And to concentrate on how well algorithm performance scales with large input sizes. Big O notation is the way to mathematize this sweet spot. Now, there's no one silver bullet in algorithm design. No single problem solving method that's guaranteed to unlock all of the computational problems that you're likely to face. That said, there are a few general algorithm design techniques. High level approaches to algorithm design that find successful application across a range of different domains. These relatively widely applicable techniques are the backbone of a general algorithms course like this one. In this course, we'll only have time to deeply explore one such algorithm design paradigm, namely that of the divide and conquer algorithms. In the sequel course as we'll discuss, there's two other major algorithms on paradigms to get covered. But for now, divide and conquer algorithm, the idea is to first break the problem into smaller problems which then gets solved recursively, and then to somehow quickly combine the solutions to the sub problems into one for the original problem that you actually care about. So for example, in the last video. We saw two algorithms of this sort, two divide and conquer algorithms from multiplying two large integers. In later videos we will see a number of different applications. We'll see how to design fast divide and conquer algorithms for problems ranging from sorting to matrix multiplication to nearest neighbor-type problems and computation of geometry. In addition, we'll cover some powerful methods for reasoning about the running time of recursive algorithms like these. As for the third topic. A randomized algorithm is one that, in some sense, flips coins while it executes. That is, a randomized algorithm will actually have different executions if you run it over and over again on a fixed input. It turns out, and this is definitely not intuitive, that allowing randomization internal to an algorithm, often leads to simple, elegant, and practical solution to various computational problems. The canonical example is randomized quick sort, and that algorithm and analysis we will cover in detail in a few lectures. Randomized formality testing is another killer application that we'll touch on. And we'll also discuss a randomized approach to [inaudible]. [inaudible], and finally we'll discuss how randomization is used to reason about half functions. Has maths. One of the themes of this course, and one of the concrete skills that I hope you take away from the course, is, literacy with a number of computational primitives for operating on data, that are so fast, that they're, in some sense, essentially free. That is, the amount of time it take to invoke one of these computational primitives is barely more than the amount of time you're already spending just examining or reading the input. When you have a primitive which is so fast, that the running time is barely more than what it takes to read the input, you should be ready to apply it. For example, in a preprocessing step, whenever it seems like it might be helpful. It should just be there on the shelf waiting to be applied at will. Sorting is one [inaudible] example of a very fast, almost [inaudible] primitive of this form. But there are ones that operate on more complex data as well. So recall that a graph of data structure that has, on the one hand, vertices, and on the other hand, edges. Which connects pair of vertices. Graphs model among any other things. Different types of networks. So even the graphs are much more complicated than you are raised. There's still a number of reasons we [inaudible] for reasoning about their structure. In this possible focus on primitives for competing connectivity information. We also talk on how some primitives have been used to investigate the structure of information in social networks. Finally, data structures are often a crucial ingredient in the design of fast algorithms. A data structure's responsible for organizing data in a way that supports fast queries. Different data structures support different types of queries. I'll assume that you're familiar with the structures that you typically encounter in a basic programming class including rays and vectors. Lists, stats, and. Cubes. Hopefully, you've seen at some point, both trees and heaps, or you're willing to read a bit about them outside of the course, but we'll also include a brief review of each of those data structures as we go along. There's two extremely useful data structures that we'll discuss in detail. The first is balanced binary search trees. These data structures dynamically maintain an ordering on a set of elements, while supporting a large number of queries that run in time are logarithmic in the size of the set. The second data structure we'll talk a, a fair bit about is hash tables or hash maps, which keep track of a dynamics set, while supporting extremely fast insert and lookup queries. We'll talk about some canonical uses of such data structures, as well as what's going on under the hood in a typical [inaudible], implementation. Of such a date of structure. >> There's a number of important concepts in the designing analysis of algorithms that we won't have time to cover in this five week course. Some of these will be covered in the sequel course, Designing Analysis of Algorithms II, which corresponds to the second half of Stanford's ten week course on this topic. The first part of this sequel course focuses on two more algorithms designed paradigms. First of all, the design analysis of Gree Algorithms with applications to minimum spanning trees, scheduling, and information theoretic coding. And secondly, the design analysis of dynamic programming algorithms with example applications being in genome sequence alignment and the shortest path protocols in communication networks. The second part of the sequel course concerns NP complete problems, and what to do about them. Now, NP complete problems are problems that, assuming a famous mathematical trajectory you might have heard of, which is the peanut equal to NP conjecture. Our problem is it cannot be solved under this conjecture by any computationally efficient algorithm. We'll discuss the theory of NP completeness, and, with a focus on what it means for you as an algorithm designer. We'll also talk about several ways to approach [inaudible] problems, including fast algorithms that correctly solve special cases. Fast heuristics with provable performance guarantees, and exponential time. Algorithms that are qualitatively faster than brute force search, of course there are plenty of important topics that can't be fit into either of these. Two five week courses. Depending on the demand, there might well be further courses on more advanced topics. Following this course is going to involve a fair amount of time and effort on your part. So it's only reasonable to ask what can you hope to get out of it, what skills will you learn. Well. Primarily, you know, even though this isn't a programming class per se, it should make you a better programmer. You'll get lots of practice describing and reasoning about algorithms, you'll learn algorithm design paradigms, so really high level problem-solving strategies that are relevant for many different problems across different domains, and tools for predicting the performance of such algorithms. You'll learn several extremely fast subroutines for processing data and several useful data structures for organizing data that can be deployed directly. In your own programs. Second, while this is not a math class per se, we'll wind up doing a fair amount of mathematical analysis. And this in turn will sharpen you mathematical analytical skills. You might ask, why is mathematics relevant for a class in the design and analysis of algorithms, seemingly more of a programming class. Well let me be clear. I am totally uninterested in merely telling you facts or regurgitating code that you can already find on the web or in any number of good programming books. My goal here in this class, and the way I think I can best supplement the resources Is that you probably already have access to is to explain why. >> Algorithms are the way they are. Why we analyze the algorithms in the way that we do, why there is super fast algorithms already super fast, and so on. And it turns out that good algorithmic ideas usually require nontrivial mathematical analysis to understand properly. You'll acquire fundamental insights into the specific algorithms and data structures that we discuss in the course. And hopefully, many of these insights will prove useful, more generally, in your other work. Third and perhaps the most relevant for those of you who work in some other discipline this course should help you learn how to think algorithmically. Indeed after studying algorithms it's hard enough not to see them pretty much everywhere whether you are riding an elevator, watching a flock of birds, buying and selling stocks out of your portfolio even watching an infant learn. As I said in the previous video algorithm thinking is becoming increasingly useful and prevalent if you are outside of computer science and technology like in biology, statistics and economics. Fourth, if you're interested in feeling like a card carrying computer scientist, in some sense, then you'll definitely want basic literacy in all of the topics that we'll be covering. Indeed, one of the things that makes studying algorithms so fun, is, it really feels like you're studying a lot of the greatest hits from the last 50 years of computer science. So, after this class, no longer will you feel excluded at that computer science cocktail party when someone cracks a joke about Dexter's Algorithm. Now you'll know exactly what they mean. Finally, there's no question that studying this material is helpful for technical interview questions. To be clear, my sole goal here is to teach you algorithms, not to prepare you for interviews, per se. But over the years, countless students of mine have regaled me with stories about how mastering the concepts in this class enabled them to ace every technical question they were ever asked. I told you, this is fundamental stuff. So, what do I expect from you? Well, honestly, the answer is nothing. After all isn't the whole point of a free online class like this one that anyone can take it and devote as much effort to it as they like. So that said, as a teacher it's still useful to have one or more canonical students in mind. And I thought I'd go ahead and be transparent with you about how I'm thinking about these lectures. Who I have in mind that I'm teaching to. So again, please don't feel discouraged if you don't conform to this canonical student template. I'm happy to have the opportunity to teach you about algorithms no matter who you are. So first, I have in mind someone who knows at least some programming. For example, consider the previous lecture. We talked about a [inaudible] approach to multiplying two numbers and I mentioned how in certain mathematical expression, back then we labeled it star and circled it in green. How that expression naturally translated into a recursive algorithm. In particular, I was certainly assuming that you had some familiarity with the recursive programs. If you feel comfortable with my statement in that lecture, if you feel like you could code up a recursive integer multiplication algorithm based on the high level outline that I gave you, then you should be in good shape for this course. You should be good to go. If you weren't comfortable with that statement, well, you might not be comfortable with the relatively high conceptual level at which we discuss program in this course. But I encourage to watch the next several videos anyway, to see if you get enough out of them to make it worth your while. [sound]. Now, while I'm aiming these lectures at people who know some programming, I'm not making any assumptions whatsoever about exactly which programming languages. Any standard imperative language. You know, something like C, Java or Python is totally fine for this course. Now, to make these lectures accessible to as many programmers as possible. And to be honest, you know, also to promote thinking about programming at a relatively abstract conceptual level. I won't describing algorithms in any particular programming language. Rather, when I discuss the algorithms, I'll use only [inaudible] pseudo-code, or often simply English. My inductive hypothesis is that you are capable of translating such a high level description into a working program in your favorite programming language. In fact, I strongly encourage everyone watching these lectures to do such a translation of all of the algorithms that we discussed. This will ensure your comprehension, and appreciation of them. Indeed, many professional computer scientists and programmers don't feel that they really understand an algorithm until they've coded it up. Many of the course's assignments will have a problem in which we ask you to do precisely this. Put another way, if you're looking for a sort of coding cookbook, code that you can copy and paste directly into your own programs. Without necessarily understanding how it works, then this is definitely not the course for you. There are several books out there that cater to programmers looking for such coding cook books. Second, for these lecture I have in mind someone who has at least a modest amount of mathematical experience though perhaps with a fair bit of accumulated rust. Concretely I expect you to be able to recognize a logical argument that is a proof. In addition, two methods of proof that I hope you've seen before are proofs by induction and proofs by contradiction. I also need you to be familiar with basic mathematical notation, like the standard quantifier and submission symbols. A few of the lectures on randomized algorithms and hashing will go down much easier for you if you've seen [inaudible] probability at some point in your life. But beyond these basics, the lectures will be self contained. You don't even need to know any calculus. Say, for a single simple [inaudible] that magically pops up. The analyses of the randomized quick sort algorithm. I imagine that many of you have studied math in the past, but you could use a refresher, you're a bit rusty. And there's plenty of free resources out there on the web, and I encourage you to explore and find some that you like. But one that I want to particularly recommend is a great set of free lecture notes. It's called Mathematics for Computer Science. It's authored by Eric Lehman and Tom Layden, and it's quite easy to find on the web if you just do a web search. And those notes cover all of the prerequisites that we'll need, in addition to tons of other stuff. In the spirit of keeping this course as widely accessible as possible, we're keeping the required supporting materials to an absolute minimum. Lectures are meant to be self-contained and we'll always provide you with the lecture notes in PowerPoint and PDF format. Once in a while, we'll also provide some additional lecture notes. No textbook is required for this class. But that said, most of the material that we'll study is well covered in a number of excellent algorithms books that are out there. So I'll single out four such books here. The first three I mention because they all had a significant influence on the way that I both think about. And teach algorithms. So it's natural to acknowledge that debt here. One very cool thing about the second book, the one by [inaudible], is that the authors have made a version of it available online for free. And again, if you search on the authors' names and the textbook title, you should have no trouble coming up with it with a web search. Similarly, that's the reason I've listed the forth book because those authors have likewise made essentially a complete version of that book available online and it's a good match for the material that we're going to cover here. If you're looking for more details about something covered in this class, or simply a different explanation than the one that I give you, all of these books are gonna be good resources for you. There are also a number of excellent algorithm textbooks that, that I haven't put on this list. I encourage to explore and find you own favorite. >> In our assignments, we'll sometimes ask you to code up an algorithm and use it to solve a concrete problem that is too large to solve by hand. Now, we don't care what program and language and development environment you use to do this as we're only going to be asking you for the final answer. Thus, we're not requiring anything specific, just that you are able to write and execute programs. If you need help or advice about how to get set up with a suitable coding environment, we suggest that you ask other students for help via the course discussion forum. Finally, let's talk a bit more about assessment. Now this course doesn't have official grades per se, but we will be assigning weekly homeworks. Now we're going to assign homeworks for three different reasons. The first is just for self-assessment. It's to give you the opportunity to test your understanding of the material so that you can figure out which topics you've mastered and which ones that you haven't. The second reason we do it is to impose some structure on the course, including deadlines, to provide you with some additional motivation to work through all the topics. Deadlines also have a very important side effect that synchronizes a lot of the students in the class. And this of course makes the course discussion forum a far more effective tool for students to seek and provide help in understanding the course material. The final reason that we give homework is to satisfy those of you who, on top of learning the course material, are looking to challenge yourself intellectually. [sound]. Now, this class has tens of thousands of students. So it's obviously essential that the assignments can be graded automatically. Now, we're currently only in the 1.0 generation of free online courses such as this one. So the available tools for auto graded assessment are currently rather primitive. So, we'll do the best we can, but I have to be honest with you. It's difficult, or maybe even impossible to test deep understanding of the design and analysis of algorithms, using the current set of tools. Thus, while the lecture content. In this online course is in no way watered down from the original Stanford version. The required assignments and exams we'll give you, are not as demanding as those that are given in the on campus version of the course. To make up for this fact, we'll occasionally propose optional algorithm design problems, either in a video or via supplementary assignment. We don't have the ability to grade these, but we hope that you'll find them interesting and challenging, and that you'll discuss possible solutions with other students via the course discussion forum. So I hope this discussion answered most of the questions you have about the course. Lets move on to the real reason that we're all here, to learn more about algorithms.