Bubble Sort

[BUBBLE SORT] [JACKSON STEINKAMP HARVARD UNIVERSITY] [THIS IS CS50. CS50TV] Bubble Sort is an example of a sorting algorithm-- that is, a procedure for sorting a set of elements in ascending or descending order. For instance, if you wanted to sort an array consisting of the numbers {3, 5, 2, 9}, a correct implementation of Bubble Sort would return the sorted array {2, 3, 5, 9} in ascending order. Now, I'm going to explain in pseudocode how the algorithm works. Let's say we're sorting a list of 5 integers--3, 2, 9, 6, and 5. The algorithm starts by looking at the first two elements, 3 and 2, and checking if they're out of order relative to each other. They are--3 is greater than 2. To be in ascending order, they should be the other way around. So, we swap them. Now the list looks like this: {2, 3, 9, 6, 5}. Next, we look at the second and third elements, 3 and 9. They're in the correct order relative to each other. That is, 3 is less than 9 so the algorithm doesn't swap them. Next, we look at 9 and 6. They're out of order. So, we need to swap them because 9 is greater than 6. Lastly, we look at the last two integers, 9 and 5. They're out of order, so they must be swapped. After the first complete pass through the list, it looks like this: {2, 3, 6, 5, 9}. Not bad. It's almost sorted. But we need to run through the list again to get it completely sorted. Two is less than 3, so we don't swap them. Three is less than 6, so we don't swap them. Six is greater than 5. We swapped. Six is less than 9. We don't swap. After the second pass through, it looks like this: {2, 3, 5, 6, 9}. Perfect. Now, let's write it in pseudocode. Basically, for each element in the list, we need to look at it and the element directly to its right. If they are out of order relative to each other--that is, if the element on the left is greater than the one on the right--we should swap the two elements. We do this for every element of the list, and we've made one pass through. Now we just have to do the pass-through enough times to ensure the list is fully, properly sorted. But how many times do we have to pass through the list to guarantee that we're done? Well, the worst-case scenario is if we have a completely backwards list. Then it takes a number of pass-throughs equal to the number of elements n-1. If this doesn't make sense intuitively, think of a simple case--the list {2, 1}. This is going to take one pass-through to sort correctly. {3, 2, 1}--The worst case is that with 3 elements sorted backwards, it's going to take 2 iterations to sort. After one iteration, it's {2, 1, 3}. The second yields the sorted array {1, 2, 3}. So you know you never have to go through the array, in general, more than n-1 times, where n is the number of elements in the array. It's called Bubble Sort because the largest elements tend to 'bubble-up' to the right pretty quickly. In fact, this algorithm has very interesting behavior. After m iterations through the entire array, the rightmost m elements are guaranteed to be sorted into their correct place. If you want to see this for yourself, we can try it on a completely backwards list {9, 6, 5, 3, 2}. After one pass through the entire list, [sound of writing] {6, 9, 5, 3, 2}, {6, 5, 9, 3, 2}, {6, 5, 3, 9, 2}, {6, 5, 3, 2, 9} the rightmost element 9 is in its proper place. After the second pass-through, the 6 will have 'bubbled-up' to the second rightmost place. The two elements on the right--6 and 9-- will be in their correct places after the first two pass-throughs. So, how can we use this to optimize the algorithm? Well, after one iteration through the array we don't actually need to check the rightmost element because we know it's sorted. After two iterations, we know for sure the rightmost two elements are in place. So, in general, after k iterations through the full array, checking the last k elements is redundant since we know they're in the correct location already. So if you're sorting an array of n elements, on the first iteration--you'll have to sort all of the elements--the first n-0. On the second iteration, you'll have to look at all of the elements but the last-- the first n-1. Another optimization might be to check if the list is already sorted after each iteration. If it's already sorted, we don't need to make any more iterations through the list. How can we do this? Well, if we don't make any swaps on a pass-through of the list, it's clear that the list was already sorted because we didn't swap anything. So we definitely don't have to sort again. Perhaps you could initialize a flag variable called 'notSorted' to false and change it to true if you have to swap any elements on one iteration through the array. Or similarly, make a counter to count how many swaps you make on any given iteration. At the end of an iteration, if you didn't swap any of the elements, you know the list is already sorted and you're done. Bubble Sort, like other sorting algorithms, can be tweaked to work for any elements which have an ordering method. That is, given two elements, you have a way to say if the first one is greater than, equal to, or less than the second. For instance, you could sort letters of the alphabet by saying that a < b, b < c, etc. Or you could sort days of the week where Sunday is less than Monday which is less than Tuesday. Bubble Sort is by no means a very efficient or fast sorting algorithm. Its worst-case runtime is Big O of n² because you have to make n iterations through a list, checking all n elements each pass-through, n x n = n². This run time means that as the number of elements you're sorting increases, the run time increases quadratically. But if efficiency isn't a major concern of your program or if you're only sorting a small number of elements, you might find Bubble Sort useful because it's one of the simplest sorting algorithms to understand and to code. It's also a great way to get experience with translating a theoretical algorithm into actual functioning code. Well, that's Bubble Sort for you. Thanks for watching. CS50.TV