Tip:
Highlight text to annotate it
X
Peace be upon you
Today we'll talk about another technique
an optimization technique like Dynamic Programming, and strongly related to it
This technique is called Greedy Algorithm or Greedy Technique
Quickly, let's remember what Dynamic Programming is, to figure out with what Greedy is
Dynamic Programming is also optimization: minimization or maximization, according to the problem in hand
What does it minimize or maximize over? - Over all the possible solutions we have
So, we always had: a recursive state, this state is a root from which substates come out
These substates may include child1, child2, child3, .. etc
e.g., in the Longest-Increasing-Subsequence Problem, we had 2 choices,
either to leave the item currently existing and recurse with the next one,
or take it and recurse with the next one
So, I had 2 choices, right?
And supposedly, we didn't know what the optimal solution is, so we were just trying everything,
leave the item or take the item, leave, or take, leave or take .. etc
and the number of states was exponential, so we would memorize, .. etc
Greedy refers to the English word "greedy", greedy in the sense that it wants to make something called "Greedy Choice"
What is Greedy Choice about? .. In the above example, you have several substates you choose one from
Greedy Choice, on the contrary, is about deciding "somehow" that my choice is childX
E.g., assume an example where each state had 3 children
Greedy Choice would say, for example, the second child (state) is the optimal
then after going to the state of the second child, [it'd choose] the first child is the optimal
then after going to the first child, it'd choose, for example, the first is the optimal, .. etc
Each time it makes one Greedy Choice
So, DP considers all choices, while Greedy considers all choices, but chooses only one choice to continue with
DP recurses on each choice and comes back, trying everything
but Greedy has just only one way
Greedy has two kinds: one that finds a Global Optimal Solution, and one that does not
The one that does not is called approximate
Finding the global optimal solutions means that it's like DP. DP considers everything and tells you the optimal of all
If Greedy makes one Greedy Choice per step.
Until it reaches the base case, and its answer would be then proved to the optimal, then this is the Global Optimal Solution
The other one is called approximate
In contests, only the Global Optimal Solution kind is the one that matters
because, assume the problem wants the minimum answer. So, if you solution produces 54 instead of 50
the judge would only want 54
So, in contests we'd only care about the Global-Optimal one.
But in real life, both are important, and the approximate is even more
as many real-life problems are hard, so we only try to find a close-enough approximate solution
You'll find people talking about intuition. We'll talk about "intuition" and "proof"
It's about having a hunch of which state is the answer
More info on the topic: Greedy Choice makes the choice that "looks the best at the moment"
That's why they call its choice "Local Optimal Choice"
Standing at some state, it considers some stuff and chooses according to them
As we mentioned before, if the collection of Local Optimal Choices is proved to be global, then we have our solution
You must have proof, in case globally optimal ...
That is, if you claim your solution using Greedy, the substate you choose each time, is optimal
you must have proof for that, especially in a contest
While we're in a contest, mostly time is strict. So, what most contestants try to do is to find an intuition
to have a strong observation, to have a strong sense that the solution is X, and does not prove
Instead, would generate several examples to challenge his idea
After contest, many contestants don't care to know the proof of the solutions
It's VERY IMPORTANT to think after contest why your greedy was correct
To be able to prove, you need to use a proving technique. There're different proof techniques
Just as you want to be good in programming, you need to be good as well in proving techniques
Spare some time of your life to study different proof techniques
Although Greedy is very related to DP solutions (there's usually both a DP and a Greedy solution to the problem),
in many cases Greedy is the only solution, and DP won't work
because state of the DP is [Incomprehensible], so it's worthless and you have to think Greedy
Today, we'll give two examples to illustrate the topic
One of the most famous classical examples taught in textbooks is called activity selection problem
Someones has a number of activities, and each activity starts at some time and ends at some time
and they do just one activity at any moment
So, there's an activity starting @ 2 ending @ 4.5, and another starts @ 5 ending @ 7 .. etc
So, these activities overlap. So, they can't do all of them
So, they want to choose a subset of these activities such that:
1- There's no overlap. 2- The number of elements in the subset is as big as possible
Last info: An activity can start at the end of another activity
That is, for example, an activity that starts @ 10 and ends @ 15 can start after an activity that started @ 7 and ended @ 10
But, an activity that @ starts @ 10, can start after an activity that ended @ 11, ok?
This problem has a naive DP solution, just like any DP problem we solve
When you think, you'll find this is clearly a subset-style one
That is, any activity is after all either 0 or 1, just like previous problems
So, "Take-or-leave" idea is just applied to every activity
But, you have a little problem: Now, if I pick the first activity: Previously, we used to go to the next
Now, you have 2 problems you should notice:
Assume the first activity was 10-11, and the second one was 6-7, and third was 9-12
You need to sort these activities to be able to deal with. This's the first thing
You may sort them according to their start time, or alternatively you can sort them according to their finish time
The 2nd problem: Previously, in Knapsack and such, we would pick -> go to next, then leave -> go to next
Now, if I leave, I can go to the next one easily. But if I pick the current state, can I go to the next one (then pick or leave it, .. etc)?
No, wait a second. You may be intersecting with the next one
So, the correct thing is not to go to the next right away
The correct thing is that you'll need to loop to pick the next one with which you don't overlap
So, it's a pick-or-leave, just 3-line code, you either take the state or leave it
but if you take it, you'll iterate a little to find the next event to start with
Of course, this is possible as you'd sorted the activities in the beginning
Now, we had 2 choice: pick or leave
The question is: Can I, in another way, choose an activity (I have n activities), and say this is part of the optimal solution, remove it, and then have n - 1 activities to solve as an independent subproblem?
That is, MAKE ONE CHOICE. Can I do that? If I can, then I've found the Greedy Solution
Let's start thinking about that
First thing one may think about: start to get an intuition, try to guess the solution
So, first solution one may think about: what about counting with each activity how many activities intersect with it?
So, for example, if I (an activity) intersect with 2 activities. So, if you take me, there're 2 activities you won't be able to take, and so on.
So, we'll do this move, and start picking the activities according to the activity that has the lowest number of intersections
Then, when I take this activity, I'll remove the activities that intersected with it and recurse with the rest
So, that was a nice idea. But, is it correct?
The answer is No
Look at this example
In this photo, each line represents an activity with a starting time and ending time
The correct solution is these 4 blue things
Now, what activities have the smallest number of intersections in these?
The red one intersects with 2 only
If you take the red one as the first activity, you'll need to remove the 2 upper blue ones
And the optimal then would be the activity here and the activity here
So, your solution consists of 3, while the correct one consists of 4
So, our idea is wrong
Let's think of another intuition
What about trying to make a selection such that the number of the remaining activities is as big as possible?
However, the idea of choosing an activity that intersects with smallest number of activities does so, doesn't it?
Yes, it does. But I'll do it in another way
I'll try to pick an activity. Let's say for example, I'll sort them according to start time
So, the activity in the left, if I pick it, it may help me a little
The answer: This idea is wrong.
You're telling me to sort the activities according to the start time, right?
Imagine for instance, that the first activity starts the earliest and ends the latest
So, for example, it starts @ 1 and ends @ 12 such that if I do this activity, I won't be able to do anything else
So, that's a total failure
But, wait! what about sorting them according to finish time and not start time?
Then what? Then, pick the activity that finishes the earliest
That is, pick the first-ending activity, remove the activities intersecting with it
I'm currently trying to do something we call Boundary Thinking
trying to get an activity from one of the sides (the beginning or the end), then solve for the rest
The idea makes sense. If you try it on several test cases, you'll find it working
However, you'll still be worried whether that solution is correct or no, and you can't know
As I told you, if we're in a contest, just try it on several test cases. If it works, code it
Otherwise, from inside of your idea, you try to get the proof of it
Let me show you the proof of this idea to realize the importance of proofs
I started with this example as the solution makes sense, but you'd still be afraid of your solution
Let's assume, that these activities are sorted by their finish time, and then numbered 1 to n
So, activity 1 is the one that finishes the earliest, then 2, .. etc
Let's assume we have a subset A of these activities and they're the optimal solution
And A is sorted according to finish time
Now, we're claiming that the first activity in A would be activity 1 and we'd still be able to get the optimal solution. That's we said before, right?
That after we sort according to finish time, the activity you'll pick is activity 1, then remove the overlapping ones with it and recurse over the new subproblem?
So, from your claim, if you have a set of activities number 1 to n, and you chose a subset from them,
the first activity of this subset would be 1, right?
Now, let's see how we'll prove it, and try to focus as these techniques are [Incomprehensible] general techniques
Let's assume that the first activity is called K
Now, you're assuming that subset A is optimal subset
So, I'm saying let's consider the first item in it, as my proof or my sense depended on the idea of the first one
K has 2 possible values: it either be 1. So, it's the first activity, and in this case that'd be as our claim
And it either be something other than 1
So, the first item is either 1 or NOT 1. If it's 1, then I'm correct.
If the first time is NOT 1, I wanna prove that I can remove the first item of subset A, and replace it with 1 without having overlap in A
And if no overlapping occurs, then I've managed to make a new subset whose length is the same as the optimal subset I have
Now we're examining the possibility that K != 1
Let's make another subset called B, in which we'll remove the item K and replace it with 1
The question: Now, would there be overlapping in subset B, and so we have to remove 1?
The answer is: Impossible. There can be no overlapping
because we've agreed that item 1 has the smallest finish time, and so its finish time is smaller than the finish time of the first item in A
and since it's smaller than it, the activities in B are also disjoint, no two will overlap in it
So, since the elements in B are disjoint, AND you've just removed an item from A and replaced it with an item, then B is the optimal solution
So, the proof is all about: If we have an optimal subset, the first item of it is either the smallest one, or not the smallest one but I can remove it and replace it with the smallest one and still no overlap occurs. So, we have an optimal solution
So, this proof makes you sure that you solution is surely correct
I've explained the proof, but usually I won't be explaining proofs as much as I'll be talking about the intuition to increase it for you
But you MUST get used to proofs
When you'll start reading proofs, you'll find it boring, cumbersome, you'll be sick of it. All of this is normal
especially for people with Computer-Science background, as proofs are not easy for them as for Mathematics
but keep reading the proof, understand it, increase your proving techniques, and you'll be better with time
Let's talk about another example, a much easier one
Remember the Knapsack problem? We have a knapsack of size, for example, 20
and we would enter house, take the items or not, and we wanted to get the items with highest benefit, each time had a price and weight
In the Fractional Knapsack: Previously, we used to think as 0 or 1, take the item or don't
Now, they tell you you can divide the item into small pieces, and pick the piece you want from them
So, it's something like, when you're entering an apartment which has 5-kilo rice, where you can either steal the whole sack or take just 1 kilo
Previously, the problem was about undividable items, like a TV for example
In this problem for example, imagine you entered an apartment which has 10-kilo rice, and you know these 10 kilos cost 30 L.E.
and you found 5-kilo cheese and they cost 50 L.E
and I'm telling you that your bag can carry only 7 kilo. What would the optimal solution be?
The optimal solution here consists of the 5-kilo cheese, and 2 kilos from the rice
What would the DP solution be like?
Just like the DP solution of the ordinary Knapsack, or actually it's just like the coin change problem
When you pick something, for example, 1 kilo of the items, you'll try to recurse on yourself and see if you'll pick more or that's enough, and try to maximize
And what about the Greedy solution?
Let me ask you a question. If you have in this sack just 1 R. That is, you went to the apartment and you are not allowed to take anything other than 1 kilo
from which items would you take?
would you take from the thing with the highest price? Surely, the answer is no
Why?
Let me edit this example a little
or never mind
Now if you take from the item with the highest price, note that you'll take only 1 kilo from it
you won't take the whole price, you'll take a kilo only
So, you don't care about the item with the highest price. You only care about the item with the highest per-kilo price
If I'd edit this example to illustrate the point more, I'd make the 20 kilo of the first type for 60 L.E
So, following the sense, you'd take from the one with the highest price: take a kilo from the 60-L.E one
However, the kilo of the first type is for 60 / 20 = 3 L.E
So, you benefit only 3 L.E. from it. However, from the below one, you'd gain 50 / 5 = 10 L.E.
So, it's most beneficial to me if I pick from the second one
So, if I'm allowed to take 1 kilo only, with some intelligence, I wouldn't keep trying whether I should take 1 kilo from the first or the second
Just take 1 kilo from the highest one (highest per-kilo price)
And if you can take 2 kilos? just take 1 kilo from the highest one, then you'll have a new subproblem
And in the new subproblem, the second item would be 1 reduced. So, instead of 5 kilos for 50 L.E., it'd be 4 kilos for 40 L.E.
And the take the highest one from it, .. etc
What I'm saying here is that we've developed a Greedy Solution here
The greedy solution says: "Go take from the highest per-kilo type till it runs out.
Then take from the next highest one till it runs outs, and so on"
So, the solution would be:
Sort items according to ratio of Money / Weight
Then take from the first item as much as you can. And when finished go to the next one, take as much as you can, and so on
And for complexity: Sorting is O(NlogN), and the processing is O(N)
So, what did we do in this example?
The highest one was this one, and we have up to 7 kilo. So, we took from it 5 kilos and it ran out
And now we have 2 kilos left, take them from the other one
Note that we got our intuition in this problem a lot faster, and believed in it so fast
However, the proof of this problem is a full page
That is, the problem's idea may be too easy, however, its proof is cumbersome
And the problem or the idea can be very cumbersome, but has a very easy proof
So, in all cases you need to read proofs
So, that's for "Greedy". Greedy is important and shows up in lots of places
So, it needs work. It's a hard technique as it depends on observations
It needs you to think a lot, needs you to proof. So, try to work on it well
Peace be upon you :)