Tip:
Highlight text to annotate it
X
So now that we understand the way in which optimal solutions must be composed
of optimal solutions to smaller subproblems.
We're well positioned to turn that into a recurrence, and then a dynamic
programming solution for the knapsack problem.
To compile our discussion from last time into a recurrence, let me just introduce
a little notation. So by capital v sub I x, what I mean is
the maximum value you can get of a solution that satisfies two constraints.
First of all, it should make use only of the first I items.
Second of all, the total size of the items that you pick should be at most x.
The analogue of this notation in the independent set problem was G sub I, the
graph with the first I vertices. We're now using two indices instead of
one because sub-problems have two different ways in which they can get
smaller. Recall case two of our though
experiments, when we look at a smaller sub-problem there was both one less item
and there was also less capacity. So when thinking about a sub-problem we
need to keep track of both. We need to keep track of how many items
are we allowed to use and we need to keep track of what's the total size that we're
allowed to use. So let's express our conclusion to the
last video in this notation. What did we figure out last video?
We figured out that the optimal solution has to have one of two forms.
Either you just inherit the optimal solution from that tech instance with one
less prop, one less item, that was the first case, or you take the optimal
solution to the sub-problem which has one less item and less capacity by the weight
of the current item and you extend it to an optimal solution for the current
sub-problem using the ith item. So V sub I, X.
The optimal value for the current sub problem.
It's the better of the two possibilities, so we take a max.
The first possibility is just the optimal solution to the, with the same capacity
in one pure item, we're using that solution.
The second one is you commit to using item I, you gain value vi.
And you combine that with the optimal solution using the first I minus one
items in the reduced capacity of X minus the weight of the current item.
So one quick edge case. If the weight of the ith item is bigger
than our permitted capacity x, then of course we can't use it, and then we're
forced to use the first case. We're forced to be in case one.
This then is our recurrence for the knapsack problem.
So that completes step one where we think about the optimal solution structure and
use that to devise a recurrence. Now, in the step two, we're going to
precisely nail down what exactly are the subproblems we have to care about.
Well in our maximum waited independence set example on path graphs remember the
reasoning every time we recursed every time we needed to look up a, a sub
problem solution it was obtained by plucking verticies off of the right of
the graph and for that reason all we ever cared about was the various prefixes of
the graph. In one dimension we got something similar
going on here whenever we look at smaller sub problems it's always with one fewer
item we're always sort of deleting the last item before we do our look up.
So again, we need to worry about all possible prefixes of the items.
So sub-problems, of the first I items for all values of I.
For the knapsack problem however there's a second sense in which sub-problems can
be smaller. And remember case two of our thought
experiment, when we want to know the optimal solution
that's guaranteed to use the current item I.
We have to reduce the capacity before looking up the corresponding optimal
solution of a sub-problem. That is we're not just peeling off items,
but we're also sort of peeling off parts of the knapsack capacity.
Now, here is where we're going to use our assumption, that I told you at the
beginning, at our input, that sizes are all integers, and that the
knapsack capacity is an integer. So, the knapsack capacity starts at
capital W. Every time we peel off some integer
amount of capacity from it. So at all sub problems, we're going to be
working with integral knapsack capacities.
So therefore, in the worst case, you know,
what are the various sub problems that might arise?
Well, perhaps, each choice of a residual capacity, 012, all the way up to the
original knapsack capacity, capital W. So now we're in great shape.
In step two, we figured out exactly what subproblems we care about.
In step one, we figured out a formula by which we can solve bigger subproblems
given the solutions of smaller ones, so all that remains now is to write down a
table and fill it in systematically using the recurrence, beginning with the
smaller subproblems, ending up with the largest subproblems.
So here's this pseudo-code. It's again going to be very simple.
In this algorithm, the array A that we're going to be filling in has two
dimensions, in contrast to one dimension in the wave independence set problem.
That's because our sub-problems have two different indices.
We have to keep track of both of which set of items we're allowed to use, and
also what capacity we need to respect. So the two independent varables indexing
sub problems forces us to have a 2D array that we're going to go through now in a
double four loop. So in the init-, initialization step, we
fill in all the entries, where I equals zero,
where there's no items there, of course, the optimal solution value is zero, and
then we just go through the sub problem systematically.
When we get to a sub-problem we use the recurrence that we developed to compute
the entry in that table using, of course, the entries that we've previously
computed in earlier iterations. So for a given sub-problem where you're
allowed to use the first I items and the residual capacity is X.
What do we know from my thought experiment?
This can only be one of two things. We're just going to do brute force search
through the two possibilities. The two possibilities where we have case
one where we just inherit the optimal solution with one fewer item,
and the same capacity. Or if we're going to actually use item I,
then we compose that with the optimal solution from the first I items.
That leaves enough space for item I. And in that case, we get the value V sub
I for including this Ith item. So this code doesn't quite make sense in
the case where the weight of the current item, wy, is strictly bigger than the,
the capacity, x. that would give us a negative array
index, which, of course, is a no-no. But in that
case, you know, amongst friends we just interpret it as ignoring the second case
of the max. So, you know?
I'm not going to write down the extra code.
But what you would write down is. If WI is strictly bigger than x, then you
just define a of I, x to be equal to a of ai-1, x.
And one crucial point, is that, you know, by the time we need to solve subproblem
with a given I and a given X, we have at our disposal the solutions to all of the
subproblems that we need. In particular, in the previous iteration
of the outer for loop, we computed the solutions to the two relevant subproblems
for evaluating this recurrence. They're there waiting for us available
for constant time lookup. So after this double four loop completes,
sitting there waiting for us in A, in the entry N comma capital W, is the solution
that we wanted. That's the max value solution that can
use any items that it wants and that can use the entire original knapsack capacity
capital W. That's what we want to return.
So that's it. That's the whole dynamic programming
solution for the knapsack problem. Just to make sure this makes sense, in
the next quiz, I want to ask you to analyze the running time of this dynamic
programming algorithm. Alright, so the correct answer is the
second one. And the running time analysis is as
straightforward as it was for the max weight independent set problem.
We just count up the number of sub-problems and look at how much work we
do in each. So how many sub-problems are there?
Where they're indexed by both I and x, there are n plus one choices for I, there
are capital w plus one choices for x. So that gives us theta of n times w
different sub-problems. And all we do for each is a single
comparison amongst previously computed solutions.
So we just do constant work per sub-problem,
giving us the overall running time of n times capital W.
So, a couple other quick notes, which play out in exactly the same way as for
the Maximum Independence Set problem. So I'm not going to discuss the details
here. So first of all, correctness.
Again, it's exactly the same template as in the previous dynamic programming
algorithm, and in our divide and conquer algorithms.
You just go by induction on the problem size.
And, you use the case analysis or a thought experiment to justify, formally,
the inductive step. So this algorithm suffers from a similar
drawback to the max independent set algorithm for path graphs that we looked
at. Namely it computes the value of an
optimal solution not the optimal solution itself.
It returns a number, not an actual subset of items.
But, just like with the independence set problem, you can reconstruct an optimal
solution from the filled in array just by tracing backward.
So the intuition is that you begin with the biggest sub-problem, and you look at
which of the two cases was used to fill in that entry.
If case one was used, you know you should exclude the last item.
If case two was used, you know you should include the last item.
And also, which of those two cases tells you which sub-problem you should trace
back to, to continue the process. So I encourage you to spell out the
details of this reconstruction algorithm in the privacy of your own home.
That'll give you some nice practice with this reconstruction aspect of the dynamic
programming paradigm.