[quote]stokedporcupine8 wrote:
blithe wrote:
Hmm. I wrote out the problem I was struggling with, but realized that I need to add an additional twist before the problem becomes clear. I posted it anyway, but I suspect that unless one is already familiar with the Lagrange the end steps will be a little unclear.
Alright, so in a simple case of a Lagrange, I would use it for something like:
Max z = xy
subject to the constraint x + y = 1
the setup for the Lagrange is as follows:
L = xy - v(x + y - 1)
where v would typically be lambda. I then take the first order derivatives with respect to x, y, and v respectively:
y - v = 0
x - v = 0
x + y - 1 = 0
and then solving for that system of equations:
y = v = x
x + x = 1
x = 1/2
y = 1/2
I find the value of x and y that would max z subject to the constraint. It essentially says the direction of the grads will be equal at the point of optimization, and if you multiple the magnitude by some scalar the gradients will be equal, so you must solve for x and y where this occurs, and you’ll have the point of optimization.
Alright. So moving on to what confuses me, a case like:
Max z = (x + 100)y
subject to x + y = 1
and nonnegativity, or x >= 0, and y >= 0, which is conventionally written as -x <= 0, and -y <= 0
Geometrically, it is obvious that this will be a “corner solution;” the max value that can occur here will be when x = 0 and y = 1, or at the corner of the constraint, touching the x axis.
Setting up the Lagrange here requires an additional twist for nonnegativity:
L = (x + 100)y - v(x + y -1) + tx + ry
and there’s a handy set of rules for solving this, but I haven’t quite mastered them at this point. Anyway, I’m concerned that in situations like this, the intuitive portion of the Lagrange no longer holds true. The grads no longer point in the same direction…how could it work?
Sorry for my short reply before, this is actually, I think, an interesting question. You are correct in stating as you have that “the Lagrange essentially says that the direction of the gradient of the objective function will be the same as some scalar times the gradient of the constraint.” This isn’t essentially what it says, it’s exactly what it says. So, let’s look at your own example:
L = xy - v(x + y - 1)
where v would typically be lambda. I then take the first order derivatives with respect to x, y, and v respectively:
y - v = 0
x - v = 0
x + y - 1 = 0
More descriptively, this is what you get:
d/dx(xy) - (v)d/dx(x + y - 1)=0
d/dy(xy) - (v)d/dy(x + y - 1)=0
(x + y - 1)
Or, just calling these things f(x,y), g(x,y)
d/dxf(x,y) - (v)d/dxg(x,y)=0
d/dyf(x,y) - (v)d/dyg(x,y)=0
g(x,y)=1
Now, we can clearly right the first two equations as:
d/dxf(x,y) - (v)d/dxg(x,y)=d/dyf(x,y) - (v)d/dyg(x,y)
Rearrange them:
d/dxf(x,y) - d/dyf(x,y)=(v)d/dxg(x,y) - (v)d/dyg(x,y)
or, absorbing negatives into the constants:
d/dxf(x,y) + d/dyf(x,y)=(v)[d/dxg(x,y) - d/dyg(x,y)]
But this just says, of course,
grad(f) = vgrad(g)
which is what you said.
Now, I’ve written all this out, and made such a big deal out of it, because it should be clear that given the conditions for the nonnegativity problem you can’t perform this little derivation. So consider your second problem:
Setting up the Lagrange here requires an additional twist for nonnegativity:
L = (x + 100)y - v(x + y -1) + tx + ry
If you take the first derivatives of this for x,y and v, you will get three questions, but unlike the case above you will NOT be able to derive that setting the first partial derivatives of this equation equal to zero is “essentially” setting the gradients of f and g equal.
What’s the point? Well, the point is you were EXACTLY right. Setting the gradients of f and g equal when there is a nonnegativity constraint doesn’t work, and in fact you’re not doing that when you use the formulas for this case. Now, if your teacher presented this all to you as if you were still “just” setting the gradients equal, than either they themselves are confused or they mislead you by omission.
If this all has left you with a new question–mainly, just how the hell do the “new” conditions get me the maximization–that’s somewhat more complicated. I, since I’ve never had to deal with this sort of nonnegativity constraint in a maximalization problem, don’t really quite understand them myself. I’m sure I could look into it more and figure out just what’s going on, but you can do that just as well as I. I’ve found this pdf, which addresses just where these maximalization conditions for nonnegativity come from: http://www.hks.harvard.edu/nhm/notes2006/You,%20the%20Kuhn-Tucker%20conditions,%20and%20You.pdf . I can’t attest to it’s clarity or accuracy, but maybe it’s a start.
In any event I hope I’ve at least cleared up your original question–namely what setting the gradients equal has to do with maximization with nonnegativity–even if I’ve left you with a new question. [/quote]
Very thorough explanation.