Lagrange Optimization Question

I’m fiddling around with the Lagrange function for optimization this semester, and I’ve got a bit of a conceptual roadblock that I was hoping one of the learned people on the board might be able to help me out with.

So in the simplest case, the Lagrange essentially says that the direction of the gradient of the objective function will be the same as some scalar times the gradient of the constraint. Cool beans. What happens when there’s a corner solution? The gradient is obviously not in the same direction then…am I not allowed to use a Lagrange for corner solutions? Do I check corner solutions manually or something?

If anyone knows what I’m talking about, I would really appreciate some advice. Thanks.

Foam Roll.

What’s an example of a problem that you’re struggling with?

Please explain what you mean by a “corner solution”.

Hmm. I wrote out the problem I was struggling with, but realized that I need to add an additional twist before the problem becomes clear. I posted it anyway, but I suspect that unless one is already familiar with the Lagrange the end steps will be a little unclear.

Alright, so in a simple case of a Lagrange, I would use it for something like:

Max z = xy

subject to the constraint x + y = 1

the setup for the Lagrange is as follows:

L = xy - v(x + y - 1)

where v would typically be lambda. I then take the first order derivatives with respect to x, y, and v respectively:

y - v = 0
x - v = 0
x + y - 1 = 0

and then solving for that system of equations:

y = v = x
x + x = 1
x = 1/2
y = 1/2

I find the value of x and y that would max z subject to the constraint. It essentially says the direction of the grads will be equal at the point of optimization, and if you multiple the magnitude by some scalar the gradients will be equal, so you must solve for x and y where this occurs, and you’ll have the point of optimization.

Alright. So moving on to what confuses me, a case like:

Max z = (x + 100)y

subject to x + y = 1
and nonnegativity, or x >= 0, and y >= 0, which is conventionally written as -x <= 0, and -y <= 0

Geometrically, it is obvious that this will be a “corner solution;” the max value that can occur here will be when x = 0 and y = 1, or at the corner of the constraint, touching the x axis.

Setting up the Lagrange here requires an additional twist for nonnegativity:

L = (x + 100)y - v(x + y -1) + tx + ry

and there’s a handy set of rules for solving this, but I haven’t quite mastered them at this point. Anyway, I’m concerned that in situations like this, the intuitive portion of the Lagrange no longer holds true. The grads no longer point in the same direction…how could it work?

I don’t see why you’ve thrown in tx + ry. Can you not just solve in the same manner as the first problem and then throw out the negative solutions for x and y you obtain?

Yeah, that’s what I’ve done when I’ve gotten results that were fubar. Essentially I look at it, realize it must be a corner solution, and then calculate the values of x and y at the corner. The way I’ve set it up is the right way to do it though. There’s a whole system of equations that results from the setup, and if they’re all simultaneously satisfied I’m supposed to get the right answer.

I still don’t understand what role “+ tx + ry” plays in ensuring that you obtain nonnegative x and y values though. I’ll take a look at this stuff a bit later when I’ve woken up more, hopefully then I’ll be more help.

Thanks. If you’re curious, I’ll post up what role they play when I figure all this out. First though…coffee and chores.

Shit, I thought this was a thread dedicated to ZZ Top.

Fuck math.

Just wondering, are we speaking of the lagrangian method from microeconomics? I remember lagrange from intermediate micro yet this seems to be an even greater (or different) look into this method.

What’s with all the math threads popping up lately?

Yes. The Lagrangian has many applications I’m sure, but I am using it for microeconomics.

If I recall correctly, this PDF had some information comparing corner solutions and interior solutions from a Lagrangian standpoint.

The link Cr Powerlinate posted up looks useful. I’m very familiar with lagrangians, but from a physics/pure math perspective. I’ve never really seen these problems where the constraints are inequalities and/or nonnegatives, but if you’re still scratching your head I’ll try to think about it more.

Lagrange vs. parabolic trajectories, who would win? Discuss…

[quote]Cr Powerlinate wrote:
If I recall correctly, this PDF had some information comparing corner solutions and interior solutions from a Lagrangian standpoint.

Thanks. This actually covers a lot of what I’m talking about, but more from the application standpoint than the theoretical. In the pdf that you posted, what confuses me is the intuition behind the Lagrange in the corner solution at b = 0, c <=48.

At that point, the direction of the grad is not the same as the direction of the budget constraint’s grad, so it confuses me that the Lagrange is able to work.

My understanding of the Lagrange is that for equalities it finds points of tangency between the constraint and objective functions, and that for inequalities it either finds the constraints binding or finds them not binding and drops them from the equation. None of that really deals with the intuition behind corner solutions.

[quote]blithe wrote:
Hmm. I wrote out the problem I was struggling with, but realized that I need to add an additional twist before the problem becomes clear. I posted it anyway, but I suspect that unless one is already familiar with the Lagrange the end steps will be a little unclear.

Alright, so in a simple case of a Lagrange, I would use it for something like:

Max z = xy

subject to the constraint x + y = 1

the setup for the Lagrange is as follows:

L = xy - v(x + y - 1)

where v would typically be lambda. I then take the first order derivatives with respect to x, y, and v respectively:

y - v = 0
x - v = 0
x + y - 1 = 0

and then solving for that system of equations:

y = v = x
x + x = 1
x = 1/2
y = 1/2

I find the value of x and y that would max z subject to the constraint. It essentially says the direction of the grads will be equal at the point of optimization, and if you multiple the magnitude by some scalar the gradients will be equal, so you must solve for x and y where this occurs, and you’ll have the point of optimization.

Alright. So moving on to what confuses me, a case like:

Max z = (x + 100)y

subject to x + y = 1
and nonnegativity, or x >= 0, and y >= 0, which is conventionally written as -x <= 0, and -y <= 0

Geometrically, it is obvious that this will be a “corner solution;” the max value that can occur here will be when x = 0 and y = 1, or at the corner of the constraint, touching the x axis.

Setting up the Lagrange here requires an additional twist for nonnegativity:

L = (x + 100)y - v(x + y -1) + tx + ry

and there’s a handy set of rules for solving this, but I haven’t quite mastered them at this point. Anyway, I’m concerned that in situations like this, the intuitive portion of the Lagrange no longer holds true. The grads no longer point in the same direction…how could it work?

[/quote]

Sorry for my short reply before, this is actually, I think, an interesting question. You are correct in stating as you have that “the Lagrange essentially says that the direction of the gradient of the objective function will be the same as some scalar times the gradient of the constraint.” This isn’t essentially what it says, it’s exactly what it says. So, let’s look at your own example:

[quote]
L = xy - v(x + y - 1)

where v would typically be lambda. I then take the first order derivatives with respect to x, y, and v respectively:

y - v = 0
x - v = 0
x + y - 1 = 0[/quote]

More descriptively, this is what you get:

d/dx(xy) - (v)d/dx(x + y - 1)=0
d/dy(xy) - (v)d/dy(x + y - 1)=0
(x + y - 1)

Or, just calling these things f(x,y), g(x,y)

d/dxf(x,y) - (v)d/dxg(x,y)=0
d/dyf(x,y) - (v)d/dyg(x,y)=0
g(x,y)=1

Now, we can clearly right the first two equations as:

d/dxf(x,y) - (v)d/dxg(x,y)=d/dyf(x,y) - (v)d/dyg(x,y)

Rearrange them:

d/dxf(x,y) - d/dyf(x,y)=(v)d/dxg(x,y) - (v)d/dyg(x,y)

or, absorbing negatives into the constants:

d/dxf(x,y) + d/dyf(x,y)=(v)[d/dxg(x,y) - d/dyg(x,y)]

But this just says, of course,

grad(f) = vgrad(g)

which is what you said.

Now, I’ve written all this out, and made such a big deal out of it, because it should be clear that given the conditions for the nonnegativity problem you can’t perform this little derivation. So consider your second problem:

[quote]
Setting up the Lagrange here requires an additional twist for nonnegativity:

L = (x + 100)y - v(x + y -1) + tx + ry[/quote]

If you take the first derivatives of this for x,y and v, you will get three questions, but unlike the case above you will NOT be able to derive that setting the first partial derivatives of this equation equal to zero is “essentially” setting the gradients of f and g equal.

What’s the point? Well, the point is you were EXACTLY right. Setting the gradients of f and g equal when there is a nonnegativity constraint doesn’t work, and in fact you’re not doing that when you use the formulas for this case. Now, if your teacher presented this all to you as if you were still “just” setting the gradients equal, than either they themselves are confused or they mislead you by omission.

If this all has left you with a new question–mainly, just how the hell do the “new” conditions get me the maximization–that’s somewhat more complicated. I, since I’ve never had to deal with this sort of nonnegativity constraint in a maximalization problem, don’t really quite understand them myself. I’m sure I could look into it more and figure out just what’s going on, but you can do that just as well as I. I’ve found this pdf, which addresses just where these maximalization conditions for nonnegativity come from: http://www.hks.harvard.edu/nhm/notes2006/You,%20the%20Kuhn-Tucker%20conditions,%20and%20You.pdf . I can’t attest to it’s clarity or accuracy, but maybe it’s a start.

In any event I hope I’ve at least cleared up your original question–namely what setting the gradients equal has to do with maximization with nonnegativity–even if I’ve left you with a new question.

ps,

Sorry for the typo’s, and hopefully no one will jump my bones too badly for my rough and ready conclusion that ‘grad(f) = vgrad(g)’. I know I skipped some steps.

Yeah, the pdf you posted was pretty helpful. It didn’t answer the question, but I think it gave me some tools that will allow me to either figure it out myself or better understand it when I ask my prof about it. Thanks a lot.