Answer:
This is a typical example of a constraint max/min. The method used to solve this problem is called
the method of Lagrange multipliers. Let’s generalize the situation:
Given: A function: f(x, y, z) and a constraint that we can write as g(x, y, z) = 0.
Goal: Find min or max of f(x, y, z) for (x, y, z) satisfying g(x, y, z) = 0.
To have a “visual grasp” for the concept of Lagrange multipliers one can think about the following
problem:
Take a balloon (here approximated by a perfect sphere centered at the origin) and a box (think of
a cube for example). We want to find the maximum radius of the balloon (this is the function to
maximize) that can fit inside the box (this is the constraint). We start inflating the balloon and we
realize that the maximum radius is obtained when the balloon touches the box. At the touching
point(s) the surface of the balloon and the one of the box are tangent to each other!
This simple experiment is not a special case. In fact in general1
if P0 = (x0, y0, z0) is a point sitting
on the level surface given by the constraint where max/min for f occur, then at this point the level
surface of the constraint is tangent to the level surface of f passing through P0:
If the two surfaces are tangent, then all normal vectors to the two surfaces are parallel to each other.
In particular their gradients at P0 are parallel, that is
O~ f(P0) = λO~ g(P0) (3.1)
for some parameter λ. This parameter is called the Lagrange multiplier.
We discovered that the max/min points for a function f(x, y, z) constraint by g(x, y, z) = 0 are
found among the solutions (x, y, z, λ) for the system
O~ f(x, y, z) − λO~ g(x, y, z) = 0
g(x, y, z) = 0.
Notice that this system contains four equations and four unknowns:
∂
∂x
f(x, y, z) − λ
∂
∂x
g(x, y, z) = 0
∂
∂y
f(x, y, z) − λ
∂
∂y
g(x, y, z) = 0
∂
∂z
f(x, y, z) − λ
∂
∂z
g(x, y, z) = 0
g(x, y, z) = 0.
(3.2)
but in general it is not a linear system!
One can present the method of Lagrange Multipliers in a more efficient (but less illuminating) way.
Define in fact the new function
L(x, y, z, λ) = f(x, y, z) − λg(x, y, z).
The critical points of L solve the vector equation
O~ L(x, y, z, λ) = 0.
But remember that now the variables are (x, y, z, λ) so we need to take four partial derivatives for
L. If one does so then again (3.2) is obtained!