# Statistical Mechanics – Lagrange Multipliers

Before we explore the Gibbs entropy further, it is necessary to introduce a technique called the method of Lagrange multipliers. The following is a sketch proof, one I hope will be satisfactory for the average amateur physicist!

Consider some function $f$ of the variables $\{x_1,x_2,...,x_\Omega\}$. At a stationary point, the function’s differential $df$ is equal to 0:

$\displaystyle df=\frac{\partial f}{\partial x_1}dx_1+\frac{\partial f}{\partial x_2}dx_2+...+\frac{\partial f}{\partial x_\Omega}dx_\Omega=0$

Since the variables are independent, this equation is only generally true provided each partial derivative is equal to 0. But what if we want to extremise $f$ subject to some constraint of the form

$g(x_1,x_2,...,x_\Omega)=0$

where $g$ is some known function. This equation provides a relation between the variables, hence they are not all independent. The constraint above can at least in principal be rearranged to give one variable in terms of the others. Hence only $\Omega-1$ of the variables are independent. So we cannot say for definite that for $df$ to be 0, each partial derivative of $f$ must be 0: we have to treat the problem more carefully.

We have the constraint $g=0$. Then we can say for certain that its differential $dg=0$ everywhere:

$\displaystyle dg=\frac{\partial g}{\partial x_1}dx_1+\frac{\partial g}{\partial x_2}dx_2+...+\frac{\partial g}{\partial x_\Omega}dx_\Omega=0$

This equation can be rearranged to give one of the variable differentials, say the last one for sake of argument:

$\displaystyle dx_\Omega=-\frac{\partial x_\Omega}{\partial g}\Bigg(\frac{\partial g}{\partial x_1}dx_1+\frac{\partial g}{\partial x_2}dx_2+...+\frac{\partial g}{\partial x_{\Omega-1}}dx_{\Omega-1}\Bigg)$

We substitute this into the equation for $df$:

$\displaystyle \frac{\partial f}{\partial x_1}dx_1+\frac{\partial f}{\partial x_2}dx_2+...-\frac{\partial f}{\partial x_\Omega}\frac{\partial x_\Omega}{\partial g}\Bigg(\frac{\partial g}{\partial x_1}dx_1+\frac{\partial g}{\partial x_2}dx_2+...+\frac{\partial g}{\partial x_{\Omega-1}}dx_{\Omega-1}\Bigg)=0$

Here, we introduce the new variable

$\displaystyle \lambda=\frac{\partial f}{\partial x_\Omega}\frac{\partial x_\Omega}{\partial g}$

The quantity $\lambda$ is called a Lagrange multiplier. The above becomes

$\displaystyle \Bigg[\frac{\partial f}{\partial x_1}-\lambda\frac{\partial g}{\partial x_1}\Bigg]dx_1+\Bigg[\frac{\partial f}{\partial x_2}-\lambda\frac{\partial g}{\partial x_2}\Bigg]dx_2+...+\Bigg[\frac{\partial f}{\partial x_{\Omega-1}}-\lambda\frac{\partial g}{\partial x_{\Omega-1}}\Bigg]dx_{\Omega-1}=0$

These $\Omega-1$ variables are independent, so we can say with certainty that

$\displaystyle \frac{\partial f}{\partial x_i}-\lambda\frac{\partial g}{\partial x_i}=0$

for $i=1,2,...,\Omega-1$. What about the last variable? From the definition of $\lambda$,

$\displaystyle \frac{\partial f}{\partial x_\Omega}\frac{\partial x_\Omega}{\partial g}=\lambda$

$\displaystyle \frac{\partial f}{\partial x_\Omega}-\lambda\frac{\partial g}{\partial x_\Omega}=0$

So we can extend the identity above to include $i=\Omega$ too:

$\displaystyle \frac{\partial f}{\partial x_i}-\lambda\frac{\partial g}{\partial x_i}=0\ \forall\ i$

Now consider the function

$h=f-\lambda g$

where $h=h(x_1,x_2,...,x_\Omega,\lambda)$. At stationary points,

$dh=df-\lambda dg-g d\lambda = 0$

For this to be true,

$df-\lambda dg=0$

$g=0$

So we recover the equations relating the partial derivatives that we derived above, and the constraint. This is a set of $\Omega+1$ simultaneous equations with $\Omega+1$ unknowns (since $\lambda$ has been added to the mix). The conclusion: to extremise a function $f$ subject to the constraint $g=0$, we can instead extremise the functon $f-\lambda g$ subject to no constraints.

It can be shown (though I will not!) that multiple constraints can be accounted for in a similar way by adding additional terms to our function $h$. For instance, say $g=k=0$. Then we would extremise the function

$h=f-\lambda g-\mu k$

where $\mu$ is our new Lagrange multiplier.

How does this pertain to the Gibbs entropy?

The Gibbs entropy $S_G$ is a function of the $\Omega$ variables $\{p_\alpha\}$. We want to find the probabilities $\{p_\alpha\}$ that maximise $S_G$. So far, we’ve purported to know absolutely nothing about the system we’re describing. In reality, we are not completely ignorant of the system we’re describing – we are able to make measurements of certain macroscopic quantities, which manifest themselves as a mean average over all possible microstates. We can convey what we know about the system through constraints.

The implementation of a constraint will be used in the next post.