Statistical Mechanics – Interpreting the Gibbs Entropy

To a great extent, I’ve treated the Gibbs entropy as the means to an end. Maximising it gave us a set of state occupation probabilities \{p_\alpha\} with which we could say what the system did on average; that is, on a macroscopic level. At the end of the first post, I even gave a cowardly disclaimer saying that we know the Gibbs entropy, whatever it is, is maximal at equilibrium, so don’t ask any more questions! I realise that ‘the ends justify the means’ is largely unsatisfactory; at no point did I explicitly address what the Gibbs entropy means. Here’s the story so far.

We considered some system which could occupy any one of a set of microstates \{\alpha\}. The microstates of the system were characterised by the values taken by certain properties of the system in that state.

It was decided from the outset that we were to a great extent ignorant of the system; the exact microstate of the system could not realistically be known, so our description of the system would be inherently probabilitistic. As such, properties of the system like its energy would be treated as randomly distributed variables.

We therefore introduced a set of numbers \{p_\alpha\}, defined such that p_\alpha was the probability of the system occupying microstate \alpha.

The task was to find a ‘reasonable’ way of assigning the probabilities \{p_\alpha\}, subject to information available to us. For example, knowledge about the system could be conveyed through measurements of the mean values of properties of the system, such as its energy:

\displaystyle \langle E\rangle=\sum_{\alpha} p_\alpha E_\alpha

Constraint equations of this form gave us tiny clues about what the fairest assignment of the \{p_\alpha\} might be.

The problem was that there were many assignments of the set \{p_\alpha\} which are consistent with this constraint equation; the probabilities are underconstrained. Systems often have infinitely many microstates available to them, so no matter how many constraint equations you collect, you will never have enough to uniquely determine \{p_\alpha\}.

At this point I conjured a function called the Gibbs entropy, denoted by S_G. It was defined by the summation

\displaystyle S_G=-\sum_\alpha p_\alpha \ln p_\alpha

I claimed that the most reasonable assignment of the set of probabilities \{p_\alpha\} would be that which maximised this function, subject to constraints.

This was essentially a statement of the second law of thermodynamics; that, when a system is not acted on by external agency, the system will evolve such that the Gibbs entropy increases. Phrased another way, the equilibrium state of a system is that with maximal Gibbs entropy, subject to constraints.

But why is it that the Gibbs entropy should increase? What is it exactly that’s increasing?


We found for an isolated system (one about which nothing can be known), the equilibrium state is that in which all microstates are equiprobable:

\displaystyle p_\alpha=\frac{1}{\Omega}

where \Omega is the total number of microstates available to system. This seemed to make sense. If nothing is known about a system, the only reasonable assignment of the probabilities is to say that all states are equiprobable.

Here, we can introduce the notion of quantifying ignorance. Exactly how ignorant are we of an isolated system? Well, the least informative thing you could tell anyone about the system is that it’s just as likely to be in one state as another. You are ‘maximally ignorant’ about an isolated system.

Since the \{p_\alpha\} for the isolated system were those which maximised the Gibbs entropy subject to no constraints at all, we have to conclude  that this is the global maximum of the Gibbs entropy. That is, the Gibbs entropy is as big as it could ever be when we are maximally ignorant about the system. The maximum value assumed by S_G is

\displaystyle S_G^{max}=-\sum_\alpha \frac{1}{\Omega} \ln\Big(\frac{1}{\Omega}\Big)


Now consider the probability distribution for the closed system. We found that, rather than all states being equiprobable,

\displaystyle p_\alpha (\beta)=\frac{e^{-\beta E_\alpha}}{\sum_\alpha e^{-\beta E_\alpha}}

where E_\alpha is the energy of the system in microstate \alpha and thermodynamic \beta was a parameter we associated with the ‘coldness’ of the system.

Consider the limit in which the system’s temperature reaches absolute 0. The lowest energy level E_0, which can be taken to be 0 without loss of generality, has an occupation probability of

\displaystyle p_0 (\infty)=\frac{e^0}{e^0+e^{-\infty}+e^{-\infty}+...}

p_0 (\infty)=1

whereas all other states with E_\alpha>0 have occupation probability

\displaystyle p_\alpha (\infty)=\frac{e^{-\infty}}{e^0+e^{-\infty}+e^{-\infty}+...}

p_\alpha (\infty)=0

Hence the microstate of the system in this limit is known exactly – at absolute zero, the system must be in its lowest energy microstate. We have reached the case of ‘minimum ignorance’. Note also that in this limit, the Gibbs entropy is given by

\displaystyle S_G=-\sum_\alpha p_\alpha\ln p_\alpha

S_G=-1\cdot\ln 1-0\cdot\ln 0-0\cdot \ln 0



\displaystyle \lim_{x\to 0}x\ln x=0

Since the Gibbs entropy is the sum of positive terms, when S_G reaches 0 it must be at its global minimum:


That is, the Gibbs entropy is as small as it could ever be when we are minimally ignorant about the system.

So we note that

  • the Gibbs entropy reaches a global maximum when we are maximally ignorant
  • the Gibbs entropy reaches a global minimum when we are minimally ignorant

Therefore the Gibbs entropy has a strong correlation with our lack of knowledge of the system’s microstate.

To add weight to this argument, consider how the Gibbs entropy of a closed system varies with temperature:

\displaystyle \frac{\partial S_G}{\partial\beta}=\frac{\partial}{\partial \beta}(\beta U+\ln Z)

where U is the system’s mean energy and Z is its partition function, defined by

\displaystyle Z=\sum_\alpha e^{-\beta E_\alpha}

Using the product and chain rules appropriately,

\displaystyle \frac{\partial S_G}{\partial \beta}=U+\beta \frac{\partial U}{\partial \beta}+\frac{\partial \ln Z}{\partial \beta}

We learnt in a previous post that the third term will actually cancel with the first, hence

\displaystyle \frac{\partial S_G}{\partial \beta}=\beta\frac{\partial U}{\partial \beta}

for the closed system. In the last post, we evaluated the derivative of the system’s mean energy with respect to \beta, and the answer tells us

\displaystyle \frac{\partial S_G}{\partial \beta}=-\beta\text{Var}(E)

where \text{Var}(E) is the variance of the system’s energy. So, assuming \beta is positive,

\displaystyle \frac{\partial S_G}{\partial \beta}<0

Hence the Gibbs entropy of a system increases monotonically with decreasing ‘coldness’. This means that, as the system evolves from a regime of total knowledge to total ignorance, the Gibbs entropy increases steadily with no wiggles or discontinuities. We might now be unashamed to say: the Gibbs entropy is a measure of the extent to which the microstate of the system is uncertain.


The second law of thermodynamics therefore states, in some sense, that as a system evolves towards equilibrium, we become less and less certain of its microstate. I hope you will agree that this is a plausible premise.

As a specific example, take a look at the system shown in the animation below. It shows a collection of 100 identical circular bodies undergoing elastic collisions in two dimensions. Initially, only one particle in the system is moving, and at lightning speed; every other particle is at rest.

entropy4Consider first the behaviour of each particle. Each particle in this super-system can be considered a little closed system of its own. The microstate of each particle is characterised by its speed; that is, the words ‘microstate’ and ‘speed’ are, in this discussion, synonymous.

The graph in the top-right shows the instantaneous distribution of speeds of the particles. The graph below it shows the time-averaged distribution of speeds, which we can interpret as the probability distribution of each particle’s microstates.

Initially the probability distribution has an extremely sharp peak at the zero-speed microstate, since all but one of the particles are stationary. Each particle is guaranteed to be found in the zero-speed microstate. So each particle is essentially at absolute zero – the initial state of each particle is of low entropy.

As the system evolves, energy is inevitably dissipated throughout the system, and the instantaneous distribution of speeds begins to fluctuate crazily. You can watch as its average, the microstate probability distribution, broadens, the initially sharp peak of the zero-velocity microstate softening. Each particle quickly gains access to a large range of its microstates. This broadening of the probability distribution yields an ever-increasing value of S_G – the final, equilibrium state of each particle is of high entropy.

What about the system as a whole, the collection of particles? Recall we established that the Gibbs entropy is additive; the entropy of a composite system is the sum of the entropies of its constituent parts. So as the entropy of each single-particle subsystem increases, so too must the entropy of the entire system.

The simulation above is an example of empirical evidence. We can watch as the system evolves from a state in which energy is localised to a single particle to one in which it is spread throughout the system’s entirety. This process inevitably leads to a loss of knowledge about the system. Initially, we were certain that all but one of the particles was stationary, stuck resolutely in a single microstate. But as each particle comes to equilibrium with its environment (the other particles), it gains access to many of its microstates. It might be moving very slowly, or very quickly, or (most likely) somewhere in-between.

If you’re interested, the function the lower graph is tending towards is the 2-dimensional Maxwellian distribution. It looks a little different to the ‘traditional’ Maxwellian, which predicts the distribution of speeds of a familiar, three-dimensional gas, on account of its non-zero gradient at the origin. Other than that, their forms are similar.

Here’s the spooky bit. Take a look at this animation: is nothing in the laws of physics that prevents this process from happening. Every collision you see here obeys the conservation of energy and of linear momentum. But by chance, the particles conspire to donate all of their energy to one lucky particle, and in doing so eerily come to a halt. The system has evolved from a state of high entropy to one of low entropy spontaneously.

It’s for this reason that the second law of thermodynamics is often subtly amended to saying: the (Gibbs) entropy of a system ‘almost always’ increases towards a maximal value. The previous animations shows that it’s possible for a system, even one close to equilibrium, to fluctuate in such a way that its entropy spontaneously decreases, if only for a short time. The point is that it’s incredibly unlikely to happen (incredibly not being a big enough word).


Finally, it’s important not to forget the expression given to us from the first law of thermodynamics, which says that the change in Gibbs entropy of a system is


where dQ is the energy added to the system as heat. By heating the system, we increase its mean energy and hence ‘unlock’ its higher-energy microstates. This allows the system to occupy a greater number of its microstates with appreciable probability, broadening the microstate probability distribution. Hence the certainty with which we can specify the system’s microstate is reduced; the system’s entropy has increased.


Really, this post should have come first. It is much more satisfactory to define a function which quantifies our ignorance of the system, take its tendency to increase as axiomatic and construct statistical mechanics from there. Instead, I introduced the Gibbs entropy in an artificial way and took its maximisation as read on the basis of evidence (and ‘fairness’).

Maximisation of the Gibbs entropy is not the standard approach in ‘deriving’ statistical mechanics; instead, the principle of indifference is usually taken to be axiomatic (the principle that all microstates of a system are equiprobable). The equations describing the canonical ensemble can be constructed by considering some small part of an isolated system using this postulate.

When we look at some real systems, we’ll revisit the Gibbs entropy; entropy is better understood in context. Our first example system will be one of the simplest imaginable – a collection of identical particles, each of which can access two distinct energy levels.

Return to top


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s