Control, Perception, and Entropy--a tutorial

The takes off from an article in Physics Today, Sept 1993, pp32-38 "Boltzmann's entropy and time's arrow" by Joel L. Lebowitz. It was originally posted to CSG-L (now CSGnet), the mailing list of the Control Systems Group, which is devoted to the study of W.T. Powers theory of Perceptual Control Systems.

BoltzmannÕs view of entropy is one of three classical views, the others being those of Clausius and of Gibbs. Clausius entropy applies only to macroscopic systems in equilibrium, and Gibbs entropy to a statistical ensemble of microstates corresponding to a specific macrostate, whereas Boltzmann entropy applies to any describable system. Boltzmann entropy agrees with the other two when they are applicable, and is therefore a more general concept of entropy than the others, which I will ignore in the rest of this posting. I will give an introduction to the Boltzmann view of entropy as I understand it from the Lebowitz article, and then suggest how this implies that a control system is a cooling system, and why the energy received from or supplied to the environment by the perceiving side of the control system must be small compared to the energy involved in the disturbance and the output side of the control system. I shall also argue that there is a natural relationship between Boltzmann entropy and Shannon information, beyond the formal similarity of the equations that describe them.

Where does the concept of "entropy" apply?

Entropy is a measure of disorder or lack of structure. Many people understand that entropy tends to increase over time, but are not aware that this is true only in a closed system. In an open system that can import energy from and export energy to its environment, entropy can increase or decrease, and structure can be maintained indefinitely. An increase in entropy means that the system becomes less ordered or structured. Any system has a maximum entropy, beyond which it cannot increase. In common language, in a system at maximum entropy, everything is random. Wherever there is structure, entropy is less than it might be for that system.

Entropy is DEFINED over a DELIMITED system, not a closed one. A closed system is one in which the elements do not interact with anything outside the system, whereas a delimited system consists of a well defined set of elements, which may interact in any way with the world outside the system provided they do not lose their identity. A delimited system may be open or closed. It's all perception, so the interests of the observer do matter, in agreement with the comment in a message that induced me to write the original posting from which this is derived: "To me, order/disorder is a purely subjective concept." It is, and moreover, it depends on the perceptual functions used to observe the system. This is why I commend the Lebowitz paper as a way of seeing how physical entropy might relate to communicative information.

Entropy is a measure that has a value at an instant in time, and it can be defined over limited parts of structures that are intimately intertwined with other parts of the same structure, such as the electrons in a metal, or the radiation field in a gas. If there are two equal-sized defined systems of the same degree of disorder, the entropy of the two lumped together is twice the entropy of one. Entropy can decrease only if the system exports some to the world outside, increasing the ordering of the system at the expense of disordering the world outside. This can be done only if there is a non-equilibrium energy flow through the open system under consideration, from the world outside to a different part of the world outside.

A living control system is just such an open system, getting its energy ultimately from the sun and depositing its entropy in the form of less organized waste products. The business of a living control system is to create and maintain structure--to keep entropy less than its maximum within the living body--and it can do this for as long as it can maintain an energy flow through itself. A living system maintains its entropy at a more or less constant level, lower than would be the case for its components if it were not alive. When a living organism dies, its components decay and are scattered around the world, raising their entropy to a level consistent with that of the world as a whole.

A describable system: 2 balls in a box

A describable system, which any physical systems is, can be described in terms of a phase space. The dimensions of the phase space are those variables on which the value makes a difference to the system's aspects of interest. One way of looking at a system constructed of simple elastically interacting balls (an idealized gas, in other words) is to record the location and velocity of each ball in space, a 6-dimensional vector. This 6-vector can be represented by a point in a 6-D space, which is the phase space for that ball. For a gas of N balls, the phase space has 6N dimensions. If the ball constitutes a "closed system" in the sense of no energy transfer, the velocity components of this vector will not change as time goes on, at least until the ball bounces off a wall of the closed container, if one exists.

If there are walls, any such bounces do not cause the ball to gain or lose energy. All that happens is that the velocity vectors reverse their sign with respect to the orientation of the box wall, and the point moves to a new point in the velocity subspace at the same distance from the origin. The ball's energy is determined by its location in the 3-D velocity subspace--in fact by the ball's radial distance from the origin in that subspace, which does not change in the bounce, by definition of "closed system".

In a gravitational field, or if the ball were magnetic and the box permeated by a non-uniform magnetic field, the energy would also depend on the location. For a given energy, the radius of the sphere in the velocity subspace would depend on the location of the phase point in the 3-D position subspace. The ball would slow or speed up depending on where in the box it happened to be. But we will ignore such complications for the time being. Trivial, so far.

Now let's add another ball making a 2-molecule ideal gas. The phase space now has 12 dimensions. Ignoring position-dependent fields, the total energy in the system is still represented by the radial distance of the point from the origin in the (now 6-D) velocity subspace. What happens when the balls meet? They bounce off each other, losing no energy overall. Each of the six velocity components changes, though. The system moves to a new point in the 12-D subspace. When the balls bounce off each other, the system as a whole is still closed, and no energy is gained or lost in the collision. The position of the point in the 6-D velocity subspace is changed to a new position, at the same distance from the origin. This new position will not change until the balls bounce off each other or the wall again.

In the bounce, with high likelihood, vector components much larger than average will be reduced, and components much smaller than average will be enhanced, so the location of the point in phase space is most likely to be found not very close to the axes of the velocity subspace. More typically, the individual velocity vector lengths will be distributed around some intermediate value. The top-left Java example shows such a 2-ball space in 2 dimensions.

Can there be order and disorder in such a 2-ball system? It depends on how you choose to look at the system. There might be something special about a state in which one ball was stationary at in the centre of the box, all the energy being concentrated in the other ball. In the phase space, such a situation would be represented by a point that lies in a particular 3-D subspace of the whole 6-D space. There might be something special about a state in which the two balls travelled together as a pair, in which case the three velocity vectors for one ball would be defined by those of the other ball. Measure one, and you know them both. Or there might be something special about a state in which the balls exactly mirrored each otherÕs motion (as they would if they were equal gravitational masses in outer space).

The particular subspace is defined by the observer, not by the momentary behaviour of the balls. Each particular tightly defined subspace is untypical, in Boltzmann's sense, and in an everyday psychological sense. Even though any precise position for the system's phase point is as probable as any other position in the phase space, almost all of the equiprobable points do not lie in or very close to any predefined position. It is much like a bridge hand in which the deal gives each player 13 cards of one suit. That hand is no less probable than any other, but we see it as untypical because we have previously labelled the cards so that we perceive certain relations to exist among them. An observer is likely to ponder the possibility that the dealer might have cheated.

It is more probable that a bridge hand will be "typical" in having each player receive cards of at least three suits. There are 24 different hands in which the four players all hold 13 cards of one suit, but many thousands of hands in which each player has cards of at least three suits. The hand is "typical" because it is a member of the large class rather than of the small class. For the two-ball system, it is most probable (typical) that its phase point will be in a region of the phase space that represents both balls as moving. A state in which one ball is nearly stopped and the other carries all the energy can happen, but it is not typical.

Another aspect of typicality in the ideal gas (which refers to the problem ot time-asymmetry) is that if the phase point defined by (L, V) is typical, then the phase point defined by (L, -V) is also typical (L and V are the location and velocity vectors defining the position of the point in the location and velocity subspaces). A typical situation does not become less typical if all of the balls bounce off the wall. The same applies if any subset of the V vector components has a sign reversal. Replacing the phase point (L, V) by (L, -V) is to reverse time. A brief snapshot of the gas described by the two different phase points would show no characteristic difference between them. The difference is in the detail of which ball or molecule is going in which direction, but this ordinarily does not matter.

What does matter is that if the gas is in a small atypical region of the phase space, collisions between the balls are more likely to move the phase point into a typical region than the reverse. It is more likely that when bridge hands are shuffled, the deal after a pure-suit deal will be distributed at least three suits in each hand than that the reverse will happen. Similarly, if the gas is in an atypical region of the phase space at time t, it will most probably be in a more typical region at time t+delta t, and have come from a less typical region at time t-delta t.

If all the velocity vectors were replaced by their inverses, the gas would revert to its less typical prior state. This is most improbable, even in the two-ball "gas." Although the states (L, V) as observed, and its mirror image (L, -V) are of identical typicality, the two microstates are not equally likely to occur in practice. One changes over time from less typical to more typical states, and the other becomes less and less typical over time. However, a casual observer shown a brief snapshot of the two could not tell the difference between them.

More complex: 2 balls in a swarm of others

Notice that none of the above description requires that there be a box confining the balls. If there is no box, then there will be only one bounce of the two balls against each other, or none, and the situation is less interesting. Let's remove the box, but add a lot more balls that individually do not interest us. Call them "nondescript" because we will not describe them within the phase space. They may, however, interact with the two balls that interest us. We will still look only at the original two balls, and the position of their phase point in their 12-D phase space. Now the balls can encounter each other OR any of the nondescript other balls.

The phase space description of the two balls is unchanged, but the behaviour of the phase point is different. If the two balls bounce off one another, the point moves to another position at the same distance from the origin in the velocity subspace. But if they bounce off a nondescript ball, the radius of the shell in which they live may change. If the nondescript ball was moving very fast, the total velocity of the interesting ball will probably be increased in the collision. The phase-point of the 2-ball system will move further from the origin in the 6-D velocity subspace. The open 2-ball system will have gained energy. Conversely, if the nondescript ball happened to be moving very slowly, the interesting ball is likely to lose speed and the nondescript ball to gain it. The remaining three Java examples show the two balls in spaces containing various numbers of other balls.

Temperature

Temperature is proportional to the energy in a system. In our 2-ball system, the total energy is equal to the sum of the energies associated with each of the velocity vectors, or 0.5*sum(vi^2) where vi is the velocity on axis i and the balls are assumed to be of unit mass. The temperature, T, of the system is not affected by how many balls are in the gas, and so we can write T = k*mean(0.5*vi^2). Another way of saying this is that the energy per degree of freedom in the system is kT/2. The proportionality constant k is known as "Boltzmann's constant."

In typical regions of the phase space, the velocity vectors are distributed more or less evenly around some intermediate value, very few being either very large or very small. When one of the interesting balls collides with a ball from the environment, and it gains energy from the collision. This increases the temperature of the system of interesting balls, and decreases the temperature of the system of environmental balls. When an interesting ball collides with a slower environmental ball, conversely, the temperature of the interesting system is lowered and that of the environment raised. The radius of the sphere centred on the zero-velocity origin of the velocity subspace is sqrt(sum(vi^2)), which means that the temperature of the system is proportional to the square of the length of the velocity component of the phase vector, divided by the dimensionality of that part of the phase space.

Typical and atypical description states: the measurement of entropy

Now let's add a lot of interesting balls, bringing the set to size N. The phase space for this set now has 6N dimensions. Again, we don't care whether there are other uninteresting balls with which they interact. The phase space has typical and untypical regions, defined by the observer (exactly as any CEV is defined by the perceptual functions of the observer). One thing to note is as before: if a point is in a typical region, it will almost never move into an untypical region as a consequence of reversing some or all of its components in the velocity subspace.

To see that the typicality of a region of the phase space depends on the observer, imagine an observer looking at the phase space plot through the N-dimensional equivalent of a scrambled fibre-optic pipe. In a fibre-optic pipe, a large number of glass fibres are placed together so that at each end of the pipe their ends form a single plane onto which a pattern can be focussed. If the alignment of the fibres were the same at both ends, a pattern focussed onto one end of the pipe would be seen as the same pattern glowing on the surface at the other end of the pipe. But if the fibres are scrambled, what is seen at the other end is a random hodge-podge of light and dark. There is, however, a configuration of what looks like random dots that, when entered at the front end, emerges as a straight line at the back end.

Patterns of random dots are "typical" of dot patterns in general, in that almost all such patterns of dots look alike; but a straight line pattern of dots is untypical, there being only a relatively few ways that the dots can be so arranged. A random procedure for locating the dots could produce a straight line, but it would be most unlikely to do so. A "random" pattern at one end of the pipe that produces a straight line at the other is a most untypical random pattern, and the observer that saw the line would so assess it.

Returning to the system of balls, the observer's definition of what constitutes a particular "atypical" structure is like the scrambling of the fibres in the fibre-optic pipe. The pipe may be straight through, and the observer may define a "natural" atypical configuration, such as that half the balls have zero velocity and are arranged in a repeated regular pattern of locations (a crystal), while the rest of the balls have high velocity and are not specifically related in their locations. Or the observer may define some arbitrary pattern that another observer would see as quite typical.

Each point in the phase space defines a possible "microstate" of the space. Any volume of the phase space defines a "macrostate." Any configuration with a phase point that the observer saw as conforming to the arbitrary pattern would belong to the macrostate for the pattern, and any point that did not so conform would belong to the larger "typical" macrostate of the phase space. Boltzmann's entropy is concerned with the volume of the phase space that might be considered typical of the present state of the system. For example, in the 2-ball case in which one ball was motionless or nearly so, it would presumably not matter which of the balls was stopped and which was moving, so the region of phase space typical of the "one-stopped" state would at least include two thin slices near the two 3-D subspaces of the 6-D velocity space which represent one of the balls as stopped. In the Physics Today article, the "one-stopped" state would be an example of a "macrostate," whereas the specific phase point associated with one such condition would be a "microstate." To be accurate, it should be noted that macrostates do not normally have discrete boundaries. It would be more proper to treat a microstate not as being "inside" or "outside" a specified macrostate, but as having a specified fuzzy set membership in the macrostate. The "volume" of the macrostate would then be the integral over the phase space of the membership function of the microstates. But we will continue to treat this integral as if it were the volume of a well-defined region of the phase space.

The value of Boltzmann's entropy is proportional to the log of the size of the subspace typical of the current state. It does not depend on whether the system is open or closed (whether there are "nondescript" balls, in the ideal gas example). It does depend on the system being delimited, (e.g., in the ideal gas example, knowing which balls are being described). If the phase space is continuous, there may not be a natural scale of measurement for the volume of the typicality space, but if it is discrete, the natural unit is the volume of one discrete cell. However since the value of the entropy is a logarithmic function of the size of the typicality region, changing the scale unit only adds or subtracts a fixed quantity to all entropies measured using the unit. It makes no difference when we consider changes in the entropy of a defined system, as is usually the case.

If the system is closed, the region of phase space available to it is limited to a shell of constant radius in the velocity subspace. That shell has a volume, the log of which which represents the maximum entropy possible for the system (in a high-dimensional system, nontypical subspaces have a vanishingly small total contribution to the volume). If the system is open, the shell to which its phase point is confined can expand or contract, depending on whether energy is on balance transferred into or out of the system. As the shell radius changes, so does the maximum entropy possible for the system, and if (as is highly probable) the system is in a state of near-maximum entropy, its entropy will increase as it gains energy (gets hotter).

Notice that nowhere in the definition of the Boltzmann entropy does the construct "probability" appear; it appears only consequentially. At least in the ideal gas, any point in phase space is as likely as any other to be occupied. Hence the probability of a point being in any particular macrostate is proportional to the size of the region typical of that macrostate relative to the total phase space volume. In systems with attractor dynamics, this statement does not hold, and probability and entropy go their separate ways. A low-entropy condition is highly probable in a system with attractor dynamics--or in the environment of a control system.

As Lebowitz's article points out, Boltzmann's entropy is numerically consistent with other definitions of entropy if the situation is appropriate for the application of the other definitions, such as Gibbs, which depends on probabilities, or Clausius, which applies to equilibrium systems. Time's arrow, as described in the Physics Today article, depends on the fact that it is easier to get out of a tiny volume of the phase space into a larger volume than to find a way from an arbitrary point in the large volume back into the tiny volume. Taking that to the extreme, for a specific example, suppose that there were a macrostate for which the typical region consisted of just one microstate (L, V). Then, if after some time the system had evolved to a microstate (L', V') in a larger macrostate and all the V vector components were reversed exactly, the system would eventually evolve back to the original microstate. But if any of the reversals were inexact, or if some of the components were left alone, the system would not return to the original microstate. The incomplete reversal would leave the system in the same evolved macrostate, but in a microstate outside the original small volume.

Entropy (almost) always increases in a closed system. Entropy does not always increase in an open system. Imagine a system of balls that I can observe and influence. I, being outside the open system of concern, can select the balls of interest and deliberately place them in any state I desire, in location and velocity. I can "control" them, at least insofar as I can perceive their exact locations and velocities. I can have them move all in parallel at the same velocity (a flow regime), or stand on top of one another, or whatever. These structures have microstates with relatively small typicality regions. I, the observer-manipulator, have reduced the entropy of the system of interest, putting the phase point in a non-typical subspace of the descriptive space I have defined by my perceptual functions. I can keep the structure in such a low-entropy state, provided I can continue to observe and correct its departures from this non-typical region under the influence of the "nondescript" parts of the universe. I can control my perception of it.

I argue that Boltzmann entropy can be applied to any descriptive space, and in particular it can become Shannon entropy or uncertainty under appropriate conditions. Why is this so? Consider my control of the balls of the open system; I define a state in which I wish them to be. I can force them into this state only as accurately as I can perceive them. The region "typical" of my perception consists of all those microstates I cannot distinguish. The less uncertain I am about the location and velocity of all the balls, the smaller this typicality region and the lower the entropy of the ball system as I see it.