# [SciPy-user] Maximum entropy distribution for Ising model - setup?

James Coughlan coughlan at ski.org
Sat Oct 28 12:18:45 CDT 2006

Martin,

Hope this makes sense. If not, why don't we take this discussion offline.

Best,

James

1.) "I want to find a combination of the individual spins and
spin products that maximizes the entropy"

No, you want to find a *probability distribution* of spins that
maximizes the entropy. Remember, the entropy of a random variable is
defined for a distribution of the variable (e.g. configuration of
spins), not for a particular value of the variable.

2.) "I thought the whole magic of running maximum entropy is not just that
you end up with a set of hi and Jij that give you a probability
distribution that matches the expectation values that you asked for (to
within some tolerance), but that you also choose those hi and Jij such
that the distribution's entropy is maximized"

True -- see the summary below.

3.) "I think many of the Jij will be quite small, but I want that to
come out of the model. I
don't have any way to justify setting some of them to zero ahead of
time."

No problem, as long as your empirical values are measured accurately
enough (and assuming the max.entr. framework makes sense for your
application).

4.) Maxent summary:

Case 1. A single scalar empirical statistic

Given a variable x (e.g. N spin states) and empirical stastic
<f(x)>_{emp}, the max. ent. distr. whose statistic matches the empirical
statistic is:

P(x) = e^{a*f(x)} / Z(a)

where a is chosen such that <f(x)> = <f(x)>_{emp}. Entropy is defined as
H = -\sum_x P(x) log P(x).

Proof (sketch): Use lagrange multiplier to maximize entropy subject to
constraint <f(x)> = <f(x)>_{emp}:

E = H - a(<f(x)> - <f(x)>_{emp}) where a is the lagrange multiplier.

Maximize E wrt P(x) and a, get P(x) = const * e^{a*f(x)}, where we call
const Z(a) since it depends on a, and where a is chosen to satisfy
<f(x)> = <f(x)>_{emp}.

Case 2. More than one scalar empirical statistic

Now f(x) can be a scalar or vector function (one component for each
empirically observed scalar value) of x. In your spin example, it is a
vector with many components: f(s)=(s_1, ... s_N; s_12, ... s_1N, s_22,
... s_2N, s_N1, ... s_NN), in which case the lagrange multipliers are
the h_i and J_ij 's.

Similar result is obtained.