The dynamics of a rigid body (that
is, an idealized body that does not deform) are governed by its inertial
parameters, which consist of the mass m∈R, center of mass
c∈R3, and inertia matrix I∈R3×3
(not to be confused with the identity matrix, which we denote as 1).
This post is aimed at those who have seen the inertial parameters before (for
example, in the Newton-Euler
equations), but
who do not necessarily know all of their properties offhand. In particular, we
focus on the set of inertial parameters that are realizable by a real physical
object—that is, those that correspond to a valid (i.e., non-negative) mass
density function. For a paper on this topic, I quite enjoy this
one by Wensing, Kim, and Slotine, which
informed a lot of my own understanding of rigid body inertial parameters.
Density
The density of a rigid body is
described by a non-negative mass density function
ρ:R3→R+, where R+ denotes the set of
non-negative real numbers. The density function can be thought of as an
unnormalized probability distribution over three-dimensional space (we’ll come
back to this analogy shortly), which assigns an infinitesimal mass value to
each point in the body’s volume.
The inertial parameters are related to the density function by the integrals
where we are integrating over the position r∈R3, and
r×=xyz×=0z−y−z0xy−x0.
forms a skew-symmetric
matrix.
The right-hand
side of (4) is also sometimes called the cross-product matrix
because a×b=a×b for any
a,b∈R3. Since r× is skew-symmetric, it
satisfies (r×)T=−r× by definition, so it is also
common to see (3) written as
I=−∫R3ρ(r)r×r×dr.
The inertia matrix is always taken with respect to a particular reference
point; in (3) we simply used the origin. Expressed about a general
reference point p∈R3, the inertia matrix is
Ip=−∫R3ρ(r)(r−p)×(r−p)×dr.
It is often convenient to use the center of mass c as the reference
point, in which case we have
Ic=−∫R3ρ(r)Δr×Δr×dr,
where Δr=r−c.
Probability Distribution Analogy
The quantity mc in (2) is known as the first moment of
mass. This is just the mean of the density function, so we can write
E[r]=mc,
where E[⋅] denotes the expected value under the distribution
ρ. If we take m=1 such that ρ is a proper (normalized) probability
distribution, then E[r]=c and the covariance matrix is
Σ=E[(r−E[r])(r−E[r])T]=∫R3ρ(r)ΔrΔrTdr,
which encodes the spread of the mass distribution about the center of mass.
More generally, we can define the quantity
S=∫R3ρ(r)rrTdr,
which is known as the second moment matrix and does not require a normalized
distribution. The second moment matrix has a one-to-one relationship with the
inertia matrix, and is also taken about a particular reference point. In
(8), the reference point is the origin. If we use the center of mass
as the reference point instead, we get
Sc=∫R3ρ(r)ΔrΔrTdr,
which corresponds to the covariance matrix from (7).
Physical Consistency
Since the density ρ is non-negative everywhere (as you cannot have a
negative mass), it immediately follows from (1) and (6),
respectively, that m≥0 and Ic≽0;
that is, the mass must be positive and the inertia matrix taken about the
center of mass must be positive semidefinite (which means all of its
eigenvalues are non-negative; we will see
below that Ic≽0
implies Ip≽0 for any p∈R3).
However, if m=0, then ρ must be zero everywhere and therefore
Ic=0 as well. To avoid this special case we will require that the
mass is strictly positive, so our two conditions are
m>0,Ic≽0.
Some authors prefer to also restrict Ic to be strictly positive
definite, which serves to exclude idealized zero-volume bodies like point
masses, lines, or planes. We do not make this restriction.
Despite the fact that the two conditions in (9) are together
known in the literature as physical consistency, confusingly they are
actually not quite sufficient to ensure that the inertial parameters correspond
a non-negative mass density—we need one more condition on the inertia matrix.
Triangle Inequality
A valid inertia matrix (expressed about any reference point; here we use the
origin for simplicity) must also satisfy the triangle inequality, which
states that none of its eigenvalues is larger than the sum of the other two. To
see this, we will make use of the identity
(r×)Tr×=rTr13−rrT=tr(rrT)13−rrT,
where 13 is the 3×3 identity matrix and tr(⋅)
denotes the matrix trace. Substituting into (3), we get
Now let λ1≥λ2≥λ3≥0 be the eigenvalues of I and
let v∈R3 be the normalized (i.e., unit-length) eigenvector
corresponding to λ1, such that λ1=vTIv. Substituting in (10), we get
λ1=vT(tr(S)13−S)v=tr(S)−vTSv≤tr(S),
where the inequality follows because the second moment matrix is always
positive semidefinite (this is easy to see from its definition (8))
and therefore vTSv≥0 for any v.
Finally, recalling that the trace of a matrix equals the sum of its eigenvalues,
we can combine (11) and (12) to obtain
tr(I)=λ1+λ2+λ3=2tr(S)≥2λ1,
which we rearrange to obtain the triangle inequality:
λ1≤λ2+λ3.
Full Physical Consistency
A set of physically consistent inertial parameters where I also
satisfies the triangle inequality are called fully physically
consistent, which is a necessary and
sufficient condition for the inertial parameters to be realizable by some
non-negative mass density ρ.
Pseudo-Inertia Matrix
We can gather the inertial parameters into the 4×4pseudo-inertia matrix
Π=[SmcTmcm]=∫R3ρ(r)r~r~Tdr,
where r~=[rT,1]T is the homogeneous representation of
the point r.
It turns out that necessary and sufficient conditions for a set of inertial
parameters to be fully physically consistent are
m>0,Π≽0.
These conditions are convenient
because they are convex, and can therefore be included as constraints in convex
optimization problems for
parameter identification or robust
constraint verification. (For the purposes
of numerical optimization, we can relax the constraint m>0 to
m≥ϵ for some small ϵ≥0, since strict inequalities don’t
make sense in this case.)
Why do m>0 and Π≽0 imply full physical consistency?
The Schur
complement theorem tells us that if m>0, then
Π≽0⟺Sc≽0,
where Sc=S−mccT (see (14)
below). We already saw that a positive semidefinite second moment matrix yields
a fully physically consistent inertia matrix in the previous section on the
triangle inequality. Here we need to prove the other direction; that is, if
S is not positive semidefinite, then I must not be fully
physically consistent (again, we will use the origin as the reference point
here, but the same logic holds for any reference point, including the center of
mass).
Let us assume that I≽0 but
S≽0, which means that there exists some unit-length
vector v∈R3 such that vTSv<0. Using
(10), we have
vTIv=vT(tr(S)13−S)v=tr(S)−vTSv>tr(S).
We also know that vTIv≤λ1, and therefore
λ1>tr(S). Combined with (11), we have
2λ1>2tr(S)=tr(I)=λ1+λ2+λ3.
Rearranging the above equation reveals that λ1>λ2+λ3,
showing that the triangle inequality is not satisfied and therefore I is
not fully physically consistent.
Changing the Reference Frame
Parallel Axis Theorem
We can manipulate the equation for the inertia matrix expressed about an arbitrary
point p∈R3 from (5) to obtain
where Δp=p−c and we have used the fact that
∫R3ρ(r)Δr×dr=0. This
result is known as the parallel axis
theorem, and is used to
translate the inertia matrix to and from the center of mass.
Notice that (13) implies that
Ip≽Ic for any reference point p∈R3,
with equality if and only if p=c. This means that it is easier
(i.e., less energy is required) to rotate a rigid body about its center of mass
than any other point.
To translate between two arbitrary points p and q, we have
Ip=Iq+mΔq×Δq×−mΔp×Δp×,
where Δq=q−c. The analogous rule for the second moment
matrix is
Sp=Sc+mΔpΔpT=Sq−mΔqΔqT+mΔpΔpT.
Full Spatial Transformations
The parallel axis theorem handles translations, but suppose we want to
transform the inertial parameters by a general spatial transformation (i.e.,
translation and rotation) from frame {a} to frame {b}. Let
Tba=[Rba0Tpbab1]∈SE(3),
be the homogeneous transformation matrix that maps points from {a} to
{b}, where Rba∈SO(3) is the rotation and
pbab∈R3 is the position of the origin of {a} with
respect to {b} expressed in the coordinates of {b}. To map the inertial
parameters from {a} to {b}, we simply represent them as the
pseudo-inertia matrix Πa in {a} and apply the “sandwich” rule
Πb=TbaΠaTbaT
to obtain their representation Πb in {b}.
We can use (15) to obtain the parallel-axis theorem rule
for S in (14) by applying
the pure translation
Tpc=[130T−Δp1]
to the pseudo-inertia matrix expressed about the center of mass
where −Δp is the location of the center of mass with respect to p.
More on the Inertia Matrix
We will conclude with a few more interesting properties of the inertia and
second moment matrices.
S vs. I
The second moment matrix S encodes the spread of the mass distribution
while I encodes its resistance to rotation. To help understand this,
consider the simple example shown below (borrowed from Chapter 2
of my PhD thesis).
Two point masses distributed along the x-axis. The z-axis points out of the
page.
This system consists of two point masses, each with mass 0.5, placed at
±1 unit distance from the origin along the x-axis. The inertia and second
moment matrices for this system
I=000010001,S=100000000,
where S shows that the mass is spread along the x-axis and I
shows that this spread resists rotation about the y- and z-axes.
From I to S
Given I, we can recover S by rearranging (10)
and substituting in (11) to obtain
S=(1/2)tr(I)−I.
What happens when I does not satisfy the triangle inequality? Let
I=100000000,
which is positive semidefinite but has eigenvalues λ1=1 and
λ2=λ3=0, so the triangle inequality is not satisfied. Using the above equation, the corresponding second moment matrix is
S=(1/2)−100010001,
which is clearly not positive semidefinite and is therefore invalid.
Bigger S, Bigger I
Another interesting (and intuitive) property is that if
S′≽S, then I′≽I. This means that
when the mass distribution is more spread out, then the body’s resistance to
rotation is increased. To prove this fact, consider the relationship
where λmax(ΔS)≥0 is the largest eigenvalue of
ΔS, which shows that I′≽I.
Common Inertia Matrices
Expressions for the inertia matrix of common shapes with uniform density are
available on Wikipedia. We have also derived the inertia matrix for
ellipsoidal and
cuboid shells in previous blog
posts.
Thanks to Philippe Nadeau for reading a draft of this post.