Rigid Body Inertial Parameters

The dynamics of a rigid body (that is, an idealized body that does not deform) are governed by its inertial parameters, which consist of the mass mRm\in\mathbb{R}, center of mass cR3\bm{c}\in\mathbb{R}^3, and inertia matrix IR3×3\bm{I}\in\mathbb{R}^{3\times3} (not to be confused with the identity matrix, which we denote as 1\bm{1}).

This post is aimed at those who have seen the inertial parameters before (for example, in the Newton-Euler equations), but who do not necessarily know all of their properties offhand. In particular, we focus on the set of inertial parameters that are realizable by a real physical object—that is, those that correspond to a valid (i.e., non-negative) mass density function. For a paper on this topic, I quite enjoy this one by Wensing, Kim, and Slotine, which informed a lot of my own understanding of rigid body inertial parameters.

Density

The density of a rigid body is described by a non-negative mass density function ρ:R3R+\rho:\mathbb{R}^3\to\mathbb{R}_+, where R+\mathbb{R}_+ denotes the set of non-negative real numbers. The density function can be thought of as an unnormalized probability distribution over three-dimensional space (we’ll come back to this analogy shortly), which assigns an infinitesimal mass value to each point in the body’s volume.

The inertial parameters are related to the density function by the integrals

m=R3ρ(r)dr,mc=R3ρ(r)rdr,I=R3ρ(r)(r×)Tr×dr,\begin{align} m &= \int_{\mathbb{R}^3} \rho(\bm{r})\,d\bm{r},\label{1} \\ m\bm{c} &= \int_{\mathbb{R}^3} \rho(\bm{r})\bm{r}\,d\bm{r},\label{2} \\ \bm{I} &= \int_{\mathbb{R}^3} \rho(\bm{r})(\bm{r}^{\times})^T\bm{r}^{\times}\,d\bm{r},\label{3} \\ \end{align}

where we are integrating over the position rR3\bm{r}\in\R^3, and

r×=[xyz]×=[0zyz0xyx0].\begin{equation}\label{4} \bm{r}^{\times} = \begin{bmatrix} x \\ y \\ z \end{bmatrix}^{\times} = \begin{bmatrix} 0 & -z & y \\ z & 0 & -x \\ -y & x & 0 \end{bmatrix}. \end{equation}

forms a skew-symmetric matrix. The right-hand side of (4)\eqref{4} is also sometimes called the cross-product matrix because a×b=a×b\bm{a}^{\times}\bm{b}=\bm{a}\times\bm{b} for any a,bR3\bm{a},\bm{b}\in\mathbb{R}^3. Since r×\bm{r}^{\times} is skew-symmetric, it satisfies (r×)T=r×(\bm{r}^{\times})^T=-\bm{r}^{\times} by definition, so it is also common to see (3)\eqref{3} written as

I=R3ρ(r)r×r×dr.\begin{equation*} \bm{I} = -\int_{\mathbb{R}^3} \rho(\bm{r})\bm{r}^{\times}\bm{r}^{\times}\,d\bm{r}. \end{equation*}

The inertia matrix is always taken with respect to a particular reference point; in (3)\eqref{3} we simply used the origin. Expressed about a general reference point pR3\bm{p}\in\mathbb{R}^3, the inertia matrix is

Ip=R3ρ(r)(rp)×(rp)×dr.\begin{equation}\label{5} \bm{I}_p = -\int_{\mathbb{R}^3}\rho(\bm{r})(\bm{r}-\bm{p})^{\times}(\bm{r}-\bm{p})^{\times} d\bm{r}. \end{equation}

It is often convenient to use the center of mass c\bm{c} as the reference point, in which case we have

Ic=R3ρ(r)Δr×Δr×dr,\begin{equation}\label{6} \bm{I}_c = -\int_{\mathbb{R}^3}\rho(\bm{r})\Delta\bm{r}^{\times}\Delta\bm{r}^{\times} d\bm{r}, \end{equation}

where Δr=rc\Delta\bm{r}=\bm{r}-\bm{c}.

Probability Distribution Analogy

The quantity mcm\bm{c} in (2)\eqref{2} is known as the first moment of mass. This is just the mean of the density function, so we can write

E[r]=mc,\begin{equation*} \mathbb{E}[\bm{r}] = m\bm{c}, \end{equation*}

where E[]\mathbb{E}[\cdot] denotes the expected value under the distribution ρ\rho. If we take m=1m=1 such that ρ\rho is a proper (normalized) probability distribution, then E[r]=c\mathbb{E}[\bm{r}]=\bm{c} and the covariance matrix is

Σ=E[(rE[r])(rE[r])T]=R3ρ(r)ΔrΔrTdr,\begin{equation}\label{7} \begin{aligned} \bm{\Sigma} &= \mathbb{E}[(\bm{r}-\mathbb{E}[\bm{r}])(\bm{r}-\mathbb{E}[\bm{r}])^T] \\ &= \int_{\mathbb{R}^3}\rho(\bm{r})\Delta\bm{r}\Delta\bm{r}^Td\bm{r}, \end{aligned} \end{equation}

which encodes the spread of the mass distribution about the center of mass.

More generally, we can define the quantity

S=R3ρ(r)rrTdr,\begin{equation}\label{8} \bm{S} = \int_{\mathbb{R}^3} \rho(\bm{r})\bm{r}\bm{r}^T\,d\bm{r}, \end{equation}

which is known as the second moment matrix and does not require a normalized distribution. The second moment matrix has a one-to-one relationship with the inertia matrix, and is also taken about a particular reference point. In (8)\eqref{8}, the reference point is the origin. If we use the center of mass as the reference point instead, we get

Sc=R3ρ(r)ΔrΔrTdr,\begin{equation*} \bm{S}_c = \int_{\mathbb{R}^3}\rho(\bm{r})\Delta\bm{r}\Delta\bm{r}^Td\bm{r}, \end{equation*}

which corresponds to the covariance matrix from (7)\eqref{7}.

Physical Consistency

Since the density ρ\rho is non-negative everywhere (as you cannot have a negative mass), it immediately follows from (1)\eqref{1} and (6)\eqref{6}, respectively, that m0m\geq0 and Ic0\bm{I}_c\succcurlyeq\bm{0}; that is, the mass must be positive and the inertia matrix taken about the center of mass must be positive semidefinite (which means all of its eigenvalues are non-negative; we will see below that Ic0\bm{I}_c\succcurlyeq\bm{0} implies Ip0\bm{I}_p\succcurlyeq\bm{0} for any pR3\bm{p}\in\mathbb{R}^3).

However, if m=0m=0, then ρ\rho must be zero everywhere and therefore Ic=0\bm{I}_c=\bm{0} as well. To avoid this special case we will require that the mass is strictly positive, so our two conditions are

m>0,Ic0.\begin{align}\label{9} m &> 0, & \bm{I}_c &\succcurlyeq \bm{0}. \end{align}

Some authors prefer to also restrict Ic\bm{I}_c to be strictly positive definite, which serves to exclude idealized zero-volume bodies like point masses, lines, or planes. We do not make this restriction.

Despite the fact that the two conditions in (9)\eqref{9} are together known in the literature as physical consistency, confusingly they are actually not quite sufficient to ensure that the inertial parameters correspond a non-negative mass density—we need one more condition on the inertia matrix.

Triangle Inequality

A valid inertia matrix (expressed about any reference point; here we use the origin for simplicity) must also satisfy the triangle inequality, which states that none of its eigenvalues is larger than the sum of the other two. To see this, we will make use of the identity

(r×)Tr×=rTr13rrT=tr(rrT)13rrT,\begin{equation*} (\bm{r}^{\times})^T\bm{r}^{\times} = \bm{r}^T\bm{r}\bm{1}_3 - \bm{r}\bm{r}^T = \mathrm{tr}(\bm{r}\bm{r}^T)\bm{1}_3 - \bm{r}\bm{r}^T, \end{equation*}

where 13\bm{1}_3 is the 3×33\times3 identity matrix and tr()\mathrm{tr}(\cdot) denotes the matrix trace. Substituting into (3)\eqref{3}, we get

I=R3ρ(r)(tr(rrT)13rrT)dr=tr(R3ρ(r)rrTdr)13R3ρ(r)rrTdr=tr(S)13S.\begin{equation}\label{10} \begin{aligned} \bm{I} &= \int_{\mathbb{R}^3} \rho(\bm{r})(\mathrm{tr}(\bm{r}\bm{r}^T)\bm{1}_3 - \bm{r}\bm{r}^T)\,d\bm{r} \\ &= \mathrm{tr}\biggl(\int_{\mathbb{R}^3} \rho(\bm{r})\bm{r}\bm{r}^T\,d\bm{r}\biggr)\bm{1}_3 - \int_{\mathbb{R}^3} \rho(\bm{r})\bm{r}\bm{r}^T\,d\bm{r} \\ &= \mathrm{tr}(\bm{S})\bm{1}_3 - \bm{S}. \end{aligned} \end{equation}

Taking the trace of both sides of (10)\eqref{10}, we get

tr(I)=2tr(S).\begin{equation}\label{11} \mathrm{tr}(\bm{I}) = 2\,\mathrm{tr}(\bm{S}). \end{equation}

Now let λ1λ2λ30\lambda_1\geq\lambda_2\geq\lambda_3\geq0 be the eigenvalues of I\bm{I} and let vR3\bm{v}\in\mathbb{R}^3 be the normalized (i.e., unit-length) eigenvector corresponding to λ1\lambda_1, such that λ1=vTIv\lambda_1=\bm{v}^T\bm{I}\bm{v}. Substituting in (10)\eqref{10}, we get

λ1=vT(tr(S)13S)v=tr(S)vTSvtr(S),\begin{equation}\label{12} \begin{aligned} \lambda_1 &= \bm{v}^T(\mathrm{tr}(\bm{S})\bm{1}_3 - \bm{S})\bm{v} \\ &= \mathrm{tr}(\bm{S}) - \bm{v}^T\bm{S}\bm{v} \\ &\leq \mathrm{tr}(\bm{S}), \end{aligned} \end{equation}

where the inequality follows because the second moment matrix is always positive semidefinite (this is easy to see from its definition (8)\eqref{8}) and therefore vTSv0\bm{v}^T\bm{S}\bm{v}\geq0 for any v\bm{v}.

Finally, recalling that the trace of a matrix equals the sum of its eigenvalues, we can combine (11)\eqref{11} and (12)\eqref{12} to obtain

tr(I)=λ1+λ2+λ3=2tr(S)2λ1,\begin{equation*} \mathrm{tr}(\bm{I}) = \lambda_1 + \lambda_2 + \lambda_3 = 2\,\mathrm{tr}(\bm{S})\geq 2\lambda_1, \end{equation*}

which we rearrange to obtain the triangle inequality:

λ1λ2+λ3.\begin{equation*} \lambda_1 \leq \lambda_2 + \lambda_3. \end{equation*}

Full Physical Consistency

A set of physically consistent inertial parameters where I\bm{I} also satisfies the triangle inequality are called fully physically consistent, which is a necessary and sufficient condition for the inertial parameters to be realizable by some non-negative mass density ρ\rho.

Pseudo-Inertia Matrix

We can gather the inertial parameters into the 4×44\times4 pseudo-inertia matrix

Π=[SmcmcTm]=R3ρ(r)r~r~Tdr,\begin{equation*} \bm{\Pi} = \begin{bmatrix} \bm{S} & m\bm{c} \\ m\bm{c}^T& m \end{bmatrix} = \int_{\mathbb{R}^3} \rho(\bm{r})\tilde{\bm{r}}\tilde{\bm{r}}^T\,d\bm{r}, \end{equation*}

where r~=[rT,1]T\tilde{\bm{r}} = [\bm{r}^T,1]^T is the homogeneous representation of the point r\bm{r}.

It turns out that necessary and sufficient conditions for a set of inertial parameters to be fully physically consistent are

m>0,Π0.\begin{align*} m &> 0, & \bm{\Pi}\succcurlyeq\bm{0}. \end{align*}

These conditions are convenient because they are convex, and can therefore be included as constraints in convex optimization problems for parameter identification or robust constraint verification. (For the purposes of numerical optimization, we can relax the constraint m>0m>0 to mϵm\geq\epsilon for some small ϵ0\epsilon\geq0, since strict inequalities don’t make sense in this case.)

Why do m>0m>0 and Π0\bm{\Pi}\succcurlyeq\bm{0} imply full physical consistency? The Schur complement theorem tells us that if m>0m>0, then

Π0    Sc0,\begin{equation*} \bm{\Pi}\succcurlyeq\bm{0} \iff \bm{S}_c\succcurlyeq\bm{0}, \end{equation*}

where Sc=SmccT\bm{S}_c=\bm{S}-m\bm{c}\bm{c}^T (see (14)\eqref{14} below). We already saw that a positive semidefinite second moment matrix yields a fully physically consistent inertia matrix in the previous section on the triangle inequality. Here we need to prove the other direction; that is, if S\bm{S} is not positive semidefinite, then I\bm{I} must not be fully physically consistent (again, we will use the origin as the reference point here, but the same logic holds for any reference point, including the center of mass).

Let us assume that I0\bm{I}\succcurlyeq\bm{0} but S⋡0\bm{S}\not\succcurlyeq\bm{0}, which means that there exists some unit-length vector vR3\bm{v}\in\mathbb{R}^3 such that vTSv<0\bm{v}^T\bm{S}\bm{v}<0. Using (10)\eqref{10}, we have

vTIv=vT(tr(S)13S)v=tr(S)vTSv>tr(S).\begin{equation*} \begin{aligned} \bm{v}^T\bm{I}\bm{v} &= \bm{v}^T(\mathrm{tr}(\bm{S})\bm{1}_3 - \bm{S})\bm{v} \\ &= \mathrm{tr}(\bm{S}) - \bm{v}^T\bm{S}\bm{v} \\ &> \mathrm{tr}(\bm{S}). \end{aligned} \end{equation*}

We also know that vTIvλ1\bm{v}^T\bm{I}\bm{v}\leq\lambda_1, and therefore λ1>tr(S)\lambda_1>\mathrm{tr}(\bm{S}). Combined with (11)\eqref{11}, we have

2λ1>2tr(S)=tr(I)=λ1+λ2+λ3.\begin{equation*} 2\lambda_1 > 2\,\mathrm{tr}(\bm{S}) = \mathrm{tr}(\bm{I}) = \lambda_1+\lambda_2+\lambda_3. \end{equation*}

Rearranging the above equation reveals that λ1>λ2+λ3\lambda_1>\lambda_2+\lambda_3, showing that the triangle inequality is not satisfied and therefore I\bm{I} is not fully physically consistent.

Changing the Reference Frame

Parallel Axis Theorem

We can manipulate the equation for the inertia matrix expressed about an arbitrary point pR3\bm{p}\in\mathbb{R}^3 from (5)\eqref{5} to obtain

Ip=R3ρ(r)(rp)×(rp)×dr=R3ρ(r)(ΔrΔp)×(ΔrΔp)×dr=R3ρ(r)(Δr×Δr×Δr×Δp×Δp×Δr×+Δp×Δp×)dr=IcmΔp×Δp×,\begin{equation}\label{13} \begin{aligned} \bm{I}_p &= -\int_{\mathbb{R}^3}\rho(\bm{r})(\bm{r}-\bm{p})^\times(\bm{r}-\bm{p})^\times d\bm{r} \\ &= -\int_{\mathbb{R}^3}\rho(\bm{r})(\Delta\bm{r}-\Delta\bm{p})^\times(\Delta\bm{r}-\Delta\bm{p})^\times d\bm{r} \\ &= -\int_{\mathbb{R}^3}\rho(\bm{r})(\Delta\bm{r}^\times\Delta\bm{r}^\times-\Delta\bm{r}^\times\Delta\bm{p}^\times - \Delta\bm{p}^\times\Delta\bm{r}^\times+\Delta\bm{p}^\times\Delta\bm{p}^\times)\,d\bm{r} \\ &= \bm{I}_c - m\Delta\bm{p}^\times\Delta\bm{p}^\times, \end{aligned} \end{equation}

where Δp=pc\Delta\bm{p}=\bm{p}-\bm{c} and we have used the fact that R3ρ(r)Δr×dr=0\int_{\mathbb{R}^3}\rho(\bm{r})\Delta\bm{r}^{\times}d\bm{r}=\bm{0}. This result is known as the parallel axis theorem, and is used to translate the inertia matrix to and from the center of mass.

Notice that (13)\eqref{13} implies that IpIc\bm{I}_p\succcurlyeq\bm{I}_c for any reference point pR3\bm{p}\in\mathbb{R}^3, with equality if and only if p=c\bm{p}=\bm{c}. This means that it is easier (i.e., less energy is required) to rotate a rigid body about its center of mass than any other point.

To translate between two arbitrary points p\bm{p} and q\bm{q}, we have

Ip=Iq+mΔq×Δq×mΔp×Δp×,\begin{equation*} \bm{I}_p = \bm{I}_q + m\Delta\bm{q}^\times\Delta\bm{q}^\times - m\Delta\bm{p}^\times\Delta\bm{p}^\times, \end{equation*}

where Δq=qc\Delta\bm{q}=\bm{q}-\bm{c}. The analogous rule for the second moment matrix is

Sp=Sc+mΔpΔpT=SqmΔqΔqT+mΔpΔpT.\begin{equation}\label{14} \begin{aligned} \bm{S}_p &= \bm{S}_c + m\Delta\bm{p}\Delta\bm{p}^T \\ &= \bm{S}_q - m\Delta\bm{q}\Delta\bm{q}^T + m\Delta\bm{p}\Delta\bm{p}^T. \end{aligned} \end{equation}

Full Spatial Transformations

The parallel axis theorem handles translations, but suppose we want to transform the inertial parameters by a general spatial transformation (i.e., translation and rotation) from frame {a}\{a\} to frame {b}\{b\}. Let

Tba=[Rbapbab0T1]SE(3),\begin{equation*} \begin{aligned} \bm{T}_{ba} = \begin{bmatrix} \bm{R}_{ba} & \bm{p}^{ab}_b \\ \bm{0}^T & 1 \end{bmatrix} \in SE(3), \end{aligned} \end{equation*}

be the homogeneous transformation matrix that maps points from {a}\{a\} to {b}\{b\}, where RbaSO(3)\bm{R}_{ba}\in SO(3) is the rotation and pbabR3\bm{p}^{ab}_b\in\mathbb{R}^3 is the position of the origin of {a}\{a\} with respect to {b}\{b\} expressed in the coordinates of {b}\{b\}. To map the inertial parameters from {a}\{a\} to {b}\{b\}, we simply represent them as the pseudo-inertia matrix Πa\bm{\Pi}_a in {a}\{a\} and apply the “sandwich” rule

Πb=TbaΠaTbaT\begin{equation}\label{15} \begin{aligned} \bm{\Pi}_b = \bm{T}_{ba}\bm{\Pi}_a\bm{T}_{ba}^T \end{aligned} \end{equation}

to obtain their representation Πb\bm{\Pi}_b in {b}\{b\}.

We can use (15)\eqref{15} to obtain the parallel-axis theorem rule for S\bm{S} in (14)\eqref{14} by applying the pure translation

Tpc=[13Δp0T1]\begin{equation*} \bm{T}_{pc} = \begin{bmatrix} \bm{1}_3 & -\Delta\bm{p} \\ \bm{0}^T & 1 \end{bmatrix} \end{equation*}

to the pseudo-inertia matrix expressed about the center of mass

Πc=[Sc00Tm],\begin{equation*} \bm{\Pi}_c = \begin{bmatrix} \bm{S}_c & \bm{0} \\ \bm{0}^T & m \end{bmatrix}, \end{equation*}

which yields

[SpmΔpmΔpTm]=[13Δp0T1][Sc00Tm][130ΔpT1]=[Sc+mΔpΔpTmΔpmΔpTm],\begin{equation*} \begin{aligned} \begin{bmatrix} \bm{S}_p & -m\Delta\bm{p} \\ -m\Delta\bm{p}^T & m \end{bmatrix} &= \begin{bmatrix} \bm{1}_3 & -\Delta\bm{p} \\ \bm{0}^T & 1 \end{bmatrix}\begin{bmatrix} \bm{S}_c & \bm{0} \\ \bm{0}^T & m \end{bmatrix}\begin{bmatrix} \bm{1}_3 & \bm{0} \\ -\Delta\bm{p}^T & 1 \end{bmatrix} \\ &= \begin{bmatrix} \bm{S}_c + m\Delta\bm{p}\Delta\bm{p}^T & -m\Delta\bm{p} \\ -m\Delta\bm{p}^T & m \end{bmatrix}, \end{aligned} \end{equation*}

where Δp-\Delta\bm{p} is the location of the center of mass with respect to p\bm{p}.

More on the Inertia Matrix

We will conclude with a few more interesting properties of the inertia and second moment matrices.

S\bm{S} vs. I\bm{I}

The second moment matrix S\bm{S} encodes the spread of the mass distribution while I\bm{I} encodes its resistance to rotation. To help understand this, consider the simple example shown below (borrowed from Chapter 2 of my PhD thesis).

A system of two point masses.

Two point masses distributed along the xx-axis. The zz-axis points out of the page.

This system consists of two point masses, each with mass 0.50.5, placed at ±1\pm1 unit distance from the origin along the xx-axis. The inertia and second moment matrices for this system

I=[000010001],S=[100000000],\begin{align*} \bm{I} &= \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}, & \bm{S} &= \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \end{align*}

where S\bm{S} shows that the mass is spread along the xx-axis and I\bm{I} shows that this spread resists rotation about the yy- and zz-axes.

From I\bm{I} to S\bm{S}

Given I\bm{I}, we can recover S\bm{S} by rearranging (10)\eqref{10} and substituting in (11)\eqref{11} to obtain

S=(1/2)tr(I)I.\begin{equation*} \bm{S} = (1/2)\mathrm{tr}(\bm{I}) - \bm{I}. \end{equation*}

What happens when I\bm{I} does not satisfy the triangle inequality? Let

I=[100000000],\begin{equation*} \bm{I} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \end{equation*}

which is positive semidefinite but has eigenvalues λ1=1\lambda_1=1 and λ2=λ3=0\lambda_2=\lambda_3=0, so the triangle inequality is not satisfied. Using the above equation, the corresponding second moment matrix is

S=(1/2)[100010001],\begin{equation*} \bm{S} = (1/2)\begin{bmatrix} -1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}, \end{equation*}

which is clearly not positive semidefinite and is therefore invalid.

Bigger S\bm{S}, Bigger I\bm{I}

Another interesting (and intuitive) property is that if SS\bm{S}'\succcurlyeq\bm{S}, then II\bm{I}'\succcurlyeq\bm{I}. This means that when the mass distribution is more spread out, then the body’s resistance to rotation is increased. To prove this fact, consider the relationship

II=tr(ΔS)13ΔS,\begin{equation*} \begin{aligned} \bm{I}'-\bm{I} &= \mathrm{tr}(\Delta\bm{S})\bm{1}_3 - \Delta\bm{S}, \end{aligned} \end{equation*}

where ΔS=SS0\Delta\bm{S}=\bm{S}'-\bm{S}\succcurlyeq\bm{0}. Given any vector uR3\bm{u}\in\R^3, we have

uT(II)u=tr(ΔS)u22uTΔSutr(ΔS)u22λmax(ΔS)u220,\begin{equation*} \begin{aligned} \bm{u}^T(\bm{I}'-\bm{I})\bm{u} &= \mathrm{tr}(\Delta\bm{S})\|\bm{u}\|_2^2 - \bm{u}^T\Delta\bm{S}\bm{u} \\ &\geq \mathrm{tr}(\Delta\bm{S})\|\bm{u}\|_2^2 - \lambda_{\max}(\Delta\bm{S})\|\bm{u}\|_2^2\\ &\geq 0, \end{aligned} \end{equation*}

where λmax(ΔS)0\lambda_{\max}(\Delta\bm{S})\geq0 is the largest eigenvalue of ΔS\Delta\bm{S}, which shows that II\bm{I}'\succcurlyeq\bm{I}.

Common Inertia Matrices

Expressions for the inertia matrix of common shapes with uniform density are available on Wikipedia. We have also derived the inertia matrix for ellipsoidal and cuboid shells in previous blog posts.

Thanks to Philippe Nadeau for reading a draft of this post.