The propagation of point-like particles, aka missiles, are simulated in the gravitational force field of static center of masses, aka planets, on the surface of a unit sphere.

In this document, we describe in detail the theoretical background and list all equations that are necessary to efficiently deal with this library. If you prefer a more experimental-orientated approach you can skip this section and directly jump to the description of the API. There, you will find everything necessary to use the library and to run your simulations.

We also provide a minimalistic game extension that is built on top of the API. Read its documentation here if you are interested in using it as a building block for your game server.

Theory

Missiles do not generate a force field but move in \(\vec\nabla V\) whereas planets are static and stay fixed. Let \((\vec q, \vec p) \in \mathbb{R}^3 \otimes \mathbb{R}^3\) be the phase space coordinates of a single missile and let us think about \(\vec q\) as the spatial position which is sufficient to evaluate \(V\) such that the Hamiltonian of a single missile with mass \(m\) reads:

\[ H(\vec{q}, \vec{p}) = \frac{p^2}{2m} + V(\vec{q}) + \lambda g(\vec{q}) \,, \]

where \(g(\vec q)\) is the constraint of the manifold

\begin{align*} g(\vec q) &= \frac{1}{2} \left( q^2 - 1 \right) = 0 \,, \\ \vec\nabla g(\vec q) &= \vec{q} \,. \end{align*}

NB: Differentiating the constraint \(g(\vec q) = 0\) w.r.t. time we get the identity \(\vec q \perp \vec p\), and differentiating a second time a physical interpretation of the Lagrange multiplier \(\lambda = p^2 - \vec{q} \cdot \vec\nabla V\): For particles at rest the projection of the gravitational force along the position vector of a given particle thus has to be compensated in order to fulfill the constraint of our manifold. If the particle has a finite momentum, this constraining force can be reduced (note that \(\operatorname{sign}(\vec{q} \cdot \vec\nabla V) > 0\)) until the gravitational force pulls the particle on a circular orbit ( \( \lambda = 0\)) and eventually changes from a pulling force into a pushing force if the momentum becomes even larger.

The conservative potential \(V(\vec q)\) is the linear superposition of the contributions of each planet \(i\) at position \(\vec{y}_i \in \mathbb{R}^3\),

\[ V(\vec q) = \sum\limits_{i=1}^n V_d(\sigma_i) \,, \]

with

\[ \sigma_i = \arccos(\vec{q} \cdot \vec{y}_i) \,, \]

where \(0 \le \sigma_i \le \pi \) is the (shortest) distance between \(\vec q\) and \(\vec{y}_i\) on the manifold. On a sphere, the shortest distance is an orthodrome, also known as a great circle. Without loss of generality, the kinematics of a missile in the force field of a single planet can thus be reduced to the movement on a (static) unit circle. Further, let's assume that the missile is at the azimuth \(\varphi = \sigma \in [0, 2\pi)\) and is attracted by the planet at \(\varphi = 0\). Following the line of sight of the missile on the manifold, the planet is seen periodically at distances \(\sigma, 2\pi + \sigma, \ldots\) in one direction and at \(2\pi - \sigma, 4\pi - \sigma, \ldots\) in the opposite direction.

We implement two different types of gravitation potentials \(V_d(\sigma_i)\),

\begin{align*} V_{2\mathrm{D}}(\sigma_i) = 2 \sum\limits_{j=0}^\infty V^{(j)}_{2\mathrm{D}}(\sigma_i) &= 2 \sum\limits_{j=0}^\infty \big[ \ln(2\pi j + \sigma_i) + \ln(2\pi (j+1) - \sigma_i) - 2 \ln \pi (2j+1) \big] \\ &= 2 \ln \sin \frac{\sigma_i}{2}, \\ V_{3\mathrm{D}}(\sigma_i) = \frac{1}{4\pi} \sum\limits_{j=0}^\infty V^{(j)}_{3\mathrm{D}}(\sigma_i) &= -\frac{1}{4\pi} \sum\limits_{j=0}^\infty \left[ \frac{1}{2\pi j + \sigma_i} + \frac{1}{2\pi (j+1) - \sigma_i} - \frac{2}{\pi (2j+1)} \right] \\ &= -\frac{2 \psi\!\left( \frac{1}{2} \right) - \psi\!\left(\frac{\sigma_i}{2 \pi} \right) - \psi\!\left(1 - \frac{\sigma_i}{2 \pi} \right)}{8 \pi^2} \end{align*}

with the digamma function \(\psi\). The former contributes force fields \(\sim \sigma_i^{-1}\) that correspond to a 2D field solely embedded on the manifold and the latter is motivated by the canonical \(\sim \sigma_i^{-2}\) behavior of the unconstrained 3D case. Both potentials are gauged by an additional constant offset s.t. \(V_{2\mathrm{D}}(\pi) = V_{3\mathrm{D}}(\pi) = 0\) which keeps each series convergent while not inferring with the respective force fields.

In the figure below, we show \(V^{(j)}_{3\mathrm{D}}(\sigma_i)\) for \(j=0,1,2\).

Note that these potentials are only physical meaningful for \(0 \le \sigma_i < 2\pi\). (In fact, by definition of the inverse cosine function, the interval actual reads \(0 \le \sigma_i \le \pi\).) Furthermore, since \(\sigma_i = \pi\) is the largest possible distance between two points on a unit sphere, any reasonable potential \(V(\sigma_i)\) has to obey the symmetry relation \(V(\sigma_i) = V(2\pi - \sigma_i)\).

The aforementioned gauging of the potentials makes the evaluation of the escape velocity, \(p_\pi\), at a given distance to a planet, \(\delta\), straightforward:

\[ p_\pi = \sqrt{-2m \, V_d(\delta)} \,, \]

where we define \(p_\pi\) as the minimal initial momentum necessary to asymptotically reach a distance \(\sigma = \pi\) when launched at a distance \(\delta\). In layman's terms: The only way to not shoot yourself in the face is to launch a missile faster than \(p_\pi\)—then you will hit your back. Obviously, these definitions only hold for a single planet.

Tip: In practical applications \(p_\pi\) can be used as a natural unit for velocities and time. For example, velocities can be given in multiples of \(p_\pi\) and durations in multiples of the orbital period if missiles are launched with \(2p_\pi\). Use grvx_v_esc() and grvx_orb_period() to get \(p_\pi\) and the orbital period, respectively, of the given potential.

The force fields are given by the gradients of the respective potentials:

\[ -\vec\nabla_{\!q} V(\vec{q}) = -\sum\limits_{i=1}^n \frac{V'_d(\sigma_i)}{\sin \sigma_i} \, \vec{y}_i = \sum\limits_{i=1}^n f_d(\sigma_i) \, \vec{y}_i \]

where we introduced the abbreviations

\begin{align*} V'_{2\mathrm{D}}(\sigma_i) &= \operatorname{cot} \frac{\sigma_i}{2} \\ V'_{3\mathrm{D}}(\sigma_i) &= (\pi - \sigma_i) \sum\limits_{j=0}^\infty \frac{2j + 1}{\left[ (2j+1)^2 \pi^2 - (\pi - \sigma_i)^2 \right]^2} \end{align*}

and

\begin{align*} f_{2\mathrm{D}}(\sigma_i) &= -\frac{1}{1 - \cos \sigma_i} \\ f_{3\mathrm{D}}(\sigma_i) &= \lim\limits_{J \to \infty} \sum\limits_{j=0}^J f^{(j)}_{3\mathrm{D}}(\sigma_i) \\ &= -\frac{1}{\operatorname{sinc}(\sigma_i - \pi)} \, \lim\limits_{J \to \infty} \sum\limits_{j=0}^J \frac{2j + 1}{\left[ (2j+1)^2 \pi^2 - (\sigma_i - \pi)^2 \right]^2} \end{align*}

with \(d \in \{2\mathrm{D}, 3\mathrm{D}\}\) and \(\operatorname{sinc} \bullet = (\sin \bullet) / \bullet\). Below, we show both types of the scaling factors \(f_d\) as functions of \(\sigma_i\). Note that we use different scales of the ordinate to compensate effects of our arbitrarly chosen coupling constants. Furthermore, note that \(f_d\) is a scaling factor and not the force field. (For a single planet, \(n=1\), however, it is equivalent to the magnitude of the force field.) In particular, \(f_d(\sigma_i) < 0 \; \forall \, \sigma_i\) whereas the projection of \(\vec\nabla_{\!q} V(\vec{q})\) onto the manifold can become zero, e.g., if \(n=1\) and \(\vec{q} \parallel \vec{y}_i\).

In the visualization of \(d = 3\mathrm{D}\) above, we approximated the infinite sum with \(J=100\) summands. Below, we show the ratio of truncated summations w.r.t. \(J=0\). In general, we see that the sum converges fast and for most cases taking \(\mathrm{O}(10)\) summands is sufficient. In particular, the deviation is largest where the force field is weakest and the missile kinematic is most likely be dominated by other planets, and even here, taking the first two leading terms, \(J=1\), already gives an approximation that deviates less than \(2\,\%\) from the asymptotic result, \(J \to \infty\).

Just in case you care

For a given universe with a single planet, there are circular orbits which keep a constant distance \(r = \sigma_1\) to the planet (aka small circles), as well as a constant velocity \(v_\mathrm{orb}\). By setting the planet at one of the poles, \(y_1 = (0, 0, 1)^\top\), we obtain a simple ODE from the Hamiltonian:

\[ \begin{pmatrix} \ddot x_1 \\ \ddot x_2 \\ \ddot x_3 \end{pmatrix} = f_d \!\left( \arccos(x_3) \right) \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix} \mp \lambda \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}, \]

where \(r = \arccos x_3\) is the distance to the planet. Here, a constant distance is equivalent to \(\dot x_3 = \ddot x_3 = 0\) and thus:

\[ \ddot x_3 = 0 \qquad \Leftrightarrow \qquad \lambda = \frac{f_d \!\left( \arccos(x_3) \right)}{|x_3|} = v_\lambda^2 = \text{const.} \]

Hence, a possible solution reads:

\[ \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} \cos\phi_0 \, \sin v_\lambda t \\ \cos\phi_0 \, \cos v_\lambda t \\ \sin\phi_0 \end{pmatrix}. \]

The velocity \(v_\mathrm{orb}\) for a given force field \(f_d\) and distance \(r\) can thus be obtained by setting \(\phi_0 = \pi/2 - r\) and evaluating:

\[ v_{\mathrm{orb},d} \equiv \dot x = v_\lambda \cos \phi_0 = \sqrt{ \frac{-f_d(r)}{|\cos r|}} \, \sin r \,. \]

For convenience, we provide a helper function grvx_v_scrcl() which returns \(v_\mathrm{orb}\) if provided with \(r\).

Note, that for \(d=2\mathrm{D}\) the relation simplifies significantly and allows for a straightforward evaluation of \(v_{\mathrm{orb},2\mathrm{D}}\):

\[ v_{\mathrm{orb},2\mathrm{D}} = \sqrt{ \frac{1 + \cos r}{|\cos r|}}. \]

Below, we show \(v_{\mathrm{orb},d}(r)\) for both types of potentials: Naïvely, one might expect a similar \(\sim 1/\sqrt{r}\) behavior that we know from our canonical \(\sim 1/r\) potential on earth. In curved space, however, free particles propagate on great circles and thus increasingly higher velocities are needed if the small circle converges towards a great circle at a distance of \(r = \pi / 2\). At the maximal distance \(r = \pi\), the small circle converges into a (unstable) fix point. For \(r \to 0\), our 2D potential becomes \(\sqrt{2}\) whereas \(v_{\mathrm{orb},3\mathrm{D}}\) diverges.

Symplectic integration

We use a Strang splitting \(\phi_t = \phi_{t/2}^{[1]} \circ \phi_t^{[2]} \circ \phi_{t/2}^{[1]}\) to integrate the Hamiltonian system \(H = H^{[1]} + H^{[2]}\) with

\begin{align*} H^{[1]} &= \frac{p^2}{2m} + \lambda g(\vec{q}) \\ H^{[2]} &= V(\vec{q}) + \lambda g(\vec{q}) \,. \end{align*}

Without loss of generality, we choose the scale of a time-step and the coupling constant of the potential such that we can set \(m=1\) during simulation, i.e.,

\begin{align*} \frac{t}{m} &\mapsto t , \\ Vm &\mapsto V . \end{align*}

By using this convention, the integration steps read:

\begin{align*} \begin{pmatrix} \vec q \\ \vec p \end{pmatrix} &\overset{\phi_t^{[1]}}{\longleftarrow} \begin{pmatrix} \frac{1}{p} \, \vec p \, \sin pt + \vec q \, \cos pt \\ -p \, \vec q \, \sin pt + \vec p \, \cos pt \end{pmatrix} = \begin{pmatrix} \vec q \\ \vec p \end{pmatrix} + \begin{pmatrix} \vec p \, t \operatorname{sinc} pt - 2 \vec q \sin^2 pt/2 \\ -\vec q p^2 \, t \operatorname{sinc} pt - 2 \vec p \sin^2 pt/2 \end{pmatrix} , \\ \begin{pmatrix} \vec q \\ \vec p \end{pmatrix} &\overset{\phi_t^{[2]}}{\longleftarrow} \begin{pmatrix} \vec q \\ \vec p \end{pmatrix} + \begin{pmatrix} 0 \\ \vec a t \end{pmatrix} \end{align*}

with \(\vec a = (\vec q \cdot \vec\nabla_{\!q} V(\vec q)) \, \vec q - \vec\nabla_{\!q} V(\vec q)\) and \(\operatorname{sinc}(\bullet) = (\sin \bullet) / \bullet\). ( \(\phi_t^{[1]}\) propagates the missile on a great-circle as if no gravitational forces were present and \(\phi_t^{[2]}\) corrects the momentum.)

Alternatively, the phase space can be parametrized with \((\hat q = \vec q, \hat p = \vec p / p, p)\) where \(\hat\bullet\) are unit vectors and \(p\) is the momentum magnitude:

\begin{align*} \begin{pmatrix} \hat q \\ \hat p \\ p \end{pmatrix} &\overset{\phi_t^{[1]}}{\longleftarrow} \begin{pmatrix} +\hat q \, \cos pt + \hat p \, \sin pt \\ -\hat q \, \sin pt + \hat p \, \cos pt \\ p \end{pmatrix} = \begin{pmatrix} \hat q \\ \hat p \\ p \end{pmatrix} + \begin{pmatrix} -2 \hat q \sin^2 pt/2 + \hat p \sin pt \\ -2 \hat p \sin^2 pt/2 - \hat q \sin pt \\ 0 \end{pmatrix} , \\ \begin{pmatrix} \hat q \\ \hat p \\ p \end{pmatrix} &\overset{\phi_t^{[2]}}{\longleftarrow} \begin{pmatrix} \hat q \\ \left| p \hat p + \vec a t \right|^{-1} \, \left( p \hat p + \vec a t \right) \\ |p\hat p + \vec a t| \end{pmatrix} , \end{align*}

Here, the update step for \(\hat p\) can become error-prone if \(\left| p \hat p + \vec a t \right| \ll 1\). This is the case if either \(p \hat p \approx -\vec a t\) but \(p \gg 0\) or if \(\mathcal{O}(p) = \mathcal{O}(|\vec a t |) \ll 1\). In both cases, keeping the old value of \(\hat p\) but setting \(p = 0\) gives a reasonable approximation and pins the missile in the latter case onto a fix point. However, we prefer the first formulation of \(\phi_t^{[1]}\) and \(\phi_t^{[2]}\) operating on \((\vec p, \vec q)\) since it lends itself for an efficient numerical implementation using Higham's summation scheme.

The strang splitting \(\phi_t = \phi_{t/2}^{[1]} \circ \phi_t^{[2]} \circ \phi_{t/2}^{[1]}\) is a second-order symmetric and symplectic integration scheme. We offer a set of several symmetric composition schemes during compilation, each raising the integration order by evaluating \(\phi_t\) in multiple stages and refer to them as pXsY where X and Y are the new integration order and the number of stages, respectively:

p2s1: Vanilla Strang splitting.
p4s3: Triple Jump. See, DOI:10.1103/PhysRevLett.63.9 or related publications for more information.
p4s5: Suzuki's Fractal. See DOI:10.1016/0375-9601(90)90962-N for more information.
p6s9: Kahan & Li (1997). See DOI:10.1090/S0025-5718-97-00873-9 for more information.
p8s15: Suzuki & Umeno (1993). See DOI:10.1007/978-3-642-78448-4_7 for more information.

Since the Strang splitting is symplectic, its compositions are symplectic as well.

Asymptotically, increasing the integration scheme will eventually outperform smaller time-steps. However, for practical applications and finite integration times, one should always carefully benchmark simulation speed and accuracy w.r.t. time-step size and integration order.

Spherical coordinates

Our API is designed asymmetrically w.r.t. coordinate systems. Internally, our algorithms operate on Cartesian coordinates whereas the initialization of planets and missiles has to be given in spherical coordinates. Doing so takes the burden off the caller to correctly embed vectors onto the manifold and makes the physical interpretation of parameters easier.

A point on the surface of a unit sphere has two degrees of freedom which we refer to as latitude, \(\phi\), and longitude, \(\lambda\). In Cartesian coordinates such a point is given by

\[ \vec{x} = \begin{pmatrix} \cos\phi \, \sin\lambda \\ \cos\phi \, \cos\lambda \\ \sin\phi \end{pmatrix} . \]

Similarly, a velocity vector is given by

\[ \vec{v} = \begin{pmatrix} v_1 \\ v_2 \\ v_3 \end{pmatrix} = \dot\phi \begin{pmatrix} -\sin\phi \, \sin\lambda \\ -\sin\phi \, \cos\lambda \\ \cos\phi \end{pmatrix} + \dot\lambda \, \cos \phi \begin{pmatrix} \phantom{-}\cos\lambda \\ -\sin\lambda \\ 0 \end{pmatrix} , \]

and is always orthogonal to \(\vec x\) at the same latitude and longitude. We define \(\dot\phi\) and \(\dot\lambda \, \cos\phi\) as the latitudinal velocity, grvx_vlat(), and the (scaled) longitudinal velocity, grvx_vlon(), respectively. Adding both in quadrature yields the magnitude of the Cartesian velocity,

\[ v = \sqrt{\dot\phi^2 + (\dot\lambda \, \cos \phi)^2} \,. \]

Typically, missiles are launched at the rim of a planet, where rim refers to a small circle centered at the given planet. The velocity vector of a missile launched at a distance \(\sigma\) to \((\phi, \lambda)\) is given by

\[ \vec v = v \, \boldsymbol{R}_{\phi, \lambda} \begin{pmatrix} \cos \sigma \, \sin \psi \\ \cos \sigma \, \cos \psi \\ -\sin \sigma \end{pmatrix}, \]

where the rotating matrix \(\boldsymbol{R}_{\phi, \lambda}\) rotates \(\vec{\mathrm{e}}_z = (0, 0, 1)^\top\) to latitude \(\phi\) and longitude \(\lambda\),

\[ \boldsymbol{R}_{\phi, \lambda} = \begin{pmatrix} -\cos\lambda & -\sin\phi \, \sin\lambda & \cos\phi \, \sin\lambda \\ \phantom{-}\sin\lambda & -\sin\phi \, \cos\lambda & \cos\phi \, \cos\lambda \\ 0 & \cos\phi & \sin\phi \end{pmatrix} \]

and \(\psi\) is the initial longitude prior to the rotation. This velocity vector is located at a point with latitude \(\phi' \neq \phi\) and longitude \(\lambda' \neq \lambda\),

\begin{align*} \phi' &= \arcsin x_3 \,, \\ \lambda' &= \operatorname{atan2}(x_1, x_2) \,, \end{align*}

where \((x_1, x_2, x_3)^\top\) is given by

\[ \vec x = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \boldsymbol{R}_{\phi, \lambda} \begin{pmatrix} \sin \sigma \, \sin \psi \\ \sin \sigma \, \cos \psi \\ \cos \sigma \end{pmatrix}. \]

Hence, an equivalent parametrization of \(\vec v\) is

\[ \vec{v} = \dot\phi \begin{pmatrix} -\sin\phi' \, \sin\lambda' \\ -\sin\phi' \, \cos\lambda' \\ \cos\phi' \end{pmatrix} + \dot\lambda \, \cos\phi' \begin{pmatrix} \phantom{-}\cos\lambda' \\ -\sin\lambda' \\ 0 \end{pmatrix}. \]

that allows for a straightforward extraction of the longitudinal and (scaled) latitudinal velocities:

\[ \dot\phi = \vec v \cdot \begin{pmatrix} -\sin\phi' \, \sin\lambda' \\ -\sin\phi' \, \cos\lambda' \\ \cos\phi' \end{pmatrix}, \]

\[ \dot\lambda \, \cos\phi' = \vec v \cdot \begin{pmatrix} \phantom{-}\cos\lambda' \\ -\sin\lambda' \\ 0 \end{pmatrix}. \]

Angular resolution

Let's assume that an observation \(\vec{x}_\mathrm{obs}\) on a planet at \((\phi_\mathrm{p}, \lambda_\mathrm{p})\) of an event happening somewhere in the universe is measured with a finite angular resolution \(\delta \alpha\). Hence, the exact position \(\alpha\) on a small circle with a radius \(\sigma\) around the planet is unknown:

\[ \vec{x}_\mathrm{obs} = \boldsymbol{R}_{\phi_\mathrm{p}, \lambda_\mathrm{p}} \begin{pmatrix} \sin \sigma \, \sin (\alpha + \delta \alpha) \\ \sin \sigma \, \cos (\alpha + \delta \alpha) \\ \cos \sigma \end{pmatrix}, \]

with \(\sigma = \arccos( \vec{x}_\mathrm{obs} \cdot \vec{x}_\mathrm{p})\). Consequently, the absolute error of the perturbed measurement \(\vec{x}_\mathrm{obs} = \vec{x}_\mathrm{obs}(\alpha + \delta \alpha)\) and the ground truth event \(\vec{x}_\mathrm{gt} = \vec{x}_\mathrm{obs}(\alpha + 0)\),

\[ \left| \vec{x}_\mathrm{obs} - \vec{x}_\mathrm{gt} \right| = 2 \sin \left( \frac{\delta \alpha}{2} \right) \, \sin \sigma, \]

scales linearly with \(\sin \sigma = \sqrt{ 1 - ( \vec{x}_\mathrm{obs} \cdot \vec{x}_\mathrm{p} )^2}\). We implement this kind of perturbation and expose it via grvx_perturb_measurement().

NB: The mapping between the position on a small circle \(\alpha\) and the measurement \((\phi_\mathrm{obs}, \lambda_\mathrm{obs})\) is given by \(\alpha = \operatorname{atan2}(y_1, y_2)\) with

\begin{align*} y_1 &= \cos \phi_\mathrm{obs} \, \sin(\lambda_\mathrm{p} - \lambda_\mathrm{obs}) \\ y_2 &= \cos \phi_\mathrm{p} \, \sin \phi_\mathrm{obs} - \sin \phi_\mathrm{p} \, \cos \phi_\mathrm{obs} \, \cos(\lambda_\mathrm{p} - \lambda_\mathrm{obs}) \end{align*}

where we have defined \(\vec{y} = (y_1, y_2, y_3)^\top\) as

\[ \vec y \equiv \boldsymbol{R}^{-1}_{\phi_\mathrm{p}, \lambda_\mathrm{p}} \vec{x}_\mathrm{obs} = \boldsymbol{R}^\top_{\phi_\mathrm{p}, \lambda_\mathrm{p}} \vec{x}_\mathrm{obs} \,. \]

Using the adjoint method to infer planet configurations

TL;DR: We don't expose an implementation of the adjoint method (yet).

The adjoint method is an effective tool to find the gradient of a scalar loss \(F\) w.r.t. a set of parameters of a differential equation. (Have a look into Bradley's tutorial about the adjoint method, "PDE-constrained optimization and the adjoint method" (2010), to learn more!) In our case, \(F\) could be the integrated error between the actual and a predicted trajectory, where the latter is the result of an exact simulation based on spurious planet positions \(\vec{y}_i\).

\[ F(\vec q) = \int\limits_0^T \!\mathrm{d}t \, f(\vec q, t) = \int\limits_0^T \!\mathrm{d}t \, \sum\limits_{k=1}^M \delta(t-t_k) \, \varepsilon_k(\vec q(t)), \]

where \(\varepsilon_k\) with \(k = 1, \ldots, M\) is the residual of the \(k\)-th measurement at time \(t = t_k\).

The adjoint method gives a recipe to find the trajectory of a new variable \(\vec a \in \mathbb{R}^6\), referred to as "the adjoint", that can be used to find the exact gradient

\[ \vec\nabla_{y_i}F = \int\limits_T^0 \!\mathrm{d}t \, \left( \frac{\partial \vec h}{\partial \vec{y}_i} \right)^\top \vec{a}(t) \,, \]

where \(\vec h\) summarizes the dynamics of our Hamiltonian system:

\[ \vec h(\vec q, \vec p, \vec y_1, \ldots, \vec y_n) = \begin{pmatrix} \vec p \\ -\vec\nabla_{\!q} V - \vec q \left( p^2 - \vec q \cdot \vec\nabla_{\!q} V \right) \end{pmatrix}. \]

The gradient \(\vec\nabla_{y_i}F\) can be used in a optimization framework, such as gradient descent, to infer the genuine planet configuration starting only with an initial guess. Clearly, the same method can be used to solve similar problems, e.g., finding optimal initial conditions for a missile's launch to hit a certain target.

The adjoint \(\vec{a}(t)\) is itself the solution of an ODE:

\[ \dot{\vec{a}}(t) = -\left( \frac{\partial \vec h}{\partial (\vec q, \vec p)} \right)^{\!\top} \vec a(t) + \left( \frac{\partial f}{\partial (\vec q, \vec p)} \right)^{\!\top} \]

with the initial condition \(a(T) = 0\).

Note that solving this new ODE and the integral used for obtaining the gradient requires the adjoint to be solved backwards in time. Luckily, our solver for finding \((\vec q, \vec p)\) is symmetric such that we can avoid a memory intense caching (aka checkpointing) by simply simulating a missile with inverted momentum!

The three Jacobians needed for finding the gradients are

\begin{align*} \left( \frac{\partial f}{\partial (\vec q, \vec p)} \right)^{\!\top} &= \sum_{k=1}^M \begin{pmatrix} \delta(t-t_k) \, \vec\nabla_{\!q} \varepsilon_k(\vec q(t)) \\ \boldsymbol{0} \end{pmatrix}, \\ \left( \frac{\partial \vec h}{\partial (\vec q, \vec p)} \right)^{\!\top} &= \begin{pmatrix} \boldsymbol{0} & -\vec\nabla_{\!q} \otimes \vec\nabla_{\!q} V - \boldsymbol{1} \left( p^2 - \vec q \cdot \vec\nabla_{\!q} V \right) + \vec q \otimes \vec\nabla V + \left( \vec q \otimes \vec q \right) \left( \vec\nabla_{\!q} \otimes \vec\nabla_{\!q} V \right) \\ \boldsymbol{1} & -2 \, \vec q \otimes \vec p \end{pmatrix} \\ &= \begin{pmatrix} \boldsymbol{0} & -\boldsymbol{1} p^2 \\ \boldsymbol{1} & -2 \, \vec q \otimes \vec p \end{pmatrix} - \sum_{i=1}^n \begin{pmatrix} \boldsymbol{0} & f_d(\sigma_i) \cos \sigma_i + \vec{\rho}_{i,d} \otimes \vec{y}_i \\ \boldsymbol{0} & \boldsymbol{0} \end{pmatrix}, \\ \left( \frac{\partial \vec h}{\partial \vec{y}_i} \right)^{\!\top} &= \begin{pmatrix} \boldsymbol{0} \\ -\vec\nabla_{\!q} \otimes \vec\nabla_{\!y_i} V + \left( \vec q \otimes \vec q \right) \left( \vec\nabla_{\!q} \otimes \vec\nabla_{\!y_i} V \right) \end{pmatrix}^{\!\top} \\ &= \begin{pmatrix} \boldsymbol{0} \\ f_d(\sigma_i) - \vec{\rho}_{i,d} \otimes \vec q \end{pmatrix}^{\!\top}, \end{align*}

where we introduced

\[ \vec{\rho}_{i,d} = f_d(\sigma_i) \, \vec q + f'_d(\sigma_i) \, \underbrace{\frac{\vec{y}_i - \vec q \cos \sigma_i}{\sin \sigma_i}}_{\vec{q}_{\!\perp}} \]

and already inserted the results of the Jacobians

\begin{align*} \frac{\partial}{\partial \vec q} \left( \vec\nabla_{\!q} V \right) &\equiv \vec\nabla_{\!q} \otimes \vec\nabla_{\!q} V = \sum_{i=1}^n \frac{f'_d(\sigma_i)}{\sin \sigma_i} \, \vec{y}_i \otimes \vec{y}_i \,, \\ \frac{\partial}{\partial \vec{y}_i} \left( \vec\nabla_{\!q} V \right) &\equiv \vec\nabla_{\!q} \otimes \vec\nabla_{\!y_i} V = -f_d(\sigma_i) + \frac{f'_d(\sigma_i)}{\sin \sigma_i} \vec{y}_i \otimes \vec q \,. \end{align*}

Note that \(\vec{q}_{\!\perp} \perp \vec q\), \(\vec{q}_{\!\perp} \perp (\vec q \times \vec{y}_i)\) and \(q_{\!\perp} = 1\). Numerically stable solutions of \(\vec{q}_{\!\perp} = \vec{q}_{\!\perp}^{(j)}\) are given by \(j = \operatorname{argmax} \vec q\) and

\begin{align*} \vec{q}_{\!\perp}^{(1)} &\sim \begin{pmatrix} q_2 c_2 + q_3 c_3 \\ -q_1 c_2 \\ -q_1 c_3 \end{pmatrix} & \vec{q}_{\!\perp}^{(2)} &\sim \begin{pmatrix} -q_2 c_1 \\ q_1 c_1 + q_3 c_3 \\ -q_2 c_3 \end{pmatrix} & \vec{q}_{\!\perp}^{(3)} &\sim \begin{pmatrix} -q_3 c_1 \\ -q_3 c_2 \\ q_1 c_1 + q_2 c_2 \\ \end{pmatrix} & \end{align*}

with \( \vec c = \vec q \times (\vec q \times \vec y_i)\).

We conjecture that the newly introduced ODEs should be symmetrized, e.g., using the (implicit) trapezoidal rule to stabilize the evaluation for long trajectories. Although being implicit in general, the ODE for the gradient

\[ \vec\nabla_{y_i} F = \int\limits_T^0 \!\mathrm{d}t \, \left( f_d(\sigma_i) - \vec q \otimes \vec{\rho}_{i,d} \right) \vec{a}_p \]

becomes explicit by using the Sherman-Morrison formula. However, the same does not hold for the adjoint ODE thus driving the application of the adjoint method numerically costly, in particular for \(d=3\mathrm{D}\) where the derivative \(f'_{3D}\) has to be found as well.

In this light, approximating the gradient in a finite-difference (FD) scheme appears to be favorable if the number of parameters, \(3\times n\), is small. We therefore refrain from implementing the adjoint method for the time being and motivate practioners to implement a parallelized FD scheme on top of our API. Besides the fact that a similar parallelization of the adjoint method is not straight-forward to implement, an FD scheme also allows the direct extraction of the gradients w.r.t. two spherical coordinates per planet instead of three Cartesian coordinates.

Detailed description of API

You now have all the information necessary to efficiently deal with our API. Go there to find out how to use it in practice for fun and profit.