Math & AI

This page explains the mathematics used in the visualization and how AI concepts are applied conceptually (no training is required — we only use the ideas).

Probability distributions

From the CMS event data we have particles with position (x, y, z) and momentum (px, py, pz) at each event_index. We treat event_index as a discrete time label. To get a probability density we aggregate over many particles (and optionally over nearby events): we assign a scalar value proportional to the density of particles in a region, e.g. via a simple kernel or histogram. So at each point in (reduced) space we have a number ρ ≥ 0 such that (after normalization) it behaves like a probability density.

In the standard mode we use this density directly: we map ρ to brightness or to a radius (e.g. circle size) so that “more probability” means brighter or larger.

Vector field transformation (VSPD)

In VSPD we convert the scalar probability into a vector field. One simple approach: use the gradient of the (smoothed) density, so that vectors point in the direction of steepest increase of probability, with magnitude proportional to the rate of change. Another is to use the momentum field: at each cell we average the momentum of particles in that cell (or nearby), and display that as a vector. So we get a field v(x, y) with both direction and magnitude.

In the demo we derive vectors from the data: e.g. from position differences between consecutive event indices (displacement) or from momentum (px, py), then scale and smooth them so the arrows show “probability flow” direction and strength. The exact mapping is chosen so the animation is deterministic and reproducible.

Discrete time evolution (Δt)

Time is represented by event_index. The step between two consecutive event indices is our discrete Δt. We do not use a continuous differential equation in the browser; we advance by one event index at a time. So:

Particle positions (or densities) at step n are updated to step n + 1 using the data at event_index = n + 1.
Vector fields are recomputed at each step from the data at that step (and optionally smoothed in space or over a short window of event indices).

This keeps the visualization deterministic: same data and same step index always give the same frame.

AI concepts used (conceptually)

We do not train any model here. The following AI/ML ideas are used only as concepts to design the pipeline:

Dimensionality reduction — The raw data are 3D position + 3D momentum (and energy). For 2D visualization we project to (x, y) or (px, py), or use a simple projection. So we “reduce” dimensions to what the screen can show, similar in spirit to PCA or t-SNE reducing high-dimensional data to 2D for visualization.
Vector smoothing — To avoid noisy arrows we smooth the vector field (e.g. local averaging or low-pass filter). This is analogous to regularization in ML: we trade a bit of fidelity for stability and interpretability.
Pattern recognition — By visualizing both scalar and vector views we make it easier for the eye to recognize patterns (e.g. jets, flows, clusters). The idea is the same as in ML: represent data in a way that highlights structure; here the “model” is just the deterministic mapping from CERN data to pixels and arrows.