Population-Based Modeling#
This page explains the isochrone-based population modeling approach used in brutus for fitting stellar clusters and coeval populations. For individual field star fitting, see Grid-Based Fitting.
Overview#
Stellar clusters offer powerful constraints because all member stars share the same:
Age (coeval formation)
Metallicity (common birth cloud)
Distance (spatial coherence)
Extinction (similar sightline)
brutus exploits these constraints by fitting a single isochrone to all cluster members simultaneously, rather than fitting each star independently. This dramatically reduces degeneracies and improves parameter precision.
The key challenge is field contamination: not all stars in the field of view are cluster members. brutus handles this using a mixture model that probabilistically separates cluster members from field interlopers.
For full details on the cluster model, see Speagle et al. (2025) Appendix D.
The Likelihood Function#
Population Parameters#
The population is described by six parameters:
where:
[Fe/H]: Metallicity (dex)
log(age): Logarithm of age in years (e.g., 9.0 = 1 Gyr)
A_V: V-band extinction (magnitudes)
R_V: Extinction curve shape parameter
d: Distance (parsecs)
f_field: Field contamination fraction (0 to 1)
Cluster Likelihood#
For each star, the cluster membership likelihood compares observed photometry to isochrone predictions:
The photometric likelihood uses a chi-square formulation:
The parallax likelihood (if available) constrains distance:
Outlier Likelihood#
Field contaminants are modeled with an adaptive outlier distribution. By default, brutus uses a chi-square outlier model that assigns probability based on how poorly the data fits any reasonable model:
where \(\chi^2_{\rm max}\) is the chi-square value at a cumulative probability threshold (default 99.9%), and \(k_i\) is the number of photometric bands plus parallax if available. This adaptive threshold is more conservative than a uniform outlier model, retaining borderline members.
Mixture Model#
The mixture model combines cluster and outlier probabilities at each grid point:
where the weights are:
\(w_c = P_{\rm mem} \times (1 - f_{\rm field})\) — probability of being a true cluster member
\(w_o = 1 - w_c\) — probability of being a field star or outlier
Here \(P_{\rm mem}\) is the external membership probability (e.g., from proper motion analysis) and \(f_{\rm field}\) is the fitted field fraction.
Marginalization#
After applying the mixture model, brutus marginalizes over stellar parameters (mass \(M\) and secondary mass fraction \(q\)):
This integral is computed numerically over a grid of (EEP, SMF) points, with Jacobian corrections for the non-uniform mass spacing along the isochrone.
Total Likelihood#
The total population likelihood is the product over all stars:
Binary Star Modeling#
brutus includes binary stars through the secondary mass fraction (SMF or \(q\)) parameter:
where \(q = 0\) is a single star and \(q = 1\) is an equal-mass binary.
Binary photometry is computed by adding the fluxes of both components:
The default SMF grid uses 21 uniformly-spaced values from 0.0 to 1.0.
Note
Binary modeling is restricted to main-sequence stars (EEP < 480) to avoid unphysical configurations like two red giants in a close binary.
Basic Usage#
The isochrone_population_loglike() function computes the log-likelihood for a set of population parameters:
import numpy as np
from brutus.core import Isochrone, StellarPop
from brutus.analysis import isochrone_population_loglike
# Set up population model with specific filters
filters = ['Gaia_G_MAW', 'Gaia_BP_MAWf', 'Gaia_RP_MAW',
'2MASS_J', '2MASS_H', '2MASS_Ks']
iso = Isochrone()
pop = StellarPop(iso, filters=filters)
# Example observed data (N_stars=100, N_filters=6)
# Flux densities in units of 10**(-0.4 * mag)
flux = np.random.rand(100, 6) * 1e-3 # Replace with real data
flux_err = flux * 0.02 # 2% errors
parallax = np.full(100, 2.0) # 2 mas (500 pc)
parallax_err = np.full(100, 0.1) # 0.1 mas errors
# Population parameters: [feh, loga, av, rv, dist, field_frac]
theta = np.array([0.0, 9.0, 0.3, 3.3, 500.0, 0.05])
# Compute log-likelihood
lnl = isochrone_population_loglike(
theta,
stellarpop=pop,
obs_phot=flux,
obs_err=flux_err,
parallax=parallax,
parallax_err=parallax_err,
)
print(f"Log-likelihood: {lnl:.2f}")
Using with Samplers#
The likelihood function is designed for use with external MCMC or optimization codes.
Optimization (Point Estimate)#
import numpy as np
from scipy.optimize import minimize
from brutus.core import Isochrone, StellarPop
from brutus.analysis import isochrone_population_loglike
# Set up model (as above)
filters = ['Gaia_G_MAW', 'Gaia_BP_MAWf', 'Gaia_RP_MAW',
'2MASS_J', '2MASS_H', '2MASS_Ks']
iso = Isochrone()
pop = StellarPop(iso, filters=filters)
# Your observed data
# flux, flux_err, parallax, parallax_err = load_your_data()
def neg_lnlike(theta):
return -isochrone_population_loglike(
theta, pop, flux, flux_err,
parallax=parallax,
parallax_err=parallax_err
)
# Initial guess: [feh, loga, av, rv, dist, field_frac]
theta0 = np.array([0.0, 9.0, 0.3, 3.3, 1000.0, 0.05])
result = minimize(neg_lnlike, theta0, method='Nelder-Mead')
print(f"Best-fit parameters: {result.x}")
MCMC Sampling (Full Posterior)#
import numpy as np
import emcee
from brutus.core import Isochrone, StellarPop
from brutus.analysis import isochrone_population_loglike
# Set up model
filters = ['Gaia_G_MAW', 'Gaia_BP_MAWf', 'Gaia_RP_MAW',
'2MASS_J', '2MASS_H', '2MASS_Ks']
iso = Isochrone()
pop = StellarPop(iso, filters=filters)
# Your observed data
# flux, flux_err, parallax, parallax_err = load_your_data()
def lnprior(theta):
feh, loga, av, rv, dist, f_field = theta
# Uniform priors with bounds
if not (-2.5 < feh < 0.5):
return -np.inf
if not (6.0 < loga < 10.5):
return -np.inf
if not (0.0 <= av < 5.0):
return -np.inf
if not (2.0 < rv < 5.0):
return -np.inf
if not (100 < dist < 10000):
return -np.inf
if not (0.0 <= f_field < 0.5):
return -np.inf
return 0.0
def lnprob(theta):
lp = lnprior(theta)
if not np.isfinite(lp):
return -np.inf
return lp + isochrone_population_loglike(
theta, pop, flux, flux_err,
parallax=parallax,
parallax_err=parallax_err
)
# Initialize walkers (use many more walkers than parameters)
ndim = 6
nwalkers = 128 # At least 2*ndim, but more is better
theta0 = np.array([0.0, 9.0, 0.3, 3.3, 1000.0, 0.05])
p0 = theta0 + 1e-3 * np.random.randn(nwalkers, ndim)
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob)
sampler.run_mcmc(p0, 5000, progress=True)
# Extract samples (discard burn-in)
samples = sampler.get_chain(discard=1000, thin=10, flat=True)
Nested Sampling#
import numpy as np
import dynesty
from brutus.core import Isochrone, StellarPop
from brutus.analysis import isochrone_population_loglike
# Set up model
filters = ['Gaia_G_MAW', 'Gaia_BP_MAWf', 'Gaia_RP_MAW',
'2MASS_J', '2MASS_H', '2MASS_Ks']
iso = Isochrone()
pop = StellarPop(iso, filters=filters)
# Your observed data
# flux, flux_err, parallax, parallax_err = load_your_data()
def prior_transform(u):
"""Transform unit cube to parameter space."""
theta = np.zeros(6)
theta[0] = -2.5 + 3.0 * u[0] # feh: [-2.5, 0.5]
theta[1] = 6.0 + 4.5 * u[1] # loga: [6.0, 10.5]
theta[2] = 5.0 * u[2] # av: [0, 5]
theta[3] = 2.0 + 3.0 * u[3] # rv: [2, 5]
theta[4] = 100 + 9900 * u[4] # dist: [100, 10000] pc
theta[5] = 0.5 * u[5] # f_field: [0, 0.5]
return theta
def lnlike(theta):
return isochrone_population_loglike(
theta, pop, flux, flux_err,
parallax=parallax,
parallax_err=parallax_err
)
sampler = dynesty.NestedSampler(lnlike, prior_transform, ndim=6)
sampler.run_nested()
results = sampler.results
Per-Object Membership Probabilities#
The cluster_prob parameter specifies the prior probability that each star is a cluster member (before considering photometry). This can be:
A scalar (same for all stars):
cluster_prob=0.95A per-object array from external analysis (e.g., proper motions)
# From proper motion / radial velocity analysis
# membership_prob = compute_kinematic_membership(proper_motions, radial_velocities)
membership_prob = np.random.uniform(0.8, 1.0, size=100) # Example
lnl = isochrone_population_loglike(
theta, pop, flux, flux_err,
cluster_prob=membership_prob, # shape (N_stars,)
parallax=parallax,
parallax_err=parallax_err,
)
This allows incorporating kinematic membership information while still fitting for additional photometric field contamination via f_field.
Diagnostics#
Use return_components=True to inspect intermediate results:
lnl, components = isochrone_population_loglike(
theta, pop, flux, flux_err,
return_components=True
)
# Available diagnostic outputs
print(f"Total log-likelihood: {components['lnl_total']:.2f}")
# Per-object likelihoods (identify problem stars)
lnl_per_star = components['lnl_per_object'] # shape (N_stars,)
worst_stars = np.argsort(lnl_per_star)[:5]
print(f"Worst-fit stars: {worst_stars}")
# Cluster vs outlier likelihoods (check mixture)
# These have shape (N_grid_points, N_objects)
lnl_cluster = components['lnl_cluster']
lnl_outlier = components['lnl_outlier']
lnl_mixture = components['lnl_mixture']
# The isochrone grid used
grid = components['isochrone_grid']
Performance Considerations#
Timing Benchmarks#
The following benchmarks were measured on a typical workstation (2024) using 3 Gaia bands. All times are per evaluation of the full log-likelihood.
Pipeline stage breakdown (100 stars, default grid):
Stage |
Time (ms) |
Fraction |
|---|---|---|
Grid generation (fixed cost) |
15 |
17% |
Cluster loglike |
50-90 |
54% |
Mixture model |
25 |
15% |
Marginalization |
23 |
14% |
Outlier loglike |
<1 |
<1% |
Total |
~120 |
Scaling with number of stars:
N_stars |
Time per eval (ms) |
Per-star cost (ms) |
|---|---|---|
10 |
40 |
4.0 |
50 |
65 |
1.3 |
100 |
120 |
1.2 |
200 |
200 |
1.0 |
500 |
600 |
1.2 |
Grid generation is a fixed cost (~15 ms), so per-star cost decreases for larger samples. Adding more photometric bands increases cost modestly.
Grid Resolution and Convergence#
The default grid uses 1000 EEP points \(\times\) 21 SMF values. After applying mass bounds and binary constraints (EEP < 480 for binaries), the effective grid size is typically 7,000-10,000 points per evaluation.
EEP convergence (measured as \(\Delta \ln \mathcal{L}\) vs 5000-point reference, 100 stars):
N_EEP |
Grid size |
\(\Delta \ln \mathcal{L}\) |
Time (ms) |
|---|---|---|---|
200 |
1,368 |
-1.4 |
11 |
500 |
3,451 |
-0.4 |
26 |
1000 (default) |
6,916 |
-0.2 |
75 |
2000 |
13,862 |
-0.05 |
170 |
5000 (reference) |
34,581 |
0 |
529 |
EEP convergence is fast: 500 points are within \(|\Delta \ln \mathcal{L}| < 0.5\) of the reference, and 1000 points are essentially converged.
SMF convergence (measured vs 31-point uniform reference, 100 stars):
SMF config |
Grid size |
\(\Delta \ln \mathcal{L}\) |
Time (ms) |
|---|---|---|---|
Singles only (N=1) |
1,934 |
+0.8 |
14 |
7 uniform |
7,046 |
+12.6 |
67 |
15 uniform |
13,862 |
+3.8 |
165 |
21 uniform (default) |
18,974 |
+1.7 |
269 |
31 uniform (reference) |
27,494 |
0 |
370 |
SMF convergence is slower than EEP — at least 15-21 uniformly-spaced points are needed for \(|\Delta \ln \mathcal{L}| < 4\) per 100 stars. Uniform spacing outperforms non-uniform grids at equal point counts.
Custom grids can be specified for faster iteration during development:
# Coarser grid for testing (~5x faster)
coarse_eep = np.linspace(202, 808, 200)
coarse_smf = np.linspace(0, 1, 7)
lnl = isochrone_population_loglike(
theta, pop, flux, flux_err,
eep_grid=coarse_eep,
smf_grid=coarse_smf,
)
Parallelization
The likelihood function itself is not internally parallelized. For MCMC, parallelize at the sampler level:
from multiprocessing import Pool
# Using variables from MCMC example above
with Pool(processes=8) as pool:
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, pool=pool)
sampler.run_mcmc(p0, 5000, progress=True)
Limitations#
Single population: Assumes one coeval population plus field. Multiple populations (e.g., two clusters) require extensions.
No proper motion modeling: Cluster membership from kinematics must be provided externally via
cluster_prob.Binary simplifications: Only photometric binaries; no orbital dynamics or mass transfer.
Extinction uniformity: All cluster members share the same \(A_V\). Differential reddening requires extensions.
Technical Notes#
Mixture-before-marginalization: brutus applies the mixture model (cluster vs outlier) at each grid point before marginalizing over stellar parameters. This is the mathematically correct approach for contaminated populations:
The alternative—marginalizing first, then mixing—can produce biased results because it compares integrated cluster probabilities against point outlier probabilities. See Appendix D of Speagle et al. (2025) for details.
See Also#
Grid-Based Fitting - Individual star fitting with
BruteForceStellar Models and Photometry - MIST models and available filters
Prior Distributions - Prior probability distributions
References#
Speagle et al. (2025), “Deriving Stellar Properties, Distances, and Reddenings using Photometry and Astrometry with BRUTUS”, arXiv:2503.02227 (see Appendix D for cluster model details)