Analysis Module (brutus.analysis)#
The analysis module implements the statistical inference machinery for brutus. It provides high-level fitting functions that combine stellar models with Bayesian inference to estimate parameters from photometric and astrometric data.
Key Components:
Individual Star Fitting:
BruteForceclass performs grid-based Bayesian inference for field starsPopulation Analysis: Functions for fitting coeval stellar populations (clusters) with mixture models
Photometric Offsets: Tools for empirical calibration of systematic photometric errors
Line-of-Sight Dust: Functions for 3-D dust mapping from stellar ensembles
Design Philosophy:
The module implements the complete Bayesian inference workflow:
Model Evaluation: Compute likelihoods over stellar model grids
Prior Application: Incorporate Galactic priors on stellar properties and dust
Optimization: Find best-fit parameters in distance-extinction space
Marginalization: Integrate over nuisance parameters
Sampling: Generate posterior samples for uncertainty quantification
Typical Usage Patterns:
For individual field stars with photometry and parallax:
from brutus.analysis import BruteForce
from brutus.core import StarGrid
from brutus.data import load_models
from brutus.utils import inv_magnitude
import numpy as np
import h5py
# Load pre-computed grid
models, labels, label_mask = load_models('grid_mist_v9.h5')
grid = StarGrid(models, labels)
# Initialize fitter
fitter = BruteForce(grid)
# Observed data (Nstars=1 in this example)
# BruteForce.fit expects linear flux densities ("maggies"), NOT magnitudes.
# Convert observed magnitudes + errors to maggies via inv_magnitude.
mag = np.array([[16.5, 15.2, 14.8, 13.5, 13.1]]) # shape (1, 5)
mag_err = np.array([[0.01, 0.01, 0.02, 0.03, 0.03]])
flux, flux_err = inv_magnitude(mag, mag_err) # maggies = 10**(-0.4*mag)
mask = np.array([[True, True, True, True, True]])
obj_ids = np.array([[1]])
parallax = np.array([2.5]) # mas
parallax_err = np.array([0.1]) # mas
# Fit and save results to HDF5
output_file = fitter.fit(
data=flux, data_err=flux_err, data_mask=mask,
data_labels=obj_ids, save_file='results.h5',
parallax=parallax, parallax_err=parallax_err,
Ndraws=250
)
# Read posterior samples from HDF5
with h5py.File(output_file, 'r') as f:
dist = f['samps_dist'][0] # (Ndraws,) distances in kpc
av = f['samps_red'][0] # (Ndraws,) A_V values
print(f"Distance: {np.median(dist)*1000:.0f} pc")
print(f"Extinction: {np.median(av):.2f} mag")
For stellar clusters with MCMC:
from brutus.analysis import isochrone_population_loglike
from brutus.core import Isochrone, StellarPop
import emcee
import numpy as np
# Initialize population model
iso = Isochrone()
pop = StellarPop(isochrone=iso)
# Observed cluster data (N stars, M filters)
obs_flux = np.array([...]) # shape (N, M)
obs_err = np.array([...]) # shape (N, M)
# Define log-likelihood for MCMC
# theta = [feh, loga, av, rv, dist]
def lnprob(theta):
return isochrone_population_loglike(
theta,
stellarpop=pop,
obs_phot=obs_flux,
obs_err=obs_err,
cluster_prob=0.9 # External membership prior
)
# Run MCMC
ndim, nwalkers = 5, 32
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob)
sampler.run_mcmc(initial_pos, 5000, progress=True)
# Extract posterior samples
samples = sampler.get_chain(discard=1000, thin=10, flat=True)
Key Methodological Features:
Brute-force grid evaluation: Systematically explores parameter space without convergence issues
Multi-stage optimization: Fast magnitude-space → accurate flux-space → Bayesian posterior
Mixture-before-marginalization: Mathematically correct treatment of field contamination in clusters
Importance sampling: Efficient posterior sampling weighted by probability
See Also:
Grid-Based Fitting - Understanding the brute-force fitting algorithm
Population-Based Modeling - Cluster population fitting methodology
Empirical Calibrations - Empirical calibration procedures
Understanding Results - Interpreting posterior distributions
Individual Star Fitting#
- class brutus.analysis.BruteForce(star_grid, verbose=True)[source]#
Bases:
objectBayesian parameter estimation for individual stars using grid-based models.
This class performs brute-force fitting over a pre-computed stellar model grid to estimate stellar parameters, distances, and extinction. It uses the StarGrid infrastructure for model management and applies Bayesian priors for robust inference.
- Parameters:
- models#
Magnitude coefficients from the grid.
- Type:
- models_labels#
Labels for each model in the grid.
- Type:
structured numpy.ndarray
- labels_mask#
Mask indicating which labels are grid parameters (True) vs predictions (False).
- Type:
See also
brutus.core.StarGridStellar model grid infrastructure
brutus.data.load_modelsLoad pre-computed grids
loglike_gridCompute log-likelihoods
logpost_gridCompute log-posteriors
Notes
The BruteForce fitter uses a two-stage approach:
Likelihood computation (loglike_grid): Optimizes distance and extinction for each grid point to find maximum likelihood
Posterior computation (logpost_grid): Integrates over distance and extinction uncertainty using Monte Carlo, applying priors for Galactic structure, dust maps, and astrometry
The fitter automatically handles: - Grid parameter vs. prediction distinction - Age weighting for proper sampling - Grid spacing corrections - Parallax constraints - Galactic structure priors - Dust map priors
Examples
Basic usage with a pre-loaded grid:
>>> from brutus.core import StarGrid >>> from brutus.analysis.individual import BruteForce >>> from brutus.data import load_models >>> >>> # Load grid >>> models, labels, params = load_models('grid_mist_v9.h5') >>> grid = StarGrid(models, labels, params) >>> >>> # Initialize fitter >>> fitter = BruteForce(grid) >>> >>> # Fit data >>> results = fitter.fit( ... data, data_err, data_mask, ... data_labels, save_file='results.h5', ... parallax=parallax_data, ... data_coords=coordinates ... )
- property nmodels#
Number of models in the grid.
- property nfilters#
Number of filters in the grid.
- get_sed_grid(indices=None, av=None, rv=None, return_flux=False)[source]#
Compute SEDs for multiple grid points simultaneously.
This is the grid-based batch computation method, distinct from StarGrid.get_seds() which handles single star synthesis.
- Parameters:
indices (array-like, optional) – Grid indices to compute SEDs for. If None, uses all models.
av (array-like or float, optional) – A(V) values for each model. If None, defaults to 0.
rv (array-like or float, optional) – R(V) values for each model. If None, defaults to 3.3.
return_flux (bool, optional) – If True, return fluxes instead of magnitudes. Default is False.
- Returns:
seds (numpy.ndarray of shape (Nmodels, Nbands)) – Computed SEDs.
rvecs (numpy.ndarray of shape (Nmodels, Nbands)) – Reddening vectors.
drvecs (numpy.ndarray of shape (Nmodels, Nbands)) – Differential reddening vectors with respect to Rv.
See also
brutus.core.sed_utils._get_sedsUnderlying SED computation
StarGrid.get_sedsSingle star SED generation
Notes
This method performs batch SED computation for multiple grid points simultaneously, which is more efficient than calling StarGrid.get_seds() repeatedly.
The scale factor relates to distance as \(s = 1/d^2\) where d is distance in parsecs. The reddening is applied as:
\[\begin{split}m(\\lambda) = m_0(\\lambda) + A_V \\cdot [r_0(\\lambda) + R_V \\cdot dr(\\lambda)]\end{split}\]where \(r_0\) and \(dr\) are the reddening vector components.
- loglike_grid(data, data_err, data_mask, avlim=(0.0, 20.0), av_gauss=(0.0, 1000000.0), rvlim=(1.0, 8.0), rv_gauss=(3.32, 0.18), av_init=None, rv_init=None, dim_prior=True, ltol=0.03, ltol_subthresh=0.01, init_thresh=0.005, parallax=None, parallax_err=None, return_vals=False, indices=None, **kwargs)[source]#
Compute log-likelihood over the stellar model grid.
This is a wrapper around the module-level loglike_grid function that uses the instance’s model grid.
- Parameters:
data (~numpy.ndarray of shape (Nfilt)) – Measured flux densities.
data_err (~numpy.ndarray of shape (Nfilt)) – Measurement errors.
data_mask (~numpy.ndarray of shape (Nfilt)) – Binary mask for valid data.
indices (array-like, optional) – Subset of model indices to use. If None, uses all models.
avlim (tuple, optional) – (min, max) bounds on A(V). Default is (0.0, 20.0).
av_gauss (tuple, optional) – (mean, std) for Gaussian prior on A(V). Default is (0.0, 1e6).
rvlim (tuple, optional) – (min, max) bounds on R(V). Default is (1.0, 8.0).
rv_gauss (tuple, optional) – (mean, std) for Gaussian prior on R(V). Default is (3.32, 0.18).
parallax (float, optional) – Parallax measurement in mas.
parallax_err (float, optional) – Parallax error in mas.
return_vals (bool, optional) – If True, return full results including covariances. Default is False.
**kwargs – Passed to optimization functions.
- Returns:
If return_vals=False –
- lnlnumpy.ndarray
Log-likelihoods for each grid point
- Ndimint
Number of dimensions (filters)
- chi2numpy.ndarray
Chi-squared values
If return_vals=True – Also includes scale, av, rv, icov_sar arrays
See also
logpost_gridCompute log-posteriors from likelihoods
_optimize_fit_magMagnitude-space optimization
_optimize_fit_fluxFlux-space optimization
Notes
This method optimizes (distance, Av, Rv) for each grid point by:
Initial magnitude-space fit for numerical stability
Iterative flux-space refinement until convergence
Optional parallax constraint during optimization
The optimization uses priors on Av and Rv but not on distance/scale (distance priors are applied in logpost_grid).
- logpost_grid(results, parallax=None, parallax_err=None, coord=None, Nmc_prior=100, lnprior=None, wt_thresh=0.001, cdf_thresh=0.002, max_models=50000, precision_shrinkage=0.0, subsample_mode='representative', lngalprior=None, lndustprior=None, dustfile=None, dlabels=None, avlim=(0.0, 20.0), rvlim=(1.0, 8.0), mem_lim=8000.0, rstate=None, apply_av_prior=True, R_solar=8.2, Z_solar=0.025, **kwargs)[source]#
Compute log-posterior over the stellar model grid.
This is a wrapper around the module-level logpost_grid function.
- Parameters:
results (tuple) – Results from loglike_grid with return_vals=True.
logpost_grid. (Other parameters are passed to)
- Return type:
Results from logpost_grid.
- fit(data, data_err, data_mask, data_labels, save_file, phot_offsets=None, parallax=None, parallax_err=None, Nmc_prior=50, avlim=(0.0, 20.0), av_gauss=None, rvlim=(1.0, 8.0), rv_gauss=(3.32, 0.18), lnprior=None, wt_thresh=0.001, cdf_thresh=0.002, Ndraws=250, apply_agewt=True, apply_grad=True, lngalprior=None, lndustprior=None, dustfile=None, apply_dlabels=True, data_coords=None, logl_dim_prior=True, ltol=0.03, ltol_subthresh=0.01, logl_initthresh=0.005, mag_max=50.0, merr_max=0.25, rstate=None, save_dar_draws=True, running_io=True, mem_lim=8000.0, max_models=50000, precision_shrinkage=0.0, subsample_mode='representative', verbose=True, R_solar=8.2, Z_solar=0.025)[source]#
Fit all input models to the input data to compute log-posteriors.
This is the main interface for fitting stellar parameters using grid-based Bayesian inference. Results are saved to an HDF5 file.
- Parameters:
data (numpy.ndarray of shape (Ndata, Nfilt)) – Observed flux densities for each object.
data_err (numpy.ndarray of shape (Ndata, Nfilt)) – Associated errors on the flux densities.
data_mask (numpy.ndarray of shape (Ndata, Nfilt)) – Binary mask (0/1) indicating whether each measurement is valid.
data_labels (numpy.ndarray of shape (Ndata, Nlabels)) – Labels for each object to be stored in the output file.
save_file (str) – Path to the output HDF5 file. The ‘.h5’ extension will be added if not present.
phot_offsets (numpy.ndarray of shape (Nfilt,), optional) – Multiplicative photometric offsets applied to data and errors.
parallax (numpy.ndarray of shape (Ndata,), optional) – Parallax measurements in mas for each object.
parallax_err (numpy.ndarray of shape (Ndata,), optional) – Errors on parallax measurements. Required if parallax is provided.
Nmc_prior (int, optional) – Number of Monte Carlo samples for prior integration. Default is 50.
avlim (tuple, optional) – (min, max) bounds on A(V). Default is (0.0, 20.0).
av_gauss (tuple, optional) – (mean, std) for Gaussian prior on A(V). If provided, this is used instead of the distance-reddening prior during fitting.
rvlim (tuple, optional) – (min, max) bounds on R(V). Default is (1.0, 8.0).
rv_gauss (tuple, optional) – (mean, std) for Gaussian prior on R(V). Default is (3.32, 0.18) based on Schlafly et al. (2016).
lnprior (numpy.ndarray of shape (Nmodel,), optional) – Log-prior for each model. If not provided, defaults to Kroupa IMF prior on initial mass (for MIST models) or PS1 luminosity function prior (for Bayestar models).
wt_thresh (float, optional) – Threshold wt_thresh * max(weight) for model selection. Default is 1e-3.
cdf_thresh (float, optional) – CDF threshold for model selection (used if wt_thresh is None). Default is 2e-3.
Ndraws (int, optional) – Number of posterior draws to save per object. Default is 250.
apply_agewt (bool, optional) – Whether to apply age weighting from MIST models. Default is True.
apply_grad (bool, optional) – Whether to apply grid spacing corrections. Default is True.
lngalprior (callable, optional) – Galactic structure prior function. If not provided, uses the default Galactic model from Green et al. (2014).
lndustprior (callable, optional) – Dust prior function with signature
f(avs, dustmap, coord, distance=None). If not provided and dustfile is given, useslogp_extinction.dustfile (str or ~brutus.dust.Bayestar, optional) – 3D dust map for extinction priors. Can be a file path (string) to a Bayestar HDF5 file, which will be loaded automatically, or a pre-loaded
Bayestarobject.apply_dlabels (bool, optional) – Whether to pass model labels to Galactic prior. Default is True.
data_coords (numpy.ndarray of shape (Ndata, 2), optional) – Galactic (l, b) coordinates for each object in degrees. Required if using default Galactic prior.
logl_dim_prior (bool, optional) – Whether to apply dimensional correction to log-likelihood. Default is True.
ltol (float, optional) – Convergence tolerance for likelihood optimization. Default is 3e-2.
ltol_subthresh (float, optional) – Sub-threshold for convergence. Default is 1e-2.
logl_initthresh (float, optional) – Initial likelihood threshold for model culling. Default is 5e-3.
mag_max (float, optional) – Maximum allowed magnitude for valid data. Default is 50.0.
merr_max (float, optional) – Maximum allowed magnitude error. Default is 0.25.
rstate (numpy.random.RandomState, optional) – Random state for reproducibility.
save_dar_draws (bool, optional) – Whether to save distance, A(V), and R(V) draws. Default is True.
running_io (bool, optional) – If True, writes results to disk after each object (safer for long runs). If False, accumulates results in memory and writes at end (faster for slow filesystems). Default is True.
mem_lim (float, optional) – Memory limit in MB for Monte Carlo sampling. Default is 8000.0.
verbose (bool, optional) – Whether to print progress to stderr. Default is True.
max_models (int, optional) – When the number of models selected for an object exceeds this, the models are subsampled (see
subsample_mode) to bound memory and runtime. Default is 50000.precision_shrinkage (float, optional) – Fractional shrinkage applied to the off-diagonal terms of the 3x3 (scale, A(V), R(V)) precision matrix before Monte Carlo sampling, which stabilizes strongly-correlated fits. 0 disables it; ~0.03 is a reasonable nonzero value. Default is 0.0.
subsample_mode (str, optional) – Strategy used to thin models when the selection exceeds
max_models:'representative'(Gumbel-max sampling weighted by likelihood) or'topk'(the highest-likelihood models). Default is'representative'.R_solar (float, optional) – Solar galactocentric radius in kpc, used by the Galactic structure prior. Default is 8.2.
Z_solar (float, optional) – Solar height above the Galactic midplane in kpc, used by the Galactic structure prior. Default is 0.025.
- Returns:
Path to the output HDF5 file.
- Return type:
Notes
The output HDF5 file contains the following datasets:
labels: Object labels (Ndata, Nlabels)model_idx: Resampled model indices (Ndata, Ndraws)ml_scale: Maximum likelihood scale factors (Ndata, Ndraws)ml_av: Maximum likelihood A(V) values (Ndata, Ndraws)ml_rv: Maximum likelihood R(V) values (Ndata, Ndraws)ml_cov_sar: Covariance matrices (Ndata, Ndraws, 3, 3)obj_log_post: Log-posteriors (Ndata, Ndraws)obj_log_evid: Log-evidence per object (Ndata,)obj_chi2min: Minimum chi-squared per object (Ndata,)obj_Nbands: Number of bands used per object (Ndata,)mc_ess: Monte Carlo effective sample size (Ndata, Ndraws). For each posterior draw, the ESS of the MC integration over (distance, Av, Rv) for that draw’s source model. Low ESS indicates poor overlap between the Gaussian proposal and the target posterior.
If save_dar_draws=True, also includes:
samps_dist: Distance draws in kpc (Ndata, Ndraws)samps_red: A(V) draws (Ndata, Ndraws)samps_dred: R(V) draws (Ndata, Ndraws)samps_logp: Log-weights for draws (Ndata, Ndraws)
Examples
>>> from brutus.analysis import BruteForce >>> from brutus.core import StarGrid >>> from brutus.data import load_models >>> >>> # Load model grid >>> models, labels, params = load_models('grid.h5') >>> grid = StarGrid(models, labels, params) >>> fitter = BruteForce(grid) >>> >>> # Fit photometry >>> results_file = fitter.fit( ... phot, phot_err, phot_mask, obj_labels, ... save_file='results.h5', ... parallax=parallax, parallax_err=parallax_err, ... data_coords=coords ... )
Population Analysis#
- brutus.analysis.isochrone_population_loglike(theta, stellarpop, obs_phot, obs_err, parallax=None, parallax_err=None, cluster_prob=0.95, dim_prior=True, outlier_model_func=None, smf_grid=None, eep_grid=None, mini_bound=0.08, eep_binary_max=480.0, return_components=False, mask=None, **outlier_kwargs)[source]#
Compute log-likelihood for coeval stellar population using isochrone fitting.
Uses the mathematically correct mixture-before-marginalization approach: 1. Generate isochrone grid over (mass, SMF) 2. Compute cluster and outlier likelihoods at each grid point 3. Apply mixture model at each grid point 4. Marginalize over (mass, SMF) with proper geometric factors 5. Sum over all objects
- Parameters:
theta (array-like, shape (6,)) – Population parameters: [feh, loga, av, rv, dist, field_frac] where field_frac is the field contamination fraction (0 to 1).
stellarpop (StellarPop object) – StellarPop model from core.populations module with get_seds method
obs_phot (array-like, shape (N_objects, N_filters)) – Observed flux densities in units of 10**(-0.4 * mag)
obs_err (array-like, shape (N_objects, N_filters)) – Flux density errors in same units
parallax (array-like, shape (N_objects,), optional) – Parallax measurements (mas)
parallax_err (array-like, shape (N_objects,), optional) – Parallax errors (mas)
cluster_prob (float, optional) – Prior probability of cluster membership. Default 0.95
dim_prior (bool, optional) – Use chi-square (True) or normal (False) likelihood. Default True
outlier_model_func (callable, optional) – Custom outlier model function
smf_grid (array-like, optional) – Secondary mass fraction grid for binary modeling
eep_grid (array-like, optional) – EEP grid for isochrone evaluation
mini_bound (float, optional) – Minimum initial mass for isochrone. Default 0.08
eep_binary_max (float, optional) – Maximum EEP for binary modeling. Default 480.0
return_components (bool, optional) – Return intermediate results for debugging. Default False
mask (array-like, shape (N_objects, N_filters), optional) – Data validity mask
**outlier_kwargs (dict) – Additional arguments passed to outlier model
- Returns:
lnl_total (float) – Total log-likelihood summed over all objects
components (dict, optional) – Intermediate results (if return_components=True) containing: - ‘lnl_total’: same as primary return value - ‘lnl_per_object’: array of per-object likelihoods - ‘isochrone_grid’: the generated grid dictionary - ‘lnl_cluster’: cluster likelihood array - ‘lnl_outlier’: outlier likelihood array - ‘lnl_mixture’: mixed likelihood array
See also
generate_isochrone_population_gridStep 1 - Grid generation
compute_isochrone_cluster_loglikeStep 2 - Cluster likelihood
compute_isochrone_outlier_loglikeStep 3 - Outlier likelihood
apply_isochrone_mixture_modelStep 4 - Mixture model
marginalize_isochrone_gridStep 5 - Marginalization
brutus.core.populations.StellarPopStellar population model
Notes
This function implements the complete mixture-before-marginalization workflow:
\[\begin{split}\\ln L(\\theta) = \\sum_i \\ln \\left[ \\int \\int \\left( w_c P_c(d_i|m, s, \\theta) + w_o P_o(d_i) \\right) dm \\, ds \\right]\end{split}\]where: - \(\\theta = [{\\rm Fe/H}, \\log {\\rm age}, A_V, R_V, d, f_{\\rm field}]\) are population parameters - \(m\) is stellar mass - \(s\) is secondary mass fraction (SMF) - \(w_c, w_o\) are mixture weights - \(P_c, P_o\) are cluster and outlier likelihoods - \(d_i\) is data for object i
The function is designed for use with MCMC samplers like emcee or nested sampling codes like dynesty. It handles errors gracefully by returning -∞ for failed computations.
Examples
Use with emcee for MCMC sampling:
>>> def lnprob(theta): ... if not in_prior_bounds(theta): ... return -np.inf ... return isochrone_population_loglike(theta, pop, flux, err) >>> >>> sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob) >>> sampler.run_mcmc(p0, nsteps)
Extract intermediate results for diagnostics:
>>> lnl, components = isochrone_population_loglike( ... theta, pop, flux, err, return_components=True ... ) >>> print(f"Per-object likelihoods: {components['lnl_per_object']}")
- brutus.analysis.generate_isochrone_population_grid(stellarpop, feh, loga, av, rv, dist, smf_grid=None, eep_grid=None, mini_bound=0.08, eep_binary_max=480.0, corr_params=None)[source]#
Generate isochrone population grid over (mass, SMF) parameter space.
- Parameters:
stellarpop (StellarPop object) – StellarPop model from core.populations module with get_seds method
feh (float) – Stellar population parameters (metallicity, log age, extinction, distance)
loga (float) – Stellar population parameters (metallicity, log age, extinction, distance)
av (float) – Stellar population parameters (metallicity, log age, extinction, distance)
rv (float) – Stellar population parameters (metallicity, log age, extinction, distance)
dist (float) – Stellar population parameters (metallicity, log age, extinction, distance)
smf_grid (array-like, optional) – Secondary mass fraction grid. Default is 21 uniform points from 0.0 to 1.0
eep_grid (array-like, optional) – EEP grid for isochrone evaluation. Default is 1000 points from 202 to 808
mini_bound (float, optional) – Minimum initial mass for evaluation. Default 0.08 solar masses
eep_binary_max (float, optional) – Maximum EEP for binary modeling. Default 480.0
corr_params (array-like, optional) – Empirical correction parameters [dtdm, drdm, msto_smooth, feh_scale]
- Returns:
grid – Dictionary containing: - ‘photometry’: array, shape (N_total_points, N_filters) - model photometry - ‘masses’: array, shape (N_total_points,) - stellar masses - ‘smf_values’: array, shape (N_total_points,) - SMF values for each point - ‘mass_jacobians’: array, shape (N_total_points,) - mass grid spacing - ‘smf_jacobians’: array, shape (N_total_points,) - SMF grid spacing - ‘grid_info’: dict with SMF grid structure information
- Return type:
See also
compute_isochrone_cluster_loglikeUse this grid for likelihood computation
StellarPop.get_sedsUnderlying SED generation
Notes
The grid is constructed by:
Looping over SMF values (binary mass ratios)
For each SMF, computing isochrone along EEP dimension
Extracting masses from the isochrone
Computing geometric jacobians (grid spacings) for proper integration
Filtering invalid models (NaN photometry from impossible binaries)
The jacobians are critical for proper marginalization - they represent the geometric factors \(dm\) and \(d({\\rm SMF})\) in the integral:
\[\begin{split}P({\\rm data}) = \\int \\int P({\\rm data}|m, {\\rm SMF}) \\, dm \\, d({\\rm SMF})\end{split}\]Binary models are only computed for EEP ≤ eep_binary_max (typically main sequence) to avoid unphysical binary configurations.
- brutus.analysis.compute_isochrone_cluster_loglike(obs_flux, obs_err, isochrone_grid, parallax=None, parallax_err=None, distance=None, dim_prior=True, mask=None)[source]#
Compute cluster membership likelihood using existing photometry infrastructure.
- Parameters:
obs_flux (array-like, shape (N_objects, N_filters)) – Observed flux densities
obs_err (array-like, shape (N_objects, N_filters)) – Flux errors
isochrone_grid (dict) – Isochrone grid from generate_isochrone_population_grid()
parallax (array-like, shape (N_objects,), optional) – Parallax measurements (mas)
parallax_err (array-like, shape (N_objects,), optional) – Parallax errors (mas)
distance (float, optional) – Population distance (pc). Required if parallax provided
dim_prior (bool, optional) – Whether to use chi-square (True) or normal (False) likelihood
mask (array-like, shape (N_objects, N_filters), optional) – Data mask (1=use, 0=skip)
- Returns:
lnl_cluster – Cluster membership log-likelihood for each grid point and object. Invalid models (NaN photometry) are assigned NaN likelihood.
- Return type:
array-like, shape (N_grid_points, N_objects)
See also
brutus.utils.photometry.phot_loglikePhotometric likelihood function
compute_isochrone_outlier_loglikeComplementary outlier likelihood
generate_isochrone_population_gridCreates the isochrone_grid input
Notes
The likelihood includes both photometric and parallax components:
\[\begin{split}\\ln L_{\\rm cluster} = \\ln L_{\\rm phot} + \\ln L_{\\rm parallax}\end{split}\]where the photometric likelihood uses either chi-square (dim_prior=True) or Gaussian (dim_prior=False) formulation.
Invalid models with NaN photometry are preserved as NaN in the output to be properly handled during marginalization (they contribute zero probability via logsumexp).
- brutus.analysis.compute_isochrone_outlier_loglike(obs_flux, obs_err, isochrone_grid=None, parallax=None, parallax_err=None, dim_prior=True, outlier_model_func=None, **outlier_kwargs)[source]#
Compute outlier likelihood with stellar-parameter-aware interface.
- Parameters:
obs_flux (array-like, shape (N_objects, N_filters)) – Observed flux densities
obs_err (array-like, shape (N_objects, N_filters)) – Flux errors
isochrone_grid (dict, optional) – Isochrone grid containing stellar parameters for potential dependence
parallax (array-like, shape (N_objects,), optional) – Parallax measurements (mas)
parallax_err (array-like, shape (N_objects,), optional) – Parallax errors (mas)
dim_prior (bool, optional) – Use chi-square (True) or uniform (False) outlier model
outlier_model_func (callable, optional) – Custom outlier model function
**outlier_kwargs (dict) – Additional arguments for outlier model
- Returns:
lnl_outlier – Outlier likelihood for each grid point and object
- Return type:
array-like, shape (N_grid_points, N_objects)
- brutus.analysis.apply_isochrone_mixture_model(lnl_cluster, lnl_outlier, cluster_prob, field_fraction)[source]#
Apply mixture model at each grid point: mixture before marginalization.
For each (grid_point, object) pair: P(data|mass,SMF) = P_cluster * P(data|cluster) + P_outlier * P(data|outlier)
- Parameters:
lnl_cluster (array-like, shape (N_grid_points, N_objects)) – Cluster membership likelihoods
lnl_outlier (array-like, shape (N_grid_points, N_objects)) – Outlier model likelihoods
cluster_prob (float) – Prior probability of cluster membership (external)
field_fraction (float) – Fraction of cluster stars that are field contaminants (fitted parameter)
- Returns:
lnl_mixture – Mixed log-likelihoods for each grid point and object
- Return type:
array-like, shape (N_grid_points, N_objects)
See also
marginalize_isochrone_gridNext step after mixture application
compute_isochrone_cluster_loglikeCluster likelihood component
compute_isochrone_outlier_loglikeOutlier likelihood component
Notes
The mixture model is applied as:
\[\begin{split}P({\\rm data}|m, {\\rm SMF}) = w_c \\cdot P_c + w_o \\cdot P_o\end{split}\]where: - \(w_c = P_{\\rm cluster} \\cdot (1 - f_{\\rm field})\) - \(w_o = 1 - w_c\) - \(P_{\\rm cluster}\) is the prior probability (cluster_prob) - \(f_{\\rm field}\) is the field contamination fraction
This is computed in log-space using logsumexp for numerical stability:
\[\begin{split}\\ln L_{\\rm mix} = \\ln(\\exp(\\ln L_c + \\ln w_c) + \\exp(\\ln L_o + \\ln w_o))\end{split}\]The key distinction from traditional approaches is that this mixture is applied before marginalization over stellar parameters, which is mathematically correct for contaminated populations.
- brutus.analysis.marginalize_isochrone_grid(lnl_mixture, mass_jacobians, smf_jacobians)[source]#
Marginalize mixed likelihoods over (mass, SMF) grid with geometric jacobians.
Performs: P(data|population_params) = ∫∫ P(data|mass,SMF) dm d(SMF)
- Parameters:
lnl_mixture (array-like, shape (N_grid_points, N_objects)) – Mixed likelihoods at each grid point
mass_jacobians (array-like, shape (N_grid_points,)) – Mass grid spacing (geometric factors for integration)
smf_jacobians (array-like, shape (N_grid_points,)) – SMF grid spacing (geometric factors for integration)
- Returns:
lnl_marginalized – Marginalized log-likelihoods for each object
- Return type:
array-like, shape (N_objects,)
See also
apply_isochrone_mixture_modelPrevious step before marginalization
generate_isochrone_population_gridProvides the jacobians
Notes
Performs the integration:
\[\begin{split}P({\\rm data}|\\theta) = \\int \\int P({\\rm data}|m, {\\rm SMF}, \\theta) \\, dm \\, d({\\rm SMF})\end{split}\]numerically using a grid-based approach:
\[\begin{split}\\ln P \\approx \\ln \\sum_{i,j} \\exp(\\ln L_{i,j}) \\cdot \\Delta m_i \\cdot \\Delta({\\rm SMF})_j\end{split}\]where: - \(\\ln L_{i,j}\) is the mixed likelihood at grid point (i,j) - \(\\Delta m_i\) is the mass grid spacing (mass_jacobians) - \(\\Delta({\\rm SMF})_j\) is the SMF grid spacing (smf_jacobians)
Invalid models (with NaN likelihood) are converted to -∞ before the logsumexp operation, so they contribute zero probability.
The jacobians represent geometric integration weights and are crucial for obtaining unbiased parameter estimates.
Photometric Offsets#
- class brutus.analysis.PhotometricOffsetsConfig(min_bands_used: int = 4, min_bands_unused: int = 3, n_bootstrap: int = 300, uncertainty_method: str = 'bootstrap_iqr', progress_interval: int = 10, use_vectorized_bootstrap: bool = True, random_seed: int | None = None, validate_inputs: bool = True)[source]#
Bases:
objectConfiguration class for photometric offsets computation.
This class encapsulates all configuration parameters and provides sensible defaults with the ability to customize behavior.
- Parameters:
min_bands_used (int, optional) – Minimum number of bands required for objects where the current band was used in fitting. Default is 4 (equivalent to >3+1).
min_bands_unused (int, optional) – Minimum number of bands required for objects where the current band was not used in fitting. Default is 3.
n_bootstrap (int, optional) – Number of bootstrap realizations for uncertainty estimation. Default is 300.
uncertainty_method (str, optional) – Method for uncertainty estimation. Options: - ‘bootstrap_std’: Standard deviation of bootstrap medians - ‘bootstrap_iqr’: Scaled interquartile range of bootstrap medians Default is ‘bootstrap_iqr’.
progress_interval (int, optional) – Print progress every N iterations. Set to 0 for no progress. Default is 10.
use_vectorized_bootstrap (bool, optional) – Use vectorized bootstrap implementation for better performance. Default is True.
random_seed (int, optional) – Random seed for reproducible results. Default is None.
validate_inputs (bool, optional) – Perform input validation. Default is True.
See also
photometric_offsetsMain function using this configuration
Notes
The configuration defaults are chosen to balance statistical robustness with computational efficiency:
min_bands_used=4: Ensures robust likelihood reweighting
min_bands_unused=3: Minimum for meaningful photometric constraints
n_bootstrap=300: Sufficient for stable uncertainty estimates
bootstrap_iqr: More robust to outliers than standard deviation
Examples
>>> config = PhotometricOffsetsConfig( ... min_bands_used=5, ... n_bootstrap=500, ... random_seed=42 ... ) >>> offsets, errors, n_used = photometric_offsets( ... phot, err, mask, models, idxs, avs, rvs, dists, ... config=config ... )
- brutus.analysis.photometric_offsets(phot: ndarray, err: ndarray, mask: ndarray, models: ndarray, idxs: ndarray, reds: ndarray, dreds: ndarray, dists: ndarray, sel: ndarray | None = None, weights: ndarray | None = None, mask_fit: ndarray | None = None, old_offsets: ndarray | None = None, dim_prior: bool = True, prior_mean: ndarray | None = None, prior_std: ndarray | None = None, verbose: bool = True, config: PhotometricOffsetsConfig | None = None, rng: Generator | None = None) Tuple[ndarray, ndarray, ndarray][source]#
Compute multiplicative photometric offsets between data and models.
This function computes photometric offsets that account for systematic differences between observed photometry and model predictions. The offsets are computed by comparing model/data flux ratios across a sample of objects, with proper uncertainty estimation and optional prior constraints.
- Parameters:
phot (np.ndarray of shape (n_objects, n_filters)) – Observed flux densities for all objects.
err (np.ndarray of shape (n_objects, n_filters)) – Associated flux errors for all objects.
mask (np.ndarray of shape (n_objects, n_filters)) – Binary mask (0/1) indicating observed bands for each object.
models (np.ndarray of shape (n_models, n_filters, n_coeffs)) – Magnitude polynomial coefficients for generating reddened photometry.
idxs (np.ndarray of shape (n_objects, n_samples)) – Model indices fit to each object.
reds (np.ndarray of shape (n_objects, n_samples)) – A(V) reddening values for each object and sample.
dreds (np.ndarray of shape (n_objects, n_samples)) – R(V) reddening curve shape values for each object and sample.
dists (np.ndarray of shape (n_objects, n_samples)) – Distance values (kpc) for each object and sample.
sel (np.ndarray of shape (n_objects), optional) – Boolean selection of objects to use. Default uses all objects.
weights (np.ndarray of shape (n_objects, n_samples), optional) – Sample weights for each object. Default uses uniform weights.
mask_fit (np.ndarray of shape (n_filters), optional) – Boolean mask indicating which filters were used in fitting. Default assumes all filters were used.
old_offsets (np.ndarray of shape (n_filters), optional) – Previous offsets to remove before computing new ones. Default is no previous offsets.
dim_prior (bool, optional) – Whether to apply dimensionality prior in likelihood reweighting. Default is True.
prior_mean (np.ndarray of shape (n_filters), optional) – Gaussian prior means for offsets. Must be provided with prior_std.
prior_std (np.ndarray of shape (n_filters), optional) – Gaussian prior standard deviations for offsets.
verbose (bool, optional) – Whether to print progress information. Default is True.
config (PhotometricOffsetsConfig, optional) – Configuration object with analysis parameters. Default uses standard configuration.
rng (np.random.Generator, optional) – Random number generator for reproducible results. Default creates new generator.
- Returns:
offsets (np.ndarray of shape (n_filters)) – Multiplicative photometric offsets (model/data ratios).
offset_errors (np.ndarray of shape (n_filters)) – Uncertainties on the photometric offsets.
n_objects_used (np.ndarray of shape (n_filters)) – Number of objects used to compute each offset.
Examples
>>> import numpy as np >>> from brutus.analysis.offsets import photometric_offsets >>> >>> # Mock data for demonstration >>> n_obj, n_filt, n_samp = 100, 5, 50 >>> phot = np.random.uniform(0.1, 10, (n_obj, n_filt)) >>> err = 0.1 * phot >>> mask = np.random.choice([0, 1], (n_obj, n_filt), p=[0.1, 0.9]) >>> >>> # Mock fitted parameters >>> models = np.random.random((1000, n_filt, 3)) >>> idxs = np.random.randint(0, 1000, (n_obj, n_samp)) >>> reds = np.random.uniform(0, 2, (n_obj, n_samp)) >>> dreds = np.random.uniform(2.5, 4.5, (n_obj, n_samp)) >>> dists = np.random.uniform(0.1, 10, (n_obj, n_samp)) >>> >>> # Compute offsets >>> offsets, errors, n_used = photometric_offsets( ... phot, err, mask, models, idxs, reds, dreds, dists ... ) >>> print(f"Computed offsets: {offsets}")
See also
PhotometricOffsetsConfigConfiguration options
brutus.core.sed_utils.get_sedsModel SED generation
brutus.utils.photometry.phot_loglikeLikelihood computation
brutus.analysis.individual.BruteForceSource of fitted parameters
Notes
The photometric offset for each band is computed as:
Generate model SEDs for all fitted objects and posterior samples
Scale by distance: \(F_{\\rm model} = F_0 / d^2\)
Compute flux ratios: \(r = F_{\\rm model} / F_{\\rm obs}\)
Reweight samples: For bands used in fitting, recompute \(P(M|D_{-i})\) excluding band i to avoid circularity
Bootstrap: Resample objects and models with weights, compute median
Uncertainty: From bootstrap distribution (IQR or std)
Apply priors (optional): Bayesian combination with prior
The reweighting in step 4 is critical: if a band was used in the original fit, including it in offset computation would create a circular dependency. Instead, we recompute posteriors excluding that band.
The offsets should be applied as:
\[\begin{split}F_{\\rm corrected} = F_{\\rm observed} \\times {\\rm offset}\end{split}\]For iterative refinement, provide old_offsets from previous iteration.
References
The bootstrap methodology follows standard non-parametric uncertainty estimation. The likelihood reweighting approach ensures unbiased estimates for bands included in the original fit.
Line-of-Sight Dust#
- brutus.analysis.los_clouds_priortransform(u, rlims=(0.0, 6.0), dlims=(4.0, 19.0), pb_params=(-3.0, 0.7, -inf, 0.0), s_params=(-3.0, 0.3, -inf, 0.0), dust_template=False, nlims=(0.2, 2))[source]#
Transform unit cube samples to physical parameters for LOS dust fitting.
The “prior transform” for the LOS fit that converts from draws on the N-dimensional unit cube to samples from the prior. Used in nested sampling methods. Assumes uniform priors for distance and reddening and a (truncated) log-normal in outlier fraction.
- Parameters:
u (array_like, shape (Nparams,)) – The Nparams values drawn from the unit cube. Contains the portion of outliers P_b, followed by the foreground smoothing sfore and background smoothing sback, followed by the foreground reddening fred, followed by a series of (dist, red) pairs for each “cloud” along the LOS.
rlims (tuple of float, optional) – The reddening bounds within which we’d like to sample. Default is (0., 6.), which assumes reddening is in units of A_V.
dlims (tuple of float, optional) – The distance bounds within which we’d like to sample. Default is (4., 19.), which assumes distance is in units of distance modulus.
pb_params (tuple of float, optional) – Mean, standard deviation, lower bound, and upper bound for a truncated log-normal distribution used as a prior for the outlier model. The default is (-3., 0.7, -np.inf, 0.), which corresponds to a mean of 0.05, a standard deviation of a factor of 2, a lower bound of 0, and an upper bound of 1.
s_params (tuple of float, optional) – Mean, standard deviation, lower bound, and upper bound for a truncated log-normal distribution used as a prior for the smoothing along the reddening axis (in %). The default is (-3., 0.3, -np.inf, 0.), which corresponds to a mean of 0.05, a standard deviation of a factor of 1.35, a lower bound of 0, and an upper bound of 1.
dust_template (bool, optional) – Whether or not to use a spatial distribution for the dust based on a particular template. If true, dust along the line of sight will be in terms of rescalings of the template rather than A_V. Default is False.
nlims (tuple of float, optional) – Lower and upper bounds for the uniform prior for the rescaling applied to the Planck spatial reddening template. Default is (0.2, 2.).
- Returns:
x – The transformed physical parameters in order: [pb, s_fore, s_back, A_V_fore, d1, A_V1, d2, A_V2, …] where di are sorted in increasing order.
- Return type:
ndarray, shape (Nparams,)
See also
los_clouds_loglike_samplesLikelihood function for these parameters
kernel_gaussGaussian smoothing kernel
kernel_lorentzLorentzian smoothing kernel
Notes
The prior distributions are:
Outlier fraction (P_b): Truncated log-normal with median ~0.05
Smoothing (s_fore, s_back): Truncated log-normal with median ~0.05
Foreground extinction: Uniform over rlims
Cloud distances: Uniform over dlims, sorted in increasing order
Cloud extinctions: Uniform over rlims, ordered by distance
The sorting ensures clouds are ordered by increasing distance, which is required for the cumulative extinction calculation.
The log-normal priors on outlier fraction and smoothing concentrate probability near zero while allowing occasional larger values.
Examples
>>> import numpy as np >>> # For 1-cloud model: pb, s0, s, fred, dist1, red1 (6 parameters) >>> u = np.random.uniform(0, 1, 6) >>> params = los_clouds_priortransform(u) >>> print(f"Outlier fraction: {params[0]:.3f}") >>> print(f"Cloud distance (DM): {params[4]:.1f}") >>> print(f"Cloud extinction (A_V): {params[5]:.2f}")
- brutus.analysis.los_clouds_loglike_samples(theta, dsamps, rsamps, kernel='gauss', rlims=(0.0, 6.0), template_reds=None, Ndraws=25, additive_foreground=False, monotonic=True)[source]#
Compute log-likelihood for multi-cloud extinction model along line-of-sight.
Compute the log-likelihood for the cumulative reddening along the line of sight (LOS) parameterized by theta, given a set of input reddening and distance draws. Assumes a uniform outlier model in distance and reddening across our binned posteriors.
- Parameters:
theta (array_like, shape (Nparams,)) – A collection of parameters that characterizes the cumulative reddening along the LOS. Contains the fraction of outliers P_b followed by the fractional reddening smoothing for the foreground s0 and background s followed by the foreground reddening fred followed by a series of (dist, red) pairs for each “cloud” along the LOS.
dsamps (array_like, shape (Nobj, Nsamps)) – Distance samples for each object. Follows the units used in theta.
rsamps (array_like, shape (Nobj, Nsamps)) – Reddening samples for each object. Follows the units in theta.
kernel (str or callable, optional) – The kernel used to weight the samples along the LOS. If a string is passed, a pre-specified kernel will be used. Options include ‘lorentz’, ‘gauss’, and ‘tophat’. Default is ‘gauss’.
rlims (tuple of float, optional) – The reddening bounds within which we’d like to sample. Default is (0., 6.), which assumes reddening is in units of A_V.
template_reds (array_like, shape (Nobj,), optional) – Reddenings for each star based on a spatial dust template. If not provided, the same reddening value in a given distance bin will be fit to all stars. If provided, a rescaled version of the individual reddenings will be fit instead.
Ndraws (int, optional) – The number of draws to use for each star. Default is 25.
additive_foreground (bool, optional) – Whether the foreground is treated as just another value or added to all background values. Default is False.
monotonic (bool, optional) – Whether to enforce monotonicity in the fits so that the values must get larger with distance. Default is True.
- Returns:
loglike – The computed log-likelihood.
- Return type:
Examples
>>> import numpy as np >>> # Generate synthetic stellar samples >>> nstars = 10 >>> nsamps = 50 >>> dsamps = np.random.uniform(6, 12, (nstars, nsamps)) # distance modulus >>> rsamps = np.random.uniform(0, 2, (nstars, nsamps)) # A_V extinction >>> >>> # 1-cloud model parameters: [pb, s0, s, fred, dist1, red1] >>> theta = [0.1, 0.05, 0.05, 0.2, 8.0, 0.5] >>> >>> # Compute likelihood >>> loglike = los_clouds_loglike_samples(theta, dsamps, rsamps) >>> print(f"Log-likelihood: {loglike:.2f}")
- brutus.analysis.kernel_tophat(reds, kp)[source]#
Compute weighted log-probabilities using a Top-Hat kernel.
The kernel defines uniform dispersion of extinction values around the cloud mean: samples within
mean +/- half_bin_widthare equally likely, and samples outside are assigned zero probability.- Parameters:
- Returns:
logw – Log-weights for each sample.
- Return type:
ndarray, shape (Nsamps,)
Examples
>>> import numpy as np >>> reds = np.array([0.1, 0.3, 0.5, 0.7, 0.9]) >>> kp = (0.5, 0.2) # mean=0.5, half-width=0.2, so range [0.3, 0.7] >>> logw = kernel_tophat(reds, kp) >>> # Samples within [0.3, 0.7] should have higher weights
- brutus.analysis.kernel_gauss(reds, kp)[source]#
Compute weighted log-probabilities using a Gaussian kernel.
The kernel defines Gaussian dispersion of extinction values around the cloud mean: samples are weighted by a normal distribution centered on the mean extinction with the given standard deviation.
- Parameters:
- Returns:
logw – Log-weights for each sample.
- Return type:
ndarray, shape (Nsamps,)
Examples
>>> import numpy as np >>> reds = np.array([0.1, 0.3, 0.5, 0.7, 0.9]) >>> kp = (0.5, 0.1) # mean=0.5, std=0.1 >>> logw = kernel_gauss(reds, kp) >>> # Sample at 0.5 should have highest weight
- brutus.analysis.kernel_lorentz(reds, kp)[source]#
Compute weighted log-probabilities using a Lorentzian kernel.
The kernel defines heavy-tailed dispersion of extinction values around the cloud mean: samples are weighted by a Cauchy/Lorentzian distribution centered on the mean extinction, allowing larger deviations from the mean than a Gaussian kernel.
- Parameters:
- Returns:
logw – Log-weights for each sample.
- Return type:
ndarray, shape (Nsamps,)
Examples
>>> import numpy as np >>> reds = np.array([0.1, 0.3, 0.5, 0.7, 0.9]) >>> kp = (0.5, 0.1) # mean=0.5, HWHM=0.1 >>> logw = kernel_lorentz(reds, kp) >>> # Sample at 0.5 should have highest weight, with heavy tails