Tutorial 5: Fitting Individual Stars with BruteForce

Tutorial 5: Fitting Individual Stars with BruteForce#

This tutorial demonstrates how to fit individual stars using brutus’s BruteForce class, which performs fast Bayesian inference over pre-computed model grids.

Topics Covered#

Data preparation (flux units, parallaxes, coordinates)
BruteForce class setup and configuration
Running fits with various options
Result visualization and interpretation

Prerequisites#

This tutorial requires the following brutus data files:

grid_mist_v9.h5 - MIST model grid
offsets_mist_v9.txt - Photometric offsets
Orion_l209.1_b-19.9.h5 - Example Orion field data (included in tutorials/)
bayestar2019_v1.h5 (optional) - 3D dust map

If you don’t have these files, run the download cell below.

# Download required data files (if not already available locally)
from tutorial_utils import find_brutus_data_file
from brutus.data import fetch_grids, fetch_offsets

# Download grid and offsets if needed
fetch_grids()       # Downloads MIST model grid
fetch_offsets('mist_v9')  # Downloads photometric offsets

# Check dustmap availability (large file, ~2 GB)
try:
    dustmap_path = find_brutus_data_file('bayestar2019_v1.h5')
    print(f"Dustmap found: {dustmap_path}")
except FileNotFoundError:
    from brutus.data import fetch_dustmaps
    fetch_dustmaps()  # Downloads Bayestar dust map
    print("Dustmap downloaded.")

# Note: The Orion field data is included in the tutorials/ directory.

# Imports and setup
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

from tutorial_utils import (
    setup_tutorial,
    find_brutus_data_file,
    save_figure as _save_fig,
    print_section,
    load_orion_data,
)

info = setup_tutorial(5, title="Tutorial 05: Fitting Individual Stars with BruteForce")
plots_dir = info['plot_dir']


def save_figure(fig, name):
    """Save figure to this tutorial's plot directory."""
    _save_fig(fig, 5, name)

1. Data Preparation for BruteForce#

BruteForce requires specific data formats:

Photometry in flux units (maggies = 10^(-0.4 * magnitude))
Parallaxes in mas (milliarcseconds) from Gaia
Galactic coordinates (l, b) for applying priors

Let’s load and examine the Orion field data.

from brutus.utils import inv_magnitude, magnitude

# Load example Orion field data
print("Loading Orion field data...")
data = load_orion_data()

print(f"\nLoaded {len(data['phot'])} sources")
print(f"  Photometry shape: {data['phot'].shape}")
print(f"  Filters: Pan-STARRS (grizy) + 2MASS (JHKs) = 8 bands")
print(f"  Valid parallaxes: {np.sum(np.isfinite(data['parallax']))}")
print(f"  Coordinate range: l=[{data['coords'][:, 0].min():.1f}, {data['coords'][:, 0].max():.1f}], "
      f"b=[{data['coords'][:, 1].min():.1f}, {data['coords'][:, 1].max():.1f}] deg")

Loading Orion field data...

Loaded 1642 sources
  Photometry shape: (1642, 8)
  Filters: Pan-STARRS (grizy) + 2MASS (JHKs) = 8 bands
  Valid parallaxes: 1201
  Coordinate range: l=[204.5, 204.9], b=[-19.4, -19.0] deg

# Add systematic photometric uncertainties in quadrature
# The MIST isochrone corrections have systematic uncertainties of ~0.02 mag
# in the optical and ~0.03 mag in the NIR (Speagle et al. 2025, Table 5).
# These should be added to the measurement errors before fitting.

# Systematic uncertainties per band (mag)
sys_err_mag = np.array([0.02, 0.02, 0.02, 0.02, 0.02,   # PS grizy
                        0.03, 0.03, 0.03])                # 2MASS JHKs

# Convert to flux space and add in quadrature
sys_err_flux = data['phot'] * sys_err_mag[np.newaxis, :] * np.log(10) / 2.5
data['err'] = np.sqrt(data['err']**2 + sys_err_flux**2)

# Show the effect
mag_err_after = 2.5 / np.log(10) * data['err'] / np.where(data['phot'] > 0, data['phot'], 1)
filt_names = ['PS_g', 'PS_r', 'PS_i', 'PS_z', 'PS_y', '2MASS_J', '2MASS_H', '2MASS_Ks']
print("Effective magnitude errors after adding systematics:")
for i, name in enumerate(filt_names):
    valid = data['mask'][:, i]
    if np.sum(valid) > 0:
        print(f"  {name:>10s}: min={np.min(mag_err_after[valid, i]):.4f}, "
              f"median={np.median(mag_err_after[valid, i]):.4f} mag")

2. BruteForce Class Setup#

BruteForce performs fast stellar parameter inference by:

Evaluating likelihood over a pre-computed grid of stellar models
Marginalizing analytically over nuisance parameters (distance, extinction)
Incorporating priors (Galactic structure, 3D dust, parallax)

Let’s load the MIST model grid and initialize BruteForce.

from brutus.analysis import BruteForce
from brutus.core import StarGrid
from brutus.data import load_models, filters

# Load MIST model grid
print("Loading MIST model grid...")
grid_file = find_brutus_data_file('grid_mist_v9.h5')

# Define filters (Pan-STARRS + 2MASS)
filt = filters.ps[:-2] + filters.tmass  # Skip PS w and open bands
print(f"\nUsing filters: {filt}")

# Load models
models, labels, label_mask = load_models(grid_file, filters=filt)

print(f"\nLoaded {len(models):,} models")
print(f"  Model shape: {models.shape}")
print(f"  Parameter fields: {list(labels.dtype.names)}")
print(f"  Key parameters:")
print(f"    Mass range: [{labels['mini'].min():.2f}, {labels['mini'].max():.1f}] M_sun")
print(f"    [Fe/H] range: [{labels['feh'].min():.2f}, {labels['feh'].max():.2f}]")
print(f"    log(age) range: [{labels['loga'].min():.2f}, {labels['loga'].max():.2f}]")

# Create StarGrid and initialize BruteForce
grid = StarGrid(models, labels)
bf = BruteForce(grid)

print(f"\nBruteForce initialized with {len(filt)} bands")

Loading MIST model grid...

Using filters: ['PS_g', 'PS_r', 'PS_i', 'PS_z', 'PS_y', '2MASS_J', '2MASS_H', '2MASS_Ks']

Reading entire dataset (49 filters) once...

Extracting 8 requested filters from memory...

Loaded 613,530 models
  Model shape: (613530, 8, 3)
  Parameter fields: ['mini', 'feh', 'eep', 'loga', 'logl', 'logt', 'logg', 'agewt']
  Key parameters:
    Mass range: [0.50, 2.0] M_sun
    [Fe/H] range: [-3.00, 0.45]
    log(age) range: [6.46, 10.14]
Loaded StarGrid with 613,530 models, 8 filters, 8 labels
BruteForce initialized with 613,530 models
  Grid parameters (8): mini, feh, eep, loga, logl, logt, logg, agewt

BruteForce initialized with 8 bands

3. Running BruteForce Fits#

Now let’s fit all sources from the Orion field, demonstrating various fitting options.

Key considerations:

Photometric offsets calibrate models to match observed photometry
3D dust maps provide extinction priors
Parallax constraints from Gaia improve distance estimates
Galactic priors based on position help constrain parameters

# Load photometric offsets (calibration corrections)
from brutus.data import load_offsets

print("Loading photometric offsets...")
try:
    offsets_file = find_brutus_data_file('offsets_mist_v9.txt')
    offsets_mist = load_offsets(offsets_file, filters=filt)
    print("\nPhotometric offsets (multiplicative flux):")
    for band, offset in zip(filt, offsets_mist):
        print(f"  {band}: {offset:.4f}")
except FileNotFoundError:
    offsets_mist = np.ones(len(filt))
    print("  Offsets file not found, using unity offsets")

# Select sources with sufficient photometric coverage
good_sources = np.sum(data['mask'], axis=1) >= 4  # At least 4 bands
idx_fit = np.where(good_sources)[0]

print(f"\nSelected {len(idx_fit)} sources for fitting")
print(f"  Criteria: >=4 bands detected")
print(f"  Average bands per source: {np.mean(np.sum(data['mask'][idx_fit], axis=1)):.1f}")

# Try to load 3D dust map
try:
    dustfile = find_brutus_data_file('bayestar2019_v1.h5')
    print("\n3D dust map (Bayestar) available")
except FileNotFoundError:
    dustfile = None
    print("\n3D dust map not available (will use flat prior)")

Note: brutus recommends at least 4 photometric bands for reliable fits. Users can disable the 3D dust map prior (making it flat over A(V)) by not passing the dustfile argument to bf.fit().

# Run BruteForce fits
# Save results to tutorials/ directory for use by later tutorials
tutorials_dir = Path(__file__).parent if '__file__' in dir() else Path('.')
output_file = tutorials_dir / 'Orion_l209.1_b-19.9_mist.h5'

import time

print(f"Running BruteForce fits on {len(idx_fit)} sources...")
print("This may take several minutes depending on your hardware.")

t0 = time.time()

bf.fit(
    data=data['phot'][idx_fit],
    data_err=data['err'][idx_fit],
    data_mask=data['mask'][idx_fit],
    data_labels=idx_fit,
    save_file=str(output_file),
    data_coords=data['coords'][idx_fit],
    parallax=data['parallax'][idx_fit],
    parallax_err=data['parallax_err'][idx_fit],
    phot_offsets=offsets_mist,
    dustfile=dustfile,
    Ndraws=500,
    Nmc_prior=20,
    logl_dim_prior=True,
    save_dar_draws=True,
    running_io=False,
    verbose=True
)

elapsed = time.time() - t0
print(f"\nFitting complete in {elapsed:.1f} seconds ({elapsed/60:.1f} minutes).")
print(f"Results saved to {output_file}")

Show code cell source

Hide code cell source

# Load and analyze results
import h5py

with h5py.File(output_file, 'r') as f:
    chi2 = f['obj_chi2min'][:]
    nbands = f['obj_Nbands'][:]
    model_idx = f['model_idx'][:]
    ml_scale = f['ml_scale'][:]
    ml_av = f['ml_av'][:]
    ml_rv = f['ml_rv'][:]
    log_post = f['obj_log_post'][:]
    dists = f['samps_dist'][:]
    reds = f['samps_red'][:]
    dreds = f['samps_dred'][:]

print("Loaded fitting results:")
print(f"  chi2 values: {chi2.shape}")
print(f"  Model indices: {model_idx.shape}")
print(f"  ML scale/AV/RV: {ml_scale.shape}")
print(f"  Distance samples: {dists.shape}")

# Compute summary statistics
chi2_per_band = chi2 / nbands  # shown below as a diagnostic only
# Fit quality is judged by the goodness-of-fit p-value (dof = Nbands - 3
# free params: scale, A(V), R(V)), NOT chi2/Nbands -- the latter is not
# variance-stabilizing across different band counts (see CLAUDE.md).
from scipy.stats import chi2 as chi2_dist
dof = np.maximum(nbands - 3, 1)
pvals = chi2_dist.sf(chi2, dof)
mean_dists = np.mean(dists, axis=1)
mean_av = np.mean(reds, axis=1)

# Create fit quality visualization
fig, axes = plt.subplots(2, 3, figsize=(15, 10))

# Panel 1: Chi2 distribution
ax = axes[0, 0]
ax.hist(chi2_per_band, bins=30, alpha=0.7, color='blue', edgecolor='darkblue')
ax.axvline(1.0, color='red', ls='--', lw=2, label='Expected (chi2/n = 1)')
ax.axvline(np.median(chi2_per_band), color='green', ls='--', lw=2, label=f'Median = {np.median(chi2_per_band):.2f}')
ax.set_xlabel('chi2/band')
ax.set_ylabel('Number of Sources')
ax.set_title('Goodness of Fit')
ax.set_xlim(0, np.percentile(chi2_per_band, 99))
ax.legend()
ax.grid(True, alpha=0.3)

# Panel 2: Distance distribution
ax = axes[0, 1]
ax.hist(mean_dists, bins=30, alpha=0.7, color='green', edgecolor='darkgreen')
ax.axvline(np.median(mean_dists), color='red', ls='--', lw=2, label=f'Median = {np.median(mean_dists):.2f} kpc')
ax.set_xlabel('Distance (kpc)')
ax.set_ylabel('Number of Sources')
ax.set_title('Distance Distribution')
ax.set_xlim(0, np.percentile(mean_dists, 99))
ax.legend()
ax.grid(True, alpha=0.3)

# Panel 3: Extinction distribution
ax = axes[0, 2]
ax.hist(mean_av, bins=30, alpha=0.7, color='brown', edgecolor='darkred')
ax.axvline(np.median(mean_av), color='blue', ls='--', lw=2, label=f'Median = {np.median(mean_av):.2f} mag')
ax.set_xlabel('A(V) (mag)')
ax.set_ylabel('Number of Sources')
ax.set_title('Extinction Distribution')
ax.set_xlim(0, np.percentile(mean_av, 99))
ax.legend()
ax.grid(True, alpha=0.3)

# Panel 4: Distance vs extinction
ax = axes[1, 0]
scatter = ax.scatter(mean_dists, mean_av, alpha=0.5, s=20,
                    c=chi2_per_band, cmap='viridis', vmin=0, vmax=3)
ax.set_xlabel('Distance (kpc)')
ax.set_ylabel('A(V) (mag)')
ax.set_title('Distance-Extinction Relation')
ax.set_xlim(0, np.percentile(mean_dists, 99))
ax.set_ylim(0, np.percentile(mean_av, 99))
plt.colorbar(scatter, ax=ax, label='chi2/band')
ax.grid(True, alpha=0.3)

# Panel 5: Stellar parameters (using best-fit model per source)
ax = axes[1, 1]

best_draw = np.argmax(log_post, axis=1)
best_model_per_source = model_idx[np.arange(len(chi2)), best_draw]
unique_masses = []
for idx in best_model_per_source:
    if idx < len(labels):
        unique_masses.append(labels[idx]['mini'])

if unique_masses:
    ax.hist(unique_masses, bins=np.logspace(-0.5, 1, 30),
           alpha=0.7, color='purple', edgecolor='darkviolet')
    ax.set_xlabel('Initial Mass (M_sun)')
    ax.set_ylabel('Number of Sources')
    ax.set_title('Mass Distribution')
    ax.set_xscale('log')
    ax.grid(True, alpha=0.3)

# Panel 6: Fit statistics
ax = axes[1, 2]
ax.axis('off')

# Calculate statistics
good_fits = pvals > 1e-3
stats_text = f"""
Fit Statistics:

Total sources: {len(chi2)}
Good fits (p > 1e-3): {good_fits.sum()} ({100*good_fits.sum()/len(chi2):.1f}%)

Distance:
  Mean: {np.mean(mean_dists):.2f} kpc
  Median: {np.median(mean_dists):.2f} kpc
  Std: {np.std(mean_dists):.2f} kpc

Extinction:
  Mean: {np.mean(mean_av):.2f} mag
  Median: {np.median(mean_av):.2f} mag
  Std: {np.std(mean_av):.2f} mag

chi2/band:
  Mean: {np.mean(chi2_per_band):.2f}
  Median: {np.median(chi2_per_band):.2f}

R(V):
  Mean: {np.mean(dreds):.2f}
  Std: {np.std(dreds):.2f}
"""

ax.text(0.05, 0.95, stats_text, transform=ax.transAxes,
       fontsize=11, va='top', family='monospace')

plt.suptitle('BruteForce Fitting Results', fontsize=16, fontweight='bold')
save_figure(fig, 'fitting_results')
plt.show()

print("\nResult analysis complete")

4. Visualizing Individual Results#

Let’s examine individual star fits in detail to understand the posterior distributions and parameter correlations.

# Select a typical example star that passes a reasonable chi2 cut
good_fit = pvals > 1e-3  # p-value threshold for a good fit

# Use the first source that passes the chi2 cut
if good_fit[0]:
    star_idx = 0
else:
    star_idx = np.where(good_fit)[0][0]

data_idx = idx_fit[star_idx]

print(f"Selected example star (index {star_idx}):")
print(f"  p-value = {pvals[star_idx]:.3g}")
print(f"  N_bands = {nbands[star_idx]}")
if np.isfinite(data['parallax'][data_idx]):
    plx_snr = data['parallax'][data_idx] / data['parallax_err'][data_idx]
    print(f"  Parallax = {data['parallax'][data_idx]:.3f} +/- {data['parallax_err'][data_idx]:.3f} mas (SNR = {plx_snr:.1f})")
else:
    print(f"  Parallax = not available")
print(f"  Distance = {mean_dists[star_idx]:.2f} +/- {np.std(dists[star_idx]):.3f} kpc")
print(f"  A(V) = {mean_av[star_idx]:.2f} mag")

4.1 Visualizing a Single Star#

brutus provides built-in plotting functions for visualizing individual fitting results. We demonstrate three key plots for the selected example star.

from brutus.plotting import dist_vs_red, posterior_predictive, cornerplot

print(f"Visualizing star {star_idx}")
print(f"  p-value = {pvals[star_idx]:.3g}")
print(f"  Distance = {mean_dists[star_idx]:.2f} +/- {np.std(dists[star_idx]):.3f} kpc")
print(f"  A(V) = {mean_av[star_idx]:.2f} mag")

Visualizing star 17
  chi2/n = 1.003
  Distance = 1.20 +/- 0.046 kpc
  A(V) = 0.48 mag

  Saved: /mnt/c/Users/joshs/Dropbox/GitHub/brutus/tutorials/plots/tutorial_05/dist_vs_red.png

../_images/2a5c332cedc062dac5f7bc9184b4029266dc21510449c5bd839a0a51e0f5e021.png

  Saved: /mnt/c/Users/joshs/Dropbox/GitHub/brutus/tutorials/plots/tutorial_05/posterior_predictive.png

../_images/4f065a2f648bf8de3b55c55f1c77d10db498a79be45241a08abc1874c731bf86.png

5. Impact of the 3D Dust Map Prior#

The Bayestar 3D dust map provides distance-dependent extinction constraints that can significantly affect the inferred stellar parameters. When a dust map is available, BruteForce evaluates the extinction prior at each candidate distance, preferring (distance, A_V) combinations consistent with the cumulative dust profile along the line of sight.

To quantify this effect, we re-fit the same sources without the dust map prior and compare the resulting distance and extinction posteriors. Sources in highly extincted regions (like Orion) should show the largest differences.

Show code cell source

Hide code cell source

# Re-fit the same sources WITHOUT the dust map prior
if dustfile is not None:
    output_file_nodust = tutorials_dir / 'Orion_l209.1_b-19.9_mist_nodust.h5'

    print(f"Running comparison fit WITHOUT dust map prior on {len(idx_fit)} sources...")
    print("This may take several minutes depending on your hardware.")

    t0 = time.time()

    bf.fit(
        data=data['phot'][idx_fit],
        data_err=data['err'][idx_fit],
        data_mask=data['mask'][idx_fit],
        data_labels=idx_fit,
        save_file=str(output_file_nodust),
        data_coords=data['coords'][idx_fit],
        parallax=data['parallax'][idx_fit],
        parallax_err=data['parallax_err'][idx_fit],
        phot_offsets=offsets_mist,
        dustfile=None,  # <-- No dust map
        Ndraws=500,
        Nmc_prior=20,
        logl_dim_prior=True,
        save_dar_draws=True,
        running_io=False,
        verbose=True
    )

    elapsed = time.time() - t0
    print(f"\nComparison fit complete in {elapsed:.1f} seconds ({elapsed/60:.1f} minutes).")
    print(f"Saved to {output_file_nodust}")

    # --- Load both result sets ---
    with h5py.File(output_file, 'r') as f:
        reds_dust = f['samps_red'][:]

    with h5py.File(output_file_nodust, 'r') as f:
        reds_nodust = f['samps_red'][:]

    mean_av_dust = np.mean(reds_dust, axis=1)
    mean_av_nodust = np.mean(reds_nodust, axis=1)

    # --- Single-panel A(V) comparison ---
    fig, ax = plt.subplots(figsize=(6, 6))
    ax.scatter(mean_av_nodust, mean_av_dust, s=30, alpha=0.7,
               c='firebrick', edgecolor='k', lw=0.5)
    ax.plot([0, 6], [0, 6], 'k--', lw=1, alpha=0.5, label='1:1')
    ax.set_xlabel('A(V) (mag) -- No Dust Prior')
    ax.set_ylabel('A(V) (mag) -- With Dust Prior')
    ax.set_title('Effect of 3D Dust Map Prior on Extinction')
    ax.set_xlim(0, 6)
    ax.set_ylim(0, 6)
    ax.legend()
    ax.grid(True, alpha=0.3)
    ax.set_aspect('equal')

    save_figure(fig, 'dust_prior_comparison')
    plt.show()

    print("\nDust prior comparison complete")
else:
    print("Dust map not available -- skipping comparison.")
    print("The dust prior comparison requires bayestar2019_v1.h5")

6. Advanced BruteForce Options#

BruteForce.fit() accepts many parameters beyond the basics shown earlier. Understanding these options lets you tune performance, control output size, and adjust the statistical model to match your science case.

The parameters fall into several categories:

Likelihood control: logl_dim_prior switches between a standard Gaussian log-likelihood and a chi-square dimensionality-corrected version that penalizes over-fitting (or under-fitting) relative to the number of observed bands.
Prior adjustments: apply_agewt adds an age-based weight from the MIST evolutionary tracks (accounting for time spent in each evolutionary phase). apply_grad applies a gradient-based correction for non-uniform grid spacing. av_gauss lets you impose a Gaussian prior on A(V) instead of using the distance-reddening relation from a 3D dust map.
Output thresholds: wt_thresh and cdf_thresh control which models are kept in the posterior sampling. Raising these thresholds reduces output file size at the cost of potentially trimming the tails of the posterior.

The cell below prints a reference table of these advanced parameters. It does not run a fit (which would be too slow for an interactive tutorial).

# Advanced BruteForce.fit() parameter reference
# This cell documents parameters -- it does NOT run a fit.

advanced_params = [
    # (parameter, default, description)
    ("logl_dim_prior", "True",
     "Use chi-square dimensionality prior on the likelihood.\n"
     "True  -> penalizes models based on chi2 relative to DOF (recommended).\n"
     "False -> standard Gaussian log-likelihood (no DOF correction)."),

    ("apply_agewt", "True",
     "Apply age-based weighting from MIST evolutionary tracks.\n"
     "Accounts for the different amounts of time a star spends\n"
     "in each evolutionary phase (e.g., main sequence vs. giant branch)."),

    ("apply_grad", "True",
     "Apply gradient-based correction for non-uniform grid spacing.\n"
     "Corrects for the fact that model grid points are not evenly\n"
     "distributed in parameter space (mass, age, metallicity)."),

    ("av_gauss", "None",
     "Gaussian prior on A(V) as a (mean, std) tuple.\n"
     "Example: av_gauss=(0.5, 0.2) centers extinction at A(V)=0.5 mag.\n"
     "If None, uses the distance-reddening relation from the dust map."),

    ("wt_thresh", "1e-3",
     "Weight threshold for model selection: keeps models with\n"
     "weight > wt_thresh * max(weight). Lower values keep more\n"
     "models (broader posterior tails) but increase output size."),

    ("cdf_thresh", "2e-3",
     "CDF threshold for model selection. Models contributing less\n"
     "than this fraction of cumulative probability are discarded.\n"
     "Used as fallback if wt_thresh is None."),

    ("Ndraws", "250",
     "Number of posterior samples saved per object.\n"
     "More draws give smoother posteriors but larger output files.\n"
     "Typical range: 100 (quick) to 500 (publication quality)."),

    ("Nmc_prior", "50",
     "Number of Monte Carlo samples for prior integration.\n"
     "Higher values improve accuracy of the distance/extinction\n"
     "marginalization at the cost of computation time."),

    ("rv_gauss", "(3.32, 0.18)",
     "Gaussian prior on R(V) as a (mean, std) tuple.\n"
     "Default based on Schlafly et al. (2016).\n"
     "Set to None to use a flat prior within rvlim bounds."),

    ("running_io", "True",
     "If True, writes each object to disk immediately after fitting.\n"
     "Safer for long runs (recoverable on crash) but slower on\n"
     "network filesystems. Set False to batch-write at the end."),
]

# Print formatted table
print("Advanced BruteForce.fit() Parameters")
print("=" * 72)

for param, default, desc in advanced_params:
    print(f"\n  {param}  (default: {default})")
    print("  " + "-" * 68)
    for line in desc.split("\n"):
        print(f"    {line}")

print("\n" + "=" * 72)
print("\nExample usage with advanced options:")
print("""
    bf.fit(
        data=flux, data_err=flux_err, data_mask=mask,
        data_labels=obj_ids, save_file='results.h5',
        # --- advanced options ---
        logl_dim_prior=True,      # recommended for robust fitting
        apply_agewt=True,         # weight by evolutionary timescales
        apply_grad=True,          # correct for grid spacing
        av_gauss=(0.3, 0.1),     # informative A(V) prior
        rv_gauss=(3.32, 0.18),   # Schlafly+2016 R(V) prior
        wt_thresh=1e-3,           # keep top 0.1% of models
        cdf_thresh=2e-3,          # CDF cutoff
        Ndraws=250,               # posterior samples per source
        Nmc_prior=50,             # MC prior samples
        running_io=True,          # write results incrementally
    )
""")

Advanced BruteForce.fit() Parameters
========================================================================

  logl_dim_prior  (default: True)
  --------------------------------------------------------------------
    Use chi-square dimensionality prior on the likelihood.
    True  -> penalizes models based on chi2 relative to DOF (recommended).
    False -> standard Gaussian log-likelihood (no DOF correction).

  apply_agewt  (default: True)
  --------------------------------------------------------------------
    Apply age-based weighting from MIST evolutionary tracks.
    Accounts for the different amounts of time a star spends
    in each evolutionary phase (e.g., main sequence vs. giant branch).

  apply_grad  (default: True)
  --------------------------------------------------------------------
    Apply gradient-based correction for non-uniform grid spacing.
    Corrects for the fact that model grid points are not evenly
    distributed in parameter space (mass, age, metallicity).

  av_gauss  (default: None)
  --------------------------------------------------------------------
    Gaussian prior on A(V) as a (mean, std) tuple.
    Example: av_gauss=(0.5, 0.2) centers extinction at A(V)=0.5 mag.
    If None, uses the distance-reddening relation from the dust map.

  wt_thresh  (default: 1e-3)
  --------------------------------------------------------------------
    Weight threshold for model selection: keeps models with
    weight > wt_thresh * max(weight). Lower values keep more
    models (broader posterior tails) but increase output size.

  cdf_thresh  (default: 2e-3)
  --------------------------------------------------------------------
    CDF threshold for model selection. Models contributing less
    than this fraction of cumulative probability are discarded.
    Used as fallback if wt_thresh is None.

  Ndraws  (default: 250)
  --------------------------------------------------------------------
    Number of posterior samples saved per object.
    More draws give smoother posteriors but larger output files.
    Typical range: 100 (quick) to 500 (publication quality).

  Nmc_prior  (default: 50)
  --------------------------------------------------------------------
    Number of Monte Carlo samples for prior integration.
    Higher values improve accuracy of the distance/extinction
    marginalization at the cost of computation time.

  rv_gauss  (default: (3.32, 0.18))
  --------------------------------------------------------------------
    Gaussian prior on R(V) as a (mean, std) tuple.
    Default based on Schlafly et al. (2016).
    Set to None to use a flat prior within rvlim bounds.

  running_io  (default: True)
  --------------------------------------------------------------------
    If True, writes each object to disk immediately after fitting.
    Safer for long runs (recoverable on crash) but slower on
    network filesystems. Set False to batch-write at the end.

========================================================================

Example usage with advanced options:

    bf.fit(
        data=flux, data_err=flux_err, data_mask=mask,
        data_labels=obj_ids, save_file='results.h5',
        # --- advanced options ---
        logl_dim_prior=True,      # recommended for robust fitting
        apply_agewt=True,         # weight by evolutionary timescales
        apply_grad=True,          # correct for grid spacing
        av_gauss=(0.3, 0.1),     # informative A(V) prior
        rv_gauss=(3.32, 0.18),   # Schlafly+2016 R(V) prior
        wt_thresh=1e-3,           # keep top 0.1% of models
        cdf_thresh=2e-3,          # CDF cutoff
        Ndraws=250,               # posterior samples per source
        Nmc_prior=50,             # MC prior samples
        running_io=True,          # write results incrementally
    )

7. Summary and Key Takeaways#

This tutorial has demonstrated how to fit individual stars using BruteForce:

Key Steps#

Data Preparation
- Convert magnitudes to flux (maggies)
- Ensure proper error arrays and masks
- Include parallax and coordinates when available
Model Setup
- Load pre-computed model grids (MIST)
- Apply photometric offsets for calibration
- Initialize BruteForce with models
Running Fits
- Configure priors (Galactic, dust, parallax)
- Set sampling parameters (Ndraws, Nmc_prior)
- Save results to HDF5 for analysis
Result Analysis
- Examine chi-squared distributions for fit quality
- Analyze parameter distributions and correlations
- Visualize individual SEDs and posteriors

Best Practices#

Data Quality: Require SNR > 5 and >= 4 bands per source
Photometric Offsets: Always apply appropriate calibrations
Priors: Use 3D dust maps and parallax when available
Validation: Check chi-squared/n values (should be ~1 for good fits)
Uncertainties: Use posterior samples for proper error estimation

Next Steps#

Tutorial 6: Cluster Analysis and Population Fitting
Tutorial 7: 3D Dust Mapping
Tutorial 8: Photometric Calibration

print("Tutorial 5 Complete!")
print("="*60)

tutorials_dir = Path(__file__).parent if '__file__' in dir() else Path('.')

print("\nOutput files:")
for h5_file in sorted(tutorials_dir.glob('Orion_l209.1_b-19.9_mist*.h5')):
    print(f"  - {h5_file.name}")

print("\nGenerated plots:")
for plot_file in sorted(plots_dir.glob('*.png')):
    print(f"  - {plot_file.name}")