Tutorial 0: Data Setup and Environment Verification#

This is the starting point for the brutus tutorial series. Before diving into stellar modeling, fitting, or dust mapping, you need to ensure your environment is correctly configured and understand how brutus manages its data files.

Topics Covered#

  1. Environment check – Verify Python version, brutus installation, and key dependencies

  2. Filter system – Understand the photometric filter groups available in brutus

  3. Data download functions – Learn how to fetch model grids, tracks, isochrones, and dust maps

  4. Loading models and offsets – Understand the data structures returned by the loaders

  5. Data status check – Verify which tutorial data files are available locally

Important#

This notebook is designed to run without any downloaded data files. Cells that require data are guarded with availability checks and will print informative messages when data is missing. This lets you understand the full data pipeline before committing to any downloads.

# Setup: imports and plot style
import sys
import warnings
from pathlib import Path

import numpy as np

warnings.filterwarnings("ignore")

# Tutorial utilities
from tutorial_utils import (
    setup_tutorial,
    find_brutus_data_file,
    check_data_requirements,
    print_section,
)

# Run the standardized tutorial bootstrap
info = setup_tutorial(0, title="Tutorial 00: Data Setup and Environment Verification")
Tutorial 00: Data Setup and Environment Verification
====================================================

Checking data requirements for Tutorial 0
=========================================

  All required files available

1. Environment Check#

Let’s verify that Python, brutus, and the key scientific dependencies are installed and importable.

print_section("Python")
print(f"  Python version : {sys.version}")
print(f"  Executable     : {sys.executable}")

print_section("brutus")
try:
    import brutus
    print(f"  brutus version : {brutus.__version__}")
    print(f"  Location       : {Path(brutus.__file__).parent}")
except ImportError as e:
    print(f"  brutus is NOT installed: {e}")
    print("  Install with: pip install astro-brutus")

print_section("Key Dependencies")
deps = [
    "numpy", "scipy", "matplotlib", "h5py",
    "numba", "pooch", "astropy",
]
for dep in deps:
    try:
        mod = __import__(dep)
        version = getattr(mod, "__version__", "unknown")
        print(f"  {dep:15s} : {version}")
    except ImportError:
        print(f"  {dep:15s} : NOT INSTALLED")

# Optional dependencies
print_section("Optional Dependencies")
opt_deps = ["healpy", "tqdm"]
for dep in opt_deps:
    try:
        mod = __import__(dep)
        version = getattr(mod, "__version__", "unknown")
        print(f"  {dep:15s} : {version}")
    except ImportError:
        print(f"  {dep:15s} : not installed (optional)")
Python
======
  Python version : 3.13.5 | packaged by Anaconda, Inc. | (main, Jun 12 2025, 16:17:47) [GCC 11.2.0]
  Executable     : /home/joshspeagle/miniconda3/bin/python

brutus
======
  brutus version : 1.0.0
  Location       : /mnt/c/Users/joshs/Dropbox/GitHub/brutus/src/brutus

Key Dependencies
================
  numpy           : 2.2.6
  scipy           : 1.16.1
  matplotlib      : 3.10.6
  h5py            : 3.14.0
  numba           : 0.61.2
  pooch           : v1.8.2
  astropy         : 7.1.0

Optional Dependencies
=====================
  healpy          : 1.18.1
  tqdm            : 4.67.1

2. Filter System#

brutus supports a wide range of photometric filter systems. The full list of available filters is defined in brutus.data.filters. Filters are organized into groups by survey or instrument.

The MIST grids support all filters below, while the Bayestar grids only support Pan-STARRS grizy and 2MASS JHKs.

from brutus.data.filters import FILTERS
from brutus.data import filters

print_section("Available Filter Groups")

# Build a summary table of filter groups
filter_groups = {
    "Gaia (DR3)": filters.gaia,
    "SDSS": filters.sdss,
    "Pan-STARRS": filters.ps,
    "DECam": filters.decam,
    "Bessell": filters.bessell,
    "2MASS": filters.tmass,
    "VISTA": filters.vista,
    "UKIDSS": filters.ukidss,
    "WISE": filters.wise,
    "Tycho": filters.tycho,
    "Hipparcos": filters.hipp,
    "Kepler": filters.kepler,
    "TESS": filters.tess,
}

print(f"\n  {'Group':<16s} {'Count':>5s}   Filters")
print(f"  {'-'*16} {'-'*5}   {'-'*50}")
total = 0
for group_name, group_filters in filter_groups.items():
    n = len(group_filters)
    total += n
    filt_str = ", ".join(group_filters)
    print(f"  {group_name:<16s} {n:>5d}   {filt_str}")

print(f"  {'-'*16} {'-'*5}")
print(f"  {'TOTAL':<16s} {total:>5d}")
print(f"\n  FILTERS list length: {len(FILTERS)}")
Available Filter Groups
=======================

  Group            Count   Filters
  ---------------- -----   --------------------------------------------------
  Gaia (DR3)           3   Gaia_G_MAW, Gaia_BP_MAWf, Gaia_RP_MAW
  SDSS                 5   SDSS_u, SDSS_g, SDSS_r, SDSS_i, SDSS_z
  Pan-STARRS           7   PS_g, PS_r, PS_i, PS_z, PS_y, PS_w, PS_open
  DECam                6   DECam_u, DECam_g, DECam_r, DECam_i, DECam_z, DECam_Y
  Bessell              5   Bessell_U, Bessell_B, Bessell_V, Bessell_R, Bessell_I
  2MASS                3   2MASS_J, 2MASS_H, 2MASS_Ks
  VISTA                5   VISTA_Z, VISTA_Y, VISTA_J, VISTA_H, VISTA_Ks
  UKIDSS               5   UKIDSS_Z, UKIDSS_Y, UKIDSS_J, UKIDSS_H, UKIDSS_K
  WISE                 4   WISE_W1, WISE_W2, WISE_W3, WISE_W4
  Tycho                2   Tycho_B, Tycho_V
  Hipparcos            1   Hipparcos_Hp
  Kepler               2   Kepler_D51, Kepler_Kp
  TESS                 1   TESS
  ---------------- -----
  TOTAL               49

  FILTERS list length: 49

3. Data Download Functions#

brutus data files are hosted on Harvard Dataverse and managed locally via Pooch. Each fetch_* function downloads a specific category of data to a target directory (defaulting to the Pooch cache at ~/.cache/astro-brutus/). Files are verified by SHA256 hash and only downloaded once.

The available download functions are:

Function

What it downloads

Typical file

fetch_nns()

Neural network for bolometric corrections

nn_c3k.h5

fetch_tracks()

EEP evolutionary tracks

MIST_1.2_EEPtrk.h5

fetch_isos()

Isochrone tables

MIST_1.2_iso_vvcrit0.0.h5

fetch_grids()

Pre-computed model grids for fitting

grid_mist_v9.h5

fetch_offsets()

Photometric offset calibrations

offsets_mist_v9.txt

fetch_dustmaps()

3D dust extinction maps

bayestar2019_v1.h5

from brutus.data import (
    fetch_nns,
    fetch_tracks,
    fetch_isos,
    fetch_grids,
    fetch_offsets,
    fetch_dustmaps,
)
import inspect

print_section("Download Function Signatures")

fetch_funcs = [
    ("fetch_nns", fetch_nns),
    ("fetch_tracks", fetch_tracks),
    ("fetch_isos", fetch_isos),
    ("fetch_grids", fetch_grids),
    ("fetch_offsets", fetch_offsets),
    ("fetch_dustmaps", fetch_dustmaps),
]

for name, func in fetch_funcs:
    sig = inspect.signature(func)
    print(f"\n  brutus.data.{name}{sig}")
    # Extract first line of docstring
    doc_first = func.__doc__.strip().split("\n")[0]
    print(f"    -> {doc_first}")

# Show the cache path
print()
try:
    import pooch
    cache_path = pooch.os_cache("astro-brutus")
    print(f"  Cache location: {cache_path}")
except ImportError:
    print("  Cache location: (pooch not installed)")

# -----------------------------------------------------------------------
# Uncomment the lines below to download data files.
# Files are cached in the Pooch directory and only downloaded once.
# -----------------------------------------------------------------------

# fetch_nns()          # ~50 MB  -- Neural network (needed by most tutorials)
# fetch_tracks()       # ~60 MB  -- EEP tracks (Tutorials 1, 3)
# fetch_isos()         # ~200 MB -- Isochrones (Tutorials 2, 4, 6)
# fetch_grids()        # ~2 GB   -- MIST model grid (Tutorials 3, 5, 8)
# fetch_offsets()      # ~1 KB   -- Photometric offsets (Tutorials 5, 6, 8)
# fetch_dustmaps()     # ~4 GB   -- Bayestar 3D dust map (Tutorials 4, 7)
Download Function Signatures
============================

  brutus.data.fetch_nns(target_dir='.', model='c3k')
    -> Download pre-trained neural network model files (used for fast spectral energy

  brutus.data.fetch_tracks(target_dir='.', track='MIST_1.2_vvcrit0.0')
    -> Download EEP (Equivalent Evolutionary Point) track files to target directory.

  brutus.data.fetch_isos(target_dir='.', iso='MIST_1.2_vvcrit0.0')
    -> Download isochrone files to target directory.

  brutus.data.fetch_grids(target_dir='.', grid='mist_v9')
    -> Downloads pre-computed stellar model grids (used for fast stellar

  brutus.data.fetch_offsets(target_dir='.', grid='mist_v9')
    -> Downloads photometric offset files (used to calibrate systematic

  brutus.data.fetch_dustmaps(target_dir='.', dustmap='bayestar19')
    -> Download 3D dust extinction map files to target directory.

  Cache location: /home/joshspeagle/.cache/astro-brutus

4. Loading Models#

Once downloaded, pre-computed stellar model grids are loaded with load_models(). This function returns three arrays:

  • models: Shape (Nmodel, Nfilt, 3) – photometric coefficients (unreddened magnitude, reddening vector, R_V dependence) for each model and filter.

  • labels: Structured array of shape (Nmodel,) – stellar parameters for each model (initial mass, [Fe/H], EEP, log(age), etc.).

  • label_mask: Structured array of shape (1,) – boolean mask indicating which labels are grid inputs vs. derived predictions.

from brutus.data import load_models

print_section("load_models() Demo")

# Try to find and load a grid file
grid_file = None
for fname in ["grid_mist_v9.h5", "grid_bayestar_v5.h5"]:
    try:
        grid_file = find_brutus_data_file(fname)
        print(f"  Found grid file: {fname}")
        break
    except FileNotFoundError:
        continue

if grid_file is not None:
    # Load with a small set of filters for demonstration
    demo_filters = ["PS_g", "PS_r", "PS_i", "2MASS_J", "2MASS_Ks"]
    models, labels, label_mask = load_models(
        grid_file, filters=demo_filters, verbose=True
    )

    print(f"\n  models shape     : {models.shape}")
    print(f"    -> (Nmodel={models.shape[0]}, Nfilt={models.shape[1]}, Ncoef={models.shape[2]})")
    print(f"  labels dtype     : {labels.dtype.names}")
    print(f"  labels shape     : {labels.shape}")
    print(f"  label_mask dtype : {label_mask.dtype.names}")
    print(f"  label_mask values: {dict(zip(label_mask.dtype.names, label_mask[0]))}")

    # Show label ranges
    print_section("Label Ranges", char="-")
    for name in labels.dtype.names:
        vals = labels[name]
        finite = vals[np.isfinite(vals)]
        if len(finite) > 0:
            print(f"  {name:8s}: min={finite.min():10.4f}  max={finite.max():10.4f}")
        else:
            print(f"  {name:8s}: (no finite values)")
else:
    print("  No grid file found locally. Showing expected output structure:")
    print()
    print("  models shape     : (Nmodel, Nfilt, 3)")
    print("    -> Nmodel ~ hundreds of thousands of stellar models")
    print("    -> Nfilt  = number of requested photometric filters")
    print("    -> Ncoef  = 3 reddening coefficients per filter")
    print("  labels dtype     : ('mini', 'feh', 'eep', 'loga', 'logl', 'logt', 'logg', 'agewt')")
    print("  label_mask       : True for grid inputs (mini, feh, eep), False for derived params")
    print()
    print("  To download a grid file, run:")
    print("    from brutus.data import fetch_grids")
    print("    fetch_grids()")
load_models() Demo
==================
  Found grid file: grid_mist_v9.h5
Reading entire dataset (49 filters) once...
Extracting 5 requested filters from memory...
  models shape     : (613530, 5, 3)
    -> (Nmodel=613530, Nfilt=5, Ncoef=3)
  labels dtype     : ('mini', 'feh', 'eep', 'loga', 'logl', 'logt', 'logg', 'agewt')
  labels shape     : (613530,)
  label_mask dtype : ('mini', 'feh', 'eep', 'loga', 'logl', 'logt', 'logg', 'agewt')
  label_mask values: {'mini': np.True_, 'feh': np.True_, 'eep': np.True_, 'loga': np.False_, 'logl': np.False_, 'logt': np.False_, 'logg': np.False_, 'agewt': np.False_}

Label Ranges
------------
  mini    : min=    0.5000  max=    2.0000
  feh     : min=   -3.0000  max=    0.4500
  eep     : min=  202.0000  max=  808.0000
  loga    : min=    6.4619  max=   10.1400
  logl    : min=   -1.4623  max=    3.4528
  logt    : min=    3.4458  max=    4.1537
  logg    : min=   -0.2019  max=    4.9059
  agewt   : min=    0.0096  max=514949633.5270

5. Loading Photometric Offsets#

Photometric offsets correct for systematic differences between observed and synthetic photometry. They are loaded with load_offsets(), which returns a 1D array of multiplicative flux corrections (values close to 1.0).

from brutus.data import load_offsets

print_section("load_offsets() Demo")

# Try to find an offsets file
offset_file = None
for fname in ["offsets_mist_v9.txt", "offsets_bs_v5.txt"]:
    try:
        offset_file = find_brutus_data_file(fname)
        print(f"  Found offset file: {fname}")
        break
    except FileNotFoundError:
        continue

if offset_file is not None:
    demo_filters = ["PS_g", "PS_r", "PS_i", "PS_z", "PS_y",
                    "2MASS_J", "2MASS_H", "2MASS_Ks"]
    offsets = load_offsets(offset_file, filters=demo_filters, verbose=False)

    print(f"  offsets shape: {offsets.shape}")
    print()
    print(f"  {'Filter':<12s} {'Offset':>8s} {'Correction':>12s}")
    print(f"  {'-'*12} {'-'*8} {'-'*12}")
    for filt, off in zip(demo_filters, offsets):
        pct = 100.0 * (off - 1.0)
        print(f"  {filt:<12s} {off:8.5f} {pct:+11.3f}%")
else:
    print("  No offset file found locally. Showing expected output structure:")
    print()
    print("  offsets shape: (Nfilt,)")
    print("  Values are multiplicative flux corrections, typically 0.98 -- 1.02")
    print("  Applied to DATA (not models): corrected_flux = flux * offset")
    print()
    print("  To download offset files, run:")
    print("    from brutus.data import fetch_offsets")
    print("    fetch_offsets()")
load_offsets() Demo
===================
  Found offset file: offsets_mist_v9.txt
  offsets shape: (8,)

  Filter         Offset   Correction
  ------------ -------- ------------
  PS_g          1.00000      +0.000%
  PS_r          0.95000      -5.000%
  PS_i          0.96000      -4.000%
  PS_z          0.94000      -6.000%
  PS_y          0.95000      -5.000%
  2MASS_J       0.98000      -2.000%
  2MASS_H       1.04000      +4.000%
  2MASS_Ks      1.03000      +3.000%

6. Data Status Check#

The following cell checks which data files are available locally for each tutorial in the series. This helps you plan which files to download depending on which tutorials you want to run.

print_section("Data Availability Summary")
print()

tutorial_titles = {
    0: "Data Setup",
    1: "Individual Stars",
    2: "Populations",
    3: "Grids & Performance",
    4: "Galactic Priors",
    5: "Fitting Individual",
    6: "Cluster Analysis",
    7: "Dust Mapping",
    8: "Photometric Calibration",
    9: "Utilities",
    10: "Plotting",
    11: "Results",
}

all_ok = True
for tut_num in range(12):
    ok, missing = check_data_requirements(tut_num, verbose=False)
    title = tutorial_titles.get(tut_num, "")
    status = "READY" if ok else f"MISSING {len(missing)} file(s)"
    marker = "[+]" if ok else "[-]"
    print(f"  {marker} Tutorial {tut_num:2d}: {title:<28s} {status}")
    if not ok:
        all_ok = False
        for f in missing:
            print(f"        -> {f}")

print()
if all_ok:
    print("  All tutorial data files are available.")
else:
    print("  Some files are missing. Use the fetch_* functions above to download them.")
    print("  You only need to download files for the tutorials you plan to run.")
Data Availability Summary
=========================

  [+] Tutorial  0: Data Setup                   READY
  [+] Tutorial  1: Individual Stars             READY
  [+] Tutorial  2: Populations                  READY
  [+] Tutorial  3: Grids & Performance          READY
  [+] Tutorial  4: Galactic Priors              READY
  [+] Tutorial  5: Fitting Individual           READY
  [+] Tutorial  6: Cluster Analysis             READY
  [+] Tutorial  7: Dust Mapping                 READY
  [+] Tutorial  8: Photometric Calibration      READY
  [+] Tutorial  9: Utilities                    READY
  [+] Tutorial 10: Plotting                     READY
  [+] Tutorial 11: Results                      READY

  All tutorial data files are available.

Summary#

This tutorial covered the essential setup for working with brutus:

  1. Environment verification – Confirmed Python version, brutus installation, and key dependencies (numpy, scipy, numba, h5py, pooch, astropy, etc.).

  2. Filter system – brutus supports 49 filters across 13 survey/instrument groups (Gaia, Pan-STARRS, SDSS, DECam, 2MASS, WISE, and more). MIST grids support all filters; Bayestar grids support Pan-STARRS grizy + 2MASS.

  3. Data download functions – Six fetch_* functions download specific categories of data from Harvard Dataverse via Pooch, with SHA256 verification and automatic caching.

  4. Loading modelsload_models() returns a (Nmodel, Nfilt, 3) array of photometric coefficients plus structured arrays of stellar labels and masks.

  5. Loading offsetsload_offsets() returns multiplicative flux corrections for calibrating systematic photometric differences.

  6. Data status – Checked availability of required files for all tutorials.

Next Steps#

  • Tutorial 1: Individual star models with EEPTracks and StarEvolTrack

  • Tutorial 2: Stellar population models with Isochrone and StellarPop

  • Tutorial 3: Grid generation and performance optimization

  • Tutorial 4: Galactic priors (IMF, distance, extinction, metallicity)

  • Tutorial 5: Fitting individual sources with BruteForce

  • Tutorial 6: Cluster analysis with isochrone fitting

  • Tutorial 7: 3D dust mapping

  • Tutorial 8: Photometric calibration and offsets

  • Tutorial 9: Utility functions (photometry, likelihoods, sampling)

  • Tutorial 10: Plotting and visualization

  • Tutorial 11: Loading and interpreting BruteForce results