Data Module (brutus.data)

Data Module (`brutus.data`)#

The data module manages all external data dependencies for brutus, including MIST stellar evolution grids, isochrones, 3-D dust maps, neural network weights, and photometric calibration offsets. It uses pooch for automatic downloading and caching of data files.

Data Dependencies:

brutus requires several types of external data:

MIST Grids: HDF5 files with stellar evolutionary tracks (~750 MB)
MIST Isochrones: Tabulated isochrones for population synthesis (~250 MB)
Dust Maps: HEALPix 3-D extinction maps (Bayestar19, ~2 GB)
Neural Networks: Trained weights for bolometric corrections (~250 KB)
Photometric Offsets: Empirical calibration tables (<1 KB)

Automatic Data Management:

Data files are automatically downloaded on first use and cached locally. The default cache location is ~/.cache/astro-brutus/ (via pooch.os_cache('astro-brutus')) but can be overridden with the ASTRO_BRUTUS_DATA_DIR environment variable.

Typical Usage:

First-time setup (download all data files):

from brutus.data import fetch_grids, fetch_isos, fetch_dustmaps

# Download MIST stellar evolution grids
fetch_grids()  # Downloads default grids for common filter sets

# Download MIST isochrones
fetch_isos()

# Download 3-D dust maps
fetch_dustmaps(dustmap='bayestar19')

Loading data for fitting:

from brutus.data import load_models
from brutus.core import StarGrid

# Load pre-computed grid (third return is the ancillary-label mask)
models, labels, label_mask = load_models('grid_mist_v9.h5')
grid = StarGrid(models, labels)

# Grid is now ready for use with BruteForce fitter

Custom data locations:

# Load grid from custom path
models, labels, label_mask = load_models('/path/to/my_custom_grid.h5')

Data File Formats:

Grids: HDF5 with datasets for models, labels, reddening coefficients
Isochrones: HDF5 with structured arrays
Dust Maps: HEALPix FITS files (via healpy)
Neural Networks: Pickle files with layer weights

Storage Requirements:

Minimal installation (grid + isochrones): ~1 GB
Full installation (with dust maps): ~3 GB

See Also:

Installation - Setting up data files
Grid-Based Fitting - Creating custom grids
Quick Start Guide - Basic data loading examples

Data Downloading#

brutus.data.fetch_grids(grid='mist_v9', target_dir='.')[source]#

Downloads pre-computed stellar model grids (used for fast stellar parameter inference and photometric fitting) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
grid (str, optional) –
The desired grid file. Available options:
- ’mist_v9’ (default) : MIST v1.2 with empirical corrections (v9)
- ’mist_v8’ : MIST v1.2 with empirical corrections (v8)
- ’bayestar_v5’ : Bayestar models (v5)

Returns:

file_path – Path to the downloaded stellar model grid file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified grid file does not exist in the registry.

Examples

>>> from brutus.data import fetch_grids
>>> grid_path = fetch_grids(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded model grid to: {grid_path}")

>>> # Download older version
>>> old_grid = fetch_grids(target_dir='./data/DATAFILES/', grid='mist_v8')

brutus.data.fetch_isos(target_dir='.', iso='MIST_1.2_vvcrit0.0')[source]#

Download isochrone files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
iso (str, optional) –
The desired isochrone file. Available options:
- ’MIST_1.2_vvcrit0.0’ (default) : Non-rotating MIST v1.2 isochrones
- ’MIST_1.2_vvcrit0.4’ : Rotating MIST v1.2 isochrones

Returns:

file_path – Path to the downloaded isochrone file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified isochrone file does not exist in the registry.

Examples

>>> from brutus.data import fetch_isos
>>> iso_path = fetch_isos(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded isochrones to: {iso_path}")

>>> # Download rotating models
>>> rotating_path = fetch_isos(target_dir='./data/DATAFILES/', iso='MIST_1.2_vvcrit0.4')

brutus.data.fetch_tracks(target_dir='.', track='MIST_1.2_vvcrit0.0')[source]#

Download EEP (Equivalent Evolutionary Point) track files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
track (str, optional) –
The desired track file. Available options:
- ’MIST_1.2_vvcrit0.0’ (default) : Non-rotating MIST v1.2 tracks

Returns:

file_path – Path to the downloaded evolutionary track file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified track file does not exist in the registry.

Examples

>>> from brutus.data import fetch_tracks
>>> track_path = fetch_tracks(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded tracks to: {track_path}")

brutus.data.fetch_dustmaps(target_dir='.', dustmap='bayestar19')[source]#

Download 3D dust extinction map files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
dustmap (str, optional) –
The desired dust map file. Available options:
- ’bayestar19’ (default) : Bayestar dust map from Green et al. (2019)

Returns:

file_path – Path to the downloaded dust map file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified dust map file does not exist in the registry.

Examples

>>> from brutus.data import fetch_dustmaps
>>> dust_path = fetch_dustmaps(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded dust map to: {dust_path}")

brutus.data.fetch_offsets(target_dir='.', grid='mist_v9')[source]#

Downloads photometric offset files (used to calibrate systematic differences between observed and model photometry) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
grid (str, optional) –
The associated model grid for the offsets. Available options:
- ’mist_v9’ (default) : Offsets for MIST v1.2 with corrections (v9)
- ’mist_v8’ : Offsets for MIST v1.2 with corrections (v8)
- ’bayestar_v5’ : Offsets for Bayestar models (v5)

Returns:

file_path – Path to the downloaded photometric offset file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified offset file does not exist in the registry.

Examples

>>> from brutus.data import fetch_offsets
>>> offset_path = fetch_offsets(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded offsets to: {offset_path}")

brutus.data.fetch_nns(target_dir='.', model='c3k')[source]#

Download pre-trained neural network model files (used for fast spectral energy distribution prediction) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
model (str, optional) –
The desired neural network model file. Available options:
- ’c3k’ (default) : Network trained on C3K spectral models

Returns:

file_path – Path to the downloaded neural network file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified neural network file does not exist in the registry.

Examples

>>> from brutus.data import fetch_nns
>>> nn_path = fetch_nns(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded neural network to: {nn_path}")

Data Loading#

brutus.data.load_models(filepath, filters=None, labels=None, include_ms=True, include_postms=True, include_binaries=False, verbose=True)[source]#

Loads pre-computed stellar model grids with photometric coefficients for multiple filters and stellar parameters. Models can be filtered by evolutionary phase and binary status.

Parameters:

filepath (str) – The filepath of the stellar model file (typically .h5 format).
filters (iterable of str with length Nfilt, optional) – List of filters that will be loaded. If not provided, will default to all available filters. See the internally-defined FILTERS variable for more details on filter names. Any filters that are not available will be skipped over.
labels (iterable of str with length Nlabel, optional) – List of labels associated with the set of imported stellar models. Any labels that are not available will be skipped over. The default set is [‘mini’, ‘feh’, ‘eep’, ‘smf’, ‘loga’, ‘logl’, ‘logt’, ‘logg’, ‘Mr’, ‘agewt’].
include_ms (bool, optional) – Whether to include objects on the Main Sequence. Applied as a cut on eep <= 454 when ‘eep’ is included. Default is True.
include_postms (bool, optional) – Whether to include objects evolved off the Main Sequence. Applied as a cut on eep > 454 when ‘eep’ is included. Default is True.
include_binaries (bool, optional) – Whether to include unresolved binaries. Applied as a cut on secondary mass fraction (‘smf’) when it has been included. Default is False. If set to False, ‘smf’ is not returned as a label.
verbose (bool, optional) – Whether to print progress messages. Default is True.

Returns:

models (~numpy.ndarray of shape (Nmodel, Nfilt, Ncoef)) – Array of models comprised of coefficients in each band used to describe the photometry as a function of reddening, parameterized in terms of A_V. Each model contains coefficients for: - Unreddened magnitude - Reddening vector for R_V = 0 - Change in reddening vector as function of R_V
labels (structured ~numpy.ndarray with dimensions (Nmodel, Nlabel)) – A structured array with the labels corresponding to each model. Contains stellar parameters like initial mass, metallicity, age, etc.
label_mask (structured ~numpy.ndarray with dimensions (1, Nlabel)) – A structured array that masks ancillary labels associated with predictions (rather than those used to compute the model grid).

Raises:

ValueError – If neither main sequence nor post-main sequence models are included.

Notes

The label_mask return value is a boolean structured array indicating which labels are ancillary (derived from the grid) vs. those used to generate the grid. For example, if luminosity is predicted from mass/age/metallicity, this mask would be False for luminosity. Used internally by StarGrid to determine which parameters to marginalize over during fitting.

Examples

Basic usage with default settings:

>>> from brutus.data import load_models, fetch_grids
>>> fetch_grids()  # Download data (first time only)
>>> models, labels, label_mask = load_models('grid_mist_v9.h5')
>>> print(f"Loaded {len(labels)} stellar models")
>>> print(f"Available labels: {labels.dtype.names}")

Loading specific filters only:

>>> models, labels, mask = load_models(
...     'grid_mist_v9.h5',
...     filters=['g', 'r', 'i', 'z', 'y']
... )

Loading only main sequence stars:

>>> models, labels, mask = load_models(
...     'grid_mist_v9.h5',
...     include_ms=True,
...     include_postms=False
... )

Using with StarGrid for fitting:

>>> from brutus.core import StarGrid
>>> from brutus.analysis import BruteForce
>>> models, labels, mask = load_models('grid_mist_v9.h5')
>>> grid = StarGrid(models, labels, mask)
>>> fitter = BruteForce(grid)

brutus.data.load_offsets(filepath, filters=None, verbose=True)[source]#

Loads multiplicative photometric offsets used to calibrate systematic differences between observed and synthetic photometry.

Parameters:

filepath (str) – The filepath of the photometric offsets file (typically .txt format).
filters (iterable of str with length Nfilt, optional) – List of filters that will be loaded. If not provided, will default to all available filters. See the internally-defined FILTERS variable for more details on filter names. Any filters that are not available will be skipped over.
verbose (bool, optional) – Whether to print a summary of the offsets. Default is True.

Returns:

offsets – Array of constants that will be multiplied to the data to account for offsets (i.e. multiplicative flux offsets). Values are typically close to 1.0, with deviations indicating systematic differences.

Return type:

~numpy.ndarray of shape (Nfilt)

Notes

The offset file should contain two columns: filter names and offset values. Filters not found in the file will be assigned an offset of 1.0 (no correction).

Examples

>>> from brutus.data import load_offsets
>>> offsets = load_offsets('./data/DATAFILES/offsets_mist_v9.txt')
>>> print(f"Loaded offsets for {len(offsets)} filters")

>>> # Load specific filters
>>> gri_offsets = load_offsets('./data/DATAFILES/offsets_mist_v9.txt',
...                            filters=['g', 'r', 'i'])

>>> # Check which filters have significant offsets
>>> significant = np.abs(offsets - 1.0) > 0.01
>>> print(f"Filters with >1% offsets: {np.sum(significant)}")

brutus.data.find_nn_file(possible_names=('nn_c3k.h5', 'nnMIST_BC.h5'))[source]#

Find the neural network model file in standard locations.

Searches for the neural network HDF5 file used for bolometric correction predictions, checking the local data/DATAFILES directory first, then the pooch cache directory.

Parameters:: possible_names (tuple of str, optional) – Filenames to search for, tried in order. Default is ("nn_c3k.h5", "nnMIST_BC.h5"). The first file found is returned.
Returns:: path – Absolute path to the neural network file.
Return type:: pathlib.Path
Raises:: FileNotFoundError – If none of the candidate files can be found in any searched location.

Submodules#

For advanced users who need access to internal implementations:

Data downloading utilities for brutus.

This module contains functions for downloading stellar evolution models, isochrones, dust maps, and other data files required by brutus. All downloads use the Pooch library for robust data management.

brutus.data.download.fetch_isos(target_dir='.', iso='MIST_1.2_vvcrit0.0')[source]

Download isochrone files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
iso (str, optional) –
The desired isochrone file. Available options:
- ’MIST_1.2_vvcrit0.0’ (default) : Non-rotating MIST v1.2 isochrones
- ’MIST_1.2_vvcrit0.4’ : Rotating MIST v1.2 isochrones

Returns:

file_path – Path to the downloaded isochrone file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified isochrone file does not exist in the registry.

Examples

>>> from brutus.data import fetch_isos
>>> iso_path = fetch_isos(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded isochrones to: {iso_path}")

>>> # Download rotating models
>>> rotating_path = fetch_isos(target_dir='./data/DATAFILES/', iso='MIST_1.2_vvcrit0.4')

brutus.data.download.fetch_tracks(target_dir='.', track='MIST_1.2_vvcrit0.0')[source]

Download EEP (Equivalent Evolutionary Point) track files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
track (str, optional) –
The desired track file. Available options:
- ’MIST_1.2_vvcrit0.0’ (default) : Non-rotating MIST v1.2 tracks

Returns:

file_path – Path to the downloaded evolutionary track file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified track file does not exist in the registry.

Examples

>>> from brutus.data import fetch_tracks
>>> track_path = fetch_tracks(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded tracks to: {track_path}")

brutus.data.download.fetch_dustmaps(target_dir='.', dustmap='bayestar19')[source]

Download 3D dust extinction map files to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
dustmap (str, optional) –
The desired dust map file. Available options:
- ’bayestar19’ (default) : Bayestar dust map from Green et al. (2019)

Returns:

file_path – Path to the downloaded dust map file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified dust map file does not exist in the registry.

Examples

>>> from brutus.data import fetch_dustmaps
>>> dust_path = fetch_dustmaps(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded dust map to: {dust_path}")

brutus.data.download.fetch_grids(grid='mist_v9', target_dir='.')[source]

Downloads pre-computed stellar model grids (used for fast stellar parameter inference and photometric fitting) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
grid (str, optional) –
The desired grid file. Available options:
- ’mist_v9’ (default) : MIST v1.2 with empirical corrections (v9)
- ’mist_v8’ : MIST v1.2 with empirical corrections (v8)
- ’bayestar_v5’ : Bayestar models (v5)

Returns:

file_path – Path to the downloaded stellar model grid file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified grid file does not exist in the registry.

Examples

>>> from brutus.data import fetch_grids
>>> grid_path = fetch_grids(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded model grid to: {grid_path}")

>>> # Download older version
>>> old_grid = fetch_grids(target_dir='./data/DATAFILES/', grid='mist_v8')

brutus.data.download.fetch_offsets(target_dir='.', grid='mist_v9')[source]

Downloads photometric offset files (used to calibrate systematic differences between observed and model photometry) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
grid (str, optional) –
The associated model grid for the offsets. Available options:
- ’mist_v9’ (default) : Offsets for MIST v1.2 with corrections (v9)
- ’mist_v8’ : Offsets for MIST v1.2 with corrections (v8)
- ’bayestar_v5’ : Offsets for Bayestar models (v5)

Returns:

file_path – Path to the downloaded photometric offset file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified offset file does not exist in the registry.

Examples

>>> from brutus.data import fetch_offsets
>>> offset_path = fetch_offsets(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded offsets to: {offset_path}")

brutus.data.download.fetch_nns(target_dir='.', model='c3k')[source]

Download pre-trained neural network model files (used for fast spectral energy distribution prediction) to target directory.

Parameters:

target_dir (str, optional) – The target directory where the file should be downloaded. If not specified, files will be downloaded to the current directory. Default is “.”.
model (str, optional) –
The desired neural network model file. Available options:
- ’c3k’ (default) : Network trained on C3K spectral models

Returns:

file_path – Path to the downloaded neural network file.

Return type:

pathlib.Path

Raises:

ValueError – If the specified neural network file does not exist in the registry.

Examples

>>> from brutus.data import fetch_nns
>>> nn_path = fetch_nns(target_dir='./data/DATAFILES/')
>>> print(f"Downloaded neural network to: {nn_path}")

Data loading utilities for brutus.

This module contains functions for loading stellar evolution models, photometric offsets, and other data files into memory for use in stellar fitting and analysis.

brutus.data.loader.find_nn_file(possible_names=('nn_c3k.h5', 'nnMIST_BC.h5'))[source]

Find the neural network model file in standard locations.

Searches for the neural network HDF5 file used for bolometric correction predictions, checking the local data/DATAFILES directory first, then the pooch cache directory.

Parameters:: possible_names (tuple of str, optional) – Filenames to search for, tried in order. Default is ("nn_c3k.h5", "nnMIST_BC.h5"). The first file found is returned.
Returns:: path – Absolute path to the neural network file.
Return type:: pathlib.Path
Raises:: FileNotFoundError – If none of the candidate files can be found in any searched location.

brutus.data.loader.load_models(filepath, filters=None, labels=None, include_ms=True, include_postms=True, include_binaries=False, verbose=True)[source]

Loads pre-computed stellar model grids with photometric coefficients for multiple filters and stellar parameters. Models can be filtered by evolutionary phase and binary status.

Parameters:

filepath (str) – The filepath of the stellar model file (typically .h5 format).
filters (iterable of str with length Nfilt, optional) – List of filters that will be loaded. If not provided, will default to all available filters. See the internally-defined FILTERS variable for more details on filter names. Any filters that are not available will be skipped over.
labels (iterable of str with length Nlabel, optional) – List of labels associated with the set of imported stellar models. Any labels that are not available will be skipped over. The default set is [‘mini’, ‘feh’, ‘eep’, ‘smf’, ‘loga’, ‘logl’, ‘logt’, ‘logg’, ‘Mr’, ‘agewt’].
include_ms (bool, optional) – Whether to include objects on the Main Sequence. Applied as a cut on eep <= 454 when ‘eep’ is included. Default is True.
include_postms (bool, optional) – Whether to include objects evolved off the Main Sequence. Applied as a cut on eep > 454 when ‘eep’ is included. Default is True.
include_binaries (bool, optional) – Whether to include unresolved binaries. Applied as a cut on secondary mass fraction (‘smf’) when it has been included. Default is False. If set to False, ‘smf’ is not returned as a label.
verbose (bool, optional) – Whether to print progress messages. Default is True.

Returns:

models (~numpy.ndarray of shape (Nmodel, Nfilt, Ncoef)) – Array of models comprised of coefficients in each band used to describe the photometry as a function of reddening, parameterized in terms of A_V. Each model contains coefficients for: - Unreddened magnitude - Reddening vector for R_V = 0 - Change in reddening vector as function of R_V
labels (structured ~numpy.ndarray with dimensions (Nmodel, Nlabel)) – A structured array with the labels corresponding to each model. Contains stellar parameters like initial mass, metallicity, age, etc.
label_mask (structured ~numpy.ndarray with dimensions (1, Nlabel)) – A structured array that masks ancillary labels associated with predictions (rather than those used to compute the model grid).

Raises:

ValueError – If neither main sequence nor post-main sequence models are included.

Notes

The label_mask return value is a boolean structured array indicating which labels are ancillary (derived from the grid) vs. those used to generate the grid. For example, if luminosity is predicted from mass/age/metallicity, this mask would be False for luminosity. Used internally by StarGrid to determine which parameters to marginalize over during fitting.

Examples

Basic usage with default settings:

>>> from brutus.data import load_models, fetch_grids
>>> fetch_grids()  # Download data (first time only)
>>> models, labels, label_mask = load_models('grid_mist_v9.h5')
>>> print(f"Loaded {len(labels)} stellar models")
>>> print(f"Available labels: {labels.dtype.names}")

Loading specific filters only:

>>> models, labels, mask = load_models(
...     'grid_mist_v9.h5',
...     filters=['g', 'r', 'i', 'z', 'y']
... )

Loading only main sequence stars:

>>> models, labels, mask = load_models(
...     'grid_mist_v9.h5',
...     include_ms=True,
...     include_postms=False
... )

Using with StarGrid for fitting:

>>> from brutus.core import StarGrid
>>> from brutus.analysis import BruteForce
>>> models, labels, mask = load_models('grid_mist_v9.h5')
>>> grid = StarGrid(models, labels, mask)
>>> fitter = BruteForce(grid)

brutus.data.loader.load_offsets(filepath, filters=None, verbose=True)[source]

Loads multiplicative photometric offsets used to calibrate systematic differences between observed and synthetic photometry.

Parameters:

filepath (str) – The filepath of the photometric offsets file (typically .txt format).
filters (iterable of str with length Nfilt, optional) – List of filters that will be loaded. If not provided, will default to all available filters. See the internally-defined FILTERS variable for more details on filter names. Any filters that are not available will be skipped over.
verbose (bool, optional) – Whether to print a summary of the offsets. Default is True.

Returns:

offsets – Array of constants that will be multiplied to the data to account for offsets (i.e. multiplicative flux offsets). Values are typically close to 1.0, with deviations indicating systematic differences.

Return type:

~numpy.ndarray of shape (Nfilt)

Notes

The offset file should contain two columns: filter names and offset values. Filters not found in the file will be assigned an offset of 1.0 (no correction).

Examples

>>> from brutus.data import load_offsets
>>> offsets = load_offsets('./data/DATAFILES/offsets_mist_v9.txt')
>>> print(f"Loaded offsets for {len(offsets)} filters")

>>> # Load specific filters
>>> gri_offsets = load_offsets('./data/DATAFILES/offsets_mist_v9.txt',
...                            filters=['g', 'r', 'i'])

>>> # Check which filters have significant offsets
>>> significant = np.abs(offsets - 1.0) > 0.01
>>> print(f"Filters with >1% offsets: {np.sum(significant)}")

Data Module (brutus.data)

Contents

Data Module (brutus.data)#

Data Downloading#

Data Loading#

Submodules#

Data Module (`brutus.data`)#