Core Concepts
[!NOTE] For information about Rompy's formatting and logging system, see formatting_and_logging.
For details on using the command line interface, see cli.
This section delves into the fundamental components that make up Rompy's architecture. If you're new to Rompy, start with the User Guide before diving into these concepts.
Rompy is a modular library with configuration and execution separated by design. The core framework consists of two primary concepts:
ModelRun
Bases: RompyBaseModel
A model run.
It is intented to be model agnostic. It deals primarily with how the model is to be run, i.e. the period of the run and where the output is going. The actual configuration of the run is provided by the config object.
Further explanation is given in the rompy.core.Baseconfig docstring.
Source code in rompy/model.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 | |
Attributes
model_type
class-attribute
instance-attribute
period
class-attribute
instance-attribute
period: TimeRange = Field(TimeRange(start=datetime(2020, 2, 21, 4), end=datetime(2020, 2, 24, 4), interval='15M'), description='The time period to run the model')
output_dir
class-attribute
instance-attribute
config
class-attribute
instance-attribute
config: Union[CONFIG_TYPES] = Field(default_factory=BaseConfig, description='The configuration object', discriminator='model_type')
delete_existing
class-attribute
instance-attribute
run_id_subdir
class-attribute
instance-attribute
staging_dir
property
The directory where the model is staged for execution
returns
staging_dir : str
Functions
generate
Generate the model input files
returns
staging_dir : str
Source code in rompy/model.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 | |
zip
Zip the input files for the model run
This function zips the input files for the model run and returns the name of the zip file. It also cleans up the staging directory leaving only the settings.json file that can be used to reproduce the run.
returns
zip_fn : str
Source code in rompy/model.py
run
Run the model using the specified backend configuration.
This method uses Pydantic configuration objects that provide type safety and validation for all backend parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend
|
BackendConfig
|
Pydantic configuration object (LocalConfig, DockerConfig, etc.) |
required |
workspace_dir
|
Optional[str]
|
Path to generated workspace directory (optional) |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if execution was successful, False otherwise |
Raises:
| Type | Description |
|---|---|
TypeError
|
If backend is not a BackendConfig instance |
Examples:
from rompy.backends import LocalConfig, DockerConfig
Local execution
model.run(LocalConfig(timeout=3600, command="python run.py"))
Docker execution
model.run(DockerConfig(image="swan:latest", cpu=4, memory="2g"))
Source code in rompy/model.py
postprocess
Postprocess the model outputs using the specified processor.
This method uses entry points to load and execute the appropriate postprocessor. Available processors are automatically discovered from the rompy.postprocess entry point group.
Built-in processors: - "noop": A placeholder processor that does nothing but returns success
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
processor
|
str
|
Name of the postprocessor to use (default: "noop") |
'noop'
|
**kwargs
|
Additional processor-specific parameters |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with results from the postprocessing |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the specified processor is not available |
Source code in rompy/model.py
pipeline
Run the complete model pipeline (generate, run, postprocess) using the specified pipeline backend.
This method executes the entire model workflow from input generation through running the model to postprocessing outputs. It uses entry points to load and execute the appropriate pipeline backend from the rompy.pipeline entry point group.
Built-in pipeline backends: - "local": Execute the complete pipeline locally using the existing ModelRun methods
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pipeline_backend
|
str
|
Name of the pipeline backend to use (default: "local") |
'local'
|
**kwargs
|
Additional backend-specific parameters. Common parameters include: - run_backend: Backend to use for the run stage (for local pipeline) - processor: Processor to use for postprocessing (for local pipeline) - run_kwargs: Additional parameters for the run stage - process_kwargs: Additional parameters for postprocessing |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with results from the pipeline execution |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the specified pipeline backend is not available |
Source code in rompy/model.py
BaseConfig
Bases: RompyBaseModel
Base class for model templates.
The template class provides the object that is used to set up the model configuration. When implemented for a given model, can move along a scale of complexity to suit the application.
In its most basic form, as implemented in this base object, it consists of path to a cookiecutter template with the class providing the context for the {{config}} values in that template. Note that any {{runtime}} values are filled from the ModelRun object.
If the template is a git repo, the checkout parameter can be used to specify a branch or tag and it will be cloned and used.
If the object is callable, it will be colled prior to rendering the template. This mechanism can be used to perform tasks such as fetching exteral data, or providing additional context to the template beyond the arguments provided by the user..
Source code in rompy/core/config.py
Attributes
template
class-attribute
instance-attribute
template: Optional[str] = Field(description='The path to the model template', default=DEFAULT_TEMPLATE)
At a high level, ModelRun orchestrates the entire model execution process including generation, execution, and post-processing, while configuration objects are responsible for defining the model setup.
Core Component Categories
Grid Components
Grids define the spatial domain of models. Rompy provides several grid types:
BaseGrid
Bases: RompyBaseModel
Representation of a grid in geographic space.
This is the base class for all Grid objects. The minimum representation of a grid are two NumPy array's representing the vertices or nodes of some structured or unstructured grid, its bounding box and a boundary polygon. No knowledge of the grid connectivity is expected.
Source code in rompy/core/grid.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | |
Attributes
Functions
bbox
Returns a bounding box for the spatial grid
This function returns a list [ll_x, ll_y, ur_x, ur_y] where ll_x, ll_y (ur_x, ur_y) are the lower left (upper right) x and y coordinates bounding box of the model domain
Source code in rompy/core/grid.py
boundary
Returns the convex hull boundary polygon from the grid.
Parameters
tolerance: float Simplify polygon shape based on maximum distance from original geometry, see https://shapely.readthedocs.io/en/stable/manual.html#object.simplify.
Returns
polygon: shapely.Polygon See https://shapely.readthedocs.io/en/stable/manual.html#Polygon
Source code in rompy/core/grid.py
boundary_points
Returns array of coordinates from boundary polygon.
Parameters
tolerance: float Simplify polygon shape based on maximum distance from original geometry, see https://shapely.readthedocs.io/en/stable/manual.html#object.simplify. spacing: float If specified, points are returned evenly spaced along the boundary at the specified spacing, otherwise all points are returned.
Returns:
points: tuple Tuple of x and y coordinates of the boundary points.
Source code in rompy/core/grid.py
plot
Plot the grid
Source code in rompy/core/grid.py
RegularGrid
Bases: BaseGrid
Regular grid in geographic space.
This object provides an abstract representation of a regular grid in some geographic space.
Source code in rompy/core/grid.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | |
Attributes
grid_type
class-attribute
instance-attribute
x0
class-attribute
instance-attribute
y0
class-attribute
instance-attribute
rot
class-attribute
instance-attribute
dx
class-attribute
instance-attribute
dx: Optional[float] = Field(default=None, description='Spacing between grid points in the x direction')
dy
class-attribute
instance-attribute
dy: Optional[float] = Field(default=None, description='Spacing between grid points in the y direction')
nx
class-attribute
instance-attribute
ny
class-attribute
instance-attribute
Functions
generate
generate() -> RegularGrid
Generate the grid from the provided parameters.
Source code in rompy/core/grid.py
Data Components
Data objects represent and handle input data for models:
DataBlob
Bases: DataBase
Data source for model ingestion.
Generic data source for files that either need to be copied to the model directory
or linked if link is set to True.
Source code in rompy/core/data.py
Attributes
model_type
class-attribute
instance-attribute
model_type: Literal['data_blob', 'data_link'] = Field(default='data_blob', description='Model type discriminator')
source
class-attribute
instance-attribute
source: AnyPath = Field(description='URI of the data source, either a local file path or a remote uri')
link
class-attribute
instance-attribute
link: bool = Field(default=False, description='Whether to create a symbolic link instead of copying the file')
Functions
get
Copy or link the data source to a new directory.
Parameters
destdir : str | Path The destination directory to copy or link the data source to.
Returns
Path The path to the copied file or created symlink.
Source code in rompy/core/data.py
DataGrid
Bases: DataPoint
Data object for gridded source data.
Generic data object for xarray datasets that with gridded spatial dimensions
Note
The fields filter_grid and filter_time trigger updates to the crop filter from
the grid and time range objects passed to the get method. This is useful for data
sources that are not defined on the same grid as the model grid or the same time
range as the model run.
Source code in rompy/core/data.py
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 | |
Attributes
model_type
class-attribute
instance-attribute
source
class-attribute
instance-attribute
source: Union[SOURCE_TYPES] = Field(description='Source reader, must return an xarray gridded dataset in the open method', discriminator='model_type')
Functions
plot
plot(param, isel={}, model_grid=None, cmap='turbo', figsize=None, fscale=10, borders=True, land=True, coastline=True, **kwargs)
Plot the grid.
Source code in rompy/core/data.py
SourceBase
Bases: RompyBaseModel, ABC
Abstract base class for a source dataset.
Source code in rompy/core/source.py
Attributes
model_type
class-attribute
instance-attribute
model_type: Literal['base_source'] = Field(description='Model type discriminator, must be overriden by a subclass')
Functions
open
Return the filtered dataset object.
Parameters
variables : list, optional List of variables to select from the dataset. filters : Filter, optional Filters to apply to the dataset.
Notes
The kwargs are only a placeholder in case a subclass needs to pass additional arguments to the open method.
Source code in rompy/core/source.py
SourceFile
Bases: SourceBase
Source dataset from file to open with xarray.open_dataset.
Source code in rompy/core/source.py
Attributes
model_type
class-attribute
instance-attribute
uri
class-attribute
instance-attribute
kwargs
class-attribute
instance-attribute
variable
class-attribute
instance-attribute
SourceIntake
Bases: SourceBase
Source dataset from intake catalog.
note
The intake catalog can be prescribed either by the URI of an existing catalog file
or by a YAML string defining the catalog. The YAML string can be obtained from
calling the yaml() method on an intake dataset instance.
Source code in rompy/core/source.py
Attributes
model_type
class-attribute
instance-attribute
dataset_id
class-attribute
instance-attribute
catalog_uri
class-attribute
instance-attribute
catalog_uri: Optional[str | Path] = Field(default=None, description='The URI of the catalog to read from')
catalog_yaml
class-attribute
instance-attribute
catalog_yaml: Optional[str] = Field(default=None, description='The YAML string of the catalog to read from')
kwargs
class-attribute
instance-attribute
kwargs: dict = Field(default={}, description='Keyword arguments to define intake dataset parameters')
Functions
check_catalog
check_catalog() -> SourceIntake
Source code in rompy/core/source.py
Boundary Components
Boundary conditions specify model forcing at domain edges:
BoundaryWaveStation
Bases: DataBoundary
Wave boundary data from station datasets.
Note
The tolerance behaves differently with sel_methods idw and nearest; in idw
sites with no enough neighbours within tolerance are masked whereas in nearest
an exception is raised (see wavespectra documentation for more details).
Note
Be aware that when using idw missing values will be returned for sites with less
than 2 neighbours within tolerance in the original dataset. This is okay for land
mask areas but could cause boundary issues when on an open boundary location. To
avoid this either use nearest or increase tolerance to include more neighbours.
Source code in rompy/core/boundary.py
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
Attributes
grid_type
class-attribute
instance-attribute
grid_type: Literal['boundary_wave_station'] = Field(default='boundary_wave_station', description='Model type discriminator')
source
class-attribute
instance-attribute
source: Union[SOURCE_TYPES] = Field(description='Dataset source reader, must return a wavespectra-enabled xarray dataset in the open method', discriminator='model_type')
sel_method
class-attribute
instance-attribute
sel_method: Literal['idw', 'nearest'] = Field(default='idw', description='Wavespectra method to use for selecting boundary points from the dataset')
buffer
class-attribute
instance-attribute
buffer: float = Field(default=2.0, description='Space to buffer the grid bounding box if `filter_grid` is True')
Functions
model_post_init
get
get(destdir: str | Path, grid: RegularGrid, time: Optional[TimeRange] = None) -> str
Write the selected boundary data to a netcdf file.
Parameters
destdir : str | Path
Destination directory for the netcdf file.
grid : RegularGrid
Grid instance to use for selecting the boundary points.
time: TimeRange, optional
The times to filter the data to, only used if self.crop_data is True.
Returns
outfile : Path Path to the netcdf file.
Source code in rompy/core/boundary.py
SourceWavespectra
Bases: SourceBase
Wavespectra dataset from wavespectra reader.
Source code in rompy/core/source.py
Spectrum Components
Spectral representations for wave models:
LogFrequency
Bases: RompyBaseModel
Logarithmic wave frequencies.
Frequencies are defined according to:
:math:f_{i+1} = \gamma * f_{i}
Note
The number of frequency bins nbin is always kept unchanged when provided. This
implies other parameters may be adjusted so nbin bins can be defined. Specify
f0, f1 and finc and let nbin be calculated to avoid those values changing.
Note
Choose finc=0.1 for a 10% increment between frequencies that satisfies the DIA.
Examples
.. ipython:: python :okwarning:
from rompy.core.spectrum import LogFrequency
LogFrequency(f0=0.04, f1=1.0, nbin=34)
LogFrequency(f0=0.04, f1=1.0, finc=0.1)
LogFrequency(f0=0.04, nbin=34, finc=0.1)
LogFrequency(f1=1.0, nbin=34, finc=0.1)
Source code in rompy/core/spectrum.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
Attributes
model_type
class-attribute
instance-attribute
f0
class-attribute
instance-attribute
f1
class-attribute
instance-attribute
finc
class-attribute
instance-attribute
nbin
class-attribute
instance-attribute
nbin: Optional[int] = Field(default=None, description='Number of frequency bins, one less the size of frequency array', gt=0)
Functions
init_options
init_options() -> LogFrequency
Set the missing frequency parameters.
Source code in rompy/core/spectrum.py
Architecture Patterns
Configuration Validation
All Rompy configurations use Pydantic models, providing type safety and validation:
- Automatic validation of configuration parameters
- Clear error messages for invalid configurations
- Serialization/deserialization capabilities for reproducibility
Plugin Architecture
Rompy's plugin system allows for extensibility:
- Model configurations via
rompy.configentry points - Execution backends via
rompy.runentry points - Post-processors via
rompy.postprocessentry points
Backend Abstraction
Execution backends abstract the computational environment:
- Local execution for development
- Docker execution for containerized workflows
- HPC execution for high-performance computing
- Cloud execution for scalable computing
Best Practices
Configuration Design
- Use Type Safety: Leverage Pydantic models for configuration validation
- Modular Configuration: Keep components modular and reusable
- Serialization: Ensure configurations are fully serializable for reproducibility
- Documentation: Document configuration options and default values
Model Integration
- Template-based Generation: Use cookiecutter templates for model input generation
- Environment Agnostic: Design models to run in different computational environments
- Data Abstraction: Abstract data sources to support multiple input formats
Next Steps
- For detailed configuration options, see Configuration Deep Dive
- To understand the overall architecture, see Architecture Overview
- For practical examples of using these concepts, see Progressive Tutorials
- To learn about implementing different models, see Models