rompy.schism.data.SCHISMDataBoundary#

pydantic model rompy.schism.data.SCHISMDataBoundary[source]#

This class is used to extract ocean boundary data from a griddd dataset at all open boundary nodes.

Show JSON schema
{
   "title": "SCHISMDataBoundary",
   "description": "This class is used to extract ocean boundary data from a griddd dataset at all open\nboundary nodes.",
   "type": "object",
   "properties": {
      "model_type": {
         "const": "boundary",
         "default": "data_boundary",
         "description": "Model type discriminator",
         "title": "Model Type",
         "type": "string"
      },
      "id": {
         "choices": [
            "elev2D",
            "uv3D",
            "TEM_3D",
            "SAL_3D",
            "bnd"
         ],
         "default": "bnd",
         "description": "SCHISM th id of the source",
         "title": "Id",
         "type": "string"
      },
      "source": {
         "description": "Source reader, must return an xarray gridded dataset in the open method",
         "discriminator": {
            "mapping": {
               "csv": "#/$defs/SourceTimeseriesCSV",
               "datamesh": "#/$defs/SourceDatamesh",
               "file": "#/$defs/SourceFile",
               "intake": "#/$defs/SourceIntake",
               "wavespectra": "#/$defs/SourceWavespectra"
            },
            "propertyName": "model_type"
         },
         "oneOf": [
            {
               "$ref": "#/$defs/SourceTimeseriesCSV"
            },
            {
               "$ref": "#/$defs/SourceDatamesh"
            },
            {
               "$ref": "#/$defs/SourceFile"
            },
            {
               "$ref": "#/$defs/SourceIntake"
            },
            {
               "$ref": "#/$defs/SourceWavespectra"
            }
         ],
         "title": "Source"
      },
      "link": {
         "default": false,
         "description": "Whether to create a symbolic link instead of copying the file",
         "title": "Link",
         "type": "boolean"
      },
      "filter": {
         "anyOf": [
            {
               "$ref": "#/$defs/Filter"
            },
            {
               "type": "null"
            }
         ],
         "description": "Optional filter specification to apply to the dataset"
      },
      "variables": {
         "description": "variable name in the dataset",
         "items": {
            "type": "string"
         },
         "title": "Variables",
         "type": "array"
      },
      "coords": {
         "anyOf": [
            {
               "$ref": "#/$defs/DatasetCoords"
            },
            {
               "type": "null"
            }
         ],
         "default": {
            "t": "time",
            "x": "longitude",
            "y": "latitude",
            "z": null,
            "s": null
         },
         "description": "Names of the coordinates in the dataset"
      },
      "crop_data": {
         "default": true,
         "description": "Update crop filter from Time object if passed to get method",
         "title": "Crop Data",
         "type": "boolean"
      },
      "buffer": {
         "default": 0.0,
         "description": "Space to buffer the grid bounding box if `filter_grid` is True",
         "title": "Buffer",
         "type": "number"
      },
      "time_buffer": {
         "default": [
            0,
            1
         ],
         "description": "Number of source data timesteps to buffer the time range if `filter_time` is True",
         "items": {
            "type": "integer"
         },
         "title": "Time Buffer",
         "type": "array"
      },
      "spacing": {
         "anyOf": [
            {
               "type": "number"
            },
            {
               "const": "parent",
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Spacing between points along the grid boundary to retrieve data for. If None (default), points are defined from the the actual grid object passed to the `get` method. If 'parent', the resolution of the parent dataset is used to define the spacing.",
         "title": "Spacing"
      },
      "sel_method": {
         "default": "interp",
         "description": "Xarray method to use for selecting boundary points from the dataset",
         "enum": [
            "sel",
            "interp"
         ],
         "title": "Sel Method",
         "type": "string"
      },
      "sel_method_kwargs": {
         "additionalProperties": true,
         "default": {},
         "description": "Keyword arguments for sel_method",
         "title": "Sel Method Kwargs",
         "type": "object"
      },
      "data_type": {
         "const": "boundary",
         "default": "boundary",
         "description": "Model type discriminator",
         "title": "Data Type",
         "type": "string"
      },
      "data_grid_source": {
         "anyOf": [
            {
               "$ref": "#/$defs/DataGrid"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "DataGrid source for boundary data"
      }
   },
   "$defs": {
      "DataGrid": {
         "additionalProperties": false,
         "description": "Data object for gridded source data.\n\nGeneric data object for xarray datasets that with gridded spatial dimensions\n\nNote\n----\nThe fields `filter_grid` and `filter_time` trigger updates to the crop filter from\nthe grid and time range objects passed to the get method. This is useful for data\nsources that are not defined on the same grid as the model grid or the same time\nrange as the model run.",
         "properties": {
            "model_type": {
               "const": "grid",
               "default": "grid",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "id": {
               "default": "data",
               "description": "Unique identifier for this data source",
               "title": "Id",
               "type": "string"
            },
            "source": {
               "description": "Source reader, must return an xarray gridded dataset in the open method",
               "discriminator": {
                  "mapping": {
                     "csv": "#/$defs/SourceTimeseriesCSV",
                     "datamesh": "#/$defs/SourceDatamesh",
                     "file": "#/$defs/SourceFile",
                     "intake": "#/$defs/SourceIntake",
                     "wavespectra": "#/$defs/SourceWavespectra"
                  },
                  "propertyName": "model_type"
               },
               "oneOf": [
                  {
                     "$ref": "#/$defs/SourceTimeseriesCSV"
                  },
                  {
                     "$ref": "#/$defs/SourceDatamesh"
                  },
                  {
                     "$ref": "#/$defs/SourceFile"
                  },
                  {
                     "$ref": "#/$defs/SourceIntake"
                  },
                  {
                     "$ref": "#/$defs/SourceWavespectra"
                  }
               ],
               "title": "Source"
            },
            "link": {
               "default": false,
               "description": "Whether to create a symbolic link instead of copying the file",
               "title": "Link",
               "type": "boolean"
            },
            "filter": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/Filter"
                  },
                  {
                     "type": "null"
                  }
               ],
               "description": "Optional filter specification to apply to the dataset"
            },
            "variables": {
               "anyOf": [
                  {
                     "items": {
                        "type": "string"
                     },
                     "type": "array"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": [],
               "description": "Subset of variables to extract from the dataset",
               "title": "Variables"
            },
            "coords": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/DatasetCoords"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {
                  "t": "time",
                  "x": "longitude",
                  "y": "latitude",
                  "z": null,
                  "s": null
               },
               "description": "Names of the coordinates in the dataset"
            },
            "crop_data": {
               "default": true,
               "description": "Update crop filters from Grid and Time objects if passed to get method",
               "title": "Crop Data",
               "type": "boolean"
            },
            "buffer": {
               "default": 0.0,
               "description": "Space to buffer the grid bounding box if `filter_grid` is True",
               "title": "Buffer",
               "type": "number"
            },
            "time_buffer": {
               "default": [
                  0,
                  0
               ],
               "description": "Number of source data timesteps to buffer the time range if `filter_time` is True",
               "items": {
                  "type": "integer"
               },
               "title": "Time Buffer",
               "type": "array"
            }
         },
         "required": [
            "source"
         ],
         "title": "DataGrid",
         "type": "object"
      },
      "DatasetCoords": {
         "additionalProperties": false,
         "description": "Coordinates representation.",
         "properties": {
            "t": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "time",
               "description": "Name of the time coordinate",
               "title": "T"
            },
            "x": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "longitude",
               "description": "Name of the x coordinate",
               "title": "X"
            },
            "y": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "latitude",
               "description": "Name of the y coordinate",
               "title": "Y"
            },
            "z": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Name of the z coordinate",
               "title": "Z"
            },
            "s": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Name of the site coordinate",
               "title": "S"
            }
         },
         "title": "DatasetCoords",
         "type": "object"
      },
      "Filter": {
         "additionalProperties": false,
         "properties": {
            "sort": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Sort"
            },
            "subset": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Subset"
            },
            "crop": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Crop"
            },
            "timenorm": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Timenorm"
            },
            "rename": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Rename"
            },
            "derived": {
               "anyOf": [
                  {
                     "additionalProperties": true,
                     "type": "object"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": {},
               "title": "Derived"
            }
         },
         "title": "Filter",
         "type": "object"
      },
      "SourceDatamesh": {
         "additionalProperties": false,
         "description": "Source dataset from Datamesh.\n\nDatamesh documentation: https://docs.oceanum.io/datamesh/index.html",
         "properties": {
            "model_type": {
               "const": "datamesh",
               "default": "datamesh",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "datasource": {
               "description": "The id of the datasource on Datamesh",
               "title": "Datasource",
               "type": "string"
            },
            "token": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "description": "Datamesh API token, taken from the environment if not provided",
               "title": "Token"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Keyword arguments to pass to `oceanum.datamesh.Connector`",
               "title": "Kwargs",
               "type": "object"
            }
         },
         "required": [
            "datasource",
            "token"
         ],
         "title": "SourceDatamesh",
         "type": "object"
      },
      "SourceFile": {
         "additionalProperties": false,
         "description": "Source dataset from file to open with xarray.open_dataset.",
         "properties": {
            "model_type": {
               "const": "file",
               "default": "file",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "uri": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "format": "path",
                     "type": "string"
                  }
               ],
               "description": "Path to the dataset",
               "title": "Uri"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Keyword arguments to pass to xarray.open_dataset",
               "title": "Kwargs",
               "type": "object"
            },
            "variable": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Variable to select from the dataset",
               "title": "Variable"
            }
         },
         "required": [
            "uri"
         ],
         "title": "SourceFile",
         "type": "object"
      },
      "SourceIntake": {
         "additionalProperties": false,
         "description": "Source dataset from intake catalog.\n\nnote\n----\nThe intake catalog can be prescribed either by the URI of an existing catalog file\nor by a YAML string defining the catalog. The YAML string can be obtained from\ncalling the `yaml()` method on an intake dataset instance.",
         "properties": {
            "model_type": {
               "const": "intake",
               "default": "intake",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "dataset_id": {
               "description": "The id of the dataset to read in the catalog",
               "title": "Dataset Id",
               "type": "string"
            },
            "catalog_uri": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "format": "path",
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The URI of the catalog to read from",
               "title": "Catalog Uri"
            },
            "catalog_yaml": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The YAML string of the catalog to read from",
               "title": "Catalog Yaml"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Keyword arguments to define intake dataset parameters",
               "title": "Kwargs",
               "type": "object"
            }
         },
         "required": [
            "dataset_id"
         ],
         "title": "SourceIntake",
         "type": "object"
      },
      "SourceTimeseriesCSV": {
         "additionalProperties": false,
         "description": "Timeseries source class from CSV file.\n\nThis class should return a timeseries from a CSV file. The dataset variables are\ndefined from the column headers, therefore the appropriate read_csv kwargs must be\npassed to allow defining the columns. The time index is defined from column name\nidentified by the tcol field.",
         "properties": {
            "model_type": {
               "const": "csv",
               "default": "csv",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "filename": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "format": "path",
                     "type": "string"
                  }
               ],
               "description": "Path to the csv file",
               "title": "Filename"
            },
            "tcol": {
               "default": "time",
               "description": "Name of the column containing the time data",
               "title": "Tcol",
               "type": "string"
            },
            "read_csv_kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Keyword arguments to pass to pandas.read_csv",
               "title": "Read Csv Kwargs",
               "type": "object"
            }
         },
         "required": [
            "filename"
         ],
         "title": "SourceTimeseriesCSV",
         "type": "object"
      },
      "SourceWavespectra": {
         "additionalProperties": false,
         "description": "Wavespectra dataset from wavespectra reader.",
         "properties": {
            "model_type": {
               "const": "wavespectra",
               "default": "wavespectra",
               "description": "Model type discriminator",
               "title": "Model Type",
               "type": "string"
            },
            "uri": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "format": "path",
                     "type": "string"
                  }
               ],
               "description": "Path to the dataset",
               "title": "Uri"
            },
            "reader": {
               "description": "Name of the wavespectra reader to use, e.g., read_swan",
               "title": "Reader",
               "type": "string"
            },
            "kwargs": {
               "additionalProperties": true,
               "default": {},
               "description": "Keyword arguments to pass to the wavespectra reader",
               "title": "Kwargs",
               "type": "object"
            }
         },
         "required": [
            "uri",
            "reader"
         ],
         "title": "SourceWavespectra",
         "type": "object"
      }
   },
   "additionalProperties": false,
   "required": [
      "source"
   ]
}

Fields:
Validators:

field data_grid_source: DataGrid | None = None#

DataGrid source for boundary data

field data_type: Literal['boundary'] = 'boundary'#

Model type discriminator

field id: str = 'bnd'#

SCHISM th id of the source

field sel_method: Literal['sel', 'interp'] = 'interp'#

Xarray method to use for selecting boundary points from the dataset

field time_buffer: list[int] = [0, 1]#

Number of source data timesteps to buffer the time range if filter_time is True

field variables: list[str] [Optional]#

variable name in the dataset

boundary_ds(grid: SCHISMGrid, time: TimeRange | None) Dataset[source]#

Generate SCHISM boundary dataset from source data.

This function extracts and formats boundary data for SCHISM from a source dataset. For 3D models, it handles vertical interpolation to the SCHISM sigma levels.

Parameters:
  • grid (SCHISMGrid) – The SCHISM grid to extract boundary data for

  • time (Optional[TimeRange]) – The time range to filter data to, if crop_data is True

Returns:

Dataset formatted for SCHISM boundary input

Return type:

xr.Dataset

get(destdir: str | Path, grid: SCHISMGrid, time: TimeRange | None = None) str[source]#

Write the selected boundary data to a netcdf file. :param destdir: Destination directory for the netcdf file. :type destdir: str | Path :param grid: Grid instance to use for selecting the boundary points. :type grid: SCHISMGrid :param time: The times to filter the data to, only used if self.crop_data is True. :type time: TimeRange, optional

Returns:

outfile – Path to the netcdf file.

Return type:

Path

model_post_init(context: Any, /) None#

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:
  • self – The BaseModel instance.

  • context – The context.