Quality Control Package

HydroServer Quality Control Guide

HydroServer Python Client - Quality Control

The hydroserverpy quality control package provides several methods for performing quality control operations on observations datasets. This guide will provide examples explaining how to retrieve data for quality control and run quality control operations. Quality controlled data can be uploaded back to HydroServer as a new datastream with an appropriate processing level.

Quality Control Guide with Examples

To perform quality control operations, you must connect to HydroServer.

from hydroserverpy import HydroServer

# Initialize HydroServer connection with credentials.
hs_api = HydroServer(
    host='https://playground.hydroserver.org',
    username='user@example.com',
    password='******'
)

Select a datastream you want to perform quality control on and fetch its observations. You can optionally include result qualifier information with the fetched observations.

from datetime import datetime

...

# Get a Datastream
datastream = hs_api.datastreams.get(uid='00000000-0000-0000-0000-000000000000')

# Get Observations of a Datastream between two timestamps
observations_df = datastream.get_observations(
    include_quality=True,
    start_time=datetime(year=2023, month=1, day=1),
    end_time=datetime(year=2023, month=12, day=31)
)

Once you have a DataFrame of observations, you will need to initialize a quality control session.

from hydroserverpy import HydroServerQualityControl

...

# Initialize quality control session.
hs_quality_control = HydroServerQualityControl(
    datastream_id=datastream.uid,
    observations=observations_df
)

You are now ready to begin performing quality control operations on the dataset. You can access the modified observations using the observations property. Your work will not be saved back to HydroServer until you upload the quality controlled observations to a new datastream.

# Get quality controlled observations DataFrame.
quality_controlled_observations = hs_quality_control.observations
Example: Find Gaps
# Find gaps in observations given an expected 15 minute interval.
hs_quality_control.find_gaps(
    time_value=15,
    time_unit='m'
)

Submodules

hydroserverpy.quality.service module

class hydroserverpy.quality.service.FilterOperation(value)[source]

Bases: Enum

Enumeration for filter operations.

E = 'E'
GT = 'GT'
GTE = 'GTE'
LT = 'LT'
LTE = 'LTE'
class hydroserverpy.quality.service.HydroServerQualityControl(datastream_id, observations)[source]

Bases: object

Quality control operations for HydroServer observations.

Parameters:
  • datastream_id (Union[UUID, str]) – The ID of the datastream.

  • observations (pd.DataFrame) – DataFrame containing ‘timestamp’ and ‘value’ columns.

add_points(points, index=None)[source]

Adds new points to the observations, optionally at specified indices.

Parameters:
  • points (List[List[Union[str, float]]]) – List of points to be added.

  • index (Optional[List[int]]) – Optional list of indices at which to insert the points.

Return type:

None

change_values(index_list, operator, value)[source]

Changes the values of observations based on the specified operator and value.

Parameters:
  • index_list (List[int]) – List of indices for which values will be changed.

  • operator (str) – The operation to perform (‘MULT’, ‘DIV’, ‘ADD’, ‘SUB’, ‘ASSIGN’).

  • value (Union[int, float]) – The value to use in the operation.

Return type:

None

datastream_id: Union[UUID, str]
delete_points(index_list)[source]

Deletes points from the observations at the specified indices.

Parameters:

index_list (List[int]) – List of indices for which points will be deleted.

Return type:

None

drift_correction(start, end, gap_width)[source]

Applies drift correction to the values of observations within the specified index range.

Parameters:
  • start (int) – Start index of the range to apply drift correction.

  • end (int) – End index of the range to apply drift correction.

  • gap_width (float) – The width of the drift gap to correct.

Returns:

DataFrame after applying drift correction.

Return type:

pd.DataFrame

fill_gap(gap, fill, interpolate_values)[source]

Fills identified gaps in the observations with placeholder values and optionally interpolates the values.

Parameters:
  • gap (Tuple[int, str]) – Tuple containing the time value and unit for identifying gaps.

  • fill (Tuple[int, str]) – Tuple containing the time value and unit for filling gaps.

  • interpolate_values (bool) – Whether to interpolate values for the filled gaps.

Returns:

DataFrame of points that filled the gaps.

Return type:

pd.DataFrame

filter(data_filter)[source]

Executes the applied filters and returns the resulting DataFrame.

Parameters:

data_filter (Dict[str, Union[float, int]]) – Dictionary containing filter operations and their values.

Return type:

None

find_gaps(time_value, time_unit)[source]

Identifies gaps in the observations based on the specified time value and unit.

Parameters:
  • time_value (int) – The time value for detecting gaps.

  • time_unit (str) – The unit of time (e.g., ‘s’, ‘m’, ‘h’).

Returns:

DataFrame containing the observations with gaps.

Return type:

pd.DataFrame

interpolate(index_list)[source]

Interpolates the values of observations at the specified indices using linear interpolation.

Parameters:

index_list (list[int]) – List of indices where values will be interpolated.

Return type:

None

property observations: DataFrame

Returns the observations DataFrame, filtered if a filter has been applied.

Returns:

Observations DataFrame.

Return type:

pd.DataFrame

shift_points(index_list, time_value, time_unit)[source]

Shifts the timestamps of the observations at the specified indices by a given time value and unit.

Parameters:
  • index_list (List[int]) – List of indices where timestamps will be shifted.

  • time_value (int) – The amount of time to shift the timestamps.

  • time_unit (str) – The unit of time (e.g., ‘s’ for seconds, ‘m’ for minutes).

Return type:

None

class hydroserverpy.quality.service.Operator(value)[source]

Bases: Enum

Enumeration for mathematical operations.

ADD = 'ADD'
ASSIGN = 'ASSIGN'
DIV = 'DIV'
MULT = 'MULT'
SUB = 'SUB'
class hydroserverpy.quality.service.TimeUnit(value)[source]

Bases: Enum

Enumeration for time units.

DAY = 'D'
HOUR = 'h'
MINUTE = 'm'
MONTH = 'M'
SECOND = 's'
WEEK = 'W'
YEAR = 'Y'