Quality Control Package
HydroServer Quality Control Guide
HydroServer Python Client - Quality Control
The hydroserverpy quality control package provides several methods for performing quality control operations on observations datasets. This guide will provide examples explaining how to retrieve data for quality control and run quality control operations. Quality controlled data can be uploaded back to HydroServer as a new datastream with an appropriate processing level.
Quality Control Guide with Examples
To perform quality control operations, you must connect to HydroServer.
from hydroserverpy import HydroServer
# Initialize HydroServer connection with credentials.
hs_api = HydroServer(
host='https://playground.hydroserver.org',
username='user@example.com',
password='******'
)
Select a datastream you want to perform quality control on and fetch its observations. You can optionally include result qualifier information with the fetched observations.
from datetime import datetime
...
# Get a Datastream
datastream = hs_api.datastreams.get(uid='00000000-0000-0000-0000-000000000000')
# Get Observations of a Datastream between two timestamps
observations_df = datastream.get_observations(
include_quality=True,
start_time=datetime(year=2023, month=1, day=1),
end_time=datetime(year=2023, month=12, day=31)
)
Once you have a DataFrame of observations, you will need to initialize a quality control session.
from hydroserverpy import HydroServerQualityControl
...
# Initialize quality control session.
hs_quality_control = HydroServerQualityControl(
datastream_id=datastream.uid,
observations=observations_df
)
You are now ready to begin performing quality control operations on the dataset. You can access the modified observations using the observations
property. Your work will not be saved back to HydroServer until you upload the quality controlled observations to a new datastream.
# Get quality controlled observations DataFrame.
quality_controlled_observations = hs_quality_control.observations
Example: Find Gaps
# Find gaps in observations given an expected 15 minute interval.
hs_quality_control.find_gaps(
time_value=15,
time_unit='m'
)
Submodules
hydroserverpy.quality.service module
- class hydroserverpy.quality.service.FilterOperation(value)[source]
Bases:
Enum
Enumeration for filter operations.
- E = 'E'
- GT = 'GT'
- GTE = 'GTE'
- LT = 'LT'
- LTE = 'LTE'
- class hydroserverpy.quality.service.HydroServerQualityControl(datastream_id, observations)[source]
Bases:
object
Quality control operations for HydroServer observations.
- Parameters:
datastream_id (Union[UUID, str]) – The ID of the datastream.
observations (pd.DataFrame) – DataFrame containing ‘timestamp’ and ‘value’ columns.
- add_points(points, index=None)[source]
Adds new points to the observations, optionally at specified indices.
- Parameters:
points (List[List[Union[str, float]]]) – List of points to be added.
index (Optional[List[int]]) – Optional list of indices at which to insert the points.
- Return type:
None
- change_values(index_list, operator, value)[source]
Changes the values of observations based on the specified operator and value.
- Parameters:
index_list (List[int]) – List of indices for which values will be changed.
operator (str) – The operation to perform (‘MULT’, ‘DIV’, ‘ADD’, ‘SUB’, ‘ASSIGN’).
value (Union[int, float]) – The value to use in the operation.
- Return type:
None
-
datastream_id:
Union
[UUID
,str
]
- delete_points(index_list)[source]
Deletes points from the observations at the specified indices.
- Parameters:
index_list (List[int]) – List of indices for which points will be deleted.
- Return type:
None
- drift_correction(start, end, gap_width)[source]
Applies drift correction to the values of observations within the specified index range.
- Parameters:
start (int) – Start index of the range to apply drift correction.
end (int) – End index of the range to apply drift correction.
gap_width (float) – The width of the drift gap to correct.
- Returns:
DataFrame after applying drift correction.
- Return type:
pd.DataFrame
- fill_gap(gap, fill, interpolate_values)[source]
Fills identified gaps in the observations with placeholder values and optionally interpolates the values.
- Parameters:
gap (Tuple[int, str]) – Tuple containing the time value and unit for identifying gaps.
fill (Tuple[int, str]) – Tuple containing the time value and unit for filling gaps.
interpolate_values (bool) – Whether to interpolate values for the filled gaps.
- Returns:
DataFrame of points that filled the gaps.
- Return type:
pd.DataFrame
- filter(data_filter)[source]
Executes the applied filters and returns the resulting DataFrame.
- Parameters:
data_filter (Dict[str, Union[float, int]]) – Dictionary containing filter operations and their values.
- Return type:
None
- find_gaps(time_value, time_unit)[source]
Identifies gaps in the observations based on the specified time value and unit.
- Parameters:
time_value (int) – The time value for detecting gaps.
time_unit (str) – The unit of time (e.g., ‘s’, ‘m’, ‘h’).
- Returns:
DataFrame containing the observations with gaps.
- Return type:
pd.DataFrame
- interpolate(index_list)[source]
Interpolates the values of observations at the specified indices using linear interpolation.
- Parameters:
index_list (list[int]) – List of indices where values will be interpolated.
- Return type:
None
- property observations: DataFrame
Returns the observations DataFrame, filtered if a filter has been applied.
- Returns:
Observations DataFrame.
- Return type:
pd.DataFrame
- shift_points(index_list, time_value, time_unit)[source]
Shifts the timestamps of the observations at the specified indices by a given time value and unit.
- Parameters:
index_list (List[int]) – List of indices where timestamps will be shifted.
time_value (int) – The amount of time to shift the timestamps.
time_unit (str) – The unit of time (e.g., ‘s’ for seconds, ‘m’ for minutes).
- Return type:
None