Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/concepts/fs/feature_group/feature_monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Statistics are computed on the whole or a subset of feature data (i.e., detectio
## Statistics Comparison

In addition to scheduled statistics, you can enable the comparison of statistics against a reference subset of feature data (i.e., reference window) and define the criteria for this comparison including the statistics metric to compare and a threshold to identify anomalous values.
The comparison can be done on a single scalar metric (e.g., the mean) or on the whole feature distribution using distance metrics such as PSI or KL divergence.

!!! info "Feature Monitoring Guide"
More information can be found in the [Feature monitoring guide](../../../user_guides/fs/feature_monitoring/index.md).
9 changes: 8 additions & 1 deletion docs/concepts/fs/feature_view/feature_monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,13 @@ Statistics are computed on the whole or a subset of feature data (i.e., detectio
## Statistics Comparison

In addition to scheduled statistics, you can enable the comparison of statistics against a reference subset of feature data (i.e., reference window), typically a training dataset, and define the criteria for this comparison including the statistics metric to compare and a threshold to identify anomalous values.
The comparison can be done on a single scalar metric (e.g., the mean) or on the whole feature distribution using distance metrics such as PSI or KL divergence.

## Model Monitoring

A Feature View backs the features served to a model in production.
By comparing the model's logged inference data against the training dataset it was trained on, you can detect drift between training and serving and decide when to retrain.
This is known as model monitoring and reuses the same statistics and distribution comparison machinery as feature monitoring.

!!! info "Feature Monitoring Guide"
More information can be found in the [Feature monitoring guide](../../../user_guides/fs/feature_monitoring/index.md).
More information can be found in the [Feature monitoring guide](../../../user_guides/fs/feature_monitoring/index.md) and the [Model Monitoring guide](../../../user_guides/mlops/model_monitoring/index.md).
37 changes: 37 additions & 0 deletions docs/concepts/mlops/model_monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Model monitoring lets you track how a deployed model behaves in production by comparing the data it serves against the data it was trained on.

When a model runs in production, the statistical properties of its inputs and predictions can drift away from those of the training data.
This degrades model quality silently, without any error being raised.
Model monitoring detects this drift early so you can decide whether to retrain the model.

## How it works

Model monitoring builds on two existing Hopsworks capabilities:

- **Feature logging**: a model deployment logs the features it serves and its predictions to the feature view's logging feature group through the Feature View logging APIs.
See the [Feature Logging guide](../../user_guides/fs/feature_view/feature_logging.md).
- **Feature monitoring**: Hopsworks computes statistics over windows of feature data and compares them against a reference, optionally raising alerts on significant shifts.
See the [Feature Monitoring concept](../fs/feature_view/feature_monitoring.md).

!!! info "Feature logging vs. the inference logger"
Hopsworks provides two separate inference logging mechanisms.
The [inference logger](../../user_guides/mlops/serving/inference-logger.md) stores the model inputs and predictions from inference requests and responses into Kafka, for later consumption and analysis.
[Feature logging](../../user_guides/fs/feature_view/feature_logging.md) supports more fine-grained logging of inference logs and features, enabling feature monitoring and model monitoring.
Model monitoring relies on feature logging, not on the inference logger.

A model monitoring configuration is a feature monitoring configuration over the logging feature group, filtered to a single model and version.
The detection window covers the recently served inference data, and the reference defaults to the training dataset version that was used to train that model.
By comparing the two — on a scalar metric or on the whole feature distribution — Hopsworks detects training/serving skew and drift over time.

## Where to configure it

Because monitoring is anchored on the feature view that backs the model, you can configure model monitoring from whichever entity is most convenient:

- a **model deployment**, when operating a model in production.
- a **model** in the model registry.
- a **feature view**, when working directly with the feature data.

All three resolve to the same underlying configuration.

!!! info "Model Monitoring Guide"
More information can be found in the [Model Monitoring guide](../../user_guides/mlops/model_monitoring/index.md).
1 change: 1 addition & 0 deletions docs/concepts/mlops/serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ A KServe model deployment can include the following components:
**`Inference Logger`**

: Hopsworks logs inputs and outputs of transformers and predictors to a ^^Kafka topic^^ that is part of the same project as the model.
This is for storing inference requests and responses for later consumption and analysis, and is separate from the feature logging that powers [Model Monitoring](model_monitoring.md).
Not available for vLLM deployments.

**`Inference Batcher`**
Expand Down
119 changes: 93 additions & 26 deletions docs/user_guides/fs/feature_group/feature_monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Before continuing with this guide, see the [Feature monitoring guide](../feature

!!! warning "Limited UI support"
Currently, feature monitoring can only be configured using the [Hopsworks Python library](https://pypi.org/project/hopsworks).
However, you can enable/disable a feature monitoring configuration or trigger the statistics comparison manually from the UI, as shown in the [Advanced guide](../feature_monitoring/feature_monitoring_advanced.md).
However, you can enable/disable a feature monitoring configuration or trigger the statistics comparison manually from the UI.

## Code

Expand Down Expand Up @@ -48,6 +48,8 @@ Connect the client running your notebooks to Hopsworks.
fs = project.get_feature_store()
```

See the API reference for [`hopsworks.login`][hopsworks.login] and [`Project.get_feature_store`][hopsworks_common.project.Project.get_feature_store].

You will be prompted to paste your API key to connect the notebook to your project.
The `fs` Feature Store entity is now ready to be used to insert or read data from Hopsworks.

Expand Down Expand Up @@ -76,6 +78,8 @@ The following is a code example for getting or creating a Feature Group with nam
trans_fg.insert(transactions_df)
```

See the API reference for [`FeatureStore.get_feature_group`][hsfs.feature_store.FeatureStore.get_feature_group] and [`FeatureStore.get_or_create_feature_group`][hsfs.feature_store.FeatureStore.get_or_create_feature_group].

### Step 2: Initialize configuration

#### Scheduled statistics
Expand All @@ -86,56 +90,57 @@ You can setup statistics monitoring on a ==single feature or multiple features==

```python
# compute statistics for all the features
fg_monitoring_config = trans_fg.create_statistics_monitoring(
fg_monitoring_config = trans_fg.create_scheduled_statistics(
name="trans_fg_all_features_monitoring",
description="Compute statistics on all data of all features of the Feature Group on a daily basis",
)

# or for a single feature
fg_monitoring_config = trans_fg.create_statistics_monitoring(
# or for one or more specific features
fg_monitoring_config = trans_fg.create_scheduled_statistics(
name="trans_fg_amount_monitoring",
description="Compute statistics on all data of a single feature of the Feature Group on a daily basis",
feature_name="amount",
description="Compute statistics on all data of selected features of the Feature Group on a daily basis",
feature_names=["amount"],
)
```

See the API reference for [`FeatureGroup.create_scheduled_statistics`][hsfs.feature_group.FeatureGroup.create_scheduled_statistics].

#### Statistics comparison

When enabling the comparison of statistics in a feature monitoring configuration, you need to specify a ==single feature== of your Feature Group.
You can create multiple feature monitoring configurations for the same Feature Group, but each of them should point to a single feature in the Feature Group.
When enabling the comparison of statistics in a feature monitoring configuration, the feature to compare is selected later in the `compare_on` (or `compare_on_distribution`) method, not in `create_feature_monitoring`.
You can create multiple feature monitoring configurations for the same Feature Group.

=== "Python"

```python
fg_monitoring_config = trans_fg.create_feature_monitoring(
name="trans_fg_amount_monitoring",
feature_name="amount",
description="Compute descriptive statistics on the amount Feature of the Feature Group on a daily basis",
description="Compute and compare descriptive statistics on the Feature Group on a daily basis",
)
```

#### Custom schedule or percentage of window data
See the API reference for [`FeatureGroup.create_feature_monitoring`][hsfs.feature_group.FeatureGroup.create_feature_monitoring].

#### Custom schedule

By default, the computation of statistics is scheduled to run endlessly, every day at 12PM.
You can modify the default schedule by adjusting the `cron_expression`, `start_date_time` and `end_date_time` parameters.
To compute statistics on only a subset of the feature data, use the `row_percentage` parameter of `with_detection_window` (see Step 3).

=== "Python"

```python
fg_monitoring_config = trans_fg.create_statistics_monitoring(
fg_monitoring_config = trans_fg.create_scheduled_statistics(
name="trans_fg_all_features_monitoring",
description="Compute statistics on all data of all features of the Feature Group on a weekly basis",
cron_expression="0 0 12 ? * MON *", # weekly
row_percentage=0.8, # use 80% of the data
)

# or
fg_monitoring_config = trans_fg.create_feature_monitoring(
name="trans_fg_amount_monitoring",
feature_name="amount",
description="Compute descriptive statistics on the amount Feature of the Feature Group on a weekly basis",
description="Compute and compare descriptive statistics on the Feature Group on a weekly basis",
cron_expression="0 0 12 ? * MON *", # weekly
row_percentage=0.8, # use 80% of the data
)
```

Expand All @@ -155,9 +160,12 @@ Additionally, you can specify the percentage of feature data on which statistics
)
```

See the API reference for [`FeatureMonitoringConfig.with_detection_window`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig.with_detection_window].

### Step 4: (Optional) Define a reference window

When setting up feature monitoring for a Feature Group, reference windows can be either a regular window or a specific value (i.e., window of size 1).
When setting up feature monitoring for a Feature Group, you can compare the detection statistics against a reference window of feature data.
A reference window is defined with the `with_reference_window` method.

=== "Python"

Expand All @@ -168,33 +176,68 @@ When setting up feature monitoring for a Feature Group, reference windows can be
time_offset="2w", # starting from two weeks ago
row_percentage=0.8, # use 80% of the data
)

# or a specific value
fm_monitoring_config.with_reference_value(
value=100,
)
```

### Step 5: (Optional) Define the statistics comparison criteria
See the API reference for [`FeatureMonitoringConfig.with_reference_window`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig.with_reference_window].

!!! info "Comparing against a specific value"
Instead of a reference window, you can compare the detection statistics against a fixed reference value (i.e., a window of size 1).
In that case, skip this step and pass the `specific_value` parameter to `compare_on` in Step 5.

### Step 5.A: (Optional) Compare on a scalar metric

In order to compare detection and reference statistics, you need to provide the criteria for such comparison.
First, you select the metric to consider in the comparison using the `metric` parameter.
First, you select the feature and the metric to consider in the comparison using the `feature_name` and `metric` parameters.
Then, you can define a relative or absolute threshold using the `threshold` and `relative` parameters.

=== "Python"

```python
# compare against a reference window
fm_monitoring_config.compare_on(
feature_name="amount", # the feature to compare
metric="mean",
threshold=0.2, # a relative change over 20% is considered anomalous
relative=True, # relative or absolute change
strict=False, # strict or relaxed comparison
)

# or compare against a specific value instead of a reference window
fm_monitoring_config.compare_on(
feature_name="amount",
metric="mean",
specific_value=100,
threshold=0.2,
relative=True,
)
```

See the API reference for [`FeatureMonitoringConfig.compare_on`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig.compare_on].

!!! info "Difference values and thresholds"
For more information about the computation of difference values and the comparison against threshold bounds see the [Comparison criteria section](../feature_monitoring/statistics_comparison.md#comparison-criteria) in the Statistics comparison guide.

### Step 5.B: (Optional) Compare on the whole distribution

Alternatively, instead of a single scalar metric, you can detect drift in the shape of a feature's distribution using `compare_on_distribution`.
Select a distribution distance metric (e.g., `PSI`) and a threshold.
A reference window (Step 4) is required for distribution comparison.

=== "Python"

```python
fm_monitoring_config.compare_on_distribution(
feature_name="amount", # the feature to compare
metric="PSI",
threshold=0.2, # a distance above 0.2 is considered a significant shift
)
```

See the API reference for [`FeatureMonitoringConfig.compare_on_distribution`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig.compare_on_distribution].

!!! tip "More distribution options"
See the [Distribution comparison guide](../feature_monitoring/distribution_comparison.md) for the full list of metrics and binning strategies.

### Step 6: Save configuration

Finally, you can save your feature monitoring configuration by calling the `save` method.
Expand All @@ -206,5 +249,29 @@ Once the configuration is saved, the schedule for the statistics computation and
fm_monitoring_config.save()
```

!!! info "Next steps"
See the [Advanced guide](../feature_monitoring/feature_monitoring_advanced.md) to learn how to delete, disable or trigger feature monitoring manually.
See the API reference for [`FeatureMonitoringConfig.save`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig.save].

### Retrieve configurations and history

Once saved, you can retrieve your feature monitoring configurations and the results of past executions directly from the Feature Group.

=== "Python"

```python
# fetch all configurations attached to the feature group
configs = trans_fg.get_feature_monitoring_configs()

# or a single configuration by name
config = trans_fg.get_feature_monitoring_configs(name="trans_fg_amount_monitoring")

# fetch the history of monitoring results (with computed statistics)
history = trans_fg.get_feature_monitoring_history(
config_name="trans_fg_amount_monitoring",
with_statistics=True,
)
```

See the API reference for [`FeatureGroup.get_feature_monitoring_configs`][hsfs.feature_group.FeatureGroup.get_feature_monitoring_configs] and [`FeatureGroup.get_feature_monitoring_history`][hsfs.feature_group.FeatureGroup.get_feature_monitoring_history].

!!! info "Explore the API"
The [`FeatureMonitoringConfig`][hsfs.core.feature_monitoring_config.FeatureMonitoringConfig] reference documents the full set of available methods, such as enabling or disabling a configuration, triggering it manually, or deleting it.
Loading
Loading