Sensor Framework Documentation
This document provides a detailed overview of all the sensors available in the data quality monitoring framework. Each sensor is detailed below, including its main logic and a breakdown of every feature it generates.
Basic Sensors
BasicSensorFeatureExtractor
A comprehensive sensor that operates on defined cohorts within a DataFrame. It calculates features at multiple levels: global schema properties (like column name conventions and type diversity), file metadata (like row counts and size), cohort-level date characteristics (like day of the week), and a granular, per-column analysis of structural and numeric properties within each cohort.
Main Function Logic
CLASS ComprehensiveFeatureExtractor:
INITIALIZE with cohort_columns, numeric_columns, date_column
FUNCTION fit_transform(data, last_file_date=None):
// --- Phase 1: Global Schema & Metadata Features (calculated once) ---
all_features = {}
all_column_names = GET_COLUMN_NAMES(data)
all_features['row_count'] = data.row_count
all_features['col_count'] = data.column_count
all_features['filesize_est'] = data.memory_usage_in_bytes
all_features['schema_numeric_suffix_cols'] = COUNT(name FOR name in all_column_names IF name ends with '_<number>')
all_features['schema_avg_underscores_per_col'] = MEAN(name.count('_') FOR name in all_column_names)
all_features['schema_uppercase_cols'] = COUNT(name FOR name in all_column_names IF name is all uppercase)
all_features['schema_lowercase_cols'] = COUNT(name FOR name in all_column_names IF name is all lowercase)
all_features['schema_avg_col_name_length'] = MEAN(LENGTH(name) FOR name in all_column_names)
all_features['schema_col_name_length_std'] = STD_DEV(LENGTH(name) FOR name in all_column_names)
all_features['schema_dtype_diversity'] = COUNT_UNIQUE(data.get_dtypes())
dominant_dtype_freq = MOST_FREQUENT(data.get_dtypes()).count
all_features['schema_dominant_dtype_pct'] = dominant_dtype_freq / data.column_count
// --- Phase 2: Create Cohorts ---
CREATE cohort_id by concatenating values from cohort_columns
cohort_groups = GROUP data by cohort_id
// --- Phase 3: Per-Cohort Analysis ---
results = []
FOR each cohort_group in cohort_groups:
cohort_features = all_features.copy()
// A) Cohort-level Date Features (requires a date_column)
IF date_column is not None:
primary_date = cohort_group[date_column].mode() // Use the most frequent date
cohort_features['day_of_week'] = primary_date.day_of_week
cohort_features['month'] = primary_date.month
cohort_features['is_weekend'] = 1 IF primary_date.day_of_week IN (Saturday, Sunday) ELSE 0
cohort_features['is_month_end'] = 1 IF primary_date is the last day of its month ELSE 0
cohort_features['is_quarter_end'] = 1 IF primary_date is the last day of a quarter ELSE 0
IF last_file_date is not None:
cohort_features['days_since_last_file'] = (primary_date - last_file_date).in_days
// B) Per-Column, Per-Cohort Numeric Analysis
FOR each col in numeric_columns:
column_data = cohort_group[col]
total_rows = LENGTH(column_data)
total_nulls = column_data.count_nulls()
cohort_features['{col}_basic_rows'] = total_rows
cohort_features['{col}_basic_total_nulls'] = total_nulls
cohort_features['{col}_basic_null_percentage'] = total_nulls / total_rows IF total_rows > 0 ELSE 0
cohort_features['{col}_basic_avg_column_means'] = column_data.mean()
cohort_features['{col}_basic_avg_column_stds'] = column_data.std_dev()
cohort_features['{col}_basic_max_of_maxes'] = column_data.max()
cohort_features['{col}_basic_total_zeros'] = column_data.count_where(value == 0)
// NOTE: Some features are placeholders for schema consistency
cohort_features['{col}_basic_std_column_means'] = 0.0
cohort_features['{col}_basic_categorical_columns'] = 0
ENDFOR
results.APPEND(cohort_features)
ENDFOR
RETURN DataFrame(results)
Generated Features
| Feature Name | Description | Data Type |
|---|---|---|
row_count |
Total number of rows in the entire input file or DataFrame. | Integer |
|
||
col_count |
Total number of columns in the entire input file or DataFrame. | Integer |
|
||
filesize_est |
An estimate of the DataFrame's size in memory, in bytes. | Integer |
|
||
schema_numeric_suffix_cols |
The count of columns whose names end with a numeric suffix, such as '_1' or '_v2'. | Integer |
|
||
schema_avg_underscores_per_col |
The average number of underscores per column name across the entire schema. | Float |
|
||
schema_uppercase_cols |
The total count of column names that consist entirely of uppercase letters and underscores. | Integer |
|
||
schema_lowercase_cols |
The total count of column names that consist entirely of lowercase letters and underscores. | Integer |
|
||
schema_dtype_diversity |
The number of unique data types (e.g., int64, float64, object) present in the DataFrame. | Integer |
|
||
schema_dominant_dtype_pct |
The percentage of columns that have the most frequent (dominant) data type. | Float |
|
||
schema_avg_col_name_length |
The average character length of column names. | Float |
|
||
schema_col_name_length_std |
The standard deviation of column name character lengths. | Float |
|
||
day_of_week |
The day of the week for the cohort's primary date (e.g., 0 for Monday, 6 for Sunday). | Integer |
|
||
month |
The month of the year for the cohort's primary date (1-12). | Integer |
|
||
is_weekend |
A binary flag (1 or 0) indicating if the cohort's primary date falls on a weekend. | Integer |
|
||
is_month_end |
A binary flag (1 or 0) indicating if the cohort's primary date is the last day of its month. | Integer |
|
||
is_quarter_end |
A binary flag (1 or 0) indicating if the cohort's primary date is the last day of a quarter. | Integer |
|
||
days_since_last_file |
The number of days between the cohort's primary date and the date of the previously processed file. Requires state to be passed from a higher-level orchestrator. | Integer |
|
||
{col}_basic_rows |
Total number of rows in the cohort. | Integer |
|
||
{col}_basic_total_nulls |
The total count of null/missing values for this specific col within the cohort. | Integer |
|
||
{col}_basic_null_percentage |
The percentage of values for this col that are null within the cohort. | Float |
|
||
{col}_basic_columns_with_nulls |
A binary flag (1 or 0) indicating if this col contains at least one null value within the cohort. | Integer |
|
||
{col}_basic_dtype_float64_count |
A static feature indicating that the col being processed is numeric (specifically, treated as float64). Always returns 1. | Integer |
|
||
{col}_basic_avg_column_means |
The mean of the values for this specific col within the cohort. (Note: The feature name is for consistency; it represents the mean of a single column, not an average of means). | Float |
|
||
{col}_basic_std_column_means |
A placeholder feature, always returns 0.0. Included for schema consistency. | Float |
|
||
{col}_basic_avg_column_stds |
The standard deviation of the values for this specific col within the cohort. (Note: The feature name is for consistency; it represents the standard deviation of a single column). | Float |
|
||
{col}_basic_max_of_maxes |
The maximum value for this specific col within the cohort. (Note: The feature name is for consistency; it represents the max of a single column). | Float |
|
||
{col}_basic_total_zeros |
The total count of zero values for this specific col within the cohort. | Integer |
|
||
{col}_basic_cardinality |
A feature than computes the cardinality of the categorical columns | Float |
|
||
Advanced Statistical Sensors
AdvancedStatisticalSensorExtractor
Goes beyond basic statistics to capture subtle distributional properties and outlier patterns for each specified col within defined cohorts. This sensor is essential for detecting sophisticated data quality anomalies that are invisible to simple mean and standard deviation checks, providing a granular, per-col view of the data's characteristics.
Main Function Logic
CLASS AdvancedStatisticalSensorExtractor:
INITIALIZE with cohort_columns and col_columns
FUNCTION fit_transform(data):
// --- Phase 1: Create Cohorts ---
CREATE cohort identifier by concatenating cohort_columns
GROUP data by cohort
// --- Phase 2: Per-Col, Per-Cohort Analysis ---
// The following calculations are performed for EACH col within EACH cohort group.
FOR each col:
// Distribution Shape
CALCULATE_SKEWNESS(of the col's data in the cohort)
CALCULATE_KURTOSIS(of the col's data in the cohort)
// Percentiles & Variability
CALCULATE_5TH_AND_95TH_PERCENTILES()
CALCULATE_PERCENTILE_RANGE(P95 - P5)
CALCULATE_COEFFICIENT_OF_VARIATION(std / mean)
// Outlier Counts
CALCULATE_IQR_BOUNDS(Q1 - 1.5*IQR, Q3 + 1.5*IQR)
COUNT_VALUES_OUTSIDE_BOUNDS() for IQR outliers
CALCULATE_Z_SCORE(for each value)
COUNT_VALUES_WHERE |Z-score| > 3 for Z-score outliers
// Data Content
CALCULATE_UNIQUENESS_RATIO(unique_count / total_count)
CALCULATE_ZEROS_PERCENTAGE()
CALCULATE_NEGATIVES_PERCENTAGE()
// Optional Advanced Metrics
IF enabled, CALCULATE_ENTROPY()
IF enabled, CALCULATE_MONOTONIC_AND_CONSECUTIVE_PATTERNS()
ENDFOR
RETURN DataFrame with detailed statistical features for each col in each cohort
Generated Features
| Feature Name | Description | Data Type |
|---|---|---|
{col}_stat_skewness |
Measures the asymmetry of the col's data distribution within the cohort. Positive values indicate a tail to the right; negative values indicate a tail to the left. | Float |
|
||
{col}_stat_kurtosis |
Measures the 'tailedness' of the col's distribution (excess kurtosis). High values indicate the presence of significant outliers compared to a normal distribution. | Float |
|
||
{col}_stat_percentile_5 |
The 5th percentile value for the col within the cohort, representing the lower bound of the typical data range. | Float |
|
||
{col}_stat_percentile_95 |
The 95th percentile value for the col within the cohort, representing the upper bound of the typical data range. | Float |
|
||
{col}_stat_percentile_range |
The range between the 95th and 5th percentiles for the col, measuring the spread of the bulk of the data. | Float |
|
||
{col}_stat_cv |
The Coefficient of Variation (std/mean) for the col. Measures relative variability, allowing comparison of spread across cols with different scales. | Float |
|
||
{col}_stat_zeros_pct |
Percentage of values for this col within the cohort that are exactly zero. | Float |
|
||
{col}_stat_negatives_pct |
Percentage of values for this col within the cohort that are negative. | Float |
|
||
{col}_stat_unique_ratio |
Ratio of unique values to total non-null values for the col. Near 1 suggests high cardinality; near 0 suggests repetitive data. | Float |
|
||
{col}_stat_outliers_iqr_count |
Total count of outliers for the col detected using the robust Interquartile Range (IQR) method. | Integer |
|
||
{col}_stat_outliers_iqr_pct |
Percentage of non-null values for the col that are outliers based on the IQR method. | Float |
|
||
{col}_stat_outliers_zscore_count |
Total count of outliers for the col where the absolute Z-score is greater than 3. | Integer |
|
||
{col}_stat_outliers_zscore_pct |
Percentage of non-null values for the col that are outliers based on the Z-score method. | Float |
|
||
{col}_stat_entropy |
Shannon entropy of the col's values, measuring unpredictability. Higher entropy means more diverse and less predictable data. | Float |
|
||
{col}_stat_consecutive_same_value_max |
The length of the longest run of consecutive identical values for the col within the cohort. | Integer |
|
||
{col}_stat_monotonic_increase_pct |
The percentage of transitions between consecutive data points that are increases (value > previous_value). | Float |
|
||
{col}_stat_monotonic_decrease_pct |
The percentage of transitions between consecutive data points that are decreases (value < previous_value). | Float |
|
||
Cross-Column Relationship Sensors
CrossColumnFeatureExtractor
A modular feature extractor that analyzes cross-column relationships within cohorts (groups) of data. It computes correlation statistics, interaction effects between categorical and numerical columns, and PCA-based dimensionality metrics. Works with any tabular dataset by grouping data into cohorts and computing features for each cohort.
Main Function Logic
CLASS CrossColumnFeatureExtractor:
INITIALIZE with:
- cohort_columns: columns to group data by (e.g., ['city', 'date'])
- correlation_pairs: list of (col1, col2) pairs to correlate
- interaction_pairs: list of (categorical, numeric) column pairs
- pca_columns: columns for PCA analysis
FUNCTION fit_transform(data):
// --- Phase 1: Create Cohorts ---
CREATE cohort identifier by concatenating cohort_columns
GROUP data by cohort
// --- Phase 2: Correlation Analysis per Cohort ---
FOR each (col1, col2) pair in correlation_pairs:
CALCULATE Pearson correlation
CALCULATE ratio statistics (col1/col2 mean and std)
CALCULATE difference statistics (|col1-col2| mean and std)
COUNT valid pairs (non-null values)
OPTIONALLY CALCULATE Spearman rank correlation
OPTIONALLY CALCULATE p-values for correlations
// --- Phase 3: Interaction Features per Cohort ---
FOR each (categorical_col, numeric_col) pair:
GROUP BY categorical values within each cohort
CALCULATE variance of numeric column within each category
AGGREGATE variances (mean and std)
// --- Phase 4: PCA Features per Cohort ---
FOR each cohort:
STANDARDIZE selected numeric columns
COMPUTE eigenvalues of covariance matrix
CALCULATE explained variance ratios
COUNT components needed for 90% variance
RETURN DataFrame with features per cohort
Generated Features
| Feature Name | Description | Data Type |
|---|---|---|
cohort_size |
Number of records in each cohort group. | Integer |
|
||
corr_{col1}_{col2}_pearson |
Pearson correlation coefficient between col1 and col2 within the cohort. | Float |
|
||
corr_{col1}_{col2}_spearman |
Spearman rank correlation coefficient between col1 and col2 within the cohort (optional). | Float |
|
||
corr_{col1}_{col2}_valid_pairs |
Count of non-null value pairs used in correlation calculation. | Integer |
|
||
corr_{col1}_{col2}_ratio_mean |
Mean of the ratio col1/col2 within the cohort (excludes division by zero). | Float |
|
||
corr_{col1}_{col2}_ratio_std |
Standard deviation of the ratio col1/col2 within the cohort. | Float |
|
||
corr_{col1}_{col2}_diff_mean |
Mean of absolute difference |col1 - col2| within the cohort. | Float |
|
||
corr_{col1}_{col2}_diff_std |
Standard deviation of absolute difference |col1 - col2| within the cohort. | Float |
|
||
corr_{col1}_{col2}_pearson_pvalue |
P-value for the Pearson correlation (optional, tests significance). | Float |
|
||
interact_mean_variance |
Average of variances: how much numeric columns vary within categorical groups, averaged across all interaction pairs. | Float |
|
||
interact_std_variance |
Average of standard deviations of variances across interaction pairs. | Float |
|
||
pca_explained_variance_1 |
Eigenvalue (variance) of the first principal component. | Float |
|
||
pca_explained_variance_2 |
Eigenvalue (variance) of the second principal component. | Float |
|
||
pca_explained_variance_ratio_1 |
Proportion of total variance explained by the first principal component. | Float |
|
||
pca_explained_variance_ratio_2 |
Proportion of total variance explained by the second principal component. | Float |
|
||
pca_cumulative_variance_2 |
Cumulative proportion of variance explained by the first two principal components. | Float |
|
||
pca_n_components_90pct |
Number of principal components needed to explain at least 90% of total variance. | Integer |
|
||
Time Series Sensors
timeseries_features_polars_optimized
An optimized time-series feature extraction module using Polars. It calculates a variety of rolling window features for sensor data, grouped by specified cohort columns.
Main Function Logic
// --- Cohort-Based Rolling Window Analysis ---
CLASS TimeSeriesFeatureExtractor(cohort_columns, sensor_columns, lookback_window, min_history):
FUNCTION fit_transform(dataframe):
// 1. Preparation
CREATE a single '_cohort' identifier from cohort_columns.
SORT the dataframe by cohort and timestamp.
ADD a row number '_row_num' for each cohort.
// 2. Feature Expression Generation
INITIALIZE an empty list of feature expressions.
FOR each sensor_column in sensor_columns:
GENERATE expressions for all feature groups (rolling, statistical, trend, etc.).
// Each expression uses a rolling window over the cohort.
// e.g., rolling_mean(sensor_column, window_size=lookback_window).over(cohort)
ADD expressions to the list.
// 3. Computation
APPLY all generated expressions to the dataframe in a single pass.
// 4. Filtering
APPLY min_history filter:
SET feature values to NULL where _row_num < min_history.
// 5. Finalization
SELECT final columns.
RETURN dataframe with new feature columns.
Generated Features
| Feature Name | Description | Data Type |
|---|---|---|
{col}_ts_rolling_mean |
Rolling mean of col values over the lookback window. | Float |
|
||
{col}_ts_rolling_std |
Rolling standard deviation over the lookback window. | Float |
|
||
{col}_ts_rolling_min |
Minimum value in the lookback window. | Float |
|
||
{col}_ts_rolling_max |
Maximum value in the lookback window. | Float |
|
||
{col}_ts_zscore |
Z-score of current value relative to window statistics. | Float |
|
||
{col}_ts_null_ratio |
Proportion of null values in the lookback window. | Float |
|
||
{col}_ts_outlier_score |
Binary flag (0/1) indicating if current value is an outlier based on IQR. | Integer |
|
||
{col}_ts_trend_slope |
Approximated trend slope, calculated as the difference between the current value and the value at the start of the window, divided by the window size. | Float |
|
||
{col}_ts_volatility |
Coefficient of variation (std/mean) measuring relative variability. | Float |
|
||
{col}_ts_stability_score |
Inverse measure of volatility (1/(1+volatility)). | Float |
|
||
{col}_ts_momentum |
Difference between current value and window mean. | Float |
|
||
{col}_ts_acceleration |
Rate of change of momentum (difference from previous momentum value). | Float |
|
||
{col}_ts_mean_diff |
Absolute difference between current value and window mean. | Float |
|
||
{col}_ts_std_diff |
The difference between the rolling standard deviation at the current step and the previous step. | Float |
|
||
{col}_ts_range_ratio |
Position of current value within window range (0-1). | Float |
|
||
{col}_ts_percentile_rank |
Percentile rank of current value within window distribution. | Float |
|
||
{col}_ts_iqr |
Interquartile range of window values. | Float |
|
||
{col}_ts_skewness |
Skewness of window distribution (third standardized moment). | Float |
|
||
{col}_ts_kurtosis |
Excess kurtosis of window distribution (fourth standardized moment - 3). | Float |
|
||
{col}_ts_value_sum |
Sum of values in the lookback window. | Float |
|
||
{col}_ts_value_mean_diff |
Signed difference between current value and window mean. | Float |
|
||
{col}_ts_value_volatility_ratio |
Normalized distance from mean (|current-mean|/std). | Float |
|
||
Data Drift Detection Sensors
sensor_single_column_drift
Detects data drift by analyzing a single column of data over time, row by row. For each new data point, it compares the recent data window against a historical lookback window using a comprehensive suite of metrics. This includes geometric distances (Euclidean, Cosine), distribution comparison tests (Wasserstein, KS-test, Jensen-Shannon), statistical tests for variance (Levene), and scores that measure sudden changes, gradual trends, and volatility. This multi-faceted approach allows it to detect a wide range of drift types, from sudden spikes to gradual changes in distribution.
Main Function Logic
FUNCTION sensor_single_column_drift(data_column, lookback_window, min_history):
// --- 1. Initialization ---
results_dataframe = COPY(input_data_column)
feature_names = GET_ALL_DRIFT_FEATURE_NAMES()
FOR each feature_name in feature_names:
results_dataframe[feature_name] = GET_DEFAULT_VALUE(feature_name)
ENDFOR
// --- 2. Iteration and Calculation ---
FOR i from min_history to END of data_column:
// --- 2a. Define Time Windows ---
current_value = data_column[i]
previous_value = data_column[i-1]
recent_window_end = i + 1
recent_window_start = MAX(0, i - lookback_window)
recent_window = data_column from recent_window_start to recent_window_end
historical_window_end = recent_window_start
historical_window_start = MAX(0, historical_window_end - lookback_window)
historical_window = data_column from historical_window_start to historical_window_end
// --- 2b. Calculate All Drift Features for the current point 'i' ---
current_features = INITIALIZE_EMPTY_DICTIONARY()
current_features['euclidean_distance'] = CALCULATE_EUCLIDEAN_DISTANCE(recent_window, historical_window)
current_features['cosine_distance'] = CALCULATE_COSINE_DISTANCE(recent_window, historical_window)
current_features['mahalanobis_distance'] = CALCULATE_MAHALANOBIS_DISTANCE(current_value, historical_window)
current_features['wasserstein_distance'] = CALCULATE_WASSERSTEIN_DISTANCE(recent_window, historical_window)
current_features['jensen_shannon_divergence'] = CALCULATE_JENSEN_SHANNON_DIVERGENCE(recent_window, historical_window)
current_features['ks_test_pvalue'] = PERFORM_KS_TEST(recent_window, historical_window)
current_features['levene_test_pvalue'] = PERFORM_LEVENE_TEST(recent_window, historical_window)
current_features['mannwhitney_test_pvalue'] = PERFORM_MANN_WHITNEY_U_TEST(recent_window, historical_window)
current_features['sudden_change_score'] = CALCULATE_SUDDEN_CHANGE_SCORE(current_value, previous_value, recent_window)
current_features['gradual_change_score'] = CALCULATE_GRADUAL_CHANGE_SCORE(recent_window)
current_features['trend_deviation'] = CALCULATE_TREND_DEVIATION(current_value, historical_window)
current_features['stability_index'] = CALCULATE_STABILITY_INDEX(recent_window)
current_features['acceleration'] = CALCULATE_ACCELERATION(data_column, i)
current_features['momentum'] = CALCULATE_MOMENTUM(recent_window)
current_features['volatility_ratio'] = CALCULATE_VOLATILITY_RATIO(current_value, previous_value, historical_window)
current_features['percentile_score'] = CALCULATE_PERCENTILE_SCORE(current_value, historical_window)
current_features['regime_change_prob'] = CALCULATE_REGIME_CHANGE_PROB(current_value, historical_window)
// --- 2c. Store Results ---
FOR feature_name, value in current_features:
results_dataframe[feature_name][i] = value
ENDFOR
ENDFOR
// --- 3. Return ---
RETURN results_dataframe
ENDFUNCTION
Generated Features
| Feature Name | Description | Data Type |
|---|---|---|
drift_euclidean_distance |
The geometric distance between the vector of the most recent window of data points and the immediately preceding window. Sensitive to shifts in the mean and overall value levels. | Float |
|
||
drift_cosine_distance |
Measures the cosine of the angle between two data windows treated as vectors. It is sensitive to changes in the pattern or shape of the data, independent of its magnitude. | Float |
|
||
drift_mahalanobis_distance |
Measures the distance of the current data window from the center of the historical data distribution, accounting for covariance. Effective at detecting multivariate outliers or changes in correlation. | Float |
|
||
drift_wasserstein_distance |
Also known as Earth Mover's Distance. Measures the 'work' required to transform the historical data distribution into the distribution including the current point. Highly sensitive to changes in distribution shape. | Float |
|
||
drift_jensen_shannon_divergence |
Measures the similarity between two probability distributions (recent vs. historical). It is a symmetrized version of Kullback-Leibler divergence, providing a score from 0 (identical) to 1 (maximal divergence). | Float |
|
||
drift_ks_test_pvalue |
The p-value from a Kolmogorov-Smirnov test comparing the distribution of a recent window of data to an older, historical window. A low p-value (e.g., < 0.05) indicates a statistically significant change in distribution. | Float |
|
||
drift_levene_test_pvalue |
The p-value from Levene's test, which checks if two sub-samples have equal variances. A low p-value suggests a significant change in the data's volatility or spread. | Float |
|
||
drift_mannwhitney_test_pvalue |
The p-value from the Mann-Whitney U test, a non-parametric test for comparing the medians of two independent samples. A low p-value suggests that the distributions of the two windows are different. | Float |
|
||
drift_sudden_change_score |
Measures the magnitude of the most recent change (current value vs. previous value) relative to the recent standard deviation. A high score indicates a spike or drop that is unusual compared to recent volatility. | Float |
|
||
drift_gradual_change_score |
A score based on the slope and R-squared value of a linear regression over the recent data window. A high absolute value indicates a consistent and strong upward or downward trend. | Float |
|
||
drift_trend_deviation |
Measures how much the current value deviates from the value predicted by a linear trend fitted on the historical data. A high score indicates a break from the established trend. | Float |
|
||
drift_stability_index |
A score from 0 to 1 based on the inverse of the coefficient of variation. A value near 1 indicates very stable and predictable data, while a value near 0 indicates high relative volatility. | Float |
|
||
drift_acceleration |
Measures the rate of change of the momentum (velocity). A large positive value indicates that the data is rising at an increasing rate, while a large negative value indicates it is falling at an increasing rate. | Float |
|
||
drift_momentum |
A weighted average of recent changes in the data. Captures the current direction and magnitude of movement in the time series. | Float |
|
||
drift_volatility_ratio |
The ratio of the most recent absolute change to the historical standard deviation of changes. A value > 1 suggests the latest change is more volatile than usual. | Float |
|
||
drift_percentile_score |
A score from 0 to 1 indicating how extreme the current value is relative to the historical distribution. A value near 1 means the point is close to the historical minimum or maximum. | Float |
|
||
drift_regime_change_prob |
A probability-like score (0 to 1) that indicates how likely the current value is to be part of a new data regime, based on how far it falls outside the historical 5th and 95th percentiles. | Float |
|
||
Sensor Framework Documentation - Version 1.0
Last Updated: September 2025