Skip to main content

Data Quality Tests

Data quality tests validate the reasonableness and business logic correctness of our pricing service data.

These tests go beyond structural integrity to ensure the data makes business sense and falls within expected ranges.

Table of Contents:


1. Price Boundary Tests

SSP Maximum Price Bounds Check

Function: test_ssp_max_price_outlier_bounds()
SQL File: sql/qa/ssp_max_price_outlier_bounds.sql

Validates that max_price values in the SSP table fall within reasonable bounds for non-cash payers.

SELECT count(*) as n
FROM {{ table }}
WHERE max_price IS NOT NULL
AND (max_price < lower_bound OR max_price > upper_bound)

Business Rule: Maximum prices should be between lower and upper bounds defined for each service by Clear Rates Purpose: Identifies unrealistic price outliers that could indicate data quality issues.
Expected Result: 0 prices outside the valid range


SSP General Price Bounds

SQL File: sql/qa/ssp_max_price_bounds.sql

Ensures SSP prices fall within the $1 to $1M range for general validation.

SELECT count(*) as n
FROM {{ table }}
WHERE max_price IS NOT NULL
AND (max_price < 1 OR max_price > 1000000)

Purpose: Basic sanity check for price values.
Expected Result: 0 prices outside bounds

Regional Average Price Bounds Check

Function: test_payer_provider_type_ssp_cbsa_price_bounds_check()
SQL File: sql/qa/payer_provider_type_ssp_cbsa_price_bounds_check.sql

Validates that SSP max/min prices align with rate table bounds by payer, provider type, SSP ID, and CBSA.

WITH quality_stats AS (
SELECT
payer_id,
provider_type,
ssp_id,
cbsa,
min(lower_bound) as min_lower_bound,
max(upper_bound) as max_upper_bound,
max(max_price) as max_price,
min(max_price) as min_price
FROM {{ ssp_table }} ps
LEFT JOIN {{ rates_table }} r ON ps.max_price_id = r.id
WHERE cbsa IS NOT NULL AND cbsa != '99999' AND payer_id != '0'
GROUP BY 1, 2, 3, 4
)
SELECT count(*) as n
FROM quality_stats
WHERE max_price > max_upper_bound * 1.1
OR min_price < min_lower_bound * 0.9

Business Rule: SSP prices should not exceed rate bounds by more than 10%
Purpose: Ensures consistency between SSP and rates data
Expected Result: 0 cases where prices exceed bounds with 10% tolerance


2. Ratio Validation Tests

Price Ratio Reasonableness Check

Function: test_payer_provider_type_ssp_cbsa_ratio_check()
SQL File: sql/qa/payer_provider_type_ssp_cbsa_ratio_check.sql

Validates that the ratio between maximum and minimum prices within a payer-ssp-cbsa group doesn't exceed reasonable limits.

Business Rule: Price ratios should never exceed 150x (15,000% difference)
Purpose: Identifies cases where price variations are unreasonably large, indicating potential data quality issues
Expected Result: 0 cases where ratio exceeds 150x


3. Fee Calculation Accuracy Tests

Canonical Rate Facility Fee Alignment

Function: test_canonical_rate_facility_fee_alignment()
Location: qa/data_quality_tests.py:90

Validates that facility fees in the SSP table align with canonical rates from the CLD schema and fall within expected bounds.

df = hook.get_pandas_df(f"""
SELECT
ps.facility_fee,
cld.canonical_rate,
cld.canonical_rate_score,
cld.lower_bound,
cld.upper_bound,
cld.state_avg_medicare_rate
FROM {ps_schema}.pricing_service_ssp_{payer_month} ps
JOIN {cld_schema}.prod_combined_all cld
ON ps.roid = cld.roid
AND ps.payer_id = cld.payer_id
WHERE facility_fee IS NOT NULL
""")

Business Rule: Facility fees should closely match canonical rates and fall within lower/upper bounds
Purpose: Ensures facility fee calculations align with established rate benchmarks
Expected Result: All facility fees match canonical rates and stay within bounds


Facility Fee Calculation Accuracy

Function: test_facility_fee_calculation_accuracy()
Location: qa/data_quality_tests.py:126

Verifies that facility fees in the SSP table match the calculated facility fees derived from the rates table.

Test Logic:

  1. Samples SSP data for non-cash payers (excluding imaging centers)
  2. Retrieves corresponding rates data for facility fees (package base codes)
  3. Compares SSP facility_fee with rates table negotiated_rate

Business Rule: SSP facility fees should match rates table calculations
Purpose: Ensures consistency between derived SSP data and underlying rates
Expected Result: Facility fees match calculated values from rates table


Professional Fee Calculation Accuracy

Function: test_professional_fee_calculation_accuracy()
Location: qa/data_quality_tests.py:175

Validates that professional fees in the SSP table match calculated professional fees from the rates table.

Test Logic:

  1. Samples SSP data for non-cash payers with 5-character billing codes
  2. Retrieves professional fee rates data
  3. Calculates expected professional fees: negotiated_rate × average_units × anesthesia_unit_conversion
  4. Compares calculated totals with SSP professional_fee values

Business Rule: SSP professional fees should equal sum of calculated professional components
Purpose: Ensures accurate professional fee derivation from rates data
Expected Result: Professional fees match calculated values including unit conversions


4. SSP and Provider Type Plausibility Tests

Imaging Centers Only RA SSPs

Function: test_imaging_centers_only_ra_ssps()
Location: qa/data_quality_tests.py

Validates that Imaging Centers only offer radiology SSPs (those starting with 'RA').

SELECT COUNT(*) as violation_count
FROM pricing_service_ssp
WHERE provider_type = 'Imaging Center'
AND ssp_id NOT LIKE 'RA%'

Business Rule: Imaging Centers should only provide radiology services (RA SSPs)
Purpose: Ensures provider type and service offerings are clinically plausible
Expected Result: 0 violations (no Imaging Centers with non-RA SSPs)


ASCs No OB SSPs

Function: test_ascs_no_ob_ssps()
Location: qa/data_quality_tests.py

Validates that Ambulatory Surgery Centers (ASCs) do not offer obstetrics SSPs (those starting with 'OB').

SELECT COUNT(*) as violation_count
FROM pricing_service_ssp
WHERE provider_type = 'ASC'
AND ssp_id LIKE 'OB%'

Business Rule: ASCs should not provide obstetrics services (OB SSPs)
Purpose: Ensures provider type and service offerings are clinically plausible
Expected Result: 0 violations (no ASCs with OB SSPs)