Denoising

The denoising module provides multiple methods for reducing noise in FTIR spectra while preserving important spectral features.

Overview

Using Utility Functions Directly

For standalone use or custom pipelines:

from xpectrass.utils import denoise, denoise_method_names

# See available methods
print(denoise_method_names())
# ['gaussian', 'lowpass', 'median', 'moving_average', 'savgol', 'wavelet', 'whittaker']

# Apply denoising to a single spectrum
denoised = denoise(intensities, method='savgol', window_length=15, polyorder=3)

Available Methods

Method

Description

Best For

savgol

Savitzky-Golay filter

Default choice - preserves peak shape

wavelet

Discrete wavelet transform

Variable noise levels

moving_average

Simple box filter

Quick preprocessing

gaussian

Gaussian smoothing

Low-frequency noise

median

Non-linear median filter

Spike/impulse noise

whittaker

Penalized least squares

Balanced smoothing

lowpass

Butterworth filter

High-frequency noise

Method Details

Wavelet Denoising

Multi-resolution approach that decomposes the signal into wavelet coefficients and thresholds small (noise) coefficients.

denoised = denoise(
    intensities,
    method='wavelet',
    wavelet='db4',        # Wavelet family
    level=3,              # Decomposition level
    threshold_mode='soft' # 'soft' or 'hard'
)

Common wavelets: 'db4', 'db6', 'sym4', 'coif2'

Median Filter

Non-linear filter that replaces each value with the median of neighboring values. Excellent for removing spike noise.

denoised = denoise(
    intensities,
    method='median',
    kernel_size=5  # Must be odd
)

Gaussian Filter

Convolves the signal with a Gaussian kernel.

denoised = denoise(
    intensities,
    method='gaussian',
    sigma=2.0  # Standard deviation
)

Moving Average

Simple box filter that averages neighboring points.

denoised = denoise(
    intensities,
    method='moving_average',
    window=11
)

Whittaker Smoother

Penalized least squares that balances fidelity with smoothness.

denoised = denoise(
    intensities,
    method='whittaker',
    lam=1e4,  # Smoothness parameter
    d=2       # Derivative order
)

Low-Pass Filter

Butterworth filter that removes high-frequency components.

denoised = denoise(
    intensities,
    method='lowpass',
    cutoff=0.1,  # Normalized frequency (0-1)
    order=4      # Filter order
)

Evaluation

SNR Estimation

from xpectrass.utils import estimate_snr

# Compare raw vs denoised
snr_improvement = estimate_snr(y_raw, y_denoised)
print(f"SNR improvement: {snr_improvement:.1f} dB")

Batch Evaluation

from xpectrass.utils import evaluate_denoising_methods

# Compare methods on dataset
results = evaluate_denoising_methods(
    data=df,
    methods=["savgol", "wavelet", "gaussian"],
    label_column="type",
    exclude_columns=["study", "sample_id", "environmental", "resolution"],
    n_samples=30,
)

# Results include SNR, smoothness, and fidelity metrics
print(results.groupby('method').mean())

Visualization

from xpectrass.utils import plot_denoising_comparison

plot_denoising_comparison(
    intensities,
    wavenumbers,
    methods=['savgol', 'wavelet', 'gaussian'],
    sample_name='HDPE1'
)

Recommendations

Situation

Recommended Method

General FTIR

savgol (window=15, polyorder=3)

High noise

wavelet with higher level

Spike noise

median first, then savgol

Real-time processing

moving_average (fastest)

Best peak preservation

savgol with small window

Example

from xpectrass.utils import denoise
import numpy as np

# Apply two-stage denoising for spike + random noise
spectrum = load_spectrum('sample.csv')

# Stage 1: Remove spikes
no_spikes = denoise(spectrum, method='median', kernel_size=3)

# Stage 2: Smooth
smooth = denoise(no_spikes, method='savgol', window_length=11, polyorder=3)