Denoising
The denoising module provides multiple methods for reducing noise in FTIR spectra while preserving important spectral features.
Overview
Using FTIRdataprocessing Class (Recommended)
The easiest way to apply denoising is through the FTIRdataprocessing class with built-in evaluation:
from xpectrass import FTIRdataprocessing
# Initialize with your data
ftir = FTIRdataprocessing(df, label_column="type")
# Convert first (recommended before denoising evaluation)
df_abs = ftir.convert(plot=False)
# Step 1: Evaluate all denoising methods to find the best one
ftir.find_denoising_method(data=df_abs, n_samples=50, plot=True)
# Step 2: View evaluation results
print(ftir.denoising_results)
# Step 3: Plot evaluation comparison
ftir.plot_denoising_eval()
# Step 4: Apply the best method
ftir.denoise_spect(method='savgol', window_length=15, polyorder=3, plot=False)
# Step 5: Access denoised data
denoised_df = ftir.df_denoised
Using Utility Functions Directly
For standalone use or custom pipelines:
from xpectrass.utils import denoise, denoise_method_names
# See available methods
print(denoise_method_names())
# ['gaussian', 'lowpass', 'median', 'moving_average', 'savgol', 'wavelet', 'whittaker']
# Apply denoising to a single spectrum
denoised = denoise(intensities, method='savgol', window_length=15, polyorder=3)
Available Methods
Method |
Description |
Best For |
|---|---|---|
|
Savitzky-Golay filter |
Default choice - preserves peak shape |
|
Discrete wavelet transform |
Variable noise levels |
|
Simple box filter |
Quick preprocessing |
|
Gaussian smoothing |
Low-frequency noise |
|
Non-linear median filter |
Spike/impulse noise |
|
Penalized least squares |
Balanced smoothing |
|
Butterworth filter |
High-frequency noise |
Method Details
Savitzky-Golay (Recommended)
Fits successive sub-sets of adjacent data points with a low-degree polynomial. Excellent for preserving peak shapes and positions.
denoised = denoise(
intensities,
method='savgol',
window_length=15, # Must be odd
polyorder=3 # Polynomial order
)
Guidelines:
Larger
window_length= more smoothingHigher
polyorder= better peak preservationTypical:
window_length=11-21,polyorder=2-4
Wavelet Denoising
Multi-resolution approach that decomposes the signal into wavelet coefficients and thresholds small (noise) coefficients.
denoised = denoise(
intensities,
method='wavelet',
wavelet='db4', # Wavelet family
level=3, # Decomposition level
threshold_mode='soft' # 'soft' or 'hard'
)
Common wavelets: 'db4', 'db6', 'sym4', 'coif2'
Median Filter
Non-linear filter that replaces each value with the median of neighboring values. Excellent for removing spike noise.
denoised = denoise(
intensities,
method='median',
kernel_size=5 # Must be odd
)
Gaussian Filter
Convolves the signal with a Gaussian kernel.
denoised = denoise(
intensities,
method='gaussian',
sigma=2.0 # Standard deviation
)
Moving Average
Simple box filter that averages neighboring points.
denoised = denoise(
intensities,
method='moving_average',
window=11
)
Whittaker Smoother
Penalized least squares that balances fidelity with smoothness.
denoised = denoise(
intensities,
method='whittaker',
lam=1e4, # Smoothness parameter
d=2 # Derivative order
)
Low-Pass Filter
Butterworth filter that removes high-frequency components.
denoised = denoise(
intensities,
method='lowpass',
cutoff=0.1, # Normalized frequency (0-1)
order=4 # Filter order
)
Evaluation
SNR Estimation
from xpectrass.utils import estimate_snr
# Compare raw vs denoised
snr_improvement = estimate_snr(y_raw, y_denoised)
print(f"SNR improvement: {snr_improvement:.1f} dB")
Batch Evaluation
from xpectrass.utils import evaluate_denoising_methods
# Compare methods on dataset
results = evaluate_denoising_methods(
data=df,
methods=["savgol", "wavelet", "gaussian"],
label_column="type",
exclude_columns=["study", "sample_id", "environmental", "resolution"],
n_samples=30,
)
# Results include SNR, smoothness, and fidelity metrics
print(results.groupby('method').mean())
Visualization
from xpectrass.utils import plot_denoising_comparison
plot_denoising_comparison(
intensities,
wavenumbers,
methods=['savgol', 'wavelet', 'gaussian'],
sample_name='HDPE1'
)
Recommendations
Situation |
Recommended Method |
|---|---|
General FTIR |
|
High noise |
|
Spike noise |
|
Real-time processing |
|
Best peak preservation |
|
Example
from xpectrass.utils import denoise
import numpy as np
# Apply two-stage denoising for spike + random noise
spectrum = load_spectrum('sample.csv')
# Stage 1: Remove spikes
no_spikes = denoise(spectrum, method='median', kernel_size=3)
# Stage 2: Smooth
smooth = denoise(no_spikes, method='savgol', window_length=11, polyorder=3)