Exploration Methods
School of Earth & Atmospheric Sciences — Georgia Institute of Technology
2026-02-23

Felix J. Herrmann
School of Earth & Atmospheric Sciences — Georgia Institute of Technology
This slide deck builds on a lecture prepared by Eric Verschuur


Purpose:
Part I — Foundations
Part II — Minimum-phase deconvolution
Part III — Predictive deconvolution
Part IV — Sparsity-promoting deconvolution
Let \(w(t)\) denote the source wavelet and \(r(t)\) the reflectivity series. Their convolution is
\[ d(t) = (w \ast r)(t) = \int_{-\infty}^{\infty} w(\tau)\, r(t - \tau)\, d\tau \]
where \(d(t)\) is the recorded seismic trace.
Physical interpretation: the output \(d(t)\) is a weighted superposition of shifted copies of \(r(t)\), with weights given by \(w(\tau)\).
The cross-correlation of \(w(t)\) with \(d(t)\) is
\[ r_{wd}(\tau) = \int_{-\infty}^{\infty} w(t)\, d(t + \tau)\, dt \]
This is equivalently written as convolution with the time-reversed wavelet:
\[ r_{wd}(\tau) = (w(-\cdot) \ast d)(\tau) \]
In operator language, correlation is the adjoint (transpose) of convolution: if \(\mathbf{d} = \mathbf{W}\mathbf{r}\), then \(\mathbf{W}^T\mathbf{d}\) computes the cross-correlation between the wavelet and the seismic trace.
The auto-correlation is the special case: \(r_{ww}(\tau) = \int w(t)\, w(t + \tau)\, dt\).
Let \(W(f)\), \(R(f)\), \(D(f)\) denote the Fourier transforms of \(w(t)\), \(r(t)\), \(d(t)\). Then:
Convolution in time becomes multiplication in frequency:
\[ D(f) = W(f)\, R(f) \]
Cross-correlation becomes
\[ R_{wd}(f) = W^*(f)\, D(f) \]
Auto-correlation becomes
\[ R_{ww}(f) = |W(f)|^2 \]
where \(W^*(f)\) denotes the complex conjugate of \(W(f)\).
For sampled signals \(w[n]\) and \(r[n]\) with sampling interval \(\Delta t\):
\[ d[n] = \sum_{k} w[k]\, r[n - k] \]
Convolution is a linear operation, which can be written in matrix form:
\[ \mathbf{d} = \mathbf{W}\, \mathbf{r} \]
where \(\mathbf{W}\) is a Toeplitz matrix built from the wavelet samples. Its adjoint \(\mathbf{W}^T\) computes correlation.
Let \(\mathbf{F}\) denote the DFT matrix. Then \(\hat{\mathbf{d}} = \mathbf{F}\mathbf{d}\) and the convolution becomes an elementwise (Hadamard) product:
\[ \hat{\mathbf{d}} = \hat{\mathbf{w}} \odot \hat{\mathbf{r}} = \operatorname{diag}(\hat{\mathbf{w}})\, \hat{\mathbf{r}} \]
where \(\hat{\mathbf{W}} = \mathbf{F}\mathbf{w}\), \(\hat{\mathbf{R}} = \mathbf{F}\mathbf{r}\), and \(\odot\) is the Hadamard (elementwise) product.
This means convolution is diagonalized by the DFT — each frequency is independent.
The 1-D seismic trace is modeled as
\[ d(t) = w(t) \ast r(t) + n(t) \]
where:
Geological section \(\to\) impedance contrasts \(\to\) reflectivity series \(r(t)\) convolved with source wavelet \(w(t)\) yields the seismic trace \(d(t)\). (Yilmaz 2001)
Each reflector produces a shifted, scaled copy of \(w(t)\); the seismogram \(d(t)\) is their superposition.
The Ricker wavelet is band-limited — it has zero energy at DC \((f=0)\) and at high frequencies. Peak frequency \(f_0 = 25\) Hz.
Convolution of earth’s impulse response (a) with wavelet (b) yields the seismogram (c). This is multiplicative in the amplitude spectra (top) and convolutional in the autocorrelations (middle). (Yilmaz 2001)
The vibroseis source emits a long frequency sweep. Cross-correlating the recorded data with the known sweep produces the Klauder wavelet — a compact, zero-phase pulse. (Sheriff and Geldart 1995)
Remove unwanted effects from the seismic signal to improve temporal resolution:
The aim is to recover a broadband, spike-like reflectivity \(r(t)\) from the recorded trace \(d(t)\).
Goal: find \(r(t)\) from \(d(t) = w(t) \ast r(t) + n(t)\).
In the frequency domain the noise-free inverse is
\[ R(f) = \frac{D(f)}{W(f)} \]
Equivalently, the inverse filter \(F(f) = 1/W(f)\) satisfies
\[ F(f)\, W(f) = 1 \quad \longleftrightarrow \quad f(t) \ast w(t) = \delta(t) \]

Wherever \(W(f) \approx 0\) (spectral notches), the inverse \(1/W(f)\) blows up:
\[ \hat{R}(f) = R(f) + \frac{N(f)}{W(f)} \longrightarrow \infty \]
Any small noise \(N(f)\) at those frequencies gets amplified enormously.
Adapted from Eric Verschuur
The term deconvolution is used for many different signal enhancement techniques:
| Type | Goal | Assumption |
|---|---|---|
| Spiking | Sharpen wavelet \(\to\) spike \(\delta(t)\) | Noise is Gaussian |
| Blind decon. | Estimate source/receiver and earth response | \(w(t)\) is minimum-phase, \(r(t)\) is white |
| Predictive | Remove source/receiver ghosts and multiples | Periodic structure in the data |
Important
Main problem: the source/receiver responses and Earth response are both unknown. Given \(z = x \ast y\), what are \(x\) and \(y\)? This is the blind deconvolution problem.
Instead of \(1/W(f)\), use the Wiener filter (Wiener 1949):
\[ F(f) = \frac{W^*(f)}{|W(f)|^2 + \varepsilon} \]
where \(\varepsilon > 0\) is the stabilization (damping) parameter that prevents division by zero.
\[ F(f) = \frac{W^*(f)}{|W(f)|^2 + \varepsilon} \]
\[F(f) \approx \frac{1}{W(f)} \qquad \Rightarrow \text{inverse filtering}\]
\[F(f) \approx \frac{W^*(f)}{\varepsilon} \approx 0 \qquad \Rightarrow \text{suppress noisy frequencies}\]
Adapted from Eric Verschuur
Adapted from Eric Verschuur
Left: Wiener filter amplitude for \(\varepsilon \in \{0.001, 0.01, 0.1, 1.0\}\). Right: product \(W(f) \cdot F(f)\) — approaches \(1\) for small \(\varepsilon\), increasingly damped for large \(\varepsilon\).
Top: noise-free. Bottom: noisy (SNR = 20 dB). Small \(\varepsilon\) gives better resolution but amplifies noise — the classic resolution vs stability trade-off.
Ricker wavelet \(w(t)\) with \(f_0 = 25\) Hz convolved with a sparse reflectivity \(r(t)\) containing six isolated reflectors.
Additive Gaussian noise at SNR = 20 dB. The noise is broadband — it contaminates all frequencies, including those where \(|W(f)| \approx 0\).
Spectral division without stabilization. At spectral nulls of \(W(f)\) the noise is amplified without bound:
\[ \hat{R}(f) = R(f) + \frac{N(f)}{W(f)} \;\longrightarrow\; \infty \]
The result is useless — completely dominated by amplified noise.
The stabilized (Wiener) deconvolution filter:
\[ F_\varepsilon(f) = \frac{W^*(f)}{|W(f)|^2 + \varepsilon} \]
Two regimes:
With no noise, small \(\varepsilon\) gives near-perfect recovery. As \(\varepsilon\) increases, the result becomes over-smoothed.
With noise, small \(\varepsilon\) amplifies noise dramatically. Larger \(\varepsilon\) stabilizes the result at the cost of resolution.
Top: wavelet amplitude spectrum \(|W(f)|\). Bottom: effective resolution filter \(W(f) \cdot F_\varepsilon(f)\) — ideally unity across the signal band.
All \(\varepsilon\) values overlaid with true \(r(t)\). Left: noise-free. Right: noisy. The optimal \(\varepsilon\) balances data misfit and noise amplification.
In the time domain, deconvolution becomes a least-squares problem. Given data vector \(\mathbf{d}\) and convolution matrix \(\mathbf{W}\):
\[ \min_{\mathbf{r}} \|\mathbf{d} - \mathbf{W}\mathbf{r}\|_2^2 \]
The normal equations that solve this objective give
\[ \mathbf{W}^T \mathbf{W}\, {\mathbf{r}} = \mathbf{W}^T \mathbf{d} \]
Adding an \(\ell_2\)-penalty on \(\mathbf{r}\) yields the damped (Tikhonov regularized) problem:
\[ \min_{\mathbf{r}} \|\mathbf{d} - \mathbf{W}\mathbf{r}\|_2^2 + \varepsilon \|\mathbf{r}\|_2^2 \]
with closed-form solution
\[ \tilde{\mathbf{r}} = (\mathbf{W}^T\mathbf{W} + \varepsilon \mathbf{I})^{-1}\, \mathbf{W}^T \mathbf{d} \]
Tip
This is exactly the Wiener filter in the time domain. The Fourier-domain Wiener filter and Tikhonov regularization are equivalent for stationary convolution.
We want a approximate short deconvolution filter \(f(t)\) that replaces the original source wavelet \(w(t)\) by a desired wavelet \(v(t)\):
\[f(t)\ast w(t)=v(t)\]
Write objective
\[\min_{\mathbf{f}\in F} \|\mathbf{v} - \mathbf{f}\ast\mathbf{w}\|_2^2 \]
with \(F\) set of short wavelets, which is solved by
\[f\ast r_{ww} = r_{vw}\]
where
the auto-correlation \(r_{ww}[k] = \sum_n w[n]\, w[n+k]\)
the cross-correlation \(r_{dw}[k] = \sum_n d[n]\, w[n+k]\)
\[ \begin{bmatrix} r_{ww}[0] & r_{ww}[1] & \cdots & r_{ww}[N{-}1] \\ r_{ww}[1] & r_{ww}[0] & \cdots & r_{ww}[N{-}2] \\ \vdots & & \ddots & \vdots \\ r_{ww}[N{-}1] & r_{ww}[N{-}2] & \cdots & r_{ww}[0] \end{bmatrix} \begin{bmatrix}f[0]\\ f[1] \\ \vdots \\ f[N]\end{bmatrix} = \begin{bmatrix} r_{vw}[0] \\ r_{vw}[1] \\ \vdots \\ r_{vw}[N] \end{bmatrix} \]
This symmetric Toeplitz system can be solved in \(O(N^2)\) via the Levinson recursion (Levinson 1947), rather than \(O(N^3)\) for general systems.
Yilmaz (2001)
Yilmaz (2001)
Yilmaz (2001)
Airgun wavelet (top) and result after spiking deconvolution (bottom) — the wavelet is compressed toward a spike \(\delta(t)\).
Shot gather before (left) and after (right) spiking deconvolution — reflectors become sharper and more interpretable.
Left: ideal zero-phase wavelet and corresponding shot gather. Right: realistic airgun wavelet with ghost — note the longer, ringy waveform that obscures reflector boundaries.
Adapted from Eric Verschuur
Mixed-phase wavelet (left) vs its minimum-phase equivalent (right). Both share the same amplitude spectrum (bottom) but have different phase — the minimum-phase wavelet concentrates energy at early times.
Minimum-phase wavelet
Zero-phase wavelet
In practice, the wavelet \(w(t)\) is unknown. We make two assumptions:
No explicit wavelet estimation is needed!
In shallow water, the seismic signal bounces between the ocean surface and the water bottom, creating multiples that are delayed by \(\Delta\tau\) (two-way water-column travel time) and alternating in sign.
Predictive deconvolution aims at removing a repetitive character from the seismic measurements, not the source signal itself. This can be used in the following situations:
In this situation, the seismic trace with reverberations can be written as a geometric series:
\[ d(t) = p(t) + r_1\, p(t - \Delta\tau) + r_1^2\, p(t - 2\Delta\tau) + \cdots \]
where \(p(t)\) is the primary signal, \(\Delta\tau\) is the two-way travel time in the water layer, and \(r_1\) is the water-bottom reflection coefficient.
Idea: if we can predict the signal \(\Delta\tau\) into the future, we can subtract the predictable (periodic) part. (Robinson 1967)
adapted from Eric Verschuur
adapted from Eric Verschuur
adapted from Eric Verschuur
The prediction-error filter has the form
\[ f_{\text{pred}}(t) = \delta(t) - a(t - \Delta\tau) \]
where \(a(t)\) is a prediction filter that estimates \(d(t)\) from \(d(t - \Delta\tau)\).
Left: ideal filter. Right: practical filter designed from data.
The prediction filter \(\mathbf{a}\) is found by minimizing
\[ \min_{\mathbf{a}} \sum_n \left| d[n] - \sum_k a[k]\, d[n - \Delta - k] \right|^2 \]
where \(\Delta\) is the prediction distance (in samples). The normal equations are
\[ \mathbf{R}_{dd}\, \hat{\mathbf{a}} = \mathbf{r}_{dd}[\Delta] \]
Note
Spiking deconvolution is the special case \(\Delta = 1\) (predict one sample ahead). Multiple removal uses \(\Delta = \Delta\tau / \Delta t\) (predict one multiple period ahead).
Adapted from Eric Verschuur
Original shot record with shallow-water multiples (left) vs after predictive deconvolution (right). The reverberations are strongly attenuated.
Autocorrelation of input (left) shows periodic structure from multiples. After predictive deconvolution (right) the periodicity is removed — the autocorrelation approaches a spike.
The Tikhonov/\(\ell_2\) solution
\[ \hat{\mathbf{r}} = (\mathbf{W}^T\mathbf{W} + \varepsilon \mathbf{I})^{-1}\, \mathbf{W}^T \mathbf{d} \]
penalizes large values of \(r[n]\) uniformly. This produces smooth estimates that spread energy across many samples — not ideal for recovering sparse reflectivity with isolated spikes.
Replace the \(\ell_2\) penalty with an \(\ell_1\)-norm penalty that promotes sparsity (Taylor, Banks, and McCoy 1979):
\[ \min_{\mathbf{r}} \|\mathbf{d} - \mathbf{W}\mathbf{r}\|_2^2 + \lambda \|\mathbf{r}\|_1 \]
where \(\|\mathbf{r}\|_1 = \sum_n |r[n]|\) and \(\lambda > 0\) controls the sparsity level.
The \(\ell_1\) penalty drives most coefficients \(r[n]\) to exactly zero, keeping only the significant reflectors.
\(\ell_2\) (Tikhonov) recovery is smooth and smeared. \(\ell_1\) (sparsity-promoting) recovery is closer to the true sparse reflectivity — it correctly identifies reflector locations.
The \(\ell_1\) problem is convex but not differentiable at zero. It can be solved by:
Tip
The connection to compressed sensing guarantees exact recovery under mild conditions on \(\mathbf{W}\) and the sparsity of \(\mathbf{r}\).
Foundations
Minimum-phase decon
Predictive decon
Sparsity-promoting decon
This lecture was prepared with the assistance of Claude (Anthropic) and validated by Felix J. Herrmann.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
