We'll need to know a wavelet in the time and space domain
whose amplitude spectrum is
(so its power spectrum is kr).
Do not mistake this for the the helix derivative Claerbout (1998)
whose power spectrum is kr2.
What we need to use here is the square root of the helix derivative.
Let the (unknown) wavelet with
amplitude spectrum
be known as G.
Why is this more efficient? The important point is that the PEF should estimate the minimal practical number of freely adjustable parameters. If G is a function that is lengthy in time or space, then the PEF does not need to be.
How important is this extra statistical efficiency? I don't know.