Low Power Switched Capacitor Implementation of Discrete Haar Wavelet Transform

Alfonso Chacon-Rodriguez∗, Shuo Li†, Milutin Stanačević‡, Leonardo Rivas∗, Esteban Baradin∗, and Pedro Julian‡

∗Escuela de Ingeniería Electrónica, Instituto Tecnológico de Costa Rica
†Department of Electrical and Computer Engineering, Stony Brook University
‡Instituto de Investigaciones en Ingeniería Eléctrica, IIIIE (UNS-CONICET)

Avda. Alem 1253, (8000) Bahía Blanca, Argentina
Email: alchacon,lerivas@tec.ac.cr
Email: shuoli,milutin@ece.sunysb.edu
Email: alchacon,lerivas@tec.ac.cr
Email: pjulian@uns.edu.ar

Abstract—A low power switched capacitor filter implementation of a Haar discrete wavelet transform is presented. The circuit is to be integrated into the pre-processing unit of an adaptive threshold detection system for environmental protection applications. For a 200Hz sinewave input signal with an amplitude of 200 mV, the simulation results demonstrate a systematic error in computation of wavelet coefficients under ± 1.1 mV with a power consumption of 40 µW. The switched-capacitor circuit is implemented on a 0.5 µm CMOS technology.

Index Terms—Gunshot detection, signal processing, low power VLSI, mixed signal ASIC, Discrete Wavelet Transform, switched capacitor filters.

I. INTRODUCTION

A system for the local detection of firearm gunshots, intended to be integrated into a wireless sensor network for environmental protection, has been proposed in [1]. The system is based on an adaptive threshold signal detection structure, typical of signal detection theory (see [2]). Six different pre-processing algorithms have been proposed and evaluated as a way to enhance the detectors, both in terms of detection efficiency and power dissipation. Results show that, for a given set of 45 gunshot recordings and 15 samples of the typical sounds found in a tropical forest, the use of detail coefficients 3, 4 and 5 from an 8-level 16-bit Haar Discrete Wavelet Transform, with a sampling rate of 48kHz, has an accuracy of 40 detected shots out of 45 samples evaluated, with no false shots detected out of 15 false samples (an example of the detector topology is shown in Figure 1, see [1] for more details). Nonetheless, as explained in [1], a completely digital implementation of such algorithm, might be prohibitive in terms of power dissipation. Thus, a discrete, switched capacitor (SC) filter topology was proposed as an low power alternative for the pre-processing unit. This paper describes a proposed implementation of such SC unit, leaving for further work the rest of the system.

Section II gives a brief explanation of the fundamentals behind the filters’ proposed structure. Section IV discusses the SC electronic implementation and gives some simulation results, while section V gives the conclusions.

Fig. 1. Basic structure of a detection algorithm using a Haar DWT as a pre-processing algorithm, as described in [1]. Level detail coefficients energy is calculated before feeding them to the energy sum. Energy may be estimated by a squaring or an absolute value estimator. The adaptive threshold is typically a running average or RMS estimation of the pre-processed signal. C is the threshold gain, that fine-tunes the selectiveness of the detection [1], [2].

II. HAAR WAVELET SIGNAL DECOMPOSITION

Figure 2 shows a typical recursive filter bank structure, that can be used to implement a Discrete Wavelet Transform (DWT) on a particular signal [3]. Here, the w_n coefficients denote the signal details, while the v_n coefficients denote the signal approximation. This structure provides a good tool for performing time-frequency analysis of the signal. However, this structure does not translate well into hardware implementation. This is due to costs in terms of area, whether digital or analog, derived from the need to construct the basic wavelet filters that perform the signal decomposition. Additionally, in the case of an analog implementation, the recursive structure implies the accumulation of errors due to offset and other parasitics from the operational amplifiers used in each stage. Haar wavelet computation, although not able to provide a smooth filtering as higher order wavelets, is simple and cheap in terms of needed resources, which makes it a suitable option when high accuracy is not sought.

In the case discussed in [1], the filter bank is structured following a dyadic scale using 3500 Hz as the Nyquist frequency. A Haar scale function is related to a moving average operator with a transfer function H(z) = (1+z^(-1))/√2, while
its wavelet function gives rise to a moving difference operator with a transfer function \( G(z) = (1 - z^{-1})/\sqrt{2} \). The \( \sqrt{2} \) factor ensures the orthonormality of the wavelet transformation, and may be replaced by another factor considering that only the analysis of the signals is required, not their reconstruction (see [1] for more details). In discrete time domain, Haar scaling filter is a moving average filter described as

\[
v_o(n) = \frac{1}{2}[v_{in}(n) + v_{in}(n - 1)]
\]  

(1)

while a wavelet or detail Haar filter is described by the moving difference equation

\[
w_o(n) = \frac{1}{2}[v_{in}(n - 1) - v_{in}(n)].
\]  

(2)

Both filters are quadrature mirror and power complementary [4]. Thus, the detail coefficient of an arbitrary level is obtained by recursively descending through the steps of the ladder given in Figure 2 and then applying a detail filter to get the wanted coefficient. The SC implementation can follow the basic decomposition structure and emulate the ladder by cascading the stages, including the sub-sampling between each stage. Such straightforward implementation not only implies excessive area because of the accumulation of stages, but also creates parasitics and offset problems. We propose a different approach in order to get the higher detail coefficients. In the proposed implementation, the higher order detailed coefficients are computed in a single step from the average signal \( cA_2 \), as described in the next Section.

III. CIRCUIT IMPLEMENTATION

The DWT filter bank is implemented using sampled-data switched-capacitor circuits. A complete block diagram of the simplified structure is shown in Figure 3. The advantage of this realization is the application of correlated-double sampling (CDS) to significantly reduce common-mode offsets and \( 1/f \) noise. A cascaded inverter, biased in subthreshold regime, is used as high-gain amplifier in these SC circuits, supporting high density of integration, and low power consumption (see [5], [6]).

A. Average Operator Circuit

The differential structure of the average operator is shown in Figure 4. In this circuit, a capacitor ratio of \( C_2 = 2C_1 \) guarantees proper operation. \( V_{os} \) represents the offset voltage the high-gain cascade amplifier. \( V_{mid} \) is a DC reference voltage \( (V_{DD}/2) \).

The timing of clock signal is shown in Figure 5(a). \( \phi_1 \) and \( \phi_2 \) are complementary non-overlapping clock signals. \( \phi_{1e} \) is the replica of \( \phi_1 \) with its falling edge preceding falling edge of \( \phi_1 \). Resolving the circuit operation according to the clock timing, we get

\[
cA_{2P}(n) = \frac{1}{2}[v_{in}(n - \frac{1}{2}) + v_{in}(n - \frac{3}{2})]
\]

\[
cA_{2N}(n) = -\frac{1}{2}[v_{in}(n - 1) + v_{in}(n - 2)]
\]

\[
cA_2(n) = \frac{1}{2}[v_{in}(n - \frac{1}{2}) + v_{in}(n - 1) + v_{in}(n - \frac{3}{2}) + v_{in}(n - 2)]
\]  

(3)

To eliminate the effects of interim output, a sample and hold circuit is added after each computation stage.

B. Difference Operator Circuit

The schematic of the difference operator circuit is shown in Figure 6 using the \( cD_3 \) stage as an example. In this circuit,
and can be implemented in a single SC stage similar to one depicted in Figure 6 for implementation of $cD_3$.

![Diagram](image)

**Fig. 6.** SC implementation of the computation of the wavelet detail coefficient $cD_3$.

The capacitor ratio is $C_2 = C_1$. The size of the capacitors is 200 fF, chosen as the tradeoff between area and the noise performance.

The timing of clock signal is shown in Figure 5(b). The circuit performs the moving difference function properly providing

\[
cD_3(n) = V_{inP}(n-1) - V_{inN}(n-1) - [V_{inP}(n-3) - V_{inN}(n-3)]
\]

\[
= cA_2(n) - cA_2(n-2)
\]

(4)

The decomposition of the higher order detail coefficients $cD_4$ and $cD_5$ is:

\[
cD_4(n) = cA_2(n) + cA_2(n-2) - cA_2(n-4) - cA_2(n-8)
\]

(5)

\[
cD_5(n) = cA_2(n) + cA_2(n-2) + cA_2(n-4) + cA_2(n-8) - cA_2(n-10) - cA_2(n-12) - cA_2(n-14) - cA_2(n-16)
\]

(6)

IV. Simulation Results

The proposed SC implementation of the Haar Wavelet Transform was simulated using an input clock of 14kHz. Clock signals are generated via a Verilog coded FSM, synthesized using Mentor Graphics ADK, and then conditioned into two-phase equivalents with no overlapping, in order to minimize unwanted capacitor discharge or charge injection effects from the MOS switches [7], [8]. A 200 Hz sine wave signal with a 200 mV amplitude is used as input signal. To verify the results, the output signals from the post-layout RC extracted SPICE netlist of the circuit (including pads) were compared to the outputs of a cascaded Haar wavelet filter implemented using the Wavelet Toolbox from MATLAB®.

![Graph](image)

**Fig. 7.** Single-ended outputs for the $cA_2$ approximation coefficient of a 200 Hz sine wave signal with 200 mV amplitude. $cA_2$ is the difference between $cA_{2p}$ and $cA_{2n}$. Data from post-layout SPICE simulation.

![Graph](image)

**Fig. 8.** Output of the SC circuit for computation of the $cA_2$ coefficient for a 200Hz sine wave signal with 200mV amplitude. Output is compared against the output of MATLAB®’s Wavelet Toolbox. Error is within $\pm 300 \mu V$. 

![Graph](image)
The post-layout simulated time waveform of the \( cA_2 \) output coefficient is shown in Figure 7. Outputs of both the MATLAB\textsuperscript{®} toolbox and the simulator are compared in Figures 8, 9 and 10, also indicating the absolute error between the circuit’s simulation and the expected output from the theoretical filter bank. Error is always bound within \( \pm 1.1 \) mV in the worst case (\( cD5 \) coefficient), and slightly over \( \pm 600 \) \( \mu \)V in the best case (\( cD3 \) coefficient).

![Coefficient cD3](image)

**Fig. 9.** Output of the SC circuit for computation of the \( cD3 \) coefficient for a 200Hz sinewave signal with 200mV amplitude. Output is compared against the output of MATLAB\textsuperscript{®}’s Wavelet Toolbox. Error is slightly over \( \pm 600 \) \( \mu \)V.

![Coefficient cD5](image)

**Fig. 10.** Output of the SC circuit for computation of the \( cD5 \) coefficient for a 200Hz sinewave signal with 200mV amplitude. Output is compared against the output of MATLAB\textsuperscript{®}’s Wavelet Toolbox. Error is bound within \( \pm 1.1 \) mV.

Layout of the proposed SC circuits implemented using a 0.5\( \mu \)m CMOS technology process is shown in Figure 11. Typical provisions for the layout of SC circuits [9], were taken to avoid noise interference between the digital and analog sections. Final area of the IC sent for fabrication, not including pads, is 1.944 mm\(^2\). The total power dissipation is 40\( \mu \)W, also not including pads.

![Final layout of implemented circuit](image)

**Fig. 11.** Final layout of implemented circuit (from Mentor Graphics\textsuperscript{®} IC-Station). Pads not shown.

V. CONCLUSIONS

The proposed switched capacitor implementation of the Haar wavelet transformation provides a low-power low-complexity solution for the signal decomposition in the acoustic signal detection system. The implementation is amenable to integration in a sensor node for gunshot detection in environmental monitoring.

ACKNOWLEDGMENT

A. Chacón-Rodríguez’s work was sponsored by Instituto Tecnológico de Costa Rica, Ministerio de Ciencia y Tecnología and Consejo Nacional de Investigaciones Científicas y Tecnológicas de Costa Rica. S. Li and M. Stanacevic are supported by NSF CAREER Award 0846265. The authors wish to thank MOSIS for their help in integrating the prototypes.

REFERENCES


