|
|
|
Instrumentation and Measurement |
|
|
|
|
Written by Azzam Moustapha
|
|
Saturday, 08 March 2008 |
Wireless Sensor Network Modeling Using
Modified Recurrent Neural Networks:
Application to Fault Detection
Azzam I. Moustapha, Member, IEEE, and Rastko R. Selmic, Member, IEEE
Abstract—This paper presents a dynamic model of wireless
sensor networks (WSNs) and its application to sensor node fault
detection. Recurrent neural networks (NNs) are used to model
a sensor node, the node’s dynamics, and interconnections with
other sensor network nodes. An NN modeling approach is used
for sensor node identification and fault detection in WSNs. The
input to the NN is chosen to include previous output samples of
the modeling sensor node and the current and previous output
samples of neighboring sensors. The model is based on a new
structure of a backpropagation-type NN. The input to the NN
and the topology of the network are based on a general nonlinear
sensor model. A simulation example, including a comparison to
the Kalman filter method, has demonstrated the effectiveness of
the proposed scheme.
Index Terms—Fault detection, modeling, neural networks
(NNs), wireless sensor networks (WSNs).
I. I NT RODUC T I ON
W
IRELESS sensor networks (WSNs) consist of the fol-
lowing: a set of sensor nodes that can communicate with
each other; sensors that measure a desired physical quantity;
and the system base station for data collection, processing,
and connection to the wide area network. Modern wireless
sensor nodes have microprocessors for local data processing,
networking, and control purposes [1]. WSNs have enabled
numerous advanced monitoring and control applications in
environmental, biomedical, and numerous other applications.
Sensors in such networks have their own dynamics (often
nonlinear), and modeling such a sensor network is often not
trivial. Because recurrent neural networks (RNNs) consist of
interconnected dynamic nodes, we explore their similarities
with WSNs and exploit those similarities in WSN modeling.
This paper presents the modeling of WSNs using a modified
dynamic RNN.
The real motivation for WSN modeling stems from the need
for intelligent fault detection in complex distributed sensory
systems. Because sensor networks often operate in potentially
Manuscript received May 22, 2007; revised November 6, 2007. This work
was supported in part by the National Science Foundation EPSCoR Pfund Grant
32-0967-58159.
The authors are with the Department of Electrical Engineering, College of
Engineering and Science, Louisiana Tech University, Ruston, LA 71272 USA
(e-mail:
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
;
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIM.2007.913803
hostile and harsh environments, most of the applications are
mission critical. The sensors are often used to compute control
actions [2]–[4], where sensor faults can cause catastrophic
events. For instance, the National Aeronautics and Space Ad-
ministration was forced to abort the launch of the space shuttle
Discovery due to a failure in one of the sensors in the sensor
network of the shuttle’s external tank (the failure was discov-
ered through human inspection).
Components such as sensors and actuators have significantly
higher fault rates than the traditional integrated semiconduc-
tor circuits-based systems. Multisensor systems need feedback
information about the health status of their nodes in order to
recover and heal from eventual faults. Such a system would
have improved reliability over existing sensor networks. Be-
cause external and internal malfunctions or excessive noise can
occur, sensor readings are somewhat uncertain in the sense
that no existing sensor will deliver accurate readings at all
times. Development of a WSN that will have the capability
of fault detection, isolation, and accommodation is needed.
Efficiency in converting data to features while consistently
accommodating the uncertainty inherent in the measurements
form a key issue for diagnosing and dealing with sensor
faults [5], [6].
Fault tolerance emerged as very essential and urgent for
modern sensory systems [6]. The traditional way of achieving
fault tolerance in dynamic systems is through hardware redun-
dancy, such as through the use of multiple sensors. However,
the multiplication of sensor devices adds cost, complexity,
and power consumption to the sensor node and the whole
network. Most recent research has concentrated on an ana-
lytical redundancy [8], [9] in which the sensor measurements
are processed analytically, and the mathematical models are
compared with the physical measurements. Koushanfar et al.
proposed a heterogeneous fault tolerance technique [10] where
one type of resource can replace another, such as computing,
storage, communication, sensing, and actuating. For example,
when the available power is limited, one can rely more on
computing, which results in transmitting less data to the base
station or to the other sensor nodes. A comparison of the actual
sensor model with the nominal model is given in [11]. In addi-
tion, a comparison with the faulty models (system with faults)
allows one to determine what types of faults have occurred
[11], [12].
Instead of using additional hardware in the form of mul-
tiple sensors, we propose to use computational resources for
0018-9456/$25.00 © 2008 IEEE
2 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
intelligent fault detection. A dynamic model of a sensor node is
formed based on the information from neighboring nodes in the
network. RNNs have been applied to model a network due to
their topological similarities with WSNs. The communication
uncertainties are modeled using confidence factors that are
based on the received signal strength. More detailed commu-
nication models can be applied, but this is not the topic of this
paper.
There are many techniques for nonlinear dynamic system
identification using NNs. Bernieri et al. [11], [12] compared
the output signals of an NN model and the sensor to de-
tect faults. Once the fault has been detected, the parame-
ters of the NN identifier are compared to isolate a fault.
Narendra and Parthasarathy [13] demonstrated that the NNs
can effectively be used for the identification and control of
nonlinear dynamic systems. Ahmed [14] presented a rapid
NN for the identification of unknown nonlinear dynamic sys-
tems when the inputs and outputs are accessible for measure-
ments. Straub and Shroder [15] presented a new approach to
identifying nonlinear dynamic systems based on a general-
regression NN.
In addition to NNs, the identification of a nonlinear dy-
namic system was studied using some alternative techniques.
Narendra and Gallman [16] used an iterative algorithm to
obtain the dynamics of the system from a finite-length input
and noisy output data records. This has shown to converge
for a class of inputs, including colored Gaussian processes.
Haber [17] discussed a two-step identification method of least-
squares parameter estimation based on correlation functions for
nonlinear dynamic systems with linear parameters. Lo Shiavo
and Luciano [18] presented a new, powerful, and flexible fuzzy
algorithm for nonlinear dynamic system identification.
The rest of this paper is organized as follows: Section II
briefly covers some background on RNNs and their function ap-
proximation property. In Section III, a modified RNN (MRNN)
and its model of a dynamic sensor node are introduced, includ-
ing a result that shows how neighboring nodes can be used in
sensor node modeling. Section IV describes how such a tool can
be used in the sensors’ failure detection in a distributed sensor
network. The numerical simulations are given in Section V to
show the effectiveness of the proposed modeling scheme.
II. BAC KGROUND
The artificial RNNs have the ability to capture and model the
dynamic properties of nonlinear systems. The RNN nodes have
their own dynamics with interconnecting weights between the
nodes—similar to WSNs, where each sensor node has its own
dynamics. Recurrent networks also include feedback loops that
the standard NNs do not have [13], [19], [20].
This paper uses a nonlinear output error model [13] given by
y(k) = FNN (y(k
− 1), y(k − 2), . . . , y(k − m)
u(k), u(k
− 1), u(k − 2), . . . , u(k − n)) (1)
where y(k) is the NN output, y(k
− i)’s are previous NN
outputs, and u(k
− i)’s are the inputs, including the previous
Fig. 1. Two-layer RNN.
inputs. The nonlinear function FNN is computed by using a
feed-forward neural net given in matrix form by
FNN(x) = W T σ(V T x) (2)
where x is the NN input, V is the first-layer weight, W is
the second-layer weight, and σ(
·) is the neural net activa-
tion function, which is usually chosen as the standard sig-
moid function. The output activation function is chosen as a
linear function. The structure of the NN is given in Fig. 1.
The two-layer NN in Fig. 1 consists of two layers of tunable
weights and thresholds and has a hidden layer and an output
layer. The hidden layer has L neurons, and the input layer is a
combination of the delayed input u(k) and the output y(k).
Many well-known results indicate that any sufficiently
smooth function can be arbitrary and closely approximated on
a compact set using a two-layer NN with appropriate weights
[21], [22]. The layer weights V and W can be tuned. The NN
universal approximation property states that any continuous
function f can be arbitrarily well approximated using a linear
combination of sigmoidal functions, i.e.,
f (x) = W T σ(V T x) + ε(x) (3)
where ε(x) is the NN approximation error. The reconstruction
error is bounded on a compact set S by
ε(x)� < εN . More-
over, for any εN , one can find an NN such that
ε(x)� < εN
for all x
∈ S .
Given a function g(x) and a domain set D
⊂ �
n
, the func-
tion is said to satisfy the Lipschitz condition on set D if
g(x) − g(y)� ≤ L�x − y� (4)
for any x, y
∈ D [23]. The function is said to be globally
Lipschitz if the above condition is valid on
n
. If the function
g is mapping
→ �, then the condition is equivalent to
|g(x) − g(y)| ≤ L|x − y| (5)
which states that a straight line connecting any two points of
g(x) cannot have a slope with absolute value greater than L.
Therefore, any function with an infinite slope at some point is
not Lipschitz at that point.
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 3
Fig. 2. Ad hoc RNN with topology of a wireless sensor network.
III. MODI FI E D RE C UR R E NT NE UR AL NE T S I N
S E NS OR NE T WOR K MODE L I NG
The dynamic RNNs consist of a set of dynamic nodes that
provide internal feedback to their own inputs (see Fig. 1). They
can be used to simulate and model dynamic systems, such as
a network of sensors. The WSNs consist of a large number
of sensors that in turn have their own dynamics. They interact
between themselves and the base station, which controls the
network. In multihop WSNs, the information hops from one
node to another and, finally, to the network gateway or the base
station.
To develop a dynamic model for such sensors, without loss
of generality, we assume that there is one sensor per sensor
node. More sensors per node will just increase the size of
the RNNs.
The sensor nodes can be viewed as small dynamic systems
with memory-like features. The output of one node forwards
the information to the next node (e.g., node 3 provides the input
to node 5, Fig. 2). Although the standard RNN is structured
in layers, we introduce an ad hoc RNN analogous to WSN
systems with confidence factors (0 < Cij < 1) between nodes
i and j . The confidence factor depends on the signal strength
and the data quality in communication links between the nodes.
For instance, in tuning node 2, the valuable inputs are coming
from node 1 and node 4, which provides the corresponding
confidence factors that are close to 1. If node 7 is not in the
coverage area of node 2, then the confidence factor is 0, and
node 7 will not directly influence node 2.
Note that the confidence factors do not provide a stochastic
modeling of the communication channel. The overall mod-
eling process can be divided into two phases: the learning
phase, where the NN adjusts its weights that correspond to
the healthy and N faulty models (where N is the num-
ber of fault types), and the production phase, where the current
output of the sensor node is being compared with the output
of the NN. The difference between these two signals is used
as a measure of a sensors’ health status. In case of a fault, the
NN weights (model) are compared with the faulty models to
isolate the fault. If there is no similar fault model, then the
fault bank model is updated with the new type of fault and the
corresponding model parameters, i.e., NN weights. This whole
process is repeated during the production phase.
Consider a nonlinear dynamical sensor model given by
yi (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), ui (k)) (6)
where ui (k) and yi (k) are the sensor input and output at sample
k, and fi ’s are the unknown nonlinear functions. For a sensor
to be operational and the user to be able to determine the real
sensor input, the function fi has to be invertible, i.e.,
ui (k) = f −1
i (yi (k
− 1), yi (k − 2), . . . , yi (k − m), yi (k)) .
(7)
Equation (7) indicates that to determine the physical input at the
sample k, knowledge of the present and past m sensor outputs
is required. A more general dynamic sensor model is given
by [18]
yi (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), �ui (k)) (8)
where �un
i (k) is a vector of input data �uni (k) = [ui (k), ui (k
−
1), . . . , ui (k
− n)]. Similarly, for the sensor to be usable and
the users be able to determine the physical input values based
on the sensor outputs, the nonlinear function has to be invertible
with respect to input the signal arguments
uni (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), yi (k)) .
(9)
Such sensor models correspond to a general sensor model
given in [12], i.e., the Hammerstein–Wiener nonlinear feedback
dynamic sensor model (Fig. 3), which consists of a linear
dynamic block surrounded by three nonlinear static blocks [24].
4 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Fig. 3. Linear dynamic block surrounded by three static nonlinear blocks representing a Hammerstein–Wiener dynamic sensor model.
Fig. 4. Sensor node i and its neighboring sensors i1 , i2 , . . . , iN i .
It is assumed that all the sensors have models of the same
order. If that is not the case, the analysis can still be carried out
with slight modification.
Assumption 1: The sensor nodes have a nonlinear model of
the same order given by (4).
Assumption 2: The functions fi ’s are globally Lipschitz
functions with Li ’s being their Lipschitz constant, respectively.
Note that Assumption 2 essentially states that the increment
in the value of the function is bounded for a bounded increment
in the argument of the function x.
Although the wireless sensor nodes are distributed in the
field, it is assumed that the neighboring nodes have a bounded
difference in the measured physical quantity. Mathematically,
the assumption is given as follows.
Assumption 3: The neighboring sensor nodes have measure-
ment events that differ by a bounded constant, i.e., for a sensor
nodes neighbors a and b, we have
ua (k)
− ub (k) = eab (k) (10)
and
eab (k)� < e.
The next result shows how to model a WSN using an RNN
and how to use such a tool in the failure detection of sensor
nodes.
Theorem 1 (WSN Model Using RNNs): Having a model of
a sensor node i (6), Assumptions 1–3, and the node neighbors
that include nodes i1 , i2 , . . . , iN i (see Fig. 4), the output of the
sensor node can be approximated using RNN with inputs con-
sisting of the previous outputs from node i and its neighboring
nodes i.e.,
yi (k) = RNNi
yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
+ c (11)
where j = 1, 2, . . . , Ni , and c is a small bounded constant.
Proof: From Assumption 3, it follows that
ui (k) = ui
j (k) + eij (k) (12)
where j = 1, 2, . . . , Ni . Equivalently, the input ui (k) is
given by
ui (k) = 1
Ni
Ni
j =1
uij (k) + eij (k). (13)
Therefore, one has
yi (k) = fi
y
i (k
− 1), yi (k − 2), . . . , yi (k − m)
1
Ni
Ni
j =1
uij (k) + eij (k)
. (14)
Using (7), one has
yi (k) = fi
y
i (k
− 1), yi (k − 2), . . . , yi (k − m)
1
Ni
Ni
j =1
f −1
ij (yij (k
− 1), yij (k − 2), . . .
yij (k
− m), yi
j (k)
+eij (k)
. (15)
Knowing that the function fi is Lipschitz yields
yi (k) = gi (yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
+ d (16)
where j = 1, 2, . . . , Ni , and
d� ≤ e max(Lj ).
Using the NN function approximation property, there is an
RNN that approximates the unknown function gi such that
gi (yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
= RNNi (x) + εi (x) (17)
where the vector x is given by
x = [yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
. (18)
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 5
The bounded constant c is then given by
c = εN i + e max(Lj ). (19)
This completes the proof. �
This shows that the sensor node output can be approximated
as an RNN with inputs of m previous output samples of the
same node and m previous output samples of neighboring
sensors. The RNN approximates the sensors dynamics, which
can, in general, be nonlinear. The proposed method can actually
be applied for linear and nonlinear, dynamic, and static sensor
models.
The previous results assume ideal communication links. In
the case where there are communication link uncertainties, the
exact value of yi
j (k) is not available. Instead, we use the output
values of the neighboring sensor nodes combined with the
confidence factors, i.e., Cj i yi
j (k). Then, the recurrent neural
net sensor node models are given by
yi (k) = RNNi
yi (k
− 1), yi (k − 2), . . . , yi (k − m)
Cj i yi
j (k), Cj i yij (k
− 1), . . . , Cj i yij (k − m)
+c. (20)
The confidence factors for sensor node i are proportional to
the signal strength between node i and its neighbors. A con-
fidence factor between neighboring nodes i and j represents a
“confidence” of sensor node i in the data generated by sensor
node j . The factor depends on certain parameters such as
proximity and distance between two nodes, terrain between
nodes, topology of the sensor network, and the received sig-
nal strength. We use the received signal strength that can be
obtained from the receiving sensor nodes as a measure of
the confidence factor. The received signal strength indicator
(RSSI) has been used in practical applications as a part of the
IEEE 802.11 standard; the existing commercial sensor nodes
have this capability (Crossbow and MoteIV wireless sensor
nodes).
IV. AP P L I C AT I ON TO A S E NS OR NODE F AULT DE T E C T I ON
The previous results provide a tool for approximating a
wireless sensor node output using RNNs. The method can be
applied to a wide range of nonlinear dynamic models. A moti-
vation for the above results stems from the need to detect faults
in a network of distributed wireless network of sensor nodes.
To detect possible sensor faults at the node level, we compare
the real output and the RNN approximation model. If such a
difference is larger than a threshold, then there is a fault at the
sensor.
For a sensor node i, its real output yi (k), and an RNN model
output RNNi (k), if
RNNi (k) − yi (k)� ≥ ηi , then there is a
fault at the sensor node i.
Fig. 5(a) and (b) shows the structure of the modified recurrent
network with its inputs consisting of the delayed output signals
of the same NN and the previous and current modified output
signals from neighboring sensors. It is initially assumed that
all the confidence factors between node i and the neighboring
nodes are equal to 1. Fig. 5(a) shows the topology during the
learning phase and Fig. 5(b) during the production phase, where
Fig. 5. (a) Block diagram of the system identification in the learning phase.
(b) Block diagram of the system identification in the production phase.
6 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
a fault analyzer detects the difference between the sensor and
the MRNNs.
V. S I MUL AT I ON RE S ULT S
We have simulated a sensor network with 15 sensors nodes
and one sensor per node. Each sensor has two or three “visible”
neighbors. Of course, if sensor i is a neighbor of j , then the
opposite is also true.
The sensors used to generate real data are temperature
sensors. For a period of 72 h. the temperature measurements
were taken and forwarded to the base station node where the
data were collected. The differences in the data output are
small enough to guarantee and justify the theoretical approach
described in Section III. The data are retrieved with 1 h of
sampling time.
Each sensor is modeled using an MRNN described in the
previous section. An MRNN node has inputs consisting of
the previous output samples of the same node and the current
and previous outputs of the neighboring sensor nodes. At first,
we assumed confidence factors equal to 1 followed by a more
realistic assumption that the confidence factors are less than 1.
In the simulation, the RNN has an input layer with eight
nodes, a hidden layer with ten nodes, and an output layer with
one node. The learning algorithm is the standard backpropa-
gation. The learning rates for the first layer and the hidden
layer are set to 0.01. The learning phase stopped after the
difference between the expected and the actual RNN output
reached a steady-state value. We used around ten iterations for
NN learning. We used Matlab 7.1 simulation software.
The neural net learning rate η, which has a value between
0 and 1, plays a key role in the learning process. It affects the
rate of convergence during the learning phase. For too-small
values of the learning rate, the learning process will be very
slow with a high probability of convergence. On the other hand,
when η approaches 1, the learning process is fast with a low
probability of convergence. Therefore, it is recommended to use
a moderate value of η. In addition, a number of training samples
also play an important role in modeling the accuracy and sensor
generalization. By generalization, we refer to the ability of the
network to approximate the output for an input different from
the training set. The results in the simulation illustrate that.
The initial NN weights also affect the learning time and the
convergence of the cost function (sum of errors between desired
and actual outputs). For instance, when the initial weights are
chosen near a local minimum, the cost function will converge
to that minimum, particularly for a small learning rate. When
the initial weights are chosen near a global minimum, the cost
function will converge to this global minimum. In both, the
choice of initial weights and the learning rate can affect the
number of iterations needed for a satisfactory NN convergence.
A data window with n + 1 samples corresponds to the cur-
rent and the n previous sensor outputs. The window size affects
the precision and accuracy of the step-ahead approximated
sensor value as well as the sensitivity of the fault detection
technique. We have chosen the window size of four samples.
Increasing the window size adds more nodes to the MRNN.
Therefore, there is a tradeoff between increased accuracy and
Fig. 6. Actual output of sensor #1, its models using modified recurrent
neural network and Kalman filter, and their corresponding discrepancies with
confidence factors set to one.
Fig. 7. Evolution of the difference e(k) between the MRNN model and the
actual output of sensor #1 with confidence factors set to one.
additional complexity of the NN and, ultimately, the duration
of the training process. In particular, for online measurement
applications, it is desirable to use smaller a window size. After
a satisfactory convergence was achieved, the validation was
provided by estimating the next (future) sensor sample.
We compared the RNN model with a Kalman filter. The
estimated value from the previous time step and the current
measurement coming from the real sensor are used as input
variables to the Kalman filter. For sensor #1, the results for
the RNN modeling and Kalman filtering techniques are shown
in Fig. 6, including a comparison of both results. The output
of the MRNN closely approximates the actual output of the
sensor with an error clearly smaller than the one produced
using the Kalman filter model. The MRNN model can certainly
better reproduce the dynamic behavior of the sensor. Likewise,
the discrepancies between the actual output of sensor #1 and
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 7
Fig. 8. Actual output of sensor #1, its models using modified recurrent neural
network and Kalman filtering technique, and their corresponding discrepancies
with confidence factors less than one.
Fig. 9. Evolution of the difference e(k) between the MRNN model and the
actual output of sensor #1 with confidence factors less than one.
its resulting models NN and Kalman filtering with confidence
factors set to 1 are shown.
Fig. 7 shows the evolution of the error between the NN model
and the actual sensor output during the learning phase. The error
decreases with the number of training iterations.
A more realistic assumption is to consider the confidence
factors between nodes less than 1. Taking C21 = 0.8, C31 =
0.6, and C41 = 0.95, the results for sensor node 1 are shown in
Figs. 8 and 9. One cannot notice a larger difference between the
sensor output and the MRNN model in this case, but still, the
result of our approach is better than that of the Kalman filtering
technique.
To model a sensor fault, we have used a linear drift given by
d(t) = 0.2tu(t
− 3), where u(t) is a unit step function, and the
time t is in hours. Fig. 10 shows the sampled output of faulty
sensor #1 when this sensor has a fault (linear drift) starting
Fig. 10. Output of faulty sensor #1 and its MRNN model.
at t0 = 3 h. Likewise, the estimated MRNN output when the
sensor was in a normal healthy mode is shown. Using the
MRNN modeling technique, the fault was successfully detected
when the fault in the sensor output reached 1 ◦ F at t = 8 h.
VI. CONC L US I ON
This paper has described a dynamic model of a WSN and its
application to sensor failure detection and identification. It was
shown how the NN model depends on the sensor model and
the network structure. The overall network model corresponds
to the topology of the WSN. The inputs to the NN are taken
from the node that is modeled and from neighboring nodes.
A communication confidence factor was taken into account in
the modeling. A simulation with comparison to the Kalman
filtering technique is carried out on a network with 15 sensor
nodes. A fault such as a drift is introduced and was successfully
detected with the modified recurrent neural net model with no
early false alarm that could occur with the Kalman filtering
approach.
RE F E R E NC E S
[1] Crossbow. [Online]. Available: http://www.xbow.com
[2] X. Di, B. K. Gosh, X. Ning, and T. J. Tarn, “Sensor-based hybrid position/
force control of a robot manipulator in an uncalibrated environment,”
IEEE Trans. Control Syst. Technol., vol. 8, no. 4, pp. 635–645, Jul. 2000.
[3] S. E. Lysheyski, “Smart flight control surfaces with microelectrome-
chanical systems,” IEEE Trans. Aerosp. Electron. Syst., vol. 38, no. 2,
pp. 543–552, Apr. 2002.
[4] S. Katsura, Y. Matsumoto, and K. Ohnishi, “Analysis and experimental
validation of force bandwidth for force control,” in Proc. IEEE Int. Conf.
Control Syst. Technol., Dec. 2003, pp. 796–801.
[5] A. D. Pouliezos and G. S. Stavrakankis, Real Time Fault Monitoring of
Industrial Processes. Norwell, MA: Kluwer, 1994.
[6] A. Zhirabok and O. V. Preobragenskaya, “Instrument fault detection
in nonlinear dynamic systems,” in Proc. Syst., Man Cybern., 1993,
pp. 114–119.
[7] C. B Weinstock, W. L Heirnerdinger, H. Ihara, S. B. Johnson,
H. D. Kirrmann, J. J. Stiffler, and L. Yount, “The state of the practice in
fault tolerant systems,” in Proc. 22nd Int. Symp. Fault-Tolerant Comput.,
Jul. 1992, pp. 2–5.
[8] S. C. Lee, “Sensor value validation based on systematic exploration of the
sensor redundancy for fault diagnosis,” IEEE Trans. Syst., Man Cybern.,
vol. 24, no. 4, pp. 594–605, Apr. 1994.
8 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
[9] M. L. Leushen, J. R. Cavallaro, and I. D. Walker, “Robotic fault detection
using analytical redundancy,” in Proc. IEEE Conf. Robot. Autom., 2002,
pp. 456–463.
[10] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli, “Fault tol-
erance techniques for wireless ad hoc sensor networks,” in Proc. IEEE
Conf. Sensors, 2002, pp. 1491–1496.
[11] A. Bernieri, M. D’Apuzzo, L. Sansone, and M. Savastano, “A neural net-
work approach for identification and fault diagnosis on dynamic systems,”
IEEE Trans. Instrum. Meas., vol. 43, no. 6, pp. 867–873, Dec. 1994.
[12] A. Bernieri, G. Betta, A. Pietrosanto, and C. Sansone, “A neural network
approach to instrument fault detection and isolation,” in Proc. IEEE Conf.
Instrum. Meas., 1994, pp. 139–144.
[13] K. S. Narendra and K. Parthasarathy, “Identification and control of dy-
namical systems using neural networks,” IEEE Trans. Neural Netw.,
vol. 1, no. 1, pp. 4–27, Mar. 1990.
[14] R. S. Ahmed, “Identification of nonlinear dynamic systems using a rapid
neural network,” in Proc. IEEE 27th Annu. Conf., Dec. 2001, vol. 3,
pp. 1734–1739.
[15] S. Straub and D. Shroder, “Identification of nonlinear dynamic systems
with recurrent neural networks and Kalman filter methods,” in Proc. IEEE
Int. Symp., May 12–15, 1996, pp. 341–344.
[16] K. Narendra and P. Gallman, “An iterative method for the identification
of nonlinear systems using a Hammerstein model,” IEEE Trans. Autom.
Control, vol. 11, no. 3, pp. 546–550, Jul. 1966.
[17] R. Haber, “Parametric identification of nonlinear dynamic systems based
on nonlinear crosscorrelation functions,” Proc. Inst. Electr. Eng.—Control
Theory Appl., vol. 135, no. 6, pp. 405–420, Nov. 1988.
[18] A. Lo Shiavo and A. M. Luciano, “Powerful and flexible fuzzy algorithm
for nonlinear dynamic system identification,” IEEE Trans. Fuzzy Syst.,
vol. 9, no. 6, pp. 828–835, Dec. 2001.
[19] S. Haykin, Neural Networks—A Comprehensive Foundation, 2nd ed.
Upper Saddle River, NJ: Prentice-Hall, 1998.
[20] F. L. Lewis, S. Jagannathan, and A. Yesilidrek, Neural Network Control
of Robot Manipulators and Nonlinear Systems. Philadelphia, PA: Taylor
& Francis, 1999.
[21] G. Cybenko, “Approximation by superpositions of a sigmoidal function,”
Math. Control Signals Syst., vol. 2, no. 4, pp. 303–314, 1989.
[22] K. Funahashi, “On the approximate realization of continuous mappings
by neural networks,” Neural Netw., vol. 2, no. 3, pp. 183–192, 1989.
[23] H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ:
Prentice-Hall, 2002.
[24] D. S. Bernstein, “Sensor performance specifications,” IEEE Control Syst.
Mag., vol. 21, no. 4, pp. 9–18, Aug. 2001.
Azzam I. Moustapha (M’98) received the B.S. degree in computer and com-
munications engineering from the American University, Beirut, Lebanon, the
M.S. degree in electrical engineering from the University of Houston, Houston,
TX, and the M.S. degree in mathematics from Louisiana Tech University,
Ruston. He is currently working toward the Ph.D. degree with the Department
of Electrical Engineering, College of Engineering and Science, Louisiana Tech
University.
Rastko R. Selmic (M’98) received the B.S. de-
gree in electrical engineering from the University of
Belgrade, Belgrade, Serbia, in 1994 and the M.S.
and Ph.D. degrees in electrical engineering from the
University of Texas, Arlington, in 1997 and 2000,
respectively.
From 2000 to 2002 he was a Lead DSP En-
gineer with Signalogic, Dallas, TX, where he
designs embedded hardware and software systems
for different telecommunication and Internet ap-
plications. Since 2002, he has been an Assistant
Professor of electrical engineering with the Department of Electrical En-
gineering, College of Engineering and Science, Louisiana Tech University,
Ruston. His research interests are in wireless sensor networks, intelligent
sensors and actuators, control systems, failure detection in nonlinear systems,
and neural networks.
Dr. Selmic is an Associate Editor for IEEE TR ANS AC T I ONS ON NE UR AL
NE T WOR KS and the IEEE TR ANS AC T I ONS ON SYS T E MS, MAN, AND
CYB E R NE T I C S.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 1
Wireless Sensor Network Modeling Using
Modified Recurrent Neural Networks:
Application to Fault Detection
Azzam I. Moustapha, Member, IEEE, and Rastko R. Selmic, Member, IEEE
Abstract—This paper presents a dynamic model of wireless
sensor networks (WSNs) and its application to sensor node fault
detection. Recurrent neural networks (NNs) are used to model
a sensor node, the node’s dynamics, and interconnections with
other sensor network nodes. An NN modeling approach is used
for sensor node identification and fault detection in WSNs. The
input to the NN is chosen to include previous output samples of
the modeling sensor node and the current and previous output
samples of neighboring sensors. The model is based on a new
structure of a backpropagation-type NN. The input to the NN
and the topology of the network are based on a general nonlinear
sensor model. A simulation example, including a comparison to
the Kalman filter method, has demonstrated the effectiveness of
the proposed scheme.
Index Terms—Fault detection, modeling, neural networks
(NNs), wireless sensor networks (WSNs).
I. I NT RODUC T I ON
W
IRELESS sensor networks (WSNs) consist of the fol-
lowing: a set of sensor nodes that can communicate with
each other; sensors that measure a desired physical quantity;
and the system base station for data collection, processing,
and connection to the wide area network. Modern wireless
sensor nodes have microprocessors for local data processing,
networking, and control purposes [1]. WSNs have enabled
numerous advanced monitoring and control applications in
environmental, biomedical, and numerous other applications.
Sensors in such networks have their own dynamics (often
nonlinear), and modeling such a sensor network is often not
trivial. Because recurrent neural networks (RNNs) consist of
interconnected dynamic nodes, we explore their similarities
with WSNs and exploit those similarities in WSN modeling.
This paper presents the modeling of WSNs using a modified
dynamic RNN.
The real motivation for WSN modeling stems from the need
for intelligent fault detection in complex distributed sensory
systems. Because sensor networks often operate in potentially
Manuscript received May 22, 2007; revised November 6, 2007. This work
was supported in part by the National Science Foundation EPSCoR Pfund Grant
32-0967-58159.
The authors are with the Department of Electrical Engineering, College of
Engineering and Science, Louisiana Tech University, Ruston, LA 71272 USA
(e-mail:
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
;
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIM.2007.913803
hostile and harsh environments, most of the applications are
mission critical. The sensors are often used to compute control
actions [2]–[4], where sensor faults can cause catastrophic
events. For instance, the National Aeronautics and Space Ad-
ministration was forced to abort the launch of the space shuttle
Discovery due to a failure in one of the sensors in the sensor
network of the shuttle’s external tank (the failure was discov-
ered through human inspection).
Components such as sensors and actuators have significantly
higher fault rates than the traditional integrated semiconduc-
tor circuits-based systems. Multisensor systems need feedback
information about the health status of their nodes in order to
recover and heal from eventual faults. Such a system would
have improved reliability over existing sensor networks. Be-
cause external and internal malfunctions or excessive noise can
occur, sensor readings are somewhat uncertain in the sense
that no existing sensor will deliver accurate readings at all
times. Development of a WSN that will have the capability
of fault detection, isolation, and accommodation is needed.
Efficiency in converting data to features while consistently
accommodating the uncertainty inherent in the measurements
form a key issue for diagnosing and dealing with sensor
faults [5], [6].
Fault tolerance emerged as very essential and urgent for
modern sensory systems [6]. The traditional way of achieving
fault tolerance in dynamic systems is through hardware redun-
dancy, such as through the use of multiple sensors. However,
the multiplication of sensor devices adds cost, complexity,
and power consumption to the sensor node and the whole
network. Most recent research has concentrated on an ana-
lytical redundancy [8], [9] in which the sensor measurements
are processed analytically, and the mathematical models are
compared with the physical measurements. Koushanfar et al.
proposed a heterogeneous fault tolerance technique [10] where
one type of resource can replace another, such as computing,
storage, communication, sensing, and actuating. For example,
when the available power is limited, one can rely more on
computing, which results in transmitting less data to the base
station or to the other sensor nodes. A comparison of the actual
sensor model with the nominal model is given in [11]. In addi-
tion, a comparison with the faulty models (system with faults)
allows one to determine what types of faults have occurred
[11], [12].
Instead of using additional hardware in the form of mul-
tiple sensors, we propose to use computational resources for
0018-9456/$25.00 © 2008 IEEE
2 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
intelligent fault detection. A dynamic model of a sensor node is
formed based on the information from neighboring nodes in the
network. RNNs have been applied to model a network due to
their topological similarities with WSNs. The communication
uncertainties are modeled using confidence factors that are
based on the received signal strength. More detailed commu-
nication models can be applied, but this is not the topic of this
paper.
There are many techniques for nonlinear dynamic system
identification using NNs. Bernieri et al. [11], [12] compared
the output signals of an NN model and the sensor to de-
tect faults. Once the fault has been detected, the parame-
ters of the NN identifier are compared to isolate a fault.
Narendra and Parthasarathy [13] demonstrated that the NNs
can effectively be used for the identification and control of
nonlinear dynamic systems. Ahmed [14] presented a rapid
NN for the identification of unknown nonlinear dynamic sys-
tems when the inputs and outputs are accessible for measure-
ments. Straub and Shroder [15] presented a new approach to
identifying nonlinear dynamic systems based on a general-
regression NN.
In addition to NNs, the identification of a nonlinear dy-
namic system was studied using some alternative techniques.
Narendra and Gallman [16] used an iterative algorithm to
obtain the dynamics of the system from a finite-length input
and noisy output data records. This has shown to converge
for a class of inputs, including colored Gaussian processes.
Haber [17] discussed a two-step identification method of least-
squares parameter estimation based on correlation functions for
nonlinear dynamic systems with linear parameters. Lo Shiavo
and Luciano [18] presented a new, powerful, and flexible fuzzy
algorithm for nonlinear dynamic system identification.
The rest of this paper is organized as follows: Section II
briefly covers some background on RNNs and their function ap-
proximation property. In Section III, a modified RNN (MRNN)
and its model of a dynamic sensor node are introduced, includ-
ing a result that shows how neighboring nodes can be used in
sensor node modeling. Section IV describes how such a tool can
be used in the sensors’ failure detection in a distributed sensor
network. The numerical simulations are given in Section V to
show the effectiveness of the proposed modeling scheme.
II. BAC KGROUND
The artificial RNNs have the ability to capture and model the
dynamic properties of nonlinear systems. The RNN nodes have
their own dynamics with interconnecting weights between the
nodes—similar to WSNs, where each sensor node has its own
dynamics. Recurrent networks also include feedback loops that
the standard NNs do not have [13], [19], [20].
This paper uses a nonlinear output error model [13] given by
y(k) = FNN (y(k
− 1), y(k − 2), . . . , y(k − m)
u(k), u(k
− 1), u(k − 2), . . . , u(k − n)) (1)
where y(k) is the NN output, y(k
− i)’s are previous NN
outputs, and u(k
− i)’s are the inputs, including the previous
Fig. 1. Two-layer RNN.
inputs. The nonlinear function FNN is computed by using a
feed-forward neural net given in matrix form by
FNN(x) = W T σ(V T x) (2)
where x is the NN input, V is the first-layer weight, W is
the second-layer weight, and σ(
·) is the neural net activa-
tion function, which is usually chosen as the standard sig-
moid function. The output activation function is chosen as a
linear function. The structure of the NN is given in Fig. 1.
The two-layer NN in Fig. 1 consists of two layers of tunable
weights and thresholds and has a hidden layer and an output
layer. The hidden layer has L neurons, and the input layer is a
combination of the delayed input u(k) and the output y(k).
Many well-known results indicate that any sufficiently
smooth function can be arbitrary and closely approximated on
a compact set using a two-layer NN with appropriate weights
[21], [22]. The layer weights V and W can be tuned. The NN
universal approximation property states that any continuous
function f can be arbitrarily well approximated using a linear
combination of sigmoidal functions, i.e.,
f (x) = W T σ(V T x) + ε(x) (3)
where ε(x) is the NN approximation error. The reconstruction
error is bounded on a compact set S by
ε(x)� < εN . More-
over, for any εN , one can find an NN such that
ε(x)� < εN
for all x
∈ S .
Given a function g(x) and a domain set D
⊂ �
n
, the func-
tion is said to satisfy the Lipschitz condition on set D if
g(x) − g(y)� ≤ L�x − y� (4)
for any x, y
∈ D [23]. The function is said to be globally
Lipschitz if the above condition is valid on
n
. If the function
g is mapping
→ �, then the condition is equivalent to
|g(x) − g(y)| ≤ L|x − y| (5)
which states that a straight line connecting any two points of
g(x) cannot have a slope with absolute value greater than L.
Therefore, any function with an infinite slope at some point is
not Lipschitz at that point.
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 3
Fig. 2. Ad hoc RNN with topology of a wireless sensor network.
III. MODI FI E D RE C UR R E NT NE UR AL NE T S I N
S E NS OR NE T WOR K MODE L I NG
The dynamic RNNs consist of a set of dynamic nodes that
provide internal feedback to their own inputs (see Fig. 1). They
can be used to simulate and model dynamic systems, such as
a network of sensors. The WSNs consist of a large number
of sensors that in turn have their own dynamics. They interact
between themselves and the base station, which controls the
network. In multihop WSNs, the information hops from one
node to another and, finally, to the network gateway or the base
station.
To develop a dynamic model for such sensors, without loss
of generality, we assume that there is one sensor per sensor
node. More sensors per node will just increase the size of
the RNNs.
The sensor nodes can be viewed as small dynamic systems
with memory-like features. The output of one node forwards
the information to the next node (e.g., node 3 provides the input
to node 5, Fig. 2). Although the standard RNN is structured
in layers, we introduce an ad hoc RNN analogous to WSN
systems with confidence factors (0 < Cij < 1) between nodes
i and j . The confidence factor depends on the signal strength
and the data quality in communication links between the nodes.
For instance, in tuning node 2, the valuable inputs are coming
from node 1 and node 4, which provides the corresponding
confidence factors that are close to 1. If node 7 is not in the
coverage area of node 2, then the confidence factor is 0, and
node 7 will not directly influence node 2.
Note that the confidence factors do not provide a stochastic
modeling of the communication channel. The overall mod-
eling process can be divided into two phases: the learning
phase, where the NN adjusts its weights that correspond to
the healthy and N faulty models (where N is the num-
ber of fault types), and the production phase, where the current
output of the sensor node is being compared with the output
of the NN. The difference between these two signals is used
as a measure of a sensors’ health status. In case of a fault, the
NN weights (model) are compared with the faulty models to
isolate the fault. If there is no similar fault model, then the
fault bank model is updated with the new type of fault and the
corresponding model parameters, i.e., NN weights. This whole
process is repeated during the production phase.
Consider a nonlinear dynamical sensor model given by
yi (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), ui (k)) (6)
where ui (k) and yi (k) are the sensor input and output at sample
k, and fi ’s are the unknown nonlinear functions. For a sensor
to be operational and the user to be able to determine the real
sensor input, the function fi has to be invertible, i.e.,
ui (k) = f −1
i (yi (k
− 1), yi (k − 2), . . . , yi (k − m), yi (k)) .
(7)
Equation (7) indicates that to determine the physical input at the
sample k, knowledge of the present and past m sensor outputs
is required. A more general dynamic sensor model is given
by [18]
yi (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), �ui (k)) (8)
where �un
i (k) is a vector of input data �uni (k) = [ui (k), ui (k
−
1), . . . , ui (k
− n)]. Similarly, for the sensor to be usable and
the users be able to determine the physical input values based
on the sensor outputs, the nonlinear function has to be invertible
with respect to input the signal arguments
uni (k) = fi (yi (k
− 1), yi (k − 2), . . . , yi (k − m), yi (k)) .
(9)
Such sensor models correspond to a general sensor model
given in [12], i.e., the Hammerstein–Wiener nonlinear feedback
dynamic sensor model (Fig. 3), which consists of a linear
dynamic block surrounded by three nonlinear static blocks [24].
4 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Fig. 3. Linear dynamic block surrounded by three static nonlinear blocks representing a Hammerstein–Wiener dynamic sensor model.
Fig. 4. Sensor node i and its neighboring sensors i1 , i2 , . . . , iN i .
It is assumed that all the sensors have models of the same
order. If that is not the case, the analysis can still be carried out
with slight modification.
Assumption 1: The sensor nodes have a nonlinear model of
the same order given by (4).
Assumption 2: The functions fi ’s are globally Lipschitz
functions with Li ’s being their Lipschitz constant, respectively.
Note that Assumption 2 essentially states that the increment
in the value of the function is bounded for a bounded increment
in the argument of the function x.
Although the wireless sensor nodes are distributed in the
field, it is assumed that the neighboring nodes have a bounded
difference in the measured physical quantity. Mathematically,
the assumption is given as follows.
Assumption 3: The neighboring sensor nodes have measure-
ment events that differ by a bounded constant, i.e., for a sensor
nodes neighbors a and b, we have
ua (k)
− ub (k) = eab (k) (10)
and
eab (k)� < e.
The next result shows how to model a WSN using an RNN
and how to use such a tool in the failure detection of sensor
nodes.
Theorem 1 (WSN Model Using RNNs): Having a model of
a sensor node i (6), Assumptions 1–3, and the node neighbors
that include nodes i1 , i2 , . . . , iN i (see Fig. 4), the output of the
sensor node can be approximated using RNN with inputs con-
sisting of the previous outputs from node i and its neighboring
nodes i.e.,
yi (k) = RNNi
yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
+ c (11)
where j = 1, 2, . . . , Ni , and c is a small bounded constant.
Proof: From Assumption 3, it follows that
ui (k) = ui
j (k) + eij (k) (12)
where j = 1, 2, . . . , Ni . Equivalently, the input ui (k) is
given by
ui (k) = 1
Ni
Ni
j =1
uij (k) + eij (k). (13)
Therefore, one has
yi (k) = fi
y
i (k
− 1), yi (k − 2), . . . , yi (k − m)
1
Ni
Ni
j =1
uij (k) + eij (k)
. (14)
Using (7), one has
yi (k) = fi
y
i (k
− 1), yi (k − 2), . . . , yi (k − m)
1
Ni
Ni
j =1
f −1
ij (yij (k
− 1), yij (k − 2), . . .
yij (k
− m), yi
j (k)
+eij (k)
. (15)
Knowing that the function fi is Lipschitz yields
yi (k) = gi (yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
+ d (16)
where j = 1, 2, . . . , Ni , and
d� ≤ e max(Lj ).
Using the NN function approximation property, there is an
RNN that approximates the unknown function gi such that
gi (yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
= RNNi (x) + εi (x) (17)
where the vector x is given by
x = [yi (k
− 1), yi (k − 2), . . . , yi (k − m)
yi
j (k), yij (k
− 1), . . . , yij (k − m)
. (18)
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 5
The bounded constant c is then given by
c = εN i + e max(Lj ). (19)
This completes the proof. �
This shows that the sensor node output can be approximated
as an RNN with inputs of m previous output samples of the
same node and m previous output samples of neighboring
sensors. The RNN approximates the sensors dynamics, which
can, in general, be nonlinear. The proposed method can actually
be applied for linear and nonlinear, dynamic, and static sensor
models.
The previous results assume ideal communication links. In
the case where there are communication link uncertainties, the
exact value of yi
j (k) is not available. Instead, we use the output
values of the neighboring sensor nodes combined with the
confidence factors, i.e., Cj i yi
j (k). Then, the recurrent neural
net sensor node models are given by
yi (k) = RNNi
yi (k
− 1), yi (k − 2), . . . , yi (k − m)
Cj i yi
j (k), Cj i yij (k
− 1), . . . , Cj i yij (k − m)
+c. (20)
The confidence factors for sensor node i are proportional to
the signal strength between node i and its neighbors. A con-
fidence factor between neighboring nodes i and j represents a
“confidence” of sensor node i in the data generated by sensor
node j . The factor depends on certain parameters such as
proximity and distance between two nodes, terrain between
nodes, topology of the sensor network, and the received sig-
nal strength. We use the received signal strength that can be
obtained from the receiving sensor nodes as a measure of
the confidence factor. The received signal strength indicator
(RSSI) has been used in practical applications as a part of the
IEEE 802.11 standard; the existing commercial sensor nodes
have this capability (Crossbow and MoteIV wireless sensor
nodes).
IV. AP P L I C AT I ON TO A S E NS OR NODE F AULT DE T E C T I ON
The previous results provide a tool for approximating a
wireless sensor node output using RNNs. The method can be
applied to a wide range of nonlinear dynamic models. A moti-
vation for the above results stems from the need to detect faults
in a network of distributed wireless network of sensor nodes.
To detect possible sensor faults at the node level, we compare
the real output and the RNN approximation model. If such a
difference is larger than a threshold, then there is a fault at the
sensor.
For a sensor node i, its real output yi (k), and an RNN model
output RNNi (k), if
RNNi (k) − yi (k)� ≥ ηi , then there is a
fault at the sensor node i.
Fig. 5(a) and (b) shows the structure of the modified recurrent
network with its inputs consisting of the delayed output signals
of the same NN and the previous and current modified output
signals from neighboring sensors. It is initially assumed that
all the confidence factors between node i and the neighboring
nodes are equal to 1. Fig. 5(a) shows the topology during the
learning phase and Fig. 5(b) during the production phase, where
Fig. 5. (a) Block diagram of the system identification in the learning phase.
(b) Block diagram of the system identification in the production phase.
6 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
a fault analyzer detects the difference between the sensor and
the MRNNs.
V. S I MUL AT I ON RE S ULT S
We have simulated a sensor network with 15 sensors nodes
and one sensor per node. Each sensor has two or three “visible”
neighbors. Of course, if sensor i is a neighbor of j , then the
opposite is also true.
The sensors used to generate real data are temperature
sensors. For a period of 72 h. the temperature measurements
were taken and forwarded to the base station node where the
data were collected. The differences in the data output are
small enough to guarantee and justify the theoretical approach
described in Section III. The data are retrieved with 1 h of
sampling time.
Each sensor is modeled using an MRNN described in the
previous section. An MRNN node has inputs consisting of
the previous output samples of the same node and the current
and previous outputs of the neighboring sensor nodes. At first,
we assumed confidence factors equal to 1 followed by a more
realistic assumption that the confidence factors are less than 1.
In the simulation, the RNN has an input layer with eight
nodes, a hidden layer with ten nodes, and an output layer with
one node. The learning algorithm is the standard backpropa-
gation. The learning rates for the first layer and the hidden
layer are set to 0.01. The learning phase stopped after the
difference between the expected and the actual RNN output
reached a steady-state value. We used around ten iterations for
NN learning. We used Matlab 7.1 simulation software.
The neural net learning rate η, which has a value between
0 and 1, plays a key role in the learning process. It affects the
rate of convergence during the learning phase. For too-small
values of the learning rate, the learning process will be very
slow with a high probability of convergence. On the other hand,
when η approaches 1, the learning process is fast with a low
probability of convergence. Therefore, it is recommended to use
a moderate value of η. In addition, a number of training samples
also play an important role in modeling the accuracy and sensor
generalization. By generalization, we refer to the ability of the
network to approximate the output for an input different from
the training set. The results in the simulation illustrate that.
The initial NN weights also affect the learning time and the
convergence of the cost function (sum of errors between desired
and actual outputs). For instance, when the initial weights are
chosen near a local minimum, the cost function will converge
to that minimum, particularly for a small learning rate. When
the initial weights are chosen near a global minimum, the cost
function will converge to this global minimum. In both, the
choice of initial weights and the learning rate can affect the
number of iterations needed for a satisfactory NN convergence.
A data window with n + 1 samples corresponds to the cur-
rent and the n previous sensor outputs. The window size affects
the precision and accuracy of the step-ahead approximated
sensor value as well as the sensitivity of the fault detection
technique. We have chosen the window size of four samples.
Increasing the window size adds more nodes to the MRNN.
Therefore, there is a tradeoff between increased accuracy and
Fig. 6. Actual output of sensor #1, its models using modified recurrent
neural network and Kalman filter, and their corresponding discrepancies with
confidence factors set to one.
Fig. 7. Evolution of the difference e(k) between the MRNN model and the
actual output of sensor #1 with confidence factors set to one.
additional complexity of the NN and, ultimately, the duration
of the training process. In particular, for online measurement
applications, it is desirable to use smaller a window size. After
a satisfactory convergence was achieved, the validation was
provided by estimating the next (future) sensor sample.
We compared the RNN model with a Kalman filter. The
estimated value from the previous time step and the current
measurement coming from the real sensor are used as input
variables to the Kalman filter. For sensor #1, the results for
the RNN modeling and Kalman filtering techniques are shown
in Fig. 6, including a comparison of both results. The output
of the MRNN closely approximates the actual output of the
sensor with an error clearly smaller than the one produced
using the Kalman filter model. The MRNN model can certainly
better reproduce the dynamic behavior of the sensor. Likewise,
the discrepancies between the actual output of sensor #1 and
MOUSTAPHA AND SELMIC: WSN MODELING USING MRNNs: APPLICATION TO FAULT DETECTION 7
Fig. 8. Actual output of sensor #1, its models using modified recurrent neural
network and Kalman filtering technique, and their corresponding discrepancies
with confidence factors less than one.
Fig. 9. Evolution of the difference e(k) between the MRNN model and the
actual output of sensor #1 with confidence factors less than one.
its resulting models NN and Kalman filtering with confidence
factors set to 1 are shown.
Fig. 7 shows the evolution of the error between the NN model
and the actual sensor output during the learning phase. The error
decreases with the number of training iterations.
A more realistic assumption is to consider the confidence
factors between nodes less than 1. Taking C21 = 0.8, C31 =
0.6, and C41 = 0.95, the results for sensor node 1 are shown in
Figs. 8 and 9. One cannot notice a larger difference between the
sensor output and the MRNN model in this case, but still, the
result of our approach is better than that of the Kalman filtering
technique.
To model a sensor fault, we have used a linear drift given by
d(t) = 0.2tu(t
− 3), where u(t) is a unit step function, and the
time t is in hours. Fig. 10 shows the sampled output of faulty
sensor #1 when this sensor has a fault (linear drift) starting
Fig. 10. Output of faulty sensor #1 and its MRNN model.
at t0 = 3 h. Likewise, the estimated MRNN output when the
sensor was in a normal healthy mode is shown. Using the
MRNN modeling technique, the fault was successfully detected
when the fault in the sensor output reached 1 ◦ F at t = 8 h.
VI. CONC L US I ON
This paper has described a dynamic model of a WSN and its
application to sensor failure detection and identification. It was
shown how the NN model depends on the sensor model and
the network structure. The overall network model corresponds
to the topology of the WSN. The inputs to the NN are taken
from the node that is modeled and from neighboring nodes.
A communication confidence factor was taken into account in
the modeling. A simulation with comparison to the Kalman
filtering technique is carried out on a network with 15 sensor
nodes. A fault such as a drift is introduced and was successfully
detected with the modified recurrent neural net model with no
early false alarm that could occur with the Kalman filtering
approach.
RE F E R E NC E S
[1] Crossbow. [Online]. Available: http://www.xbow.com
[2] X. Di, B. K. Gosh, X. Ning, and T. J. Tarn, “Sensor-based hybrid position/
force control of a robot manipulator in an uncalibrated environment,”
IEEE Trans. Control Syst. Technol., vol. 8, no. 4, pp. 635–645, Jul. 2000.
[3] S. E. Lysheyski, “Smart flight control surfaces with microelectrome-
chanical systems,” IEEE Trans. Aerosp. Electron. Syst., vol. 38, no. 2,
pp. 543–552, Apr. 2002.
[4] S. Katsura, Y. Matsumoto, and K. Ohnishi, “Analysis and experimental
validation of force bandwidth for force control,” in Proc. IEEE Int. Conf.
Control Syst. Technol., Dec. 2003, pp. 796–801.
[5] A. D. Pouliezos and G. S. Stavrakankis, Real Time Fault Monitoring of
Industrial Processes. Norwell, MA: Kluwer, 1994.
[6] A. Zhirabok and O. V. Preobragenskaya, “Instrument fault detection
in nonlinear dynamic systems,” in Proc. Syst., Man Cybern., 1993,
pp. 114–119.
[7] C. B Weinstock, W. L Heirnerdinger, H. Ihara, S. B. Johnson,
H. D. Kirrmann, J. J. Stiffler, and L. Yount, “The state of the practice in
fault tolerant systems,” in Proc. 22nd Int. Symp. Fault-Tolerant Comput.,
Jul. 1992, pp. 2–5.
[8] S. C. Lee, “Sensor value validation based on systematic exploration of the
sensor redundancy for fault diagnosis,” IEEE Trans. Syst., Man Cybern.,
vol. 24, no. 4, pp. 594–605, Apr. 1994.
8 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
[9] M. L. Leushen, J. R. Cavallaro, and I. D. Walker, “Robotic fault detection
using analytical redundancy,” in Proc. IEEE Conf. Robot. Autom., 2002,
pp. 456–463.
[10] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli, “Fault tol-
erance techniques for wireless ad hoc sensor networks,” in Proc. IEEE
Conf. Sensors, 2002, pp. 1491–1496.
[11] A. Bernieri, M. D’Apuzzo, L. Sansone, and M. Savastano, “A neural net-
work approach for identification and fault diagnosis on dynamic systems,”
IEEE Trans. Instrum. Meas., vol. 43, no. 6, pp. 867–873, Dec. 1994.
[12] A. Bernieri, G. Betta, A. Pietrosanto, and C. Sansone, “A neural network
approach to instrument fault detection and isolation,” in Proc. IEEE Conf.
Instrum. Meas., 1994, pp. 139–144.
[13] K. S. Narendra and K. Parthasarathy, “Identification and control of dy-
namical systems using neural networks,” IEEE Trans. Neural Netw.,
vol. 1, no. 1, pp. 4–27, Mar. 1990.
[14] R. S. Ahmed, “Identification of nonlinear dynamic systems using a rapid
neural network,” in Proc. IEEE 27th Annu. Conf., Dec. 2001, vol. 3,
pp. 1734–1739.
[15] S. Straub and D. Shroder, “Identification of nonlinear dynamic systems
with recurrent neural networks and Kalman filter methods,” in Proc. IEEE
Int. Symp., May 12–15, 1996, pp. 341–344.
[16] K. Narendra and P. Gallman, “An iterative method for the identification
of nonlinear systems using a Hammerstein model,” IEEE Trans. Autom.
Control, vol. 11, no. 3, pp. 546–550, Jul. 1966.
[17] R. Haber, “Parametric identification of nonlinear dynamic systems based
on nonlinear crosscorrelation functions,” Proc. Inst. Electr. Eng.—Control
Theory Appl., vol. 135, no. 6, pp. 405–420, Nov. 1988.
[18] A. Lo Shiavo and A. M. Luciano, “Powerful and flexible fuzzy algorithm
for nonlinear dynamic system identification,” IEEE Trans. Fuzzy Syst.,
vol. 9, no. 6, pp. 828–835, Dec. 2001.
[19] S. Haykin, Neural Networks—A Comprehensive Foundation, 2nd ed.
Upper Saddle River, NJ: Prentice-Hall, 1998.
[20] F. L. Lewis, S. Jagannathan, and A. Yesilidrek, Neural Network Control
of Robot Manipulators and Nonlinear Systems. Philadelphia, PA: Taylor
& Francis, 1999.
[21] G. Cybenko, “Approximation by superpositions of a sigmoidal function,”
Math. Control Signals Syst., vol. 2, no. 4, pp. 303–314, 1989.
[22] K. Funahashi, “On the approximate realization of continuous mappings
by neural networks,” Neural Netw., vol. 2, no. 3, pp. 183–192, 1989.
[23] H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ:
Prentice-Hall, 2002.
[24] D. S. Bernstein, “Sensor performance specifications,” IEEE Control Syst.
Mag., vol. 21, no. 4, pp. 9–18, Aug. 2001.
Azzam I. Moustapha (M’98) received the B.S. degree in computer and com-
munications engineering from the American University, Beirut, Lebanon, the
M.S. degree in electrical engineering from the University of Houston, Houston,
TX, and the M.S. degree in mathematics from Louisiana Tech University,
Ruston. He is currently working toward the Ph.D. degree with the Department
of Electrical Engineering, College of Engineering and Science, Louisiana Tech
University.
Rastko R. Selmic (M’98) received the B.S. de-
gree in electrical engineering from the University of
Belgrade, Belgrade, Serbia, in 1994 and the M.S.
and Ph.D. degrees in electrical engineering from the
University of Texas, Arlington, in 1997 and 2000,
respectively.
From 2000 to 2002 he was a Lead DSP En-
gineer with Signalogic, Dallas, TX, where he
designs embedded hardware and software systems
for different telecommunication and Internet ap-
plications. Since 2002, he has been an Assistant
Professor of electrical engineering with the Department of Electrical En-
gineering, College of Engineering and Science, Louisiana Tech University,
Ruston. His research interests are in wireless sensor networks, intelligent
sensors and actuators, control systems, failure detection in nonlinear systems,
and neural networks.
Dr. Selmic is an Associate Editor for IEEE TR ANS AC T I ONS ON NE UR AL
NE T WOR KS and the IEEE TR ANS AC T I ONS ON SYS T E MS, MAN, AND
CYB E R NE T I C S.
|
|
Last Updated ( Sunday, 20 April 2008 )
|
|