26.5 Wear Model/Decision-Making for Sensor-Based Tool Condition Monitoring

Back

26.5.1 Trending, Threshold

A very simple decision-making technique is to

trend features and to establish threshold values.

When a certain feature or set of features crosses a

threshold value, an estimation of the tool condition

can be made. Unfortunately, these threshold

values can only be determined experimentally.

The difficulty with this method is to determine

the correct threshold value, especially under

diverse cutting conditions. Furthermore, the

method is extremely sensitive to disturbances.

The trend of the mean feed force with increasing

flank wear is shown in Figure 26.19, and two

thresholds are shown as examples. It is clear that

this technique is not very reliable due to the large

variance in the trend.

FIGURE 26.18 Comparison of correlation coefficient and SOF for feature selection. (Source: Scheffer, C. and Heyns,

P.S., Mech. Syst. Signal Process, Elsevier, 2004. With permission.)

threshold-replace

threshold-warning

flank wear VB [mm]

feature value [normalised]

0.04 0.06 0.08

−1

−0.5

0

0.5

1

FIGURE 26.19 Example of trend and thresholds.

Vibration-Based Tool Condition Monitoring Systems 26-15

© 2005 by Taylor & Francis Group, LLC

26.5.2 Neural Networks

The use of NNs as a secondary, more sophisticated signal processing and decision-making technique is

often found in TCM applications. The simultaneous utilization of many features and the robustness

towards distorted sensor signals are two of the most attractive properties of NNs. Neural networks also

assist in the fusion of sensor information for TCM. In other words, combining features from acceleration,

AE, and force signals in a NN can result in a method that can predict the tool condition with increased

accuracy (Silva et al., 1998). The successful implementation of NNs is dependent on the proper selection

of the network structure, as well as the use of the correct training and testing methods.

It is important to make a distinction between supervised and unsupervised NN paradigms.

Unsupervised NNs are trained with input data only, and are usually used for discrete classification of

different stages of tool wear. Supervised NNs are trained with input and output data, and these are used

for continuous estimations of tool wear. Furthermore, a distinction should be made between dynamic

and static NNs. In the case of dynamic NNs, temporal (time) information is included in the network with

the aim to model a time series. This can be done explicitly by using a time-based feature as an input to the

network, or implicitly by using recurrent networks or networks with tapped delay lines (TDLs). Dynamic

networks are preferred for TCM because tool wear is time-dependent (tool wear is a monotonically

increasing parameter that is partly a function of machining time).

26.5.2.1 Unsupervised Networks

There are two basic network paradigms for unsupervised

classifications, namely adaptive resonance

theory (ART) and the self-organizing map (SOM).

ART is based on competitive learning, addressing

the stability – plasticity dilemma of NNs. The main

advantage is its ability to adapt to changing

conditions. ART networks also have self-stability

and self-organization capabilities. The SOM is

actually a data-mining method used to cluster

multidimensional data automatically. A highdimensional

feature matrix can be displayed on a

two-dimensional grid of neurons that are arranged

in clusters with similar feature values. Clusters for new and worn tools can be formed, and these are used for

automatic classification of the tool condition. A SOM is depicted schematically in Figure 26.20.

There are many practical advantages for using unsupervised networks. One is the fact that the

machining operation is not interrupted for tool wear measurements during the training phase. There is

also the advantage of practical implementation if machining conditions change very often and

appropriate training samples for supervised learning cannot be collected. Furthermore, the numerous

different combinations of tool and workpiece materials and geometries can make supervised learning

impossible. Normally, unsupervised NNs are used to identify discrete wear classes and cannot be used for

a continuous estimation of tool wear.

Silva et al. (2000) investigated the adaptability of the SOM and ART for tool wear monitoring during

turning with changing machining conditions. It was found that, with appropriate training, the methods

have enough adaptive capabilities to be employed in industrial applications. Govekar and Grabec (1994)

use the SOM for drill wear classification, where the SOM is used as a kind of empirical modeler. It was

found that the adaptability of the SOM and its ability to handle noisy data makes the technique viable for

on-line TCM. Scheffer and Heyns (2000b, 2001b) showed how a TCMS can be adaptable using SOMs.

Different network sizes were compared with define discrete classes of new and worn tools. Larger networks

yielded more continuous results. The TCMS using SOMs was applied to monitoring synthetic diamond

tools for an industrial turning operation. It was found that the SOM can be used for industrial applications,

especially if tool wear measurements are not available.

FIGURE 26.20 Schematic representation of the SOM.

26-16 Vibration and Shock Handbook

© 2005 by Taylor & Francis Group, LLC

Different NN paradigms were compared on a wear monitoring application for aluminum turning by

Scheffer and Heyns (2002b). It was shown that the SOM is useful to identify discrete wear classes, as

shown in Figure 26.21. If an exact value of the tool wear is required, supervised networks will yield better

results but will require proper training samples.

26.5.2.2 Supervised Networks

Common supervised NNs used for TCM are the multilayer perceptron (MLP), multilayer feedforward

(FF) network, recurrent neural network (RNN), supervised neuro-fuzzy system (NFS-S), time delay

neural network (TDNN), single layer perceptron (SLP) and the radial basis function (RBF) network. The

use of an SLP for TCM is described by Dimla et al. (1996), using the perceptron learning rule for training.

The SLP is useful to identify discrete classes of the tool condition. FF networks are usually trained with

the backpropagation algorithm. However, backpropagation should not always be the preferred choice

because other methods are known that outperform this technique in terms of training time and

generalization. The size of the hidden layers in multilayer networks should be optimized for performance.

Many contradictory statements about the use of MLP networks can be found in the literature. One of the

main problems is the selection of the number of input features, size of the network, and the number of

training examples that should be used.

A multilayer feedforward (FF) network is shown schematically in Figure 26.22. Normally, a nonlinear

activation function should be used in the first layer, and linear neurons in the subsequent layers. In the

case of the FF networks, the backpropagation algorithm is often used for training. Backpropagation is an

optimization algorithm based on steepest gradient descent.

The use of FF networks with the backpropagation

training rule is reported by authors such as

Zhou et al. (1995), Das et al. (1996), and

Zawada-Tomkiewicz (2001). Cutting conditions

can also be included in such networks. Lou and Lin

(1997) describe the use of a FF network using a

Kalman filter to avoid the training problems

encountered with backpropagation for a TCM

application. The proposed method is less sensitive

to the network initializations that often cause

convergence problems with backpropagation.

Monitoring a dynamic system such as a cutting

process should be done with a dynamic modeling

technique such as dynamic NN paradigms, for

example, recurrent networks, TDNNs, or explicit

FIGURE 26.21 Unsupervised approach to wear monitoring with the SOM. (Source: Scheffer, C. and Heyns,

P.S., South African Inst. Tribol. 2002. With permission.)

FIGURE 26.22 Multilayer FF network.

Vibration-Based Tool Condition Monitoring Systems 26-17

© 2005 by Taylor & Francis Group, LLC

inclusion of temporal information in static networks. Recurrent NNs have feedback connections

from their output to their input. There are various types of recurrent NNs that are useful for

specific applications. Elman networks are quite interesting. Generally, they are two-layer networks

with feedbacks from the first layer output to the first layer input. This type of network can be used to

learn and model temporal patterns. A recurrent network and an Elman network are shown schematically

in Figure 26.23.

Liu and Altintas (1999) report on the use of a FF network using a combination of TDLs and recurrent

connections. Machining conditions are also included. It is stated that the system was integrated into an

industrial TCMS, but was never put to use due to lack of “… robust, practical cutting force sensors …”

(Liu and Altintas 1999). Scheffer and Heyns (2002b) report on the use of an Elman NN for TCM. It was

found that the Elman network has a very smooth response and yielded better results than static NN

paradigms. It should be mentioned that the Elman network requires more time for training, but because

this is done off-line, training time should not be a criterion for evaluating NNs.

Neuro-fuzzy systems (NFS-S) attempts to combine the learning ability of NNs with the interpretation

ability of fuzzy logic. A TCMS using an NFS-S can be generated almost automatically because the fuzzy

rules can be learned by the NN. A combination of supervised and unsupervised training is used for

NFS-S. An in-process NFS-S system to monitor tool breakage was designed and implemented

successfully by Chen and Black (1997), concentrating on end milling operations. Xiaoli et al. (1997)

as well as Chungchoo and Saini (2002) also propose some of the advantages of using an NFS-S for TCM.

RBF networks are often preferred because of the convergence properties of the training algorithm.

In essence, convergence can be guaranteed and is often achieved much faster than in MLPs. The

accuracy of RBFs depends on the choice of the centers for the basis functions, and should be treated

with care. Pai et al. (2001) reported on the use of a resource allocation network (RAN) for TCM. The

RAN is a RBF network utilizing sequential learning. The RAN is compared with the MLP for wear

estimation during face milling. It was found that the RAN has faster learning ability but the MLP is

more robust.

TDNNs have delay elements in the feedforward connections, called TDLs. One advantage of

TDNNs over RNNs is that stability problems are avoided. An investigation towards the inclusion of

one and two phase delays for a TCM application was reported by Venkatesh et al. (1997). Different

network sizes were also investigated, and it was found that the NNs with temporal memory generally

perform better than those without memory. It is also stated that new algorithms should be

investigated for training. Sick and Sicheneder (1997) also describe the use of TDNNs for TCM in

turning. The TDNN is compared with the MLP and a significant improvement was found when using

FIGURE 26.23 Recurrent networks: feedback connection (left) and Elman network (right). (Source: Scheffer, C. and

Heyns, P.S., South African Inst. Tribol. 2002. With permission.)

26-18 Vibration and Shock Handbook

© 2005 by Taylor & Francis Group, LLC

TDNNs. In another instance, Sick et al. (1998) compare the SOM, NFS-S, and MLP networks for

wear estimation. The following critical questions are used to evaluate the different NN paradigms

(Sick et al., 1998):

* Are the generalization capabilities of the NN sufficient (tested on previously unseen data)?

* What rate of correct classification can be achieved for different wear stages?

* Are the results repeatable (e.g., with a new initialization)?

In the case study presented by Sick et al. (1998), the best results were found with MLPs. It is

stated, however, that the results can be improved when using TDNNs, and such results are reported in

Sick (1998).

A novel combined approach is suggested by Sick (1998) to handle the effect of machining parameters.

An empirical model is used to normalize the data with respect to machining parameters before the data

are entered into the NN. Thus, machining parameters are not included in the NN itself. This approach

solves the extrapolation limitations encountered when an NN is tested with data recorded with

machining parameters it was not trained with. Although many authors test their NNs’ paradigms in such

a way, NNs cannot be expected to extrapolate. NNs should instead be tested with previously unseen data

recorded with same machining parameters it was trained with (hence an interpolation effect). This is a

problem because training and testing patterns for each condition must be supplied. However, if data can

be normalized with respect to machining parameters, training is only required for the normalized

condition. This was in effect achieved by Sick (1998). A difficulty still lies with establishing an appropriate

model, and in many cases it will also require a large number of experimental tests. A possible solution lies

in the incorporation of numerical models, for example, finite element models.

Scheffer (2002) presented another approach to tool wear monitoring of turning operations, using a

combination of static and dynamic NNs. Static networks are trained off-line to model selected features

from cutting forces. A dynamic NN that uses explicit temporal information is then trained on-line

with the particle swarming optimization algorithm (PSOA). The training goal of the dynamic NN

is to minimize the errors between the outputs of the static NNs and the on-line measurements.

The method was tested on various turning operations and was also tested on an industrial shop floor.

It was found that the method is more accurate and reliable than other NN paradigms and can be used

with cost-effective hardware (Scheffer and Heyns 2002a, 2004). The method is depicted schematically

in Figure 26.24.

FIGURE 26.24 Combined static and dynamic NN approach for turning. (Source: Scheffer, C. and Heyns, P. S.,

Mech. Syst. Signal Process, Elsevier, 2004. With permission.)

Vibration-Based Tool Condition Monitoring Systems 26-19

© 2005 by Taylor & Francis Group, LLC

26.5.3 Fuzzy Logic

Many authors have investigated the use of fuzzy logic to classify tool wear. It has been shown that fuzzy

logic systems demonstrate great potential for use in intelligent manufacturing applications. While NN

models cannot directly encode structured knowledge, it is often stated that fuzzy systems can directly

encode structured knowledge in a numerical framework. Additionally, fuzzy systems are capable of

estimating functions of a system with only a partial description of the system’s behavior.

Du et al. (2002) propose a very interesting method called transition fuzzy probability, which was

applied to a boring operation. This formulation can deal with the uncertainty of process conditions.

The method performs well because TCM has two uncertainties: that of occurrence and that of

appearance. The transition fuzzy probability solves this issue through the use of temporal information,

similar to dynamic NNs. The method was shown to outperform a backpropagation NN, although

very few details are given. It would be interesting to compare this method with dynamic NNs such

as TDNNs.

Fu et al. (1997) combined force, vibration, and AE in a fuzzy classifier for TCM during milling. Timeand

frequency-domain features were used, and it was found that combining the sensory information

achieved the best result. This is done within the fuzzy classifier. Li and Elbestawi (1996) and Kuo and

Cohen (1998) combine fuzzy modeling steps with NNs at different levels for TCM. The latter combined

force, vibration and AE in a multisensor approach with satisfactory results.

26.5.4 Other Methods

There are also a number of other decision-making and modeling methods that have been applied to

TCM, and these include:

* Knowledge-based expert systems (Du, 1999)

* Pattern recognition algorithms (Kumar et al., 1997)

* Dempster– Shafer theory of evidence (Beynon et al., 2000)

* Hidden Markov models (Ertunc and Loparo, 2001; Ertunc et al., 2001)

Of these four approaches, only hidden Markov models have the potential possibly to outperform NNs

and fuzzy systems. However, not enough comparable research has been conducted in this area, and is it

certainly a worthwhile topic for future research.