Determining the Frequency of Condition Monitoring tasks
A thread from the plantmaint Maintenance Discussion Group
Plant Maintenance Resource Center Home
Maintenance Articles
This discussion thread took place within the plantmaint mailing list - a discussion forum for maintenance-related issues. What was the conclusion? Read on and make up your own mind! For more information on the plantmaint mailing list, click here.
From: "Holmes, Matthew"
Sent: Tuesday, November 16, 1999 10:55 PM
Can anyone on this list point me to a good source of published standard
frequencies, hard copy handbook(s) or on-line website(s), for Maintenance
and Conditioned Based Monitoring. For example, I am looking for frequencies
of the following, for various plant equipment (valves, pumps, motors, HXs,
etc.):
- lubrication
- lubrication analysis
- vibration monitoring
- noise analysis
- thermography
- etc.
Note: Manufacturer's recommendations acceptable.
Thank you in advance for your time and efforts!
Regards,
Matthew
From: "Peter Ball"
Matthew,
Have a look at <link no longer exists>
There is enough there to get you started, but frequencies are not addressed
in detail. These are a function of equipment Criticality and can be
calculated as a function of Mean Time To Failure (MTTF).
Many CM users work on the basis of monthly checks, as it is less complex for
trend analysis.
Hope this is of assistance, and Good Luck.
Peter Ball
From: "Steve Turner"
Hi
Might I suggest that the frequencies of condition monitoring tasks are a
primary function of the rate of decay of the failure rather than the MTBF.
(ref RCM II Moubray (1996) Nolan and Heap (1978)et al) MTBF only comes
into the equation if the inspection confidence is known and is less than
100 %. (MIL STD 2173.)
There are some "RCM" algorithm software systems which use MTBF to calculate
CM intervals but I find that they give varying outcomes depending on the
estimates of MTBF and the inspection confidence (and indeed other inputs
such as the cost of failure and the cost of completing the inspection.)
One needs also to ponder the equations as they presume MTBF is fixed
therefore all failures are random. I am aware that there are formulae that
account for this but to me they begin to border on the ridiculous as the
analyst must be prepared not only to estimate the average life but the
failure pattern.
By far, the most practical approach to determining the best rate of
inspection is to ask the question to the fitters or who ever knows the
equipment best "how often should the condition monitoring task be done such
that the failure will not occur unexpectedly?" Work this answer backward
and forward until the point where the respondent is confident that his
inspection rate is providing adequate prediction but not over doing it.
Because the answer needs to be correct to orders of magnitude absolute
precision is not necessary. Some may be familiar with the question "What
is the PF Interval?" which is the same line of thought.
A simple assessment of hours, days, weeks or months is about the best you
will get. Note that MTBF has nothing to do with this approach at this
stage of the evaluation. MTBF estimates can be used to decide if the cost
of prevention is more or less than the cost of failure as this must presume
a life cycle cost which is dependent on MTBF.
Hope this is food for thought.
Steve
From: "Shannon Hood"
I agree with Steve and offer some additional guidelines:
- Be wary of software that claims it can tell you how often to do your
stuff. Maintenace is an art, not a science that's why we get better
at it with experience. Ever wondered why some countries refer to
their tradespeople as ARTisans?
- Software can be helpful but I caution reliance on some algorithm
with limmited variables and constraints in its optimisation.
- I have come accross several sites who have used detailed RCM
analysis and wiz bang software packages that tell them to perform a
particular task every 23.765984 days, only to have the shop floor
reality kick in and the task gets scheduled for the first RDO every
month. A point you could probably have got to much faster (and
cheaper) if you'd asked your staff!
- I advocate full blown RCM analysis - but on only highly critical
machines and encorage the use of software - but within its
limmitations.
- Vibration checks every 4-8 wks is a good starting point.
- Visit your more critical stuff more often than non-critical stuff.
- The rate of decay is usually more the faster and heaivier the thing
going round and round is, so visit this stuff more often than slow
spinning, light stuff.
- If you think the bathtub applies, increase frequency soon after
commissioning and when you think the component might be nearing the
end - note that if you think the bathtub appplies, you may want to
re-think doing CM in the first instance but your site maintenace
engineer or local maintenace consultant will help here. A couple of
readings soon after commissioning can help to get a better baseline
while the m/c settles in.
- Infrared every 6 months is usually OK.
- The two drivers of decay in this area are Amperage and thermal
oscillation. So if you've got big current drawers going on and off
all the time or high amperage boards living in outside sheds, give
these some more attention than others.
- Again, do the critical stuff more often - if its not critical, is it
worth the bucks on CM anyway? Don't get carried away with a new toy
and start CMing everything.
- Tip for infrared: Don't just limit your thinking to hot wires.
If you've got the infrared fella in or hired the gear, ask what other
applications it may have. Some different ones include checking for
leaks in fridge units, checking alignment of large mechanical
couplings and warm spots in the wrong pipes exiting/entering heat
exchangers.
- Lubrication has no hard and fast rules but the manufacturer's
manuals usaully have pretty good stuff.
- Be EXTREMELY careful not to use the wrong lubricant - you can do
more harm than good if the lubrication tasks are not clear and/or the
lubes are not well labelled and arranged. Oils aint oils!
- Obvious caution about overlubricating (not so obvious to some
process/operation/production staff), who usually find out the hard
way!
Don't forget Steve's suggestion about getting the trades staff to
advise you on all this.
Don't forget that all the new wanky technologies are not a patch on
the best condition monitoring device ever invented - the human. Tap
into the people who are using the machine every day and notice the
rattles, smells, sqeaks, drips, wiggles that are out of the ordinary.
Every one of these will help you foresee and predict failure before
it occurs.
Finally, all this CM stuff is simply attempting to predict an imminent
failure. Be sure you are taking the appropriate measures to delay
the failure as long as possible through the obvious like appropriate
lubrication, but also through dusting down cooling fins on motors,
vaccuuming the distribution board and cleaning the pool of oil under
the machine so an increase in 'drip rate' is noticable. The TPMians
will call this 'Defect Elimination' but the less educated amongst us
call it common sense.
If you play the game with some of these guidelines in mind, hopefully
the MTBF scoreboard will show your improvement.
Hope this helps
Shannon
From: David Sleeman
Matthew,
Have you tried www.coxmoor.com?
From: "Stephen Young"
Peter
With respect...
CM frequencies and equipment criticality are not related.
The decision to conduct CM might depend on the criticality of the equipment
or process, but the frequency of CM is based on the PF interval, that is the
time between when we can detect that a failure
is occurring to when the total failure occurs.
If you use MTBF to calculate the CM frequency then for age related failures
then in all probability, 50% of your items will have failed by the time you
reach the MTBF.
If you use MTBF for calculate CM frequency for random failures then 63.2%
of
your items will have failed before you reach the MTBF.
Stephen Young
Director
The Asset Partnership
From: "Alexander (Sandy) Dunn"
Let me add another general note of agreement with both Steve and Shannon.
To try and put it as simply as possible, the criticality of an item of
equipment, and its reliability (as measured by MTBF) have nothing to do with
the frequency with which Condition Monitoring should be done (but has
everything to do with whether Condition Monitoring should be done at all).
(The only exception to this, as Steve points out, is where you are not 100%
certain that the Condition Monitoring task you are performing will, in fact,
predict the failure. In this case, you need to be able to estimate the
probability that it won't detect the failure, and in practice, in most
industrial applications, this is almost impossible - so, for all practical
purposes, forget the exception).
The only question that you need to ask yourself in determining the
appropriate frequency for a Condition Monitoring task is "How quickly does
it fail, once an incipient failure is detected?". If it fails more quickly,
then inspect more quickly.
Clearly, the speed of failure will vary, from application to application.
Consider a bearing - a more highly loaded, higher speed bearing that is
running closer to its design limits, in an aggressive environment, where
lubricant quality is suspect, will be likely to fail more quickly. It also
depends on the mode of failure of the bearing - ball faults tend to result
in bearings failing very quickly indeed, but a bearing with spalling on the
inner race may happily grind away for weeks or months. So operating context
is highly important.
Having said that, there are some general "rules of thumb" for Condition
Monitoring frequencies. Shannon has, I think quite adequately covered
these. I would also agree with the suggestion of considering using
thermography for more than just electrical inspections. We have quite
successfully used thermography to detect conditions such as silt build up in
process water tanks, partial blockages of pipework, incorrectly fitted seals
on pumps (leading to rubbing), broken bolts on large open geared mills,
failing bearings on conveyor idlers and much more.
As far as the "bathtub" is concerned - the only effective way to deal with
this using Condition Monitoring techniques is to perform a baseline check
immediately after the item is returned to service after overhaul or repair.
This is particularly effective when combating alignment or balancing issues
by using Vibration Analysis. In some instances, it can even be used to
monitor the quality of the repair being effected. We recently had a case
where, despite the alignment on an agitator gearbox supposedly being
performed correctly, the baseline reading showed a significant alignment
problem. After some detailed investigation, (and rechecking the alignment
several times), it was discovered that the new coupling on the agitator had
been machined incorrectly, and was not concentric! Incidentally, the repair
had been performed off-site by a contractor, and no tolerances had been
specified for concentricity.
I hope this helps...
Alexander (Sandy) Dunn
Plant Maintenance Resource Center
From: "Stephen Young"
Hey guys
Have a read of Appendix 4 for John Moubray's book Reliability-centred
Maintenance II. It explains how the period for condition monitoring should
be determined.... Yes, much of the best information comes from experienced
artisans but the information does need to be applied correctly.
Stephen Young
Director
The Asset Partnership
From: "Peter Ball"
Stephen,
I understand the direction you are coming from. Appendix 4: Condition
Monitoring Techniques in RCM11 Second Edition, by John Moubray describes the
P-F Interval quite nicely. However, it is still only the 'bathtub curve'
turned upside down in an attempt to provide better understanding of P-F
Intervals.
For those who need to have the basic model details as a negative exponential
distribution for mean time to failure (MTTF) which is what I earlier
proposed, a very well presented text would be Maintenance, Replacement and
Reliability by Professor Andrew Jardine. The book is published by Sir Isaac
Pitman in the UK, USA & Canada.
ISBN: 0 273 31654 0, and the cost is quite reasonable; or at least mine was
in 1993.
This could well provide Ken Bates with the depth of knowledge needed to
convince his management, as cost of inspections is considered in the MTTF
model. It could perhaps be of use to Matthew Holmes, also.
Oh yes! take comfort in the knowledge that ALL models are flawed in some
way; some more than others. Just don't pitch too heavily for one above all
others. Other techniques which may mitigate some of the associated risks
are: FMECA, Pareto, Weibull, and LCC.
Best regards,
Peter Ball
From: "Steve Turner"
Gulp!
How is the PF Curve the Bathtub curve upside down?
Steve Turner
From: "Shannon Hood"
The PF curve has about as much to do with the S bend as the bathtub
curve!
Shannon
From: "Peter Ball"
How about a U bend?
Peter
From: "Peter Ball"
Err Hum!
How about if you turn it 'bottom-side up' instead?
Both commence at infancy, and progress through life with increasing decay,
culminating in ultimate failure (death even).
Substitute inspection periods with CM tests (non-invasive), and hope you are
good enough to detect an impending failure. If you get an adverse report
then reduce your frequency. If all appears to be going well within limits of
acceptability, then extend the frequencies.
Naturally you will not be doing this if the item is not Critical, as
pragmatic management will not like paying for something that is not really
necessary in their view.
Peter
From: "Stephen Young"
Peter
Interesting thought about the inverted bath tub curve and when inverted the
later part of the bathtub curve could LOOK the same as the PF curve but the
bathtub curve is illustrating an increasing PROBABILITY of failure with age,
while the PF curve is defining HOW LONG a potential failure will take to
become a total failure. Two quite different animals and not related.
Stephen Young
Director
The Asset Partnership
From: "Peter Ball"
Stephen,
With due respect .... I beg to differ, perhaps!
Both curves can be look - a - like, upside down or the reverse. It is only
the words attached to them that may vary. To my way of thinking there is no
valid reason why a point P (potential) cannot be imposed on the bathtub
curve, and it often is in reality. It is 'Lead Time To Failure' we are
monitoring, and both curves (or animals) can accommodate this factor.
As this interesting discussion has developed from an initial query
concerning Condition Monitoring frequency setting, perhaps it may be of some
interest to close Moubray's RCM 11 book, and open Patton's Maintainability
and Maintenance Management book, to Page 197 (in the 1980 edition) where a
Typical Reliability (bathtub) Curve incorporates reference to "Monitor
Condition Closely" during the Optimum Operation period ( usually associated
with Random Failures). Enjoy.
Peter Ball
From: "Shannon Hood"
I'm afraid I'm with Stephen on this one (as much as it pains me to
admit it!)
The PF curve is the time between a change in some parameter away from
the 'norm', indicating the point of commencement of decay. This is
extremely useful (if not essential) for good condition monitoring.
For example, the PF curve for a bearing in a particular application
may be 6 weeks. In the first week, the cahnge may be so
infantesimally small it cannot be detected through any means. In
week 2, a small change in vibration may be detectable if an
accelerometer were used. Into week 3 (3 weeks prior to failure), an
increase in metal content may be noticable if an oil sample were
taken. In week 4 the bearing housing may be getting noticably warmer
by week 5 the operator may notice a funny smell and by week 5 and 6.5
days there's a machine making big rattling sounds and about to go...
BANG!
One can see the relative importance of various CM techniques and why
understanding the PF curve is important on deciding which technique to
use and what the frequencies are to be. Using the above example, the
time from noticable deviation in accelerometer reading to BANG, is 5
weeks. Therefore a CM frequency of every 4 weeks would theoretically
capture any imminent failure.
The bathtub curve is completely unrelated and attempts to look at the
actual attrition (or probability of attrition if the sample number is
used as a denominator). The bathtub says (say) within the first week
of commissioning, 10% of items will fail, in the next week 3% will
fail and in weeks 3-77 1% of items will fail, then 3% will fail in
week 78 and those that still live will probably die in week 79.
Imagine we're in week 20 and an item deviates in its performance away
from 'the norm'. The time it takes to go BANG (PF interval) may be
nanoseconds (in which case it will appear in the bathtub curve in week
20) or the time it takes to go BANG may be 8 weeks, in which case it
will be part of the group that appears on the bathtub curve in week
28.
It must be realised that the PF curve describes an individual
component in a specific application. Put a new component in the same
application or the same component in a new application and the curve
changes! A common mistake with the bathtub curve is that people
believe it describes a specific component. IT DOES NOT. Ie it is
not describing a bearing 'wearING in', then operating normally, then
deteriorating. INSTEAD it is describing the failure probability of a
population of identical components.
Except for their geometric appearance, the PF curve (when flipped
upside down or the right way up) has no REAL relationship to the
bathtub whatsoever.
Shannon
From: "TIPS from Joseph"
Best FREQUENCY/Interval is addressed by:
http://www.oliver-group.com
click: RELCODE
Joseph
From: "Peter Ball"
Seems that my innocent little statement concerning CM frequencies has
produced some extraordinary useful information. Shannon's remark about wanky
technologies is indeed very relevant, and has been addressed to a certain
degree by the recent publishing of the SAE Standard JA1011 for RCM. 'Let the
buyer beware'. I note that Aladon UK are now stating that their software RCM
Toolkit fully conforms to this new standard.
I have considerable reservations as to the real ability of the average
maintenance tradesman / artisan to provide significant guidance on the issue
of CM frequencies. I would even go as far as to suggest that you could ask
100 different tradesmen the question, and receive close to 100 completely
differing responses.
Regards to all,
Peter Ball
From: "Michael Doolan"
Peter,
perhaps that would be true if you asked 100 different
tradesmen; and all the suggestions from those people would be just as
relevant to your particular problem/query. Just seen from 100 different
perspectives and experience bases.
Give your tradesmen a little credit ... they're the ones actually
doing the work on this equipment and see or hear the changes in the
performance of that plant an a day to day basis.
Unlike "Most " Engineers, the tradesmen get a feel for the
performance of particular machinery they come in frequent contact with
and as such get a better understanding of its particular characteristics
as each machine tends to have their own "sounds, hums, or temperatures).
Even though you may have 100 of the exact same pump for instance,
many will behave slightly differently ( marginal differences in flow or
pressure perhaps even temperature); these things most engineers don't
understand due to the fact that your so far removed from that
environment.
Your tradesmen will know there's a problem with particular plant
especially if the rebuild/overhauls tend to be more frequent than should
be necessary, some may not know a particular product or modification to
perform to remedy the particular fault, but that's where an Engineer
that "Listens" comes into the equation - your knowledge of current
technology is the base they can draw reference.
Be open to suggestions from your trades base, they have to be open to
your orders!
One thing I have noticed over the last 20 years in engineering is that
Communication between Engineers and the shop floor Tradesmen is
Critical --- unfortunately it is most often ignored or overlooked by
the management team of that business. Sad but True.
Michael Doolan
Specialist Maintenance Tradesman
From: "Steve Turner"
Agreed completely, but I'm sure Peter's comments were not intended that way.
Steve
From: "Steve Turner"
I heard on the grape vine that the SAE Standard for RCM was not well
received at the Society for Maintenance and Reliability Professionals
conference recently and is heading back for another go. Is this true? Can
anyone confirm this?
By the way, I would not agree with the statement that 100 tradespeople
would provide a different answer to the question of rates of decay or wear
etc. Obviously we need to ask the right people - trades folk will know
about bearings because every time a bearing starts to get noisy, the
management asks them how long have they got to run. Its a bit like asking
a taxi driver how long will his diff last with that noise or perhaps his
wheel bearing....they seem to know precisely it seems because so many taxis
have these noises.
To determine rate of crack growth, then we may need to be a bit more
scientific. I'm glad that there are specialists that do this for
aircraft..cos I do a lot of flying.
In industrial plant, we don't need absolute precision - just orders o
magnitude and in practice we tend to err on the side of conservatism anyway.
Steve Turner
From: "Peter Ball"
Steve,
I have checked out your 'grape vine' comment regarding SAE JA1011, with the
Standard committee chairman Dana Netherton, who advises that there was one
(1) hostile vendor present at the SMRP Conference. He states, Quote "There
is certainly no intention to rewrite JA1011. (However much that vendor, or
other noncompliant vendors, may wish that it would get rewritten.)" Unquote.
Hope that this will clear the air on this issue. Thanks.
Peter Ball
From: "Ray Beebe"
In the early 1970s, when applying routine condition monitoring by vibration
analysis to a new fossil-fired power plant, we decided that for plant
auxiliaries (pumps, coal mills, fans) that we would take data on the basis
of service hours for each individual item of plant. The service hours were
read weekly by operations staff, and reported on log sheets, so that
information was readily available.
As much of this type of plant was spared, on any given day, some items were
not in operation, but were on standby, or on maintenance.
We found that when walking around the plant that it was more trouble than
it was worth to select only the nominated items. The extra time to test
all that were operating was minimal. We therefore decided that the
pratical way was to measure all of a type on a calendar time basis. Some
would be sampled more than others, but the reduced complexity balanced
this. Therefore, monthly became the usual (and still is).
From: Trevor Hislop
Of course you could ask 100 graduate engineers the same question and get
either no answers, or 200 different answers!!
Trevor Hislop
From: "Stephen Young"
Peter
A minor correction if I might....
Aladon state that the RCM II process fully complies with the SAE JA 1011
standard. RCM toolkit is the supporting data handling tool for RCM
facilitators. The decisions
are all made by RCM analysis team and recorded in toolkit. I would hate to
think readers of your note gained the impression RCM Toolkit was yet another
magic box solution.
Regards
Stephen Young
The Asset Partnership
Hey Everyone,
This discussion has been real interesting. My department currently reads
data every 5 weeks but the recommendation has been made to go to every 3
months.
Can anyone out there help me to provide my management with the appropriate
reasons and supporting links and documents that show why this doesn't supply
the support that they need. The reason for suggestion is to save labor costs
of reading the data. Any help would be extremely appreciated.
Thanks,
Ken Bates
From: "Shannon Hood"
What if every three months does supply the support they want?
I've no idea of your plant, so here's some high level suggestions:
Look back through your trends and see if you can approximately
quantify the PF interval. You may see you have quite a long PF
interval and may be able to extend the intervals on these machines.
However, if you notice that you have had a few close calls (or even
unexpected failures of machines that have been undergoing vibration
monitoring), then you may want to dig your heels in on these and
retain (or shorten) the frequencies.
This is a classic case of why condition monitoring IS dependent on
machine criticality. From a 'technical' point of view, criticality
should make no difference, but most of us have limmited budgets and
are driven by bottom line requirements, so in real life criticality
does make a difference. Obviously, if those above are absolutely
insistant that you increase the frequency, I'd offer a comprimise
suggesting that you will yield on some machines beacause they're not
as critical as this other list of machines that are very critical.
Criticality is dependent on a complicated interaction of direct cost
to repair, OHS/env issues, production impact to name but a few.
I often think we maintenance practitioners should be a bit more
experimental so another suggestion would be to undertake a bit of R&D.
If (say) you have two pumps in a duty/standby situation, why not
change the frequency on one and not the other. Providing you
regularly switch between the pumps (say monthly), you'll eventually
notice a detrimental change if the increased frequency is wrong.
Important note with this suggestion is that if you come back with this
suggestion and outline its going to take at least 12 months to get
some decent data, you've probably bought some time while being seen to
responding to their need in a positive way!
Shannon
From: "Stephen Young"
Ken
The frequency of CM is should be based on the PF interval and nothing else.
The decision to CM or not is then an evaluation of the consequences of
failure and the cost of conducting the CM.
An arbitary variation in CM frequency to reduce cost is failing to
appreciate the process of failure.
It could be your arbitary CM period of 5 weeks is too frequent for some
items and not frequent enough for others. The correct frequency can only be
determined by identifying the failure modes that might affect that item of
equipmenty and determining the PF interval for each failure mode and gearing
your CM for half the PF interval.
Your managers need to appreciate that maintenance is a valid and very
effective risk management tool when it is based on a sound and defensible
logic. Using guess work is not a defensible strategy.
Kind regards
Stephen Young
From: "Ber van Loon"
The objective of condition monitoring is to predict upcoming maintenance
need by monitoring condition indicators.
To minimize downtime, the desired monitoring interval should be a fraction
of the time in which the fastest known failure mode developes from "not
measurable" to "significant defect amplitude level".
Sometimes we're lucky when history data is available which can be used to
analyze the occurence and the development speed of different failure modes.
Tip for those who are desperately seeking for a numerical solution: Take a
good look at the standard deviation of TBF.
When you've invented the ultimate monitoring interval: don't forget to ask a
specialist about his thoughts on this.
Condition monitoring should not be be regarded as a process which can be
managed from behind a desktop. It's a "bottom up" process performed by
specialists.
Ber van Loon
Uptime! Condition Monitoring
From: Graham Oliver
Another subscriber has already, kindly, gotten the word out that our
software RELCODE might be the answer to the determination of maintenance
frequencies. Might we add that if one is into Condition-Based
Maintenance, that our product EXAKT would be worth a look as well.
For the record, we would have grave misgivings if maintenance people
were to look for and possibly find "standard" frequencies that they
could apply. The number of variables that affect the answer are
actually immense -- machinery type, speed at which it is run, operator
skills, operating environment, product being produced, and so on.
The only way, we think, to get reliable maintenance frequencies is to
input one's own data which would be totally pertinent to your machinery
working in your environment, and so on.
To look at RELCODE and EXAKT go to http://www.oliver-group.com.
--
Regards...Graham Oliver
Oliver Interactive, Inc.
From: "George English"
There are many factitious gurus out there - after close scrutiny
of their "philosophy", background and credentials -
G-d forbid you should ever adhere to their "gospel".
This applies especially to compressed air technology and applications.
Just a comment, George English
From: Trevor Hislop
Thank you George English for quoting some down-to earth comments in
relation to this "discussion". In over 40 years in the Maintenance
Management business around the world I doubt if I have ever heard such a
wide range of input from good sensible "talk to the trades-people"
through to some absolute "cloud 9" waffle from "ivory tower experts !
Trevor Hislop
Ebony Associates
From: "Peter Ball"
Shannon & Stephen,
Can each or either of you provide academically accepted references to
support your stated views that the P-F Curve provides satisfactory CM
frequency calculation, that cannot be resolved with the universally accepted
bathtub (or hazard rate curve).
Even a few photocopies to my Fax No. would be appreciated, as
most of the model criteria would be difficult to enter on E-mail, unless of
course you have the details in attachment form suitable for Word 97, or pdf
files, even. Thanks guys.
Peter
From: "Stephen Young"
Sure
I am in NZ at the moment but will arrange something when I get back to the
office late this week.
From: "Shannon Hood"
Peter
In terms of the "monitor condition closely" quote (as applied to the
random failure zone), I ask the next question of How? And whether
the answer is through vibration analysis or oil sampling or waiting
for a nasty grinding sound (all of which are quite legitimate
strategies), then the next obvious question is How often? If I have
a bathtub for two identical bearings, one of which goes from shake to
bang in hours and the other is in a particular application that may
cause it to go from shake to bang in weeks, how does the bathtub help
me make the How often decision? This is where PF can help.
I'm certainly one for sound references and I wouldn't trust my inane
ramblings! However, one of the problems with maintenace engineering
is the fact that it is a relatively new discipline and very little
formal academic research has been done until (relatively) recently.
Compare the distinct discipline to (say) the discipline of organic
chemistry or thermodynamics and you'll understand where I'm coming
from. Add to this the fact that while people have always been able to
place the screwdriver between the ear and bearing housing to check for
excessive vibration, the maintenace application of some of the other
technologies are also only relatively new. Personally, its these
unchartered waters that I find so exciting about maintenace
engineering.
There is always a tragic lag between leading (perhaps bleeding) edge
(formal and informal) research and academic acceptance, let alone
adoption into university curiculum to drive the production of texts.
It was only four years ago that I took a subject in maintenace
engineering at a university where we analysed the bathtub curve, did
the Wiebul thing etc etc. When I asked about the research done in
the aerospace industry that discovered 6 failure curves (not the one
bathtub), I was met with a blank stare from my lecturer! Perhaps the
application of the bathtub when the world was typically more
'machanical' is less relevant now that we must deal with the
reliability issues associated with failures of 'non-mechanical' stuff.
Unfortunately, I am on site at the moment and have all my texts at
home, but I will be certain to pass on references for the materials
you seek to back up this idea. However, I note your reference is
from 1980. Assuming it took a couple years from concept to
publishing, the content of that text is approximately 25 years old.
How much have ideas changed in 5 (let alone 25 yeras). I would
caution academic texts ever being seen as gospel, but, particularly
those older than 5-10 years. When I read the phrase 'universal
acceptance' part of me hears 'tried and tested' but another part of me
hears 'tired and testing'. I believe it is forums such as this and
papers at the numerous conferences and in periodicals that are offered
that provide the latest information. I worry about the organisation
that is willing to wait for universal acceptance by the academic
community of a concept before they size it up to see if it fits their
competitive needs.
I already have one thesis in the academic world and by the end of this
year will have inflicted more guff in the form of a second (Masters)
thesis on the Optimisation of Maintenace strategy using financial
data. However, just because I've had some university supervisor give
it the nod, does not make it any more relevant (or useful) than some
of the ideas that linger in the brains of other maintenance
practitioners. I guess I'm sceptical about academically accepted
documents becuase I know how easy they are to fudge and how out of
date most of academia is with the rest of the known universe. There
are exceptions, but I believe its the rule.
I note that you are based in Australia and I'd recommend attending
Mainstream next year if you want to hear a good mix of academic
research and wins and sins of various sites.
Will be in touch next week with the 'academic' guff for what its
worth.
Shannon
From: "Andrew G Starr"
I read with interest the exchange on inspection frequencies. The reference
to Jardine is of course an old book but a seminal work. Andrew Jardine was here
on Friday, and he's still a firm advocate of modelling risk of failure based
on a combined run-time/wear-out model, fed with rich failure data. He markets
some software incorporating the model.
The problem perhaps for most users is the lack of good data. The wear-out
probability distribution function is usually modelled on the Gaussian or on
Weibull with a large beta. But like any model, it needs plenty of data to
fit! The difficult question, posed by all new users, is how to choose a sensible
measurement interval which will prevent most failures. Using the Gaussian or
normal (simpler to explain than Weibull) you need sufficient failure data
for a good estimate of Mean time between failure MTBF and its standard deviations. You probably need about ten failures to get an acceptable confidence in the estimate.
The next problem is how to get a good estimate of the P-F interval. This is
hard enough in the laboratory, where we can control the degradation of a
component, but very difficult in the field because of the number of
variables. Most users and consultants use rules of thumb for initial intervals because
of these limitations, e.g. accepted PF for major techniques, followed up by
fine tuning when data is available. Clearly the logistics of lugging equipment
round the plant also dictate to some degree the fine tuning.
The best strategies are flexible - this is certainly true of Jardine's
method, and is also embedded in other software philosophies, e.g. Wolfson's MIMIC, to
increase measurement frequency when a monitoring threshold is exceeded.
On this basis, all users would expect to gather too much data at first,
before refining parameters and frequency, and then only increasing frequency when
necessary.
Andrew Starr
Dr Andrew Starr
Manchester School of Engineering
The University of Manchester
From: "Bill Roos"
It is with keen interest that I followed this extremely interesting discussion and I would like to add my few pennies worth. I have found that, in order to be truly pro-active, one needs to maintain such massive volumes of data, that it almost becomes impossible to do it without the assistance of a very comprehensive computer based management system. Monitoring equipment and component condition alone is simply not enough, even on-line monitoring and analysis will not present the total picture required to optimize plant availability and eventual profitability.
Finding that fine balance between reliability and productivity, or understanding the impact of the cost of maintenance as opposed to the cost of plant non-availability, requires a far wider look than just the condition of the individual objects that make up the plant. In a fair sized refinery there could be as many as 100,000 pieces of equipment, each with an average of approximately 120 components and parts. Managing the configuration and relationships of more than a million objects, tracking their locations and relationship to the process, the similarities in applications as well as commonalities between failures appears to be an impossible task. Expanding this view to include events and conditions that preceded failures, vendor and individual influences in maintenance and operating activities, skills and training, production demands, as well as the real time financial impact, is something that can only be achieved with the use of a comprehensive ERP management system such as SAP.
A rule based approach to sampling on-line data contained in a system that does the number crunching, must result in a situation where you can actually have a work-force that is only alerted to pre-defined situations that require their attention at the time when it is needed. My believe is that frequencies should not always be set to fixed intervals. In the ideal world a system should automatically adjust frequencies once a pre-defined rate of deterioration is exceeded and in some cases even call new activities to start.
It is not my intention to contradict the valuable comments made by Shannon. Peter and Steve, but merely to add a fresh perspective!
Thank you for a great forum!
Bill Roos
From: "Peter Ball"
The comments from Andrew Starr appear very relevant to this discussion. My
interpretation is that you can use the P-F curve and then apply your own
conclusions draw from experience. It is interesting to note the similarity
of wording in Moubray's RCM11 description of P-F, and Wolfson's. I wonder
just who invented this 'curve' theory. If it is so good for RCM why does
no-one unconnected with RCM appear to use it? My investigations through the
MCM community suggest that they have never heard of it; but they certainly
know of, and make use of the 'bathtub' curve!!!!!
To date we have received no convincing evidence to support P-F for
calculation of monitoring of machine condition (vibration & oil analysis).
Seems it can have an application in the monitoring of machine, or system
performance.
Further argument will be appreciated. Thanks,
Peter Ball.
From: "R. Keith Mobley"
I have followed the exchange of comments on this topic with interest. It
seems that most of the responses assume that all machines must fail and
condition monitoring is simply a tool to predict when catastrophic failure
will occur. In the almost forty years that I have been using predictive
techniques, the goal has always been the same---to extend the useful
operating life, reliability and capacity of critical product systems. If the
assumption that failure is not preventable, I have been wasting my time.
However, the results of our work proves otherwise.
One important reason that few use the P-F curve is that it must be adjusted
to actual application, installation, mode of operation and quality of
maintenance for each production system, machine-train or related component.
Like most of the other methods used in RCM, the relationship between theory
or ideal, and the real world is practically non-existent. The same is true
for bathtub curves. The flat or low-probability of failure zone duration is
variable. It is strictly dependent on the installation, application and
especially on the mode of operation. For example, when a system is applied,
installed, operated and maintained properly, the interval of low probability
of degradation, damage or failure on both the bathtub and P-F curves can be
extended almost indefinitly. More over, the ability, using today's
conditioning monitoring equipment, to detect the first minor deviation from
optimum operating condition permits plant personnel to make minor adjustments
or repairs that can further extend the useful life of production systems.
Before establishing monitoring intervals, you must first decide what results
you want from your program. If it is simply to predict immenient failure,
follow the advise that has so freely been given over the past few weeks. If
you want to optimize performance, minimize costs and extend the useful life
of your critical production systems, you must base the interval and methods
used on a design review of the installed system. This review must include
the designed operating envelope of the system (i.e. what was it designed to
do?), the actual installation and how it is really being operated. This
review will provide the answer to your question.
R. Keith Mobley
From: "Shannon Hood"
Keith
I read your comments with interest and I agree that often the link
between the theory and the practical is tenuous. I think that
through various methods we can actually adjust the curve/s so assuming
they're constant is a big mistake. A good example of this was on a
cable manufacturing site which (as you can imagine) had a huge number
of high speed spinning winders and unwinders. When the cable drums
got full, they were pretty damn heavy, so in additiona to high speed,
we had large bending moments hanging off some poor bearings. We were
experienceing large amounts of early life failure which was simply
accepted as a way of life, and all sorts of strategies were in place
to deal with these failures. Not content with this, we took the time
to try and understand why so many failures and discovered some pretty
poor fitting techniques along with a press that was 28 years old and
if used actaully pressed the bearings in out of alignment! Needless
to say, with a bit of training from the bearing provider (incidentally
provided free of charge) and replacement of the press cylinder we were
able to almost eliminate the early life failures. The point to the
story is that I don't believe we should blindly accept the current
failure pattern and through other techniques we can acyally have an
impact on undesirable patterns.
However, I find myself disagreeing with your statement about
predictive maintenance being able to extend useful operating life.
Perhaps its a terminology thing, but by my way of thinking there's
predictive maintenance DOES NOT extend useful life. I like to think
of it in this way:
- Use the TPM approaches of defect elimination to eliminate the cause of the failure. For example, if motors are overheating because the cooling fins are filled with dust, then eliminate the source of the dust or improve the ventialtion or install extraction fans.
- If the source can't be eliminated (or doing so is cost prohibitive) then fall back to PREVENTative maintenance. Prevcentative maintenance is so named because it attempts to extend the useful life of the equipment (or prevent it from happening within a given timeframe). This may be through a process of cleaning off the dust on a regular basis, or regular lubrication or abritrary replacement or whatever. Each of these approaches attempts to postpone the failure.
- One all efforts have been made to eliminate the source AND postpone the life, then if the failure is still so undesirable or 'unforecastable' (from historical records and due to its failure pattern) then we need some PREDICTive maintenace. Predictive maintenance DOES NOT extend component life. For example regular vibration monitoring of a bearing WILL NOT increase the life of the bearing. It WILL however, ensure we get maximum use out of the bearing by running it until it is about to go bang (as opposed to arbtirarlity replacing a reasonably good bearing). By PREDICTing imminent failure and addressing it at a time of our convenience we are preventing a breakdown situation, but we are not preventing the failure. Nor are we extending the useful life.
- Of course, at the end of the day, if all our efforts to eliminate the cause of the failure, extend the life through preventative maintenace and predict the failure through CM are still unsatisfactory or too costly, there may well be the need for re-design or some form of real-time alarmed monitoring that does not rely in CM inspection intervals. All of this needs to be considered in the light of risk, criticality and cost.
I'm interested in hearing more of your thoughts.
Shannon
From: "Keith Mobley"
Shannon,
You did a great job of defining the failure of most predictive maintenance
programs. My question to you is way do you need TPM and other methods to
eliminate the root-cause of problems or to extend the useful life of
equipment? That should be the role of the predictive maintenance program.
The predictive technologies, used correctly, provide all of the data needed
to accomplish these goals.
A survey that we conducted, in conjunction with Plant Services magazine,
indicates that less than 3% of those companies using predictive maintenance
generate enough benefits to offset the program's cost. In most cases, the
reason is that they have limited the program to simple predictions of failure
rather than as a plant optimization tool. Using predictive technologies
combined with other process-related data, we have shown plant personnel how
to eliminate problems related to capacity, product quality and reliability.
The result has consistently been a 100 to 1 or better return-on-investment.
Please do not sell these technologies short. They provide the means to
really make a difference in overall plant performance.
Keith Mobley
Copyright 1996-2009, The Plant Maintenance Resource Center . All Rights Reserved.
Revised: Thursday, 08-Oct-2015 11:51:55 AEDT
Privacy Policy
|