newsgroups-index (beta)

Current group: comp.simulation

Model Accuracy

Model Accuracy  
Feargal Timon
From:Feargal Timon
Subject:Model Accuracy
Date:Thu, 14 Oct 2004 08:05:55 -0500
REAL DATA COMPARED WITH STATISTICAL ANALYSIS

Feargal Timon B.E., M.Sc.Eng., MSCS, CIM Ireland Ltd, Brooklawn,
Salthill, Galway, Rep of Ireland.


ABSTRACT

Simulation models by their nature are made up of a variety of
interdependent timed events. In some models, the time of these events
are approximated and simplified. In other cases, they are analysed and
statistical distributions are applied. This paper proposes a context
where real data logs should be used in the development of a simulation
model.

KEYWORDS: Validation, Real Data, Event Logs

1. INTRODUCTION

Two of the most important aspects of building a simulation model are;
customer confidence and model accuracy/validation. This paper proposes
that, to achieve the most accurate model possible, real data should
directly feed into the model; as opposed to analysing real data and
converting it into statistical distributions. This paper also
discusses when to use real data logs. Three case studies are presented
where real data logs were used and where model accuracy of over 95%
was achieved.
The use of real data in modelling is particularly relevant to
situations where one or more element has a high degree of variability.

2. REAL DATA DRIVEN MODELS

A real data driven model is one where real data logs, which were
recorded from the system being modelled , control one or more elements
in the model. An example of a real data log would be: a down time log.
This log would record the time of all failures and their respective
durations. The real data log may be manually, but preferably
electronically recorded. In such a model, where real data logs are
used, the simulated machine would go down at the same times as the
real machine and for the same durations. Hence, the real data logs
completely control this element of the model. Table 1 demonstrates
what a real data log may look like and shows how such a log can be
converted into a format to be used by a simulation model.


Down Time Log "Model Event File"
Time of Failure Time of Recovery Time of Failure (Min) Duration (Min)
29-Aug-04 8:15 AM 29-Aug-04 8:19 AM 495.00
4.32
29-Aug-04 9:31 AM 29-Aug-04 9:45 AM 571.25
14.40
29-Aug-04 11:46 AM 29-Aug-04 11:47 AM 706.15
1.44
29-Aug-04 12:01 PM 29-Aug-04 12:29 PM 721.02
28.80


Table 1: A Down Time Log and its Conversion to a Format Suitable for
Simulation



The decision to use real data driven elements in a model will be
based initially on data availability. Taking this as a given, real
data driven models should be built:
a. When one or more elements has a sufficiently high degree of
variability such that, if a statistical distribution were to be used,
the model would have to be run multiple times to ensure a valid
result.
b. Where two variable elements interact to produce highly different
results and a more stable model is needed to analyse a specific
problem.
c. To give the customer confidence so that he or she can see how the
system would react under more realistic conditions.

The author has found that real data logs simplify the validation
process and improve model accuracy.

3. MODEL VALIDATION

The validation of a model is the cornerstone of any simulation
project. The use of real data logs both simplifies the model and
improves its accuracy, thus making it easier to validate. The
modelling task is simplified for those elements using real data logs,
as there is no data to analyse or model logic to be developed. The use
of real data logs also improves the model's accuracy as these logs
represent reality; thus there are no assumptions or approximations
required.
The most beneficial aspect of real data logs is that by keeping the
real data controlled elements representing reality, the other elements
can be more intensely scrutinised and validated. In other words, as
the accuracy of the controlled elements are ensured, the accuracy of
the other modelled elements become more apparent.
The validation process itself should also become simpler, as real
data logs can be used to compare simulated results with actual events.
For example, the actual start and end times of a batch can be compared
with the simulated times. This will be illustrated in the second case
study where a whole factory is modelled and every production order is
validated.
To demonstrate the effectiveness of using real data driven models,
three distinctly different case studies will now be discussed from the
following perspectives: a) why the elements were selected, b) how the
model was validated, and c) the resulting model accuracy.


4. CASE STUDY 1 - HIGH VOLUME INK CARTRIDGE MANUFACTURER

The first case study is of high volume, closed loop conveyor system,
where parts are placed on a pallet conveyor at the beginning of the
loop and removed at the end of said loop. The units, which are
produced in one hour, can vary in number from 900 to 2000 units. This
variation in output is caused by multiple short downtimes (less than
15 minutes) on all equipment. The objective of the project was to
increase output by 5% to 15% on the loop. For the initial model,
downtime was analysed and specific distributions were developed for
each piece of equipment. When the model was run under the same
conditions, the average output would vary by over 10%. To get a stable
result, the model would have to be run ten times in order to evaluate
each scenario correctly. However, this proved to be too time-consuming
and it was difficult to be sure of the benefits, due to the inherent
variability in the simulated results. Consequently, the model was
rebuilt in such a way that the downtime logs were fed directly into it
and the other elements were then simulated. Table 2 outlines those
elements that were simulated and those elements that were controlled
by real data. This table also presents the model accuracy in terms of
units produced and the correlation of units produced per hour.
Using real data logs, it is possible to have a model with an accuracy
level of 99.5% even though the output of the real system varies by
over 100% per hour. This demonstrates the benefits of real data driven
models compared to those models run using statistical distributions.

Elements Output Per Hour
Modeled Real Data Controlled Accuracy Correlation
Conveyors Downtime Log 99.5% 0.93
Equipment
Logic
Run Rates
Pallets

Table 2: Case Study 1 - Components Modelled and Components Driven by
Real Data, Plus Model Accuracy

Figure 1 which further demonstrates the accuracy of the simulated
model was key in validating this project. The customer gained full
confidence in the analysis, once they saw the model reacting to real
data.

The next case study looks at a large batch processing plant.

5. CASE STUDY 2 - PROCESS PLANT.

This case study looked at a complete factory, from batch
manufacturing to packaging. The manufacturing areas would produce
batches and then wait for a packaging line to become free. When a
suitable line became available, the product would be packaged. The
packaging times would vary depending on the type of packaging and the
changeover from the previous product. The next manufacturing batch
could not be started until the previous batch was packaged. Also, one
manufacturing batch could be processed on more than one packaging
line.
Similarly, the production time of a batch in the manufacturing area
could vary significantly. This variation would define when a packaging
line would be required. After collecting the real data, the batch
manufacturing times and packaging times were fed directly into the
model and all other elements were modelled, as shown in Table 3. The
accuracy of the model was based on the start and finish times of the
manufactured batches. These parameters were selected because a batch
could not start until the packaging of the previous batch was
completed. Figure 2 compares the simulated start time to the actual
start time of all orders.



Elements Order Completion
Modeled Real Data Controlled Accuracy Correlation
Tanks Batch Times 95.3% 0.98
Pipes Packaging Times 98.4%
Flow rates
Logic


Table 3: Case Study 2 - Components Modelled and Components Driven by
Real Data, plus model accuracy

To validate the model, the customer wanted to use real schedules and
real data. This approach proved to be the most effective, particularly
during the validation steps of the project as it provided a high
degree of accuracy .
This case study shows that real data driven modelling is effective
for a factory wide model. To further test the proposed approach, a
non-manufacturing example i.e. a call center will also be studied.


6. CASE STUDY 3 - CALL CENTER

This business receives a variety of calls; customer, business and
corporate calls. These calls can be further divided into sales or
service and finally into language requirements. This combination
results in over one hundred different of Skillsets. The objective of
this project was to discover how many agents were required and what
type of calls they should answer. The elements of the call center
that varied the most and could not be controlled by management were
the number of calls arriving into the center. Three months of actual
calls were fed directly into the model (see Table 4).
The main performance measure of this call center was the number of
calls abandoned for each Skillset. The approach of using real data
driven elements in the model delivered an accuracy level of 97.5%
based on the number of actual abandoned compared with abandoned
simulated calls.


Elements Abandon
Modeled Actual Accuracy Correlation
Agent Call Time 97.5% 0.99
Breaks Arrival Log
Logic Emails Arrival
Email Response Time
Skill Sets
Hang-up threshold


Table 4: Case Study 3 - Components Modelled and Components Driven by
Real Data, Plus Model Accuracy


7. CONCLUSION

In conclusion, most models have one or more elements that have a high
degree of variability. The three case studies, which were presented,
have demonstrated that by controlling these elements using real data
logs the modelling task is simplified; the model accuracy is increased
and the validation requirements can be reduced. Equally, a customer
has a higher confidence level in the model when they see it react to
real data.
Finally, the most beneficial aspect of real data logs is that by
keeping the real data controlled elements representing reality, the
other elements can be more intensely scrutinized.
   

Copyright © 2006 newsgroups-index   -   All rights reserved   -   Impressum