Aperçu des semaines
-
The use of probability models and statistical methods for analyzing data has become
common practice in virtually all scientific disciplines. This course attempts to provide
a comprehensive introduction to those models and methods most likely to be encoun-
tered and used by students in their careers in engineering and the natural sciences.
Although the examples and exercises have been designed with scientists and engi-
neers in mind, most of the methods covered are basic to statistical analyses in many
other disciplines, so that students of business and the social sciences will also profit
from reading the book.
-
1. why statistics
Statistics deals with collecting, processing, summarizing, analyzing, and interpreting data. On the other hand,
engineering and industrial management deal with such diverse issues as solving production problems, effective
use of materials and labor, development of new products, quality improvement and reliability and, of course,
basic research.
1.1. The field of statistics involves methods for :
1. Designing and carrying out research studies.
2.Describing collected data.
3.Making decisions, predictions, or inferences about phenomena represented by the data by designing valid
experiments and drawing reliable conclusions.
1.2. Branches of Statistics
1. Descriptive statistics: statistical
methods that summarize and describe the prominent features of data.
2. Inferential statistics: statistical methods that generalize results from a sample to a population.
As it is generally impossible or impractical to find out something about the entire population, we examine a part
of it to make inferences.
1.3. Definitions
A population is the entire collection of objects or outcomes about which information is sought.
A sample is a subset of a population, containing the objects or outcomes that are actually observed.
A parameter is a numerical characteristic of a population, which is usually unknown.
A statistic is computed from the sample and varies from sample to sample and used as an estimate of the
population parameter.
Data Collection :
Besides organizing and analyzing data, statistics deals with the development of techniques for collecting the
data.If data is not properly collected, an investigator may not be able to answer the questions under
consideration with a reasonable degree of confidence.
Chapter 1 : Descriptive
Statistics I
5
Observational Studies: Engineer simply observes the process without disturbing it and records quantities of
interest. May be able to find relationship between input and output but cannot study relationship between all
factors because appropriate changes were not made.
Controlled (Designed)Experiments:
Measurements are recorded while controlling some factors that might influence the results of the study.
Measures the response or output variable of interest.
Surveys: Questionnaires designed to solicit information from people. Data may be collected by face-to-face
interview, telephone interview, postal mail, email, fax.
A simple random sample (SRS) of size n is a sample chosen by a method in which each collection of n
population items is equally likely to comprise the sample.
A (SRS) is not guaranteed to reflect the population perfectly. (SRS) is always differ in some ways from each
other.
Two samples from the same population may vary from each other. This is known as sampling variation, Items
in a (SRS) may be treated as independent in most cases encountered in practice. The exception occurs when the
population is finite and the sample comprises a substantial fraction (more than 5%) of the population. Sampling
with replacement: Replace each item after it is sampled.
The population remains the same on every draw. The sampled units are truly independent.
In the sample the researcher collected, 80% of users were satisfied with their internet connection.
In the population of customers, it is unlikely there will be exactly 80%.
Who are satisfied with their internet connection.
It is more realistic to think that there will be somewhere around 80%.
of the customers who are satisfied with their internet connection.
Another researcher repeats the study with a different (SRS) of 50 Customers.
She finds 90% are satisfied with their internet connection.
Did she do something wrong or did the first researcher do something wrong ?
Sample variation at work, two different samples from the same population will differ from each other and from
the population.
1.3.1. Sampling methods
Stratified Sampling
Sometimes alternative sampling methods can be used to make the selection process easier, to obtain extra -
information, or to increase the degree of confidence in conclusions. One such method, stratified sampling,
entails separating the population units into non-overlapping groups and taking a sample from each one.
For example, a manufacturer of TV might want information about customer satisfaction for units produced
during the previous year. If three different models were manufactured and sold, a separate sample could be
selected from each of the three corresponding strata.
This would result in information on all three models and ensure that no one model was over- or
underrepresented in the entire sample.
Convenience Sampling
6
Frequently a convenience sample is obtained by selecting individuals or objects without systematic
randomization
sample is not drawn by a well-defined random method.
Example: A computer engineer received a shipment of 1000 monitors in a huge container. He wants to test the
brightness of the monitors by testing a sample of 10 ones. The engineer takes 10 monitors from the top of the
container as the sample. Things to consider with convenience samples: Differ systematically in some way from
the population. Only use when it is not feasible to draw a random sample.
1.4. Type of variables
A variable is any characteristic whose value may change from one object to another. The variables can be
classified as either quantitative or qualitative.
Quantitative (Numerical) variables: A numerical quantity is assigned to each item in the sample. Quantitative
variables can be classified as either discrete or continuous:
A discrete variable is a variable whose possible values can be listed, even though the list may continue
indefinitely For example, the number of visits to a particular Web site during a specified period, the number of
PCs owned by a family, or the number of students in an introductory statistics class A continuous variable is a
variable whose possible values form some interval of numbers. Typically, a continuous variable involves a
measurement of something, such as the price of a laptop, the CPU time of a certain task (in seconds), or the
length of time a PC battery lasts.
1.5. Graphical Methods
Descriptive statistics can be divided into two general areas, graphical and numerical
In this part, we consider representing a data set using graphical techniques.
Appropriate graphs are- For qualitative data: Bar chart and Pie chart.
For quantitative data: Histogram, Boxplot.
Bar and Pie Charts
Bar chart: A vertical or horizontal rectangle represents the frequency for each category. Height can be
frequency, relative frequency, or percent frequency. In some cases, there will be a natural ordering of groups,
for example, freshmen, sophomores, juniors, seniors, graduate students whereas in other cases the order will be
arbitrary, for example, Dell, hp,....... etc.
What to Look For: Frequently and infrequently occurring categories. In Minitab: Graph - Bar Chart.
Pie chart: A circle divided into slices where the size of each slice represents. Its relative frequency or percent
frequency. What to Look For: Categories that form large and small proportions of the data set.
In Minitab: Graph - Pie Chart.
* *
*
In this chapter,we provide a general review of statistical methods.