Weekly outline

  • Course of probability and statistics


    The use of probability models and statistical methods for analyzing data has become
    common practice in virtually all scientific disciplines. This  course attempts to provide
    a comprehensive introduction to those models and methods most likely to be encoun-
    tered and used by students in their careers in engineering and the natural sciences.
    Although the examples and exercises have been designed with scientists and engi-
    neers in mind, most of the methods covered are basic to statistical analyses in many
    other disciplines, so that students of business and the social sciences will also profit
    from reading the book.

  • Contact information

    Course title : Probability and statistics

    Score : 04

    Coefficient : 07

    Duration :15

    Target audience: First year of common core science and technology ''engineer''

    Contact informations :

    Responsable for the course : Assistant professor Dr.Nassiba Fekhar

    Contact -email: nassiba_chimi@yahoo.fr

    Availability : during the day of the lessons at the university (from 8:00 am to 17:00 pm)


  • Course information

    ·         Explaining statistical concepts using examples from the area of interest.

    ·         Quizzes and homework will be assigned at the end of each chapter/subject.

    ·         In this course students use an ebook. Students should connect to McGaw-Hill in order to access the course material.

    ·         Case studies and/or projects will be assigned in order to teach students how to apply different statistical concepts to solve problems in their related study area.

    ·         This course is a laptop/i-Pad course.  Students use laptop/ipad to a) access the course material b) solve HW assignments and exam.

    .       All statistical techniques are illustrated using Minitab statistical software. Some concepts are illustrated using JAVA applet available in the web. Minitab will be needed to solve exams and assignments

  • Author presentation

    First name : Nassiba

    Family name : Fekhar

    Departement : Common core of state engineer.

    Faculty of science and technology , university of Djelfa







  • Objectives

    1. Define and apply the basic concepts of probability theory and statistics to real situations.  
    2. Define and compute the common probability distributions used in modeling data arising in engineering and IT.
    3. Apply descriptive statistical techniques to describe data using statistical software. Select and apply the appropriate statistical methods in analyzing data using statistical software. 
    4. Select and apply the appropriate statistical methods in analyzing data using statistical software. 
    5. Plan, analyze, and interpret the results of experiments.
    6. Communicate statistical information in oral and written form.




  • Course description

    This course introduces students to events and sample space, probability, conditional probability, random variables, cumulative distribution function and probability density function, moments of random variables, common distribution functions, elementary introduction to statistics with emphasis on applications and model formulation, descriptive statistics, sampling and sampling distributions, inference,  tests, one and two factors analysis of variance, randomized complete block design, correlation and regression, and chi-square tests.






  • Pre-requisties


    To fully understand statistic and probability theory, it is helpful to have a solid foundation in certain mathematical concepts . Here are some common prequisities that are typically useful for studing probability theory :

    1. Basic Algebra : A good grasp of algebra is essential for understanding probability theory . This includes understanding and manipulating algebraic expressions , solving equations , and working with variables.

    2. Set theory : Probability theory is closely related to set theory, understanding concepts such as sets , unions, intersections, complements and the properties of set is important for studying probability.

    3. Basic calculus : While not always strictly necessary for an introductory understanding of calculus can be helpful, especially when dealing with continuous probability distributions.

    4. Statistics : A basic understanding of statistics including concepts such as mean , median, mode , variance and standard deviation can be beneficial when studing probability theory.

    5. Probability Basic : Familiarity with basic probability concepts such as sample, spaces events, probability.


  • Test


    Test N°1 :

     

    An IT student, working on his thesis, plans a survey to determine the proportion of all computer users who regularly scan flash disks before using them. He decides to interview his classmates in the three classes he is currently enrolled.

    a). What is the population of interest? 

    All computer users who regularly scan flash disks before using them.

    b). Do the student’s classmates constitute a simple random sample from the population of interest ?

    No

    c). What name have we given to the sample that the student collected ?

    convenience sampling

    d). Do you think that this sample proportion is likely to overestimate , or underestimate the true proportion of all computer users that regularly scan ask disks before using them? Overestimate, because he surveyed just IT students who are expected to use computers more than others.

     

    Test N°2 :

    A researcher wishes to estimate the average amount spent per person by visitors to a theme park. He takes a random sample of forty visitors and obtain an average of 28 person.

    a.What is the population of interest. All visitors to a theme park

    b.What is the parameter of interest. Average amount spent per person

    c.Based on this sample, do we know the average amount spent per person by visitors to park ? Explain  fully. No, we obtain an estimation of the average

    A researcher wishes to estimate the average weight of newborns in south America in the last five years .He takes a random sample of 235 newborns and obtains an average of 3.27 Kilograms.

    a). What is the population of interest ? Newborns in south America in the last five years

    b).What is the parameter of interest ? Average weight of newborns

    c). Based on the sample , do we know the average weight of newborns in South America in the last five years. No, we have just an estimation

    A researcher wishes to estimate the proportion of all adults who own a cell phone. He takes a random sample of 1.572 adults , 1.298 of them own a cell phone, hence 1298 1572 = 0.83

    a).What is the population of interest ? All adults who own a cell phone

    b).What is the parameter of interest ? Proportion of all adults who own a cell phone.

    c).What is the statistic involved? 0.83



  • Chapter 1

    1. why statistics
    Statistics deals with collecting, processing, summarizing, analyzing, and interpreting data. On the other hand,
    engineering and industrial management deal with such diverse issues as solving production problems, effective
    use of materials and labor, development of new products, quality improvement and reliability and, of course,
    basic research.
    1.1. The field of statistics involves methods for :
    1. Designing and carrying out research studies.
    2.Describing collected data.
    3.Making decisions, predictions, or inferences about phenomena represented by the data by designing valid
    experiments and drawing reliable conclusions.
    1.2. Branches of Statistics
    1. Descriptive statistics: statistical
    methods that summarize and describe the prominent features of data.
    2. Inferential statistics: statistical methods that generalize results from a sample to a population.
    As it is generally impossible or impractical to find out something about the entire population, we examine a part
    of it to make inferences.
    1.3. Definitions
    A population is the entire collection of objects or outcomes about which information is sought.
    A sample is a subset of a population, containing the objects or outcomes that are actually observed.
    A parameter is a numerical characteristic of a population, which is usually unknown.
    A statistic is computed from the sample and varies from sample to sample and used as an estimate of the
    population parameter.
    Data Collection :
    Besides organizing and analyzing data, statistics deals with the development of techniques for collecting the
    data.If data is not properly collected, an investigator may not be able to answer the questions under
    consideration with a reasonable degree of confidence.
    Chapter 1 : Descriptive
    Statistics I
    5
    Observational Studies: Engineer simply observes the process without disturbing it and records quantities of
    interest. May be able to find relationship between input and output but cannot study relationship between all
    factors because appropriate changes were not made.
    Controlled (Designed)Experiments:
    Measurements are recorded while controlling some factors that might influence the results of the study.
    Measures the response or output variable of interest.
    Surveys: Questionnaires designed to solicit information from people. Data may be collected by face-to-face
    interview, telephone interview, postal mail, email, fax.
    A simple random sample (SRS) of size n is a sample chosen by a method in which each collection of n
    population items is equally likely to comprise the sample.
    A (SRS) is not guaranteed to reflect the population perfectly. (SRS) is always differ in some ways from each
    other.
    Two samples from the same population may vary from each other. This is known as sampling variation, Items
    in a (SRS) may be treated as independent in most cases encountered in practice. The exception occurs when the
    population is finite and the sample comprises a substantial fraction (more than 5%) of the population. Sampling
    with replacement: Replace each item after it is sampled.
    The population remains the same on every draw. The sampled units are truly independent.
    In the sample the researcher collected, 80% of users were satisfied with their internet connection.
    In the population of customers, it is unlikely there will be exactly 80%.
    Who are satisfied with their internet connection.
    It is more realistic to think that there will be somewhere around 80%.
    of the customers who are satisfied with their internet connection.
    Another researcher repeats the study with a different (SRS) of 50 Customers.
    She finds 90% are satisfied with their internet connection.
    Did she do something wrong or did the first researcher do something wrong ?
    Sample variation at work, two different samples from the same population will differ from each other and from
    the population.
    1.3.1. Sampling methods
    Stratified Sampling
    Sometimes alternative sampling methods can be used to make the selection process easier, to obtain extra -
    information, or to increase the degree of confidence in conclusions. One such method, stratified sampling,
    entails separating the population units into non-overlapping groups and taking a sample from each one.
    For example, a manufacturer of TV might want information about customer satisfaction for units produced
    during the previous year. If three different models were manufactured and sold, a separate sample could be
    selected from each of the three corresponding strata.
    This would result in information on all three models and ensure that no one model was over- or
    underrepresented in the entire sample.
    Convenience Sampling
    6
    Frequently a convenience sample is obtained by selecting individuals or objects without systematic
    randomization
    sample is not drawn by a well-defined random method.
    Example: A computer engineer received a shipment of 1000 monitors in a huge container. He wants to test the
    brightness of the monitors by testing a sample of 10 ones. The engineer takes 10 monitors from the top of the
    container as the sample. Things to consider with convenience samples: Differ systematically in some way from
    the population. Only use when it is not feasible to draw a random sample.
    1.4. Type of variables
    A variable is any characteristic whose value may change from one object to another. The variables can be
    classified as either quantitative or qualitative.
    Quantitative (Numerical) variables: A numerical quantity is assigned to each item in the sample. Quantitative
    variables can be classified as either discrete or continuous:
    A discrete variable is a variable whose possible values can be listed, even though the list may continue
    indefinitely For example, the number of visits to a particular Web site during a specified period, the number of
    PCs owned by a family, or the number of students in an introductory statistics class A continuous variable is a
    variable whose possible values form some interval of numbers. Typically, a continuous variable involves a
    measurement of something, such as the price of a laptop, the CPU time of a certain task (in seconds), or the
    length of time a PC battery lasts.
    1.5. Graphical Methods
    Descriptive statistics can be divided into two general areas, graphical and numerical
    In this part, we consider representing a data set using graphical techniques.
    Appropriate graphs are- For qualitative data: Bar chart and Pie chart.
    For quantitative data: Histogram, Boxplot.
    Bar and Pie Charts
    Bar chart: A vertical or horizontal rectangle represents the frequency for each category. Height can be
    frequency, relative frequency, or percent frequency. In some cases, there will be a natural ordering of groups,
    for example, freshmen, sophomores, juniors, seniors, graduate students whereas in other cases the order will be
    arbitrary, for example, Dell, hp,....... etc.
    What to Look For: Frequently and infrequently occurring categories. In Minitab: Graph - Bar Chart.
    Pie chart: A circle divided into slices where the size of each slice represents. Its relative frequency or percent
    frequency. What to Look For: Categories that form large and small proportions of the data set.
    In Minitab: Graph - Pie Chart.
    * *
    *
    In this chapter,we provide a general review of statistical methods.

  • Chapter 2

    Introduction to probability and basic definitions
    - Random Experiments:
    - An experiment that can result in different outcomes, even though it is repeated in the same manner every
    time, is called random experiment.
    -Sample Space:
    - The set of all possible outcomes of a random experiment is called the sample space of the experiment. The
    sample space is denoted S.
    - A sample space is said discrete if it consists of a finite or countable infinite set of outcomes.
    - A sample space is said continuous if it contains an
    interval (either finite or infinite) of a real numbers.
    Examples:
    1. Tossing a coin
    2. Tossing a die3. Response of a patient to a treatment (cure, no cure).

    4. Presence of a genetic trait in a newborn baby.
    Sample space (S)
    1. Tossing a coin :
    2. Tossing a die :
    S={H, T}
    S = {1,2,3,4,5,6}
    3. Response of a patient to a treatment :
    S = {cure, no cure}
    1-Consider an experiment in which a process
    Chapter 2 : Probability
    II
    Définition : Sample spaces and Events
    Exemple : Basic examples
    Exemple : Example
    8
    manufacturers pins whose lengths vary between
    5.20 mm and 5.25 mm. The sample space as simply the positive real line
    S={x|5.20<x<5.25}
    2- If the objective of the analysis is to consider only whether a particular still pin too short, too long or within
    specifications, the sample space might be taken to be the set of three outcomes:
    S={too short, too long, within specification}
    1. Sample spaces can also be described graphically with tree diagrams.
    2. When a sample space can be constructed in several steps or stages, we can represent each of the n1 ways of
    completing the first step as a branch of a tree.
    3. Each of the ways of completing the second step can be represented as n2 branches starting from the ends of
    the original branches, and so forth.
    1.1. Events
    1.An event is a subset of the sample space of a random experiment.
    Example:
    Consider the sample space S = {yy, yn, ny, nn}.
    2. Suppose that the set of all outcomes for which at least one part conforms is denoted as E1.
    E1={yy, yn, ny}.
    3. The event in which both parts do not conform,
    denoted as E2, contains only the single outcome,
    E2={nn}.
    4. Other examples of events are E3=o, the null set,
    and E4=S, the sample space.
    1.2. Operations and Events
    1. The union of two events is the event that consists of all outcomes that contained in either of the two events.
    We denote the union as
    𝐴 u 𝐵
    2. The intersection of two events is the event that
    consists of all outcomes that are contained in both of the two events. We denote the intersection as
    𝐴 u 𝐵
    3. The complement of an event in a sample space is the set of outcomes in the sample space that are not in the
    event. We denote the complement of the event 𝐴 and 𝐴𝑐.
    Tree Diagrams
    9
    1.3. Interpretations of Probability
    1. Probability refers to the chance that a particular
    event will occur.
    2. Let A be an event, then P(A) denotes the probability that A will occur.
    3.Used to quantify likelihood or chance.
    4. Used to present risk or uncertainty in engineering applications.
    5.Can be interpreted as our degree of belief or relative frequency.
    * *
    *
    In this chapter, we provide a general review of probability techniques.



  • Bibliographie

    D. Montgomery and G. Runger, “Applied Probability and Statistics for Engineers, 6th edition”, Wiley.
    Sanjeev Kulkarni, Gilbert Harman , "An Elementary Introduction to statistical learning theory, edition'', John
    Wiley and Sons, Inc.2011.

  • Evaluation test


       Basic algebra is all that’s required, nothing more complicated than squarings needed for introductory statistics.

    Please choose an answer

    ü  Correct

    ü  Incorrect

    2.    Probability and statistics can be easy or extremly difficult depending on the level however intro-probability and statistics is relatively easy but higher level it’is difficult.

    Please choose an answer

     

    ü  Correct

    ü  Incorrect

    3.    Since you just said probability and statistics without any numbers to signify level I had say you are fine with middle school math since the are no derivatives, intgrals etc in intoish levels. It is mostly visual, common sense, and straight forward theory stuff.

          Please choose an answer

     

    ü  Correct

    ü  Incorrect

     

    4.    To really understand probability and statistics,I had say calculus+ linear Algebra If you are in an Introductory course where you will only be utilizing formula. 7 th grade algebra and basic arithmetic the biggest help I can give you is to not think of the problems as a number problem but to understand what the question is asking most people.

    Please choose an answer

     

    ü  Correct

    ü  Incorrect