Categories

## Sample Size, Control Distance, and Type I and Type II Errors in Control Charts

Type I and Type II errors in control charts (or in statistics) are difficult to explain concepts. When trying to explain the impact of the sample size and the control distances in the magnitude of Type I and Type II errors, it becomes exponentially more difficult to explain.

The embedded visualization shows three distributions:

• A population distribution (PDF: g(x)) with the average m = 12 and the s.d. S = 0.5. The process target is t = 12.
• A sampling distribution (PDF: f(x)) with the sample size n = 4 and the standard error = S/sqrt(n)
• A sampling distribution (PDF: h(x)) with a “derailed mean” c = 12.32

It uses the following control charts specifications.

• A control chart with the control distance of k = 1.96, ucl = t + k*s, and lcl = t – k*s

If you are familiar with the sampling distribution, the size of k will determine the size of both the Type I error and Type II error, and the size of n will determine the size of Type II error.

You can refer to the attached in-class exercise note to adapt it to your teaching/learning needs. This visualization to helps you to see how Type I error and Type II error changes as you:

• Change the sample size n
• Change the control distance k
• Change the mean of the process m

Go to the visualization: https://www.desmos.com/calculator/hccg2zkpon

Categories

## What could go wrong with a product design?

This is what happens when you do not respect the voice of your customers.

This illustration can also be used in the project management context to emphasize the importance of the client requirements gathering and the project scope mismanagement and what it can cause to the project deliverables.

Categories

Categories

## Research Article Collection for Quality Management Class

Shah, R., Ball, G. P., & Netessine, S. (2017). Plant operations and product recalls in the automotive industry: An empirical investigation. Management Science63(8), 2439-2459. (https://doi.org/10.1287/mnsc.2016.2456)

• Using samples from auto manufacturing plants, this article demonstrates the positive relationship between the plant utilization and the likelihood of resulting a recall.

Corbett, C. J., Montes-Sancho, M. J., & Kirsch, D. A. (2005). The financial impact of ISO 9000 certification in the United States: An empirical analysis. Management science51(7), 1046-1059. (https://doi.org/10.1287/mnsc.1040.0358)

• Using event study fashion, this article seeks to identify both short-term and long-term implications of receiving an ISO-9000 certification.

Guajardo, J. A., Cohen, M. A., Kim, S. H., & Netessine, S. (2012). Impact of performance-based contracting on product reliability: An empirical analysis. Management Science58(5), 961-979. (https://doi.org/10.1287/mnsc.1110.1465)

• This article suggests that even the type of contract (time and material contracts and performance-based contracts) with the manufacturer can have impact over the product reliability.

Categories

## Variation Reduces Capacity Utilization

There is a well-known story about how to “properly” fill a jar with sand, rocks, and water. For the detailed story, please refer to the video below. The story is often used as an analogy to inspire people about the importance of priority.

But from the operations management perspective, the story also tells us an important principle about the capacity utilization. The fact that it still had space to fill sand and water after the rocks are filled first, is because the irregular shape (variation) of the rocks were inhibiting the space from being utilized efficiently!

Efforts to reduce variations can be seen everywhere! Workstations in the factory is moving in “cycle times”, packages follow certain design specifications so that the container (or truck) spaces can be maximally utilized.

Categories

## Quality Control in Factory Level: OnePlus Factory Tour by Linus Tech Tips

In this video, a tech Youtuber, Linus, is showing us around the OnePlus factory. You do not have much factory experience or have never toured a manufacturing plant before, this is a good material.

It explicitly shows how the quality control is done in the factory level. Processes, Purposes, People, and Tasks of quality management are demonstrated clearly. (Although Linus never intended to make the video as a quality management case study. 🙂 )

Categories

## Service Quality: Moving Company in Japan

Evaluating the quality of manufactured products can be straightforward because they are tangible. But the evaluation of services, which are intangible, can be more involving. According to the Evans and Lindsay, existing research suggest the following five principal dimensions that influence customer’s perception on service quality.

As you can see, the level of effort the Japanese moving company is showing is very impressive. Can you comment on what/how the company has done in each of five dimensions listed above?

Categories

## Good Management and Performance Excellence is Hard to Copy

References:

Sadun, R., Bloom, N., & Van Reenen, J. (2017). Why do we undervalue competent management. Harvard Business Review95(5), 120-127.

Porter, M. E. (1996). What is strategy?. Harvard business review74(6), 61-78.

Categories

## From Population Distribution to Sampling Distribution (Simulation)

Sampling distribution is an important concept in statistics. We rely on sampling distributions (e.g., sample mean and sample proportion) to make decisions about whether to accept or to reject the hypotheses about the population properties.

Students usually have tough time understanding the concept due to its highly theoretical nature. The attached Excel spreadsheet can help visualize the process of obtaining a sampling distribution of population mean. Specifically, it allows you to specify a sample size n and the number of sample groups of k. Thus, you will have k number of sample averages can be used to construct a distribution. This Excel applet allows the process of obtaining the sampling distribution more visible.

Note:

• Original file has a population of 10,000 observations that follow a normal distribution with mu = 500 and sigma = 50. If you wish to demonstrate the law of large numbers, you can replace the population data with your own.
• When you specify a very big k, for example 400, Excel will freeze for a moment to process the request. Please monitor your CPU usage.
• My website does not allow me to upload Macro enabled Excel file as/is. That is reason why you are seeing a zip file.
• When use in Windows 10, please enable Macro, Data Analysis ToolPak, and Data Analysis ToolPak – VBA. Otherwise, it will report a run time error.
• For Mac, please see this link. Essentially you need to enable the developer ribbon.

(Note: Written in Excel VBA)

I have another similar worked example in R here.

Categories

## Sampling Distribution and Testing Hypothesis

I developed this lecture note spring 2020 using R markdown for the first time. It supports the compilation of R, Markdown, and LaTex code at the same time! I was really impressed.

http://www.compsaver.net/StatsNotes/02-13-2020%20Chapter%206.%20Statistical%20Techniques.html#/

Markdown Code:

Chapter 6. Statistical Techniques in Quality Management
========================================================
author: Z. Wen (OSCM 3340, Spring 2020)
date: 02/12/2020, Thursday
autosize: true
font-family: 'Fira Sans', sans-serif
width: 1440
height: 900

Learning Objeectives
========================================================

- Review of Sampling Distribution
- Confidence Interval
- Testing Hypothesis
- Various Distributions
- Sample Size Determination

Estimation
========================================================
Conceptually, the following relationship holds in any type of estimation.
$$\theta = \hat{\theta} + M.E.$$

In quality management, we are often interested in the process mean. (e.g., Does our machines need alignments?)
$$\mu = \bar{x} + M.E.$$

We are also interested in the process standard deviation. (e.g., Does our machines need calibrations?)
$$\sigma = s + M.E.$$

Margin of Error
========================================================

**Since a lot of times we have information about $\bar{x}$ and $s$, we need to develop our knowledge on $M.E.$**

There are three components in developing the confidence interval
- Your level of confidence  ($1-\alpha$)
- Sample size  ($n$)
- Best estimate of the population s.d.  ($\sigma$ or $s$, whichever available)

Here is the formal relationship of these three in forming the margin of error:
$$M.E. = z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$$

Confidence Interval
========================================================

**With the knowledge of M.E., we can construct an interval where the true parameter is located with $1-\alpha$ level of confidence.**

$$C.I. = \bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$$

There are many different types of variations. For example,
$$C.I. = \bar{x} \pm t_{\alpha/2}\frac{s}{\sqrt{n}}$$
$$C.I. = \bar{p} \pm z_{\alpha/2}\sqrt{\frac{\bar{p}*(1-\bar{p})}{n}}$$

And many more... $C.I.$ for F distribution, $\chi^2$ distribution, etc. **As long as it is an estimation result, you will always see the reporting of $C.I.$**

Confidence Interval (Example)
========================================================

See the following calculated examples:

| Level of $\alpha$ | $n$ |    $z$   | $\sigma$ |   M.E.   | $\bar{x}$ | C.I. | Interval Length |
|:-----------------:|:---:|:--------:|:--------:|:--------:|:---------:|:-------------------:|:---------------:|
|        0.01       | 100 | 2.58 |     4    | 1.03 |     34    |    [32.97, 35.03]   |     2.06    |
|        0.05       | 100 | 1.96 |     4    | 0.78 |     34    |   [33.22, 34.78]  |     1.57    |
|        0.1        | 100 | 1.64 |     4    | 0.66 |     34    |   [33.34, 34.66]  |     1.32    |
|        0.01       |  64 | 2.58 |     4    | 1.29 |     34    |   [32.71, 35.29]  |     2.58    |
|        0.05       |  64 | 1.96 |     4    | 0.98 |     34    |    [33.02, 34.98]   |     1.96    |
|        0.1        |  64 | 1.64 |     4    | 0.82 |     34    |   [33.18, 34.82]  |     1.64    |

<small>Although the name could be confusing, Excel formula **=CONFIDENCE.NORM(alpha, standard_dev, size)** and **=CONFIDENCE.T(alpha, standard_dev, size)** will give you the **M.E.** value. </small>

One Very Important Application (1/4) - Confirming Doubts
========================================================
**How to find out someone who had your total trust betrayed you?**
<br>

*I am almost certain that he/she won't do that...*

But what if he/she got caught in doing that thing. Is he/she still trustworthy?

**You heartfully believed the mean is 7. But, what if your 95% confidence interval does not contain 7? Will you still believe the mean is 7?**

In this case, you either have to update your belief on the mean, or you must have encountered a rare chance event.

One Very Important Application (2/4)
========================================================

**Example:** <br>
A cylinder manufacturer claims that their process mean is 12.5 mm. Historically, their process standard deviation was .08 mm and there is no reason to think that the s.d. has changed. Upon drawing a random sample of 9, the sample average was 12.22. Please test the claim under 5% of error tolerance level.

*Questions*
1. First, please use a visual aid to determine the answer. <br>
2. Please use a formal approach to determine the probability of obtaining such sample average, given the true mean is 12.5 mm. <br>
3. What is the conclusion?

One Very Important Application (3/4)
========================================================
If what the company claiming is true, this will be the distribution of the population.
- Population mean $\mu = 12.5$
- Population standard deviation $\sigma = 0.08$

<center>
![plot of chunk unnamed-chunk-1](Confidence Interval PowerPoint-figure/unnamed-chunk-1-1.png)
</center>

***
<small>
If what the company claiming is true, for $n=25$, we have...
- Hypothesized mean of $\mu_0 = 12.5$
- Standard Error of $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{.08}{5} = 0.016$
- 95% C.I. $\bar{x} \pm ME = [12.19, 12.25]$
</small>

<center>
![plot of chunk unnamed-chunk-2](Confidence Interval PowerPoint-figure/unnamed-chunk-2-1.png)
</center>

Formal Hypothesis Testing (3/4)
========================================================

**Hypothesis testing** re-defined:
- It is a process of determining a likelihood of surprise for a given sample result.
- How likely it is that we can obtain a sample mean of this size, given the claim is true?

Formal hypothesis: <br>
- $H_o: \mu = 12.5$ and $H_\alpha: \mu \neq 12.5$

Determination of the Likelihood (Excel Formula):
- $p-value = norm.dist(12.22, 12.5, 0.016, TRUE) = 7.16E-69 \approx 0$

**Verdict: It is very unlikely that the true mean is 12.5**

Any idea about the ture mean:
- All we know is, it is very unlikely 12.5. With a 95% confidence, it could be said it is within [12.19, 12.25]

Another Example - Two Group Mean Testing (Exercise)
========================================================

**Case**: Please determine whether the following two hospitals have the same quality rating.

<center>
<img src="Confidence Interval PowerPoint-figure/unnamed-chunk-5-1.png" title="plot of chunk unnamed-chunk-5" alt="plot of chunk unnamed-chunk-5" width="1000" height="600" />
</center>

Formalizing Two Group Mean Test
========================================================
A visual inspection of the confidence interval seems to be arguing that the means are not the same. Now, we formalize the test.
- The mean difference of $d = \bar{x}_A - \bar{x}_B$ are known to follow a **T distribution** when the both $\sigma_A$ and $\sigma_B$ are not known.
- T distribution gives a more liberal estimate than Z distribution as long as the degree of freem (*formula omitted*) is less than 120.
- A **formal hypothesis**: $H_0: \mu_A - \mu_B = 0$ and $H_\alpha: \mu_A - \mu_B \neq 0$
- Construct a hypothesized distribution with the mean of $0$ and the standard error of $\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$
- Then draw samples from each group and calculate the mean difference $d$ to determine how likely (through $p-value$) it is that one can obtain such results when there are assumed to have no difference.

[Tool from ArtofStat (Two Mean Test)](https://istats.shinyapps.io/2sample_mean/)

Excel Solution for Two Group T-Test
========================================================

**Excel Data Analysis ToolPak**

![Analysis ToolPak in Excel](https://www.excel-easy.com/examples/images/t-test/select-t-test-two-sample-assuming-unequal-variances.png)

Also as Excel Formulae

$=T.DIST(X, DF, TRUE)$
$=T.DIST.2T(X, DF)$
$=T.DIST.RT(X, DF)$

***

**Side of the Test**

![Which side are you testing?](https://ars.els-cdn.com/content/image/3-s2.0-B9780128008522000092-f09-06-9780128008522.jpg)

How would you define a surprise?
- Too big is a surprise (Right-Tail)
- Too small is a surprise (Left-Tail)
- Too big or too small both (Two Tail)

Overview of Distributions and Their Usages
========================================================

Different test statistics assume different statistical distributions. Here are what is relevant to our current context.

<center>

|                              | Testing for One Group | Comparing Two Groups |
|------------------------------|-----------------------|----------------------|
| Mean                         | Z distribution        | Z distribution       |
| Mean (When $\sigma$ unknown) | T distribution        | T distribution       |
| Variance                     | $\chi^2$ distribution | F distribution       |

</center>

In the quality context, we wish to know
- Whether the machine needs alignment? (Parts are even, but missing the target)
- Whether the machine needs re-calibration? (Generating uneven parts, but meets the target)

Overview of Distributions and Their Usages (cont.)
========================================================

Other types of distributions and their potential usage:

<center>

| Use Case                              | Distributions   | Variable Type |
|---------------------------------------|-----------------|---------------|
| Number of defective units per x units | Poisson         | Discrete      |
| Time factor                           | Exponential     | Continuous    |
| Number of success per x trials        | Binomial        | Discrete      |
| Sampling without replacement          | Hyper-geometric | Discrete      |

</center>

Many times, statistical distribution can be used to establish important baselines for the estimation.

**Now, resumed to the test of Variance**

Test of Variance
========================================================

Detecting whether the variance (or $\sigma^2$) of the process has changed provide important information
- About the accuracy of of the mean estimate

<center>
![Variance](https://www.qualitydigest.com/june08/Images/SimplifyingSPC/SimplifyingSPCFig7.gif)
</center>

***

<center>

<img src="Confidence Interval PowerPoint-figure/unnamed-chunk-6-1.png" title="plot of chunk unnamed-chunk-6" alt="plot of chunk unnamed-chunk-6" height="700" />

</center>

Test of Variance - Formalization (1/2)
========================================================

**One group variance testing statistics**

$$\chi^2 = \frac{(n-1)s^2}{\sigma_0^2}$$

It follows a $\chi^2$ distribution with $n-1$ degree of freedom.

***

<center>

Shape of $\chi^2$ Distribution

![chi-sq](https://saylordotorg.github.io/text_introductory-statistics/section_15/5a0c7bbacb4242555e8a85c9767c03ee.jpg)

$\chi^2$ distribution is useful when determining whether an observed pattern follows the expected pattern.

</center>

Test of Variance - Formalization (2/2)
========================================================

**Two group variance testing statistics**

$$F = \frac{s_1^2}{s_1^2}$$

It follows a $F$ distribution with $n_1 - 1$ degree of freedom of numerator and  $n_2 - 1$ degree of freedom of denominator.

$F$ distribution is a distribution of ratio.

***

<center>
Shape of $F$ Distribution

</center>

Test of Variance - Excel Solutions
========================================================

![F](https://www.teststeststests.com/microsoft-office/excel-2016/tutorials/13-excel-data-analysis-toolpak/1-t-test-F-test-z-test/6-F-Test-inExcel.gif)

***
Excel Formulae:

$=F.Dist(F, DF1, DF2, TRUE)$
$=F.Dist.RT(F, DF1, DF2)$
$=F.Test(Range1, Range2)$

One More Application of Confidence Interval
========================================================
Sometimes, for budgetary reason, we need to calculate the size of the sample befor we conduct a sampling. See if you can answer these two questions.

**Example:** <br>
A manager wants to ensure that whenever he rejects a shipment, he does not want to make more than 5% of mistake. Historically, the supplier's process had a very stable standard deviation of .5 mm. He believes 0.02 mm could serve as a meaningful margin of error size. What would be his choice of sample size? That is,

$$0.02 = 1.96 * \frac{.5}{\sqrt{n}}$$
Q: What is this n?
<br><br>
**Solution:** To get the sample size:
$$0.02 = 1.96 * \frac{.5}{\sqrt{n}}$$
$$\sqrt{n}= (\frac{1.96 * .5}{.02})^2 = 2401$$

Sample Size Solution
========================================================