Chi Squared is used to determine if an Attribute or Discrete “X” has an effect on and Attribute or Discrete “Y”. The example below is a hypothesis (or an educated guess) that there is a difference in Loan Default Rates between Bank Branches.
The Chi Squared test is intended to test how likely it is that an observed distribution of data is due to chance. Chi Squared is also called the “goodness of fit” statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent
The Null and Alternative Hypotheses are:
Ho = Variable A and Variable B are indepent
- There is no effect on Defaulted Loans due to Bank Branches
Ha = Variable A and Variable B are not independent
- There is an effect on Defaulted Loans due to Bank Branches
My team and I at Six Sigma Development Solutions, Inc. are engaged in Lean and Six Sigma project work about 60% of the time. We encounter many different opportunities from a variety of disciplines. We are often called in to act as added capacity to an organization that has a continuos improvement system in place. When engaged with the teams at these organizations, I am often witness to a team “jumping into the deep weeds” of root cause analysis without understanding if the process is working as was originally engineered to work. If not, where is the delta.
I have an analogy that I like to use in class when teaching the concept of "capability". The analogy is called "Closing Open Windows" and is as follows: I drive up to my house open July day when it is 105 degrees Fahrenheit outside. I can hear my condenser (my A/C's external unit) humming. I have an A/C unit that is rated for a house much larger than mine because I am not fond of the heat. When I enter my house, it is 98 degrees inside. Something is wrong. I check my thermostat and it reads 74 degrees. I check to make sure cold air is I being pushed out through my vents. The cold air is effectively being diffused throughout the house. What is the next thing that you would investigate? Most would answer that you would check for open doors and windows. That is the obvious answer, but I see "trained" practitioners first knocking holes in the walls to see if they are missing insulation. They are jumping into a deep root cause analysis without understanding the current state capability of the Inputs.
Below we will give you our reference list. *We will update this list as we discover new references.
Lean and Six Sigma Green Belt Methodologies
The Six Sigma Handbook, Third Edition by Thomas Pyzdek and Paul Keller
- The Six Sigma Handbook, Third Edition shows you, step by step, how to integrate this profitable approach into your company's culture.
As deployment leaders of Lean and Six Sigma, my colleagues and I have seen that about 65% of Six Sigma projects fail to complete. We have also found that Lean Kaizen (3 to 5 day) projects have a greater rate of completion than Six Sigma projects. The question we had was “why?”.
The failure of Six Sigma projects to complete was due in most cases to scoping issues and having a weak infrastructure, but there was a deeper issue. Even in organizations with a strong foundation for both Lean and Six Sigma, there was still a delta between the completion rates for Six Sigma projects and Lean projects.
Most who run an ANOVA in a statistical package like Minitab trust the P-Value to inform them whether to trust the Null Hypothesis or not. Just because “P is Low” doesn’t always mean “the Ho must go”. Understanding the F-Statistic will give you a more reliable determination. Below is an example of how to calculate the F-Statistic for an ANOVA (Analysis of Variance)
A company produces bags of Popped popcorn. The customers “Critical to Quality Requirement” is to have the least number of un-popped kernels in a bag. We are going to take data from three vendors to determine if there is a difference in the number of un-popped popcorn kernels between the vendors.
- 1 Factor or Variable (which is the Type of Kernel)
- 3 Settings which in this case are 3 different vendors
- N = 25 data sets (the amount of data that we have collected)