Intervals & Tests
When fitting data to any distribution, there are some basic assumptions:
1. The distribution of the data is not known.
2. The data is representative of the process during the period when the data was collected (i.e. measurement error is negligible, and the sampling process produced data reflective of the process conditions).
3. The data can be represented by a single, continuous distribution. This implies that the data are sufficiently discrete so that there is some variation among the data. In practice we fit the Cumulative density functions, so that both the data and the hypothesized curve are continuous; the data being step-wise continuous and the hypothesized curve a smooth continuous function.
4. A single distribution can only be sensibly fit to the data when the process is stable, without any influences that may shift the process in time (special causes).
5. We cannot make a claim that the data are distributed according to our hypothesis. We can claim only that the data may be represented by the hypothesized distribution. More formally, we can test and accept or reject, at a given confidence level, the hypothesis that the data has the same distribution function as a proposed function. The K-S (Kolmogorov-Smirnov) statistic should be used as a relative indicator of curve ft.
Learn more about the Statistical Inference tools for understanding statistics in Six Sigma Demystified (2011, McGraw-Hill) by Paul Keller, in his online Intro. to Statistics short course (only $89) or his online Black Belt certification training course ($875).