Due to the innumerable variables at play with materials processing and manufacture, no material sample tested from a single process will have the exact same performance. In order to address this, statistical studies can be performed to analyze the variability of a process which, in turn, can provide information for the tuning of a process to increase product consistency and reduce the uncertainty contributed from processing steps.
Quantifying Experimental Uncertainties
Confidence intervals are a general method used in quantifying the uncertainty of experimental measurements. Equation 1 shows the confidence interval associated with the normal (Gaussian) probability distribution. For many engineering applications, measurements are usually assumed to adhere to a normal distribution if data is sparse. However, if considerable data is available such that the actual probability distribution can be obtained, it is best to use a confidence interval for its specific distribution (Uniform, Normal, Log Normal, Weibull, Brinbaum-Saunders, Poisson, etc...).
Where is an estimator from the sample standard normal distribution with degrees of freedom and significance level , and is the sample standard deviation of measurements.
For some complex distributions (Weibull, Birnbaum-Saunders) that contain constants that cannot be obtained from data analysis, Maximum Likelihood Estimation (MLE) routines must be employed to compute the distribution constants.
There are numerous choices for analyzing experimental data to estimate statistical uncertainty bounds and is generally left to user preference. Typically, for small data sets, excel is the preferred platform since it is quick and easy to use. However, when dealing with large amounts of data (e.g. numerous stress-strain curves) programmatic approaches should be considered.
- Python with NumPy and SciPy plugins (Free, Open source)
- R Statistical Programming (Free, Open source)
- Statistical Analysis Software (SAS)
- Microsoft Excel
Uncertainty Quantification Example
Suppose you have the set of normally distributed data:
Determine the uncertainty of the measured variable using a normally distributed confidence interval. Confidence intervals are a general method used in quantifying the uncertainty of experimental measurements. Equation 1 shows the confidence interval associated with the normal (Gaussian) probability distribution. For many engineering applications, measurements are usually assumed to adhere to a normal distribution if data is sparse. However, if considerable data is available such that the actual probability distribution can be obtained, it is best to use a confidence interval for its specific distribution (Uniform, Normal, Log Normal, Weibull, Brinbaum-Saunders, Poisson, etc...). This method is expounded upon by  and is taught in most entry level statistics courses.
Where is an estimator from the sample standard normal distribution with degrees of freedom and significance level , and is the sample standard deviation of measurements. The factor is generally pulled from tabulated values, an example of which is at the bottom of this page.
For some complex distributions (Weibull, Birnbaum-Saunders) that contain constants that cannot be obtained from data analysis, Maximum Likelihood Estimation (MLE) routines must be employed to compute the distribution constants. The "statistics" package of Matlab has numerous MLE routines for estimating the constants for various statistical distributions.
Selecting Analysis Platform
The selection of the platform to analyze uncertainty in experimental data is largely up to the needs and experience of the researcher. Often, Microsoft Excel can be used for small lots of data. However, for large amounts of data this can be too tedious and time consuming and facilitates the need for programming based methods.
Descriptive Statistics and Computing the Confidence Interval
To begin, we first analyse the dataset to get the five basic descriptive statistics: mean, median, standard deviation, 1st quartile, and third quartile.
|Mean||Median||Standard Deviation||1st Quartile||3rd Quartile|
Note that the mean and median are very close. This suggests that the data follows a symmetrical distribution about the mean value (e.g. normal distribution). This may not always be the case. If the median is much less than the mean, this implies that a left-skew distribution is most likely and other distributions should be examined for use in creating a confidence interval. We have already computed the sample standard deviation, so the last piece of information needed to construct the confidence interval is the t-factor. Since we have 10 samples, this leaves us with 9 degrees of freedom (). The level of confidence is left up to the researcher, but 95% is commonly used. Reading from the included table, for a two-sided, normally distributed variable the t-factor is 2.262. All that is left is to plug the known quantities into (1), which gives us:
|5.0166 ± 0.3910|
The uncertainty in the variable is therefore 0.3910 with a confidence of 95%. The interval itself represents the probability or level of confidence that the actual population mean () of the data lies within that range.
t Distribution Table
- ↑ Coleman, H.W., and Steele, G.W (2010) Experimentation and Uncertainty Analysis for Engineers.