Showing posts with label Statistics. Show all posts
Showing posts with label Statistics. Show all posts

Friday, July 24, 2009

Parametric tests Versus Non parametric tests

Prepared on 24th July 2009

“Fewer or weaker are the assumptions, the more general are the conclusions”.

In statistics, unfortunately the most powerful tests are those which have extensive and stringent assumptions. With this assertion, I will be presenting the most prominent discussion of Parametric vs. Non parametric tests very briefly. The following write up reflects my personal (with nascent knowledge) bias for non parametric tests.

Parametric tests (t test and f test) are the tests that have certain assumptions about the population from which the samples are drawn. The strength of the results of such tests depends on the validity of those stringent conditions/assumptions. The prominent among them are.

  • Observations must be independent
  • Observations must be drawn from normally distributed populations
  • Those populations must have same variance
  • Variables are measured at least in an interval scale

Non parametric tests are the tests that make no assumption about the populations, but share few relatively weak assumptions with the parametric tests. They are:

  • Observations must be independent
  • Variables under study have underlying continuity.

However the concept of power of efficiency (by increasing the sample size) ensures that non parametric tests achieve the same strength as parametric test. Moreover the criticism of the non parametric tests over their usage of information (some non parametric tests ignores the sign, some convert the scale into ranks) may be answered by eliciting the answers to the following questions.

Ø How important is it that the conclusions drawn from the research are applicable to the generally rather than only to the populations with normally distributions?

Ø Which tests of the parametric and non parametric tests use the information appropriately?

The potential answer to the former question is presented in itself. The answer to the later one may be viewed from the perspective of possible assumptions one makes about the potentially unknown populations one deals with. The issue of comparing the parametric and non parametric tests may be highlighted by presenting the short summary of the advantages and disadvantages of the non-parametric test.

Advantages of Non-parametric tests:

ü The probability statements obtained from the non parametric tests are the exact ones, regardless of the shape of the underlying population.

ü For the very small samples (say N=8), there is no other alternative other than non parametric tests unless the parameters of the underlying population are known exactly.

ü Non parametric tests are suitable in the case of the samples that are drawn from various populations with different variances.

ü No other alternative than non parametric tests in the case of the samples involving nominal data.

Disadvantages of the non-parametric tests

  • In the case of the samples that satisfy the underlying assumptions of the parametric tests, application of non-parametric tests is of wasteful given their power of efficiency
  • Unstructured availability of literature about the non parametric tests may confine the researcher to employ the parametric tests though their validity is vague.

The author has heavily benefited from the extensive and provocative discussions with his fellow doctoral student Yoonus C. A at IFMR, Chennai; and the writings of Sidney Siegel. The comments on the draft by Nandhini R. are highly commendable. Finally, the author is solely responsible for any mistakes and constructive comments are highly respected.

Thursday, October 16, 2008

Basic Statistics: Hypothesis Testing

Hypothesis is a tentative assumption about the population parameter.  Such a tentative statement is called Null hypothesis and the one opposite to such tentative statement is alternative hypothesis. Usually the research hypothesis is expressed as the alternative hypothesis.

Type I and Type II errors: As hypothesis tests are based on sample information there will be the possibility of errors.

Type I error: Rejecting the null hypothesis when it is true
Type II error: Accepting the null when it is not true or false

Level of significance: It is the probability of making a type I error when the null hypothesis is true. The applications of hypothesis testing that only control for the type I error are called as "significance tests".

Most of the hypothesis tests control for the probability of making type I error, they do not control for the probability of making type II error.  Hence, though we decide to accept null, we can't determine how confident we can be with that decision. Because of the uncertainity associated with making a Type II error, it is suggested that the use of the statement "do not reject null" instead of "accept Null".  In effect by not directly accepting null, one can avoid the risk of making Type II error.  

Whenever the probability of making Type II error is not controlled, the optimal conclusions should be 'do not reject the null' or 'reject the null'.


The constructive comments and additions are encouraged!!