Documents

Assignment No. 2 Nihal RollNo-08

Description
Name: Nihal R. Dalvi Roll No: 08 Assignment No. 2 (Pitfalls of Data Analysis) The Problem with Statistics We have a pervasive notion that we can prove anything with statistics which is only true when we use them improperly. Lies, damned lies, and statistics is a
Categories
Published
of 2
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Name: Nihal R. Dalvi Roll No: 08 Advanced Business Analytics  –  II Assignment 1 Assignment No. 2 (Pitfalls of Data Analysis) The Problem with Statistics We have a pervasive notion that we can prove anything with statistics which is only true when we use them improperly.   Lies, damned lies, and statistics is a phrase describing the persuasive  power of numbers, particularly the use of statistics to bolster weak arguments. It is also sometimes colloquially used to doubt statistics used to prove an opponent's point. Sources of Bias Bias is the tendency of a statistic to overestimate or underestimate a parameter. Representative Sampling : In this, the ideal scenario would be where the sample is chosen by selecting members of the population at random, with each member having an equal probability of being selected for the sample. Thus randomness is again a source of bias. Statistical Assumptions : In this, if the sample distribution is non-normal, we apply a transformation. However, this has dangers as well; an ill-considered transformation can do more harm than good in terms of interpretability of results. Errors in Methodology Statistical Power :   The power of any test of statistical significance is defined as the probability that it will reject a false null hypothesis. Statistical power is inversely related to beta or the  probability of making a Type II error. In short, power = 1  –    β . Statistical power is affected chiefly by the size of the effect and the size of the sample used to detect it. Bigger effects are easier to detect than smaller effects, while large samples offer greater test sensitivity than small samples. Multiple Comparisons : In statistics, the multiple comparisons occur when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. In certain fields it is known as the look-elsewhere effect. Multiple comparisons arise when a statistical analysis involves multiple statistical tests, each of which has a potential to  produce a discovery. The more inferences are made, the more likely erroneous inferences are to occur. Measurement Error : Measurement Error is the difference between a measured quantity and its true value. It includes random error (naturally occurring errors that are to be expected with any experiment) and systematic error (caused by a mis-calibrated instrument that affects all measurements. Two characteristics of measurement which are particularly important in  psychological measurement are reliability and validity. Reliability refers to the ability of a  Name: Nihal R. Dalvi Roll No: 08 Advanced Business Analytics  –  II Assignment 2 measurement instrument to measure the same thing each time it is used and Validity is the extent to which the indicator measures the thing it was designed to measure. Measurement errors can quickly grow in size when used in formulas. To account for this, we should use a formula for error propagation whenever we use uncertain measures in an experiment to calculate something else. Problems with interpretation Confusion over significance : A reasonable way to handle this sort of thing is to cast results in terms of effect sizes. By doing so, the size of the effect is presented in terms that make quantitative sense. A p-value merely indicates the probability of a particular set of data being generated by the null model and has little to say about size of a deviation from that model. Precision and Accuracy :   Accuracy refers to the closeness of a measured value to a standard or known value. Precision refers to the closeness of two or more measurements to each other. A measurement system can be accurate but not precise, precise but not accurate, neither, or both and is considered valid if it is both accurate and precise. Causality : Causality is the natural or worldly agency or efficacy that connects one process (the cause) with another process or state (the effect), where the first is partly responsible for the second, and the second is partly dependent on the first. Statistics and economics usually employ  pre-existing data or experimental data to infer causality by regression methods. The bottom line on causal inference is that we must have random assignment. Graphical Representations : In this, the Lie Factor is the ratio of the difference in the proportion of the graphic elements versus the difference in the quantities they represent. The most informative graphics are those with a Lie Factor of 1. One more element is that the changes in the scale of the graphic should always correspond to changes in the data being represented. Another trouble spot with graphs is multidimensional variation. This occurs where two-dimensional figures are used to represent one-dimensional values.
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks