Description

Bayesian Inference

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

Bayesian inference
From Wikipedia, the free encyclopedia
In statistics,
Bayesian inference
is a method of inference in which Bayes' rule is used to update the probabilityestimate for a hypothesis as additional evidence is learned. Bayesian updating is an important techniquethroughout statistics, and especially in mathematical statistics: Exhibiting a Bayesian derivation for a statisticalmethod automatically ensures that the method works as well as any competing method, for some cases.Bayesian updating is especially important in the dynamic analysis of a sequence of data. Bayesian inference hasfound application in a range of fields including science, engineering, medicine, and law.In the philosophy of decision theory, Bayesian inference is closely related to discussions of subjective probability, often called Bayesian probability. Bayesian probability provides a rational method for updating beliefs;
[1][2]
however, non-Bayesian updating rules are compatible with rationality, according to Ian Hackingand Bas van Fraassen.
Contents
1 Introduction to Bayes' rule1.1 Formal1.2 Informal1.3 Bayesian updating2 Formal description of Bayesian inference2.1 Definitions2.2 Bayesian inference2.3 Bayesian prediction3 Inference over exclusive and exhaustive possibilities3.1 General formulation3.2 Multiple observations3.3 Parametric formulation4 Mathematical properties4.1 Interpretation of factor 4.2 Cromwell's rule4.3 Asymptotic behaviour of posterior 4.4 Conjugate priors4.5 Estimates of parameters and predictions5 Examples5.1 Probability of a hypothesis5.2 Making a prediction6 In frequentist statistics and decision theory6.1 Model selection7 Applications7.1 Computer applications7.2 In the courtroom7.3 Other 8 Bayes and Bayesian inference9 History10 Notes
Bayesian inference - Wikipedia, the free encyclopediafile:///F:/STUDIES/baysian/Bayesian_inference.htm1 of 1511-08-2012 15:14
11 References12 Further reading12.1 Elementary12.2 Intermediate or advanced13 External links
Introduction to Bayes' rule
Main article: Bayes' ruleSee also: Bayesian probability
Formal
Bayesian inference derives the posterior probability as a consequence of two antecedents, a prior probabilityand a likelihood function derived from a probability model for the data to be observed. Bayesian inferencecomputes the posterior probability according to Bayes' rule:where means
given
. stands for any
hypothesis
whose probability may be affected by data (called
evidence
below). Oftenthere are competing hypotheses, from which one chooses the most probable.the
evidence
corresponds to data that were not used in computing the prior probability., the
prior probability
, is the probability of
before
is observed. This indicates one's preconceived beliefs about how likely different hypotheses are, absent evidence regarding the instanceunder study., the
posterior probability
, is the probability of
given
, i.e.,
after
is observed. This tellsus what we want to know: the probability of a hypothesis
given
the observed evidence., the probability of observing
given
, is also known as the
likelihood
. It indicates thecompatibility of the evidence with the given hypothesis. is sometimes termed the marginal likelihood or model evidence . This factor is the same for all possible hypotheses being considered. (This can be seen by the fact that the hypothesis does notappear anywhere in the symbol, unlike for all the other factors.) This means that this factor does not enter into determining the relative probabilities of different hypotheses. Note that what affects the value of for different values of is only the factors and, which both appear in the numerator, and hence the posterior probability is proportional to both. Inwords:(more exactly)
The posterior probability of a hypothesis is determined by a combination of the inherent likeliness of a hypothesis (the prior) and the compatibility of the observed evidence with the hypothesis(the likelihood).
(more concisely)
Posterior is proportional to prior times likelihood.
Bayesian inference - Wikipedia, the free encyclopediafile:///F:/STUDIES/baysian/Bayesian_inference.htm2 of 1511-08-2012 15:14
Note that Bayes' rule can also be written as follows:where the factor represents the impact of on the probability of .
Informal
Rationally, Bayes' rule makes a great deal of sense. If the evidence doesn't match up with a hypothesis, oneshould reject the hypothesis. But if a hypothesis is extremely unlikely
a priori
, one should also reject it, even if the evidence does appear to match up.For example, imagine that I have various hypotheses about the nature of a newborn baby of a friend, including:: the baby is a brown-haired boy.: the baby is a blond-haired girl.: the baby is a dog.Then consider two scenarios:I'm presented with evidence in the form of a picture of a blond-haired baby girl. I find this evidencesupports and opposes and .1.I'm presented with evidence in the form of a picture of a baby dog. I don't find this evidence supports ,since my prior belief in this hypothesis (that a human can give birth to a dog) is extremely small.2.The critical point about Bayesian inference, then, is that it provides a principled way of combining new evidencewith prior beliefs, through the application of Bayes' rule. (Contrast this with frequentist inference, which reliesonly on the evidence as a whole, with no reference to prior beliefs.) Furthermore, Bayes' rule can be appliediteratively: after observing some evidence, the resulting posterior probability can then be treated as a prior probability, and a new posterior probability computed from new evidence. This allows for Bayesian principles to be applied to various kinds of evidence, whether viewed all at once or over time. This procedure is termed
Bayesian updating
.
Bayesian updating
Bayesian updating is widely used and computationally convenient. However, it is not the only updating rule thatmight be considered rational. Ian Hacking noted that traditional Dutch book arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. Hacking wrote
[3]
And neither the Dutch book argument, nor any other in the personalist arsenal of proofs of the probability axioms, entails the dynamicassumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. Itis true that in consistency a personalist could abandon the Bayesian model of learning from experience. Saltcould lose its savour. Indeed, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on probability kinematics following the publication of Richard C. Jeffrey's rule, which applies Bayes' rule to thecase where the evidence itself is assigned a probability [1] (http://plato.stanford.edu/entries/bayes-theorem/) ).The additional hypotheses needed to uniquely require Bayesian updating have been deemed to be substantial,
Bayesian inference - Wikipedia, the free encyclopediafile:///F:/STUDIES/baysian/Bayesian_inference.htm3 of 1511-08-2012 15:14
complicated, and unsatisfactory.
[4]
Formal description of Bayesian inference
Definitions
, a data point in general. This may in fact be a vector of values., the parameter of the data point's distribution, i.e. . This may in fact be a vector of parameters., the hyperparameter of the parameter, i.e. . This may in fact be a vector of hyperparameters., a set of observed data points, i.e. ., a new data point whose distribution is to be predicted.
Bayesian inference
The prior distribution is the distribution of the parameter(s) before any data is observed, i.e. .The sampling distribution is the distribution of the observed data conditional on its parameters, i.e. . This is also termed the likelihood, especially when viewed as a function of the parameter(s),sometimes written .The marginal likelihood (sometimes also termed the
evidence
) is the distribution of the observed datamarginalized over the parameter(s), i.e. .The posterior distribution is the distribution of the parameter(s) after taking into account the observeddata. This is determined by Bayes' rule, which forms the heart of Bayesian inference: Note that this is expressed in words as posterior is proportional to prior times likelihood , or sometimes as posterior = prior times likelihood, over evidence .
Bayesian prediction
The posterior predictive distribution is the distribution of a new data point, marginalized over the posterior:The prior predictive distribution is the distribution of a new data point, marginalized over the prior:Bayesian theory calls for the use of the posterior predictive distribution to do predictive inference, i.e. to predictthe distribution of a new, unobserved data point. Only this way is the entire posterior distribution of the
Bayesian inference - Wikipedia, the free encyclopediafile:///F:/STUDIES/baysian/Bayesian_inference.htm4 of 1511-08-2012 15:14

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks