Pearson Correlation
Context
In BayesiaLab's approach to learning and analyzing Bayesian networks, statistical concepts play a secondary role compared to concepts from the field of Information Theory.
Nevertheless, statistical measures, such as correlation, can provide certain insights that are unavailable from non-statistical measures.
Definition
The Pearson Correlation Coefficient between two nodes and is defined as the covariance of the two corresponding variables divided by the product of their standard deviations:
Where the covariance is defined by:
And the standard deviation:
is the value that is associated with the state .
is the Expected Value of the node
is the marginal probability of state returned by the Bayesian network
is the joint probability of states and returned by the Bayesian network
Special Considerations
For calculating the Pearson Correlation , BayesiaLab must use the values of node states.
In BayesiaLab, there are Discrete Nodes and Continuous Nodes with discretized numerical states. As a result, the value of a node's state may not always be apparent:
For Discrete Nodes that have states with integer or real values, BayesiaLab uses these numerical values directly.
For Discrete Nodes that have states without values, e.g., {red, green, blue}, BayesiaLab uses the indices of the states as values, i.e., {red, green, blue} would have the values {0, 1, 2} for the purpose of calculating . Note that the index of states starts at 0.
For Continuous Nodes, BayesiaLab uses these mean values of each interval.
Please see Mean, Value, and Standard Deviations for a detailed discussion.
Last updated