I feel distribution is about a noisegen. There's always some natural source of randomness --

- people making choices

- coin flip

- height of people

All of these could be simulated then characterized by some infinitely sophisticated computer "noisegen". For each noisegen we can sample it 1000 times and plot a histogram. If we sample infinite times, we get a pdf curve like

* uniform distribution

* binomial distribution

* normal distribution

The natural distributions may not follow any mathematically well-known distribution. If you analyze some astronomical occurrence, perhaps there's no math formula to describe it. In fact, even the familiar thick-tail may not have a closed-form pdf.

Nevertheless the probability distribution is arguably the most "needed" foundation of statistics. Note prob dist is about the Next noisegen output. (I don't prefer "future" -- When Galileo dropped his 2 cannonballs, no one know for sure which one would land first, even though it was in the past.) Every noisegen is presumed consistent though its internal parameters may change over time.

I feel probability study is about theoretical models of the distribution; statistics is about picking/adjusting these models to fit observed data. Here's a good contrast -- In device physics and electronic circuits, everyone uses fundamental circuit models. Real devices always show deviation from the models, but the deviations are small and well understood.

In probability theories, the noisegen is perfect, consistent, stable and "predictable". In statistics we don't know how many noisegens are at play, which well-known noisegen is the closest, or how the noisegen evolves over time.

I feel probability theories build theoretical noisegen models largely to help the statisticians.

## No comments:

Post a Comment