Blog

ProjectBlog: Intuitive Introduction to Statistics

Blog: Intuitive Introduction to Statistics

A straight-forward introduction to statistics and its use cases.

Welcome to my first blog on statistics. I’ve been meaning to kick-start my journey in blogging with statistics for some time. Although I have had formal statistics course in my under-graduation few years back, it is now that I’m finally intrigued by this subject. My blog is a gentle and straight-forward introduction to the world of statistics based on a book ‘Statistics in Plain English’ by Timothy C. Urdan that I’m reading. I urge everyone looking for a quick refresher to go grab a copy of this book. So, without further ado, let’s jump right in.

Defining statistics

In a very layman term, statistics is getting your head around the data. Formally, it is the science of collecting, organizing, analyzing, interpreting, and presenting data. There’s no discipline untouched by statistics, be it business, finance, physical and social science, humanities, government to name a few. It is a widely used tool to make sense of the profusion of data.

In statistical terminology statistic also refers to fact or piece of data obtained from a study of a large quantity of numerical data. For instance, an arithmetic mean of a numerical data is a statistic of that data. More on this in later blogs.

Why study statistics?

Statistics is all around us. We come across statistics on our social media maybe as social science surveys like the average income of an American household, news channels projecting election polls, or your smartphone telling you the percentage of time you spend daily on your apps. It is implausible to live a life without statistics. People who do not understand statistics are either forced to accept the interpretations that the statisticians offer or reject them completely. Distrust of statistics among such people is wide and common. They believe that statisticians can lie with statistics and make them say whatever they want. Do they? In fact, when a statistician calculates the statistics correctly, he or she cannot make them say anything other than what they say. Statistics never lie if done systematically and correctly. Interpretations may differ though.

“Statistics don’t lie. It’s the people who make up the statistics that lie.” ~ George W. Buck

Being skeptical is one thing, a better option is to gain an understanding of how statistics actually work and then use that understanding to interpret the statistics one sees and hears for oneself.

Uses of statistics

It will be wrong for instance to conclude that there was zero infrastructural progress in my country based on my personal experience of the city where I live in. This is an example of anecdotal data and it is idiosyncratic. A better approach is to collect data on infrastructural developments or sentiments of citizens on the same in hundreds of cities and towns across the country and let statistics speak of the truth. Statistics allow researchers to collect information from a large number of people and then summarize their experience. It allows researchers to make general statements about a population. It does not account for their personal experience. Through statistics we are dealing with typical or average experience.

Two of the great use cases of statistics are as follows:

1. Reach conclusions about general differences between groups — Are women taller than men in Ice Land? Do men and women differ in their enjoyment of certain movie? Do cancer patients survive longer using one drug than another? Is one method of teaching children to read more effective than another? In order to answer these questions, we need to collect data from randomly selected samples and compare these data using statistics. The conclusions are trustworthy than those of anecdotal evidences.

2. Check if scores on two variables are related — Is smoking cigarettes related to the possibility of developing lung cancer? If we know the area of the house, how accurately can we predict its price? Is it possible to predict possibility of violent crime in a neighborhood based on the average household income? ­­­Researchers have examined these and thousands of other questions using statistics!

After going through the above examples, you can think of a large number of interesting questions yourself that statistics can help answer. Look around your surrounding now and you’ll find one immediately.

Types of statistics

Statistics can be broadly classified into two categories — descriptive and inferential statistics. Descriptive statistics is all about finding indicative numbers that somehow represent all of the data. Measures of central tendency like mean, median, and mode as well as measures of variability like variance and standard deviance are all examples of descriptive statistics. Inferential statistics is used when we use the available data to draw conclusions about a larger thing. Understanding inferential statistics boils down to understanding the difference between sample and population upon which I’ll write a detailed post in future. Intuitively, a sample is a subset of a population. For example, surveying 10 random students to determine the level of stress of a class of 50 population. We clearly haven’t surveyed the entire population but a sample. If we can apply some math operation on the sample, we maybe able to make inferences or conclusions about the population as a whole. That is the big picture of what inferential statistics is all about.

Conclusion

Statistics is a crucial process behind how we make discoveries in science, take decisions based on data, and make predictions. It is used in almost all fields to make sense of the vast amount of data that are available. Learning statistics can help you make an impact in your chosen field! Thanks for reading.

Source: Artificial Intelligence on Medium