The main biological concepts covered are the fundamental statistical tools used in biostatistics to summarize, analyze, and interpret biological data. These include measures of central tendency (mean, median, mode) which describe the "average" or central value of a dataset, and measures of dispersion (standard deviation, range, percentile) which describe the spread or variability of the data.
In biology and other fields, data often varies. An average is a single value that represents the central point of a whole dataset. These are crucial for summarizing findings from experiments, population studies, and health monitoring.
The mean is the most common type of average. It is calculated by summing all the values in a dataset and dividing by the total number of values.
| Data Type | Formula | Explanation |
|---|---|---|
| Ungrouped Data | x = Each individual value<br />n = Total number of values | |
| Grouped Data | f = Frequency of each class<br />x = Midpoint of each class ([lower limit + upper limit] / 2)<br />Σf = Total number of values (sum of frequencies) |
The median is the middle value in a dataset that has been arranged in ascending or descending order. It divides the dataset into two equal halves.
For grouped data, the median is found by first calculating the cumulative frequency and then identifying the class interval that contains the middle value.
The mode is the value that appears most frequently in a dataset.
The modal group (or modal class) is the class interval with the highest frequency. The specific mode value is then calculated using a formula that considers the frequencies of the modal group and its adjacent groups.
l = Lower boundary of the modal groupf_m = Frequency of the modal groupf_1 = Frequency of the group before the modal groupf_2 = Frequency of the group after the modal grouph = Class interval widthStandard Deviation (SD) is a measure of dispersion or spread. It quantifies how much the individual data points in a set deviate from the mean.

| Data Type | Formula | Explanation |
|---|---|---|
| Population | μ = Population mean<br />N = Total population size | |
| Sample | x̄ = Sample mean<br />n = Sample size |
The range is the simplest measure of variability. It is the difference between the highest and lowest values in a dataset.
A percentile is a value below which a certain percentage of the data falls. It is used to understand an individual value's rank within a larger dataset.

Quartiles are specific percentiles that divide a dataset into four equal parts.
Q: What is the difference between measures of central tendency and measures of dispersion? A: Measures of central tendency (mean, median, mode) identify the center or typical value of a dataset. Measures of dispersion (range, standard deviation) describe how spread out the data is.
Q: Why is the median sometimes a better measure of central tendency than the mean? A: The median is not affected by extreme outliers (abnormally high or low values), whereas the mean can be significantly skewed by them.
Q: What does a standard deviation of zero indicate? A: A standard deviation of zero means that all values in the dataset are identical; there is no variation or spread.
Q: What is the difference between ungrouped and grouped data? A: Ungrouped data consists of raw, individual values. Grouped data is organized into class intervals with a frequency count for each interval, which is useful for summarizing large datasets.
| Measure | Description | Sensitivity to Outliers |
|---|---|---|
| Mean | The arithmetic average (sum of values / count of values). | High |
| Median | The middle value in an ordered dataset. | Low |
| Mode | The most frequently occurring value. | Low |
| Measure | Description | Key Feature |
|---|---|---|
| Range | Difference between the highest and lowest value. | Simple but highly sensitive to outliers. |
| Standard Deviation | Average amount of deviation from the mean. | The most common measure of spread; low SD means data is close to the mean. |
| Percentile/Quartile | Indicates the rank of a value relative to the rest of the data. | Divides data into 100 (percentiles) or 4 (quartiles) parts. |