# R: Samples and Populations

This article aims to show you how to either create a random population or import a dataset then take a random sample using `R`

.

## What is a Sample?

So when you have a population of something, you'll start to notice that the population has certain characteristics. The characteristics (or parameters) could include the average (`mean`

) of the population, the `standard deviation`

of the population or something else.

In certain situtations, you might not know what those population parameters are so the way we try to estimate one is by taking a sample and study it.

## Taking a Sample

When you take a sample, you need to know how many items to take. This is called the sample count and we will refer to as `n`

. From there, we can calculate a statistic that will be used to estimate a parameter.

# Generate a Population, Take a Sample

## Create a Random Population in R

We use the matrix object to create a random matrix.

```
# Set the seed of R's random number generator, which is useful for creating simulations or random objects that can be reproduced.
# set.seed(5)
# Create a matrix object
nCols = 5
nRows = 3
population <- matrix( runif(nCols * nRows), ncol = nCols )
# Print the Population
print.listof(list(population))
```

Here's how to only show the first row of apopulation.

```
# Print the First Row of the Population
first_row <- population[1,]
print(first_row)
```

Here's how to show the first column of a population.

```
# Print the First Column of the Population
first_col <- population[,1]
print(first_col)
```

## Take a Sample from the Population

```
# Create a Random Population
nCols = 5
nRows = 3
population <- matrix( runif(nCols * nRows), ncol = nCols )
# Print the Population
print.listof(list(population))
# Generate a Random Sample from the Population
n <- 5
random_sample = sample(population, n)
# Print a Random Sample from the Population
sprintf("Randon Sample of %s item is %1.7f", length(n), random_sample)
```

# Import a Population, Take a Sample

In this example, we are importing CSV data from Github and taking a random sample.

```
# This will allow you to reproduce the same random results I do.
set.seed(10)
# 2. Load CSV Data
df <- read.csv('https://raw.githubusercontent.com/thomaspernet/data_csv_r/master/data/women.csv', header=T )
# 3. Get a Random Sample of the Data
num_of_rows = 10
my_sample = df[sample(nrow(df), num_of_rows), ]
print(my_sample)
```

# Resources

- Sampling Distributions on Khan Academy.
- Generating Random Samples from Other Distributions