Random Sampling
Random sampling refers to selecting a subset of elements from a population by chance, with the goal of estimating properties of the whole population from the sample. Some key algorithms:
- Simple random sampling - Sample with equal probability
- Stratified sampling - Partition into subgroups and sample from each
- Reservoir sampling - Sample fixed size subset from stream
Java - Simple random sample:
|
|
C++ - Stratified sampling:
|
|
Python - Reservoir sampling:
|
|
Random sampling provides unbiased estimation of population characteristics from a subset. Useful in statistics, surveys, machine learning.
Random sampling is the process of selecting a subset of items from a larger set, where each item has an equal chance of being chosen. It’s a key technique in statistics and machine learning for estimating population parameters and training models. The primary takeaway is that random sampling provides a quick, unbiased representation of the larger set.
Java Code for Random Sampling
In Java, you can use the java.util.Random
class for generating random numbers. The following code randomly samples k
elements from an array arr
of integers.
|
|
sample(int[] arr, int k)
randomizes the positions ofk
elements in the arrayarr
.
C++ Code for Random Sampling
In C++, you can use <random>
library. The function below performs random sampling on a C++ vector.
|
|
sample(std::vector<int>& arr, int k)
swapsk
random positions in the vectorarr
.
Python Code for Random Sampling
In Python, the random
library can be used. Below is a Python function that performs the same task.
|
|
sample(arr, k)
replacesk
elements in the listarr
with randomly selected elements from the list.
All these implementations use the Fisher-Yates shuffle algorithm to select a random sample. The sample
function modifies the original array or vector, placing the random sample in the first k
positions.