🐍 Python Stats Cheatsheet

Your Python Stats Survival Kit 🎉

Every example below is 100% copy-paste ready. Just plug in your data and run! Each code block includes all imports, calculations, and interpretations.

No PhD required. No complicated setup. Just practical Python code for real marketing analysis.

⚙️

One-Time Setup

Install these packages once (skip if already installed):

pip install numpy scipy pandas statsmodels matplotlib

✓ All examples below use these packages

📊

T-Test

"Comparing two groups (like A/B testing)"

When to use:

✓ Comparing averages of 2 groups
✓ A/B testing conversion rates, email open rates, etc.
✓ Before/after comparisons

Copy-Paste Example: Independent T-Test

import numpy as np
from scipy import stats

# Your data: conversion rates (%) for two campaigns
campaign_a = np.array([5.2, 6.1, 5.8, 6.3, 5.5, 6.0, 5.7])
campaign_b = np.array([4.8, 5.2, 5.0, 5.5, 4.9, 5.1, 5.3])

# Run the test
t_stat, p_value = stats.ttest_ind(campaign_a, campaign_b)

# Calculate Cohen's d (effect size)
mean_diff = np.mean(campaign_a) - np.mean(campaign_b)
pooled_std = np.sqrt((np.var(campaign_a, ddof=1) + np.var(campaign_b, ddof=1)) / 2)
cohens_d = mean_diff / pooled_std

# Results
print(f"Campaign A mean: {np.mean(campaign_a):.2f}%")
print(f"Campaign B mean: {np.mean(campaign_b):.2f}%")
print(f"P-value: {p_value:.4f}")
print(f"Cohen's d: {cohens_d:.3f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

How to interpret:

P-value < 0.05? Campaigns are significantly different
Cohen's d: 0.2=small, 0.5=medium, 0.8=large effect
Means: Which campaign performed better?

📈

ANOVA

"Comparing 3+ groups at once"

When to use:

✓ Testing 3+ variations simultaneously
✓ Comparing multiple email subject lines
✓ Analyzing performance across regions

Copy-Paste Example: One-Way ANOVA

import numpy as np
from scipy import stats

# Your data: open rates (%) for three subject line types
questions = np.array([12.5, 15.3, 13.8, 14.2, 16.1])
urgency = np.array([18.2, 19.5, 17.8, 20.1, 18.9])
personalized = np.array([14.5, 15.8, 13.9, 16.2, 14.7])

# Run ANOVA
f_stat, p_value = stats.f_oneway(questions, urgency, personalized)

# Calculate eta-squared (effect size)
all_data = np.concatenate([questions, urgency, personalized])
grand_mean = np.mean(all_data)
ssb = len(questions) * (np.mean(questions) - grand_mean)**2 + \
      len(urgency) * (np.mean(urgency) - grand_mean)**2 + \
      len(personalized) * (np.mean(personalized) - grand_mean)**2
sst = np.sum((all_data - grand_mean)**2)
eta_squared = ssb / sst

# Results
print(f"Questions: {np.mean(questions):.1f}%")
print(f"Urgency: {np.mean(urgency):.1f}%")
print(f"Personalized: {np.mean(personalized):.1f}%")
print(f"\nF-statistic: {f_stat:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"Eta-squared: {eta_squared:.3f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

How to interpret:

P-value < 0.05? At least one group differs (but doesn't say which!)
Eta-squared: % of variance explained by groups
Look at means: Which group is highest/lowest?

🎲

Chi-Square

"Testing if categories are related"

When to use:

✓ Testing if age affects product preference
✓ Checking if day of week affects purchases
✓ Analyzing survey responses by demographics

Copy-Paste Example: Test of Independence

import numpy as np
from scipy import stats

# Your data: Age group vs Product preference
# Rows: 18-29, 30-44, 45-60  |  Columns: Product A, B, C
observed = np.array([
    [45, 30, 25],  # 18-29
    [35, 50, 30],  # 30-44
    [20, 40, 55]   # 45-60
])

# Run Chi-Square test
chi2, p_value, dof, expected = stats.chi2_contingency(observed)

# Calculate Cramér's V (effect size)
n = np.sum(observed)
min_dim = min(observed.shape[0] - 1, observed.shape[1] - 1)
cramers_v = np.sqrt(chi2 / (n * min_dim))

# Results
print("Observed frequencies:")
print(observed)
print(f"\nChi-square: {chi2:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"Cramér's V: {cramers_v:.3f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

# Interpret effect size
if cramers_v < 0.1:
    strength = "weak"
elif cramers_v < 0.3:
    strength = "moderate"
else:
    strength = "strong"
print(f"Association: {strength}")

How to interpret:

P-value < 0.05? Variables are related
Cramér's V: 0=no relationship, 1=perfect relationship
Check table: Which cells are way higher/lower than expected?

📈

Correlation

"How two things move together"

When to use:

✓ Is ad spend related to sales?
✓ Does email length affect open rate?
✓ Measuring any linear relationship

Copy-Paste Example: Pearson Correlation

import numpy as np
from scipy import stats

# Your data
ad_spend = np.array([10, 15, 12, 18, 20, 14, 16, 22, 19, 17])  # $1000s
sales = np.array([50, 75, 60, 90, 100, 70, 80, 110, 95, 85])   # $1000s

# Calculate correlation
r, p_value = stats.pearsonr(ad_spend, sales)
r_squared = r ** 2

# Results
print(f"Correlation (r): {r:.3f}")
print(f"R-squared: {r_squared:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

# Interpret strength
if abs(r) < 0.3:
    strength = "weak"
elif abs(r) < 0.7:
    strength = "moderate"
else:
    strength = "strong"
direction = "positive" if r > 0 else "negative"
print(f"\n{strength.capitalize()} {direction} correlation")

How to interpret:

r = 1: Perfect positive relationship
r = 0: No relationship
r = -1: Perfect negative relationship
Remember: Correlation ≠ Causation!

📉

Regression

"Predicting one variable from another"

When to use:

✓ Predicting sales from ad spend
✓ Forecasting clicks from email length
✓ Getting a prediction equation

Copy-Paste Example: Simple Linear Regression

import numpy as np
from scipy import stats

# Your data
ad_spend = np.array([10, 15, 12, 18, 20, 14, 16, 22, 19, 17])
sales = np.array([50, 75, 60, 90, 100, 70, 80, 110, 95, 85])

# Run regression
slope, intercept, r_value, p_value, std_err = stats.linregress(ad_spend, sales)

# Results
print(f"Equation: Sales = {intercept:.2f} + {slope:.2f} × Ad Spend")
print(f"R-squared: {r_value**2:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

# Make predictions
new_ad_spend = 25
predicted_sales = slope * new_ad_spend + intercept
print(f"\nPrediction: ${new_ad_spend}k ad → ${predicted_sales:.2f}k sales")

How to interpret:

Slope: For every $1k more in ad spend, sales increase by $[slope]k
Intercept: Predicted sales when ad spend is zero
R-squared: How well the line fits

🎯

Mann-Whitney

"T-test's outlier-proof cousin"

When to use:

✓ Comparing 2 groups with skewed data
✓ Data has outliers
✓ Small sample sizes

Copy-Paste Example: Mann-Whitney U Test

import numpy as np
from scipy import stats

# Your data: session duration (minutes) - note the outlier!
mobile = np.array([2.3, 3.1, 2.8, 3.5, 2.5, 3.0, 15.5])  # outlier!
desktop = np.array([4.5, 5.2, 4.8, 5.0, 4.3, 5.5, 4.7])

# Run Mann-Whitney test
u_stat, p_value = stats.mannwhitneyu(mobile, desktop, alternative='two-sided')

# Results (use medians, not means!)
print(f"Mobile median: {np.median(mobile):.2f} min")
print(f"Desktop median: {np.median(desktop):.2f} min")
print(f"U-statistic: {u_stat:.1f}")
print(f"P-value: {p_value:.4f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

How to interpret:

P-value < 0.05? Groups differ
Compare medians: Which group is typically higher?
Why use this? Outliers don't mess up the results

🎲

Proportions (Z-Test)

"Comparing percentages and rates"

When to use:

✓ Comparing CTR, conversion rates
✓ Success rates between groups
✓ Large samples (n > 30)

Copy-Paste Example: Compare Two Proportions

import numpy as np
from scipy import stats

# Your data
n1 = 1000         # Landing page A visitors
conversions1 = 52 # Conversions from A
p1 = conversions1 / n1

n2 = 950          # Landing page B visitors
conversions2 = 68 # Conversions from B
p2 = conversions2 / n2

# Calculate pooled proportion
pooled_p = (conversions1 + conversions2) / (n1 + n2)

# Standard error and z-statistic
se = np.sqrt(pooled_p * (1 - pooled_p) * (1/n1 + 1/n2))
z_stat = (p1 - p2) / se
p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))

# Results
print(f"Page A: {p1*100:.2f}% conversion")
print(f"Page B: {p2*100:.2f}% conversion")
print(f"Difference: {(p1-p2)*100:+.2f} pp")
print(f"Z-statistic: {z_stat:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"Significant? {'YES ✓' if p_value < 0.05 else 'No ✗'}")

How to interpret:

P-value < 0.05? Conversion rates differ
Difference: How many percentage points better?
Watch out: Huge samples = tiny differences can be "significant"

🎰

Bayesian A/B Testing

"Probability that B beats A"

When to use:

✓ Want to know "how likely is B better?"
✓ A/B testing with continuous monitoring
✓ More intuitive than p-values

Copy-Paste Example: Bayesian A/B Test

import numpy as np
from scipy import stats

# Your data
n_a = 1000
conversions_a = 52
n_b = 950
conversions_b = 68

# Bayesian update: Beta(1,1) prior + data
alpha_a = 1 + conversions_a
beta_a = 1 + (n_a - conversions_a)
alpha_b = 1 + conversions_b
beta_b = 1 + (n_b - conversions_b)

# Posterior means
mean_a = alpha_a / (alpha_a + beta_a)
mean_b = alpha_b / (alpha_b + beta_b)

# 95% credible intervals
ci_a = (stats.beta.ppf(0.025, alpha_a, beta_a),
        stats.beta.ppf(0.975, alpha_a, beta_a))
ci_b = (stats.beta.ppf(0.025, alpha_b, beta_b),
        stats.beta.ppf(0.975, alpha_b, beta_b))

# Monte Carlo: P(B > A)
np.random.seed(42)
samples_a = stats.beta.rvs(alpha_a, beta_a, size=100000)
samples_b = stats.beta.rvs(alpha_b, beta_b, size=100000)
prob_b_better = np.mean(samples_b > samples_a)

# Results
print(f"A: {mean_a*100:.2f}% (95% CI: {ci_a[0]*100:.2f}%-{ci_a[1]*100:.2f}%)")
print(f"B: {mean_b*100:.2f}% (95% CI: {ci_b[0]*100:.2f}%-{ci_b[1]*100:.2f}%)")
print(f"\nP(B > A) = {prob_b_better*100:.1f}%")
print(f"P(A > B) = {(1-prob_b_better)*100:.1f}%")

# Decision
if prob_b_better > 0.95:
    print(f"\n✓ Strong evidence for B!")
elif prob_b_better < 0.05:
    print(f"\n✓ Strong evidence for A!")
else:
    print(f"\n⚠ Inconclusive - collect more data")

How to interpret:

P(B > A) > 95%? Very confident B is better
Credible interval: Range where true rate likely lies
Advantage: Get direct probability of which is better

📊

Multiple Regression

Predict an outcome using multiple predictor variables at once

💡 Scenario:

Predict monthly sales from ad spend, seasonality, and price

# === MULTIPLE REGRESSION ===
# Predict sales from ad_spend, season, and price

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from statsmodels.stats.outliers_influence import variance_inflation_factor
import statsmodels.api as sm

# Sample data: Monthly sales data
data = pd.DataFrame({
    'sales': [120, 145, 132, 158, 125, 169, 140, 175, 138, 162],
    'ad_spend': [10, 15, 11, 18, 12, 20, 14, 22, 13, 19],  # $1000s
    'season': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],  # 0=off, 1=peak
    'price': [50, 48, 52, 47, 51, 46, 49, 45, 50, 47]  # in dollars
})

# Prepare predictors (X) and outcome (y)
X = data[['ad_spend', 'season', 'price']]
y = data['sales']

# === METHOD 1: sklearn (quick) ===
model = LinearRegression()
model.fit(X, y)

print("=== REGRESSION RESULTS ===")
print(f"R-squared: {model.score(X, y):.3f}")
print(f"\nCoefficients:")
for name, coef in zip(X.columns, model.coef_):
    print(f"  {name}: {coef:.2f}")
print(f"  intercept: {model.intercept_:.2f}")

# === METHOD 2: statsmodels (detailed stats) ===
X_with_intercept = sm.add_constant(X)  # Add intercept
model_sm = sm.OLS(y, X_with_intercept).fit()

print("\n=== DETAILED STATISTICS ===")
print(model_sm.summary())

# === Check for multicollinearity (VIF) ===
print("\n=== MULTICOLLINEARITY CHECK (VIF) ===")
vif_data = pd.DataFrame()
vif_data["Variable"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
print(vif_data)
print("\n⚠ VIF > 10 indicates multicollinearity problem")

# === Make predictions ===
new_data = pd.DataFrame({
    'ad_spend': [16],
    'season': [1],
    'price': [48]
})
prediction = model.predict(new_data)
print(f"\nPredicted sales: ${prediction[0]:.0f}k")

How to interpret:

Coefficients: Change in outcome per unit increase (holding others constant)
R²: % of variance explained (higher = better fit)
P-values: Which predictors are statistically significant
VIF < 5: Good, VIF > 10: Multicollinearity problem

🔍

Post-Hoc Tests

Find which specific groups differ after significant ANOVA

💡 Scenario:

After ANOVA shows email types differ, find which ones specifically

# === POST-HOC TESTS (Tukey HSD & Bonferroni) ===
# After ANOVA, find which groups differ

import numpy as np
import pandas as pd
from scipy import stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd, MultiComparison

# Sample data: Email open rates for 4 subject line types
questions = [12.5, 14.2, 13.8, 15.1, 14.5]  # %
urgency = [18.3, 19.5, 18.9, 19.8, 20.2]
personalized = [14.8, 15.5, 14.9, 15.8, 15.2]
emoji = [16.9, 17.5, 17.2, 18.1, 16.8]

# Combine into long format for post-hoc tests
data = pd.DataFrame({
    'open_rate': questions + urgency + personalized + emoji,
    'subject_type': ['Questions']*5 + ['Urgency']*5 + ['Personalized']*5 + ['Emoji']*5
})

# === First, run ANOVA (prerequisite) ===
groups = [questions, urgency, personalized, emoji]
f_stat, p_value = stats.f_oneway(*groups)

print("=== ANOVA (Prerequisite) ===")
print(f"F-statistic: {f_stat:.3f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("✓ ANOVA significant - proceed with post-hoc tests\n")

    # === Tukey HSD (Honestly Significant Difference) ===
    print("=== TUKEY HSD POST-HOC TEST ===")
    tukey = pairwise_tukeyhsd(
        endog=data['open_rate'],
        groups=data['subject_type'],
        alpha=0.05
    )
    print(tukey)

    # === Bonferroni correction ===
    print("\n=== BONFERRONI CORRECTION ===")
    mc = MultiComparison(data['open_rate'], data['subject_type'])
    bonferroni_result = mc.allpairtest(stats.ttest_ind, method='bonf')
    print(bonferroni_result[0])

else:
    print("✗ ANOVA not significant - don't run post-hoc tests")

# === Group summaries ===
print("\n=== GROUP SUMMARIES ===")
summary = data.groupby('subject_type')['open_rate'].agg(['mean', 'std', 'count'])
print(summary)

How to interpret:

Tukey output: "reject=True" means those groups significantly differ
Bonferroni: More conservative, divides α by number of comparisons
Only run if ANOVA is significant! (p < 0.05)

🎯

Logistic Regression

Predict binary outcomes (yes/no, 0/1, convert/don't convert)

💡 Scenario:

Predict whether a customer will purchase based on age, income, and website visits

# === LOGISTIC REGRESSION ===
# Predict binary outcome (0/1, yes/no)

import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score
import statsmodels.api as sm

# Sample data: Customer purchases
data = pd.DataFrame({
    'purchased': [1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1],
    'age': [35, 28, 42, 50, 22, 38, 25, 45, 55, 30, 40, 26, 48, 24, 52],
    'income': [65, 45, 75, 85, 35, 70, 40, 80, 95, 50, 72, 42, 88, 38, 90],  # $1000s
    'visits': [5, 2, 8, 10, 1, 6, 3, 9, 12, 4, 7, 2, 11, 1, 10]
})

# Prepare data
X = data[['age', 'income', 'visits']]
y = data['purchased']

# === METHOD 1: sklearn (quick predictions) ===
model = LogisticRegression()
model.fit(X, y)

# Predictions
y_pred = model.predict(X)
y_prob = model.predict_proba(X)[:, 1]  # Probability of class 1

print("=== LOGISTIC REGRESSION RESULTS ===")
print(f"Coefficients:")
for name, coef in zip(X.columns, model.coef_[0]):
    odds_ratio = np.exp(coef)
    print(f"  {name}: β={coef:.3f}, OR={odds_ratio:.2f}")

# === Confusion Matrix ===
print("\n=== CONFUSION MATRIX ===")
cm = confusion_matrix(y, y_pred)
print(cm)
print(f"\nTrue Negatives: {cm[0,0]}, False Positives: {cm[0,1]}")
print(f"False Negatives: {cm[1,0]}, True Positives: {cm[1,1]}")

# === Classification Metrics ===
print("\n=== CLASSIFICATION REPORT ===")
print(classification_report(y, y_pred, target_names=['No Purchase', 'Purchase']))

# Accuracy
accuracy = (y == y_pred).mean()
print(f"\nAccuracy: {accuracy:.2%}")

# AUC-ROC
try:
    auc = roc_auc_score(y, y_prob)
    print(f"AUC-ROC: {auc:.3f}")
except:
    print("AUC-ROC: Need more variation in data")

# === METHOD 2: statsmodels (detailed stats & p-values) ===
print("\n=== DETAILED STATISTICS ===")
X_with_intercept = sm.add_constant(X)
logit_model = sm.Logit(y, X_with_intercept).fit()
print(logit_model.summary())

# === Make prediction for new customer ===
new_customer = pd.DataFrame({
    'age': [40],
    'income': [75],
    'visits': [8]
})
prob = model.predict_proba(new_customer)[0, 1]
print(f"\n=== PREDICTION ===")
print(f"Probability of purchase: {prob:.1%}")

How to interpret:

Odds Ratio (OR): OR=2.0 means doubling the odds per unit increase
Confusion Matrix: Shows correct/incorrect predictions
Precision: Of predicted "Yes", how many are really "Yes"
Recall: Of actual "Yes", how many did we catch
AUC-ROC: 0.5=random, 1.0=perfect

🔵

Cluster Analysis (K-Means)

Find natural groups/segments in your data without predefined labels

💡 Scenario:

Segment customers into groups based on age, income, and purchase behavior

# === K-MEANS CLUSTERING ===
# Find natural customer segments

import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Sample data: Customer information
data = pd.DataFrame({
    'customer_id': range(1, 21),
    'age': [25, 45, 32, 52, 28, 48, 35, 55, 22, 50, 30, 46, 27, 53, 33, 49, 24, 51, 29, 47],
    'income': [35, 75, 42, 85, 38, 78, 45, 90, 32, 80, 40, 72, 36, 88, 44, 76, 34, 82, 39, 74],  # $1000s
    'purchase_freq': [2, 12, 3, 15, 2, 13, 4, 18, 1, 14, 3, 11, 2, 16, 4, 12, 2, 15, 3, 13]  # per year
})

# Select features for clustering
features = ['age', 'income', 'purchase_freq']
X = data[features]

# === IMPORTANT: Standardize features (required for K-means!) ===
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# === Find optimal number of clusters (Elbow Method) ===
wcss = []  # Within-Cluster Sum of Squares
K_range = range(1, 11)

for k in K_range:
    kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)
    kmeans.fit(X_scaled)
    wcss.append(kmeans.inertia_)

print("=== ELBOW PLOT DATA ===")
for k, score in zip(K_range, wcss):
    print(f"k={k}: WCSS={score:.2f}")

# Plot elbow curve (look for the "elbow")
# plt.plot(K_range, wcss, 'bo-')
# plt.xlabel('Number of Clusters (k)')
# plt.ylabel('WCSS')
# plt.title('Elbow Method')
# plt.show()

# === Run K-means with chosen k (let's say k=3) ===
k = 3
kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)
data['cluster'] = kmeans.fit_predict(X_scaled)

print(f"\n=== K-MEANS WITH k={k} ===")

# === Cluster centroids (in original scale) ===
centroids_scaled = kmeans.cluster_centers_
centroids = scaler.inverse_transform(centroids_scaled)

print("\nCluster Centroids (average values):")
centroid_df = pd.DataFrame(centroids, columns=features)
centroid_df.index.name = 'Cluster'
print(centroid_df)

# === Cluster sizes ===
print("\n=== CLUSTER SIZES ===")
cluster_counts = data['cluster'].value_counts().sort_index()
print(cluster_counts)

# === Cluster summaries ===
print("\n=== CLUSTER SUMMARIES ===")
for cluster_id in range(k):
    cluster_data = data[data['cluster'] == cluster_id]
    print(f"\nCluster {cluster_id} ({len(cluster_data)} customers):")
    print(cluster_data[features].describe())

# === Interpret clusters ===
print("\n=== CLUSTER INTERPRETATION ===")
for idx, row in centroid_df.iterrows():
    age = "Young" if row['age'] < 35 else "Middle-aged" if row['age'] < 50 else "Mature"
    income = "low-income" if row['income'] < 45 else "mid-income" if row['income'] < 70 else "high-income"
    freq = "infrequent" if row['purchase_freq'] < 5 else "regular" if row['purchase_freq'] < 12 else "frequent"
    print(f"Cluster {idx}: {age}, {income}, {freq} buyers")

# === Assign new customer to cluster ===
new_customer = pd.DataFrame({
    'age': [40],
    'income': [60],
    'purchase_freq': [8]
})
new_customer_scaled = scaler.transform(new_customer)
cluster_assignment = kmeans.predict(new_customer_scaled)[0]
print(f"\n=== NEW CUSTOMER ASSIGNMENT ===")
print(f"New customer assigned to Cluster {cluster_assignment}")

How to interpret:

Elbow plot: Look for "elbow" where adding clusters stops helping much
Centroids: Average values for each cluster (the "typical" customer)
WCSS: Lower is tighter clusters (but watch for overfitting)
⚠ Always standardize! Features with larger scales will dominate

📋

Quick Reference Table

Test	Use When	Sample Size	Python Function
T-Test	2 groups, continuous	Small-Medium	stats.ttest_ind()
ANOVA	3+ groups, continuous	Small-Medium	stats.f_oneway()
Chi-Square	Categories related?	Any	stats.chi2_contingency()
Correlation	Linear relationship	Medium-Large	stats.pearsonr()
Regression	Predict one from other	Medium-Large	stats.linregress()
Mann-Whitney	2 groups, skewed data	Small-Medium	stats.mannwhitneyu()
Z-Test	Compare %s, rates	Large (>30)	Manual calc
Bayesian	A/B test probability	Any	stats.beta

💡 Pro Tips

Before Running Tests:

✓ Plot your data first
✓ Check for outliers
✓ Consider sample size
✓ Think about effect size

After Getting Results:

✓ Report effect size, not just p-value
✓ Include confidence intervals
✓ Think about practical significance
✓ Be honest about limitations

⚠️ Common Pitfalls:

❌ P-hacking: Running tests until you find p<0.05
❌ Multiple comparisons without correction
❌ Assuming causation from correlation
❌ Using wrong test for your data type