Testing Methodologies & TDD#
Introduction#
Software testing is fundamental to delivering reliable, maintainable code. Beyond catching bugs, testing provides documentation, enables safe refactoring, and gives developers confidence when making changes.
Why Testing Matters:
Early Bug Detection: Find issues before they reach production
Documentation: Tests describe expected behavior better than comments
Refactoring Safety: Change code confidently knowing tests catch regressions
Design Feedback: Hard-to-test code often indicates design problems
Collaboration: Tests help team members understand and modify code safely
The Testing Pyramid#
The testing pyramid is a strategy for balancing different types of tests:
βββββββββββββ
β E2E β Slow, Expensive
β Tests β (~5-10%)
ββ΄ββββββββββββ΄β
β Integration β Medium Speed
β Tests β (~20-30%)
ββ΄ββββββββββββββ΄β
β Unit Tests β Fast, Cheap
β β (~60-70%)
βββββββββββββββββ
Level |
What It Tests |
Speed |
Scope |
|---|---|---|---|
Unit |
Individual functions/classes in isolation |
Fast (ms) |
Narrow |
Integration |
Components working together |
Medium (seconds) |
Medium |
E2E |
Full user workflows |
Slow (minutes) |
Wide |
Follow the pyramid: many fast unit tests, fewer integration tests, minimal E2E tests. Inverted pyramids lead to slow, brittle test suites.
Unit Testing#
Unit tests verify that individual βunitsβ of code work correctly in isolation. A unit is typically a function, method, or class.
Characteristics of Good Unit Tests (F.I.R.S.T.)#
Principle |
Description |
|---|---|
Fast |
Run in milliseconds; you should run them constantly |
Isolated |
No dependencies on external systems (DB, network, filesystem) |
Repeatable |
Same result every time, regardless of environment |
Self-validating |
Pass or fail clearly, no manual inspection needed |
Timely |
Written close to production code (ideally before with TDD) |
Python Example with pytest#
# calculator.py
def add(a: int, b: int) -> int:
return a + b
def divide(a: int, b: int) -> float:
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
# test_calculator.py
import pytest
from calculator import add, divide
class TestAdd:
def test_add_positive_numbers(self):
assert add(2, 3) == 5
def test_add_negative_numbers(self):
assert add(-1, -1) == -2
def test_add_zero(self):
assert add(5, 0) == 5
class TestDivide:
def test_divide_normal(self):
assert divide(10, 2) == 5.0
def test_divide_by_zero_raises_error(self):
with pytest.raises(ValueError, match="Cannot divide by zero"):
divide(10, 0)
# Run tests
pytest test_calculator.py -v
# Run with coverage
pytest --cov=calculator --cov-report=term-missing
Test Isolation with Mocking#
When code depends on external systems, use test doubles to isolate the unit:
Type |
Purpose |
Example |
|---|---|---|
Mock |
Verify interactions (was method called?) |
Check if email was sent |
Stub |
Provide canned responses |
Return fake API response |
Fake |
Working implementation (simplified) |
In-memory database |
Spy |
Record calls while using real implementation |
Track function calls |
# service.py
class PaymentService:
def __init__(self, payment_gateway):
self.gateway = payment_gateway
def process_payment(self, amount: float) -> bool:
if amount <= 0:
raise ValueError("Amount must be positive")
return self.gateway.charge(amount)
# test_service.py
from unittest.mock import Mock
from service import PaymentService
def test_process_payment_calls_gateway():
# Arrange: Create a mock gateway
mock_gateway = Mock()
mock_gateway.charge.return_value = True
service = PaymentService(mock_gateway)
# Act: Process payment
result = service.process_payment(100.0)
# Assert: Verify behavior
assert result is True
mock_gateway.charge.assert_called_once_with(100.0)
def test_process_payment_rejects_negative_amount():
mock_gateway = Mock()
service = PaymentService(mock_gateway)
with pytest.raises(ValueError):
service.process_payment(-50.0)
# Gateway should NOT be called for invalid amounts
mock_gateway.charge.assert_not_called()
White Box Testing#
graph LR
A[Input] --> B[Software\ninternal structure visible]
B --> C[Output]
style B fill:#fff,stroke:#333
White box testing examines the internal structure of code. The tester has full visibility into the implementation and designs tests to cover specific code paths.
Key Techniques#
Code Coverage Metrics:
Metric |
Description |
Target |
|---|---|---|
Statement Coverage |
% of code statements executed |
80%+ |
Branch Coverage |
% of decision branches taken |
75%+ |
Path Coverage |
% of possible execution paths |
Lower priority |
Example: Testing All Branches
# discount.py
def calculate_discount(price: float, is_member: bool, quantity: int) -> float:
"""Calculate discount based on membership and quantity."""
discount = 0.0
if is_member:
discount += 0.10 # 10% member discount
if quantity >= 10:
discount += 0.05 # 5% bulk discount
elif quantity >= 5:
discount += 0.02 # 2% small bulk
return price * (1 - discount)
# test_discount.py - White box approach: cover all branches
import pytest
from discount import calculate_discount
class TestCalculateDiscount:
# Test member branches
def test_member_gets_10_percent_discount(self):
assert calculate_discount(100, is_member=True, quantity=1) == 90.0
def test_non_member_no_member_discount(self):
assert calculate_discount(100, is_member=False, quantity=1) == 100.0
# Test quantity branches
def test_quantity_10_plus_gets_5_percent(self):
assert calculate_discount(100, is_member=False, quantity=10) == 95.0
def test_quantity_5_to_9_gets_2_percent(self):
assert calculate_discount(100, is_member=False, quantity=5) == 98.0
def test_quantity_under_5_no_bulk_discount(self):
assert calculate_discount(100, is_member=False, quantity=4) == 100.0
# Combined branches
def test_member_with_large_quantity(self):
# 10% member + 5% bulk = 15% discount
assert calculate_discount(100, is_member=True, quantity=10) == 85.0
When to Use White Box Testing#
Security-critical code paths
Complex algorithms with many branches
Achieving high code coverage requirements
Understanding legacy code behavior
Black Box Testing#
graph LR
A[Input] --> B[Black Box\ninternal structure hidden]
B --> C[Output]
style B fill:#000,color:#fff,stroke:#000
Black box testing examines behavior from the outside without knowledge of internal implementation. Tests are based on requirements and specifications.
Key Techniques#
Equivalence Partitioning: Divide inputs into classes that should behave the same way.
# Testing a registration form age field (valid: 18-120)
# Equivalence classes:
# - Invalid: age < 18 (test with 10)
# - Valid: 18 <= age <= 120 (test with 30)
# - Invalid: age > 120 (test with 130)
Boundary Value Analysis: Test at the edges of valid ranges.
# For age field (valid: 18-120)
# Test: 17 (invalid), 18 (valid), 119 (valid), 120 (valid), 121 (invalid)
@pytest.mark.parametrize("age,expected_valid", [
(17, False), # Just below minimum
(18, True), # At minimum
(50, True), # Middle (equivalence class representative)
(120, True), # At maximum
(121, False), # Just above maximum
])
def test_age_validation(age, expected_valid):
result = validate_age(age)
assert result == expected_valid
When to Use Black Box Testing#
API testing (testing contract, not implementation)
User acceptance testing
When testers donβt have access to source code
Regression testing from user perspective
White Box vs Black Box Comparison#
Aspect |
White Box |
Black Box |
|---|---|---|
Knowledge Required |
Full code visibility |
Only specifications |
Focus |
Internal code paths |
External behavior |
Performed By |
Developers |
Testers, QA, Users |
Test Design |
Based on code structure |
Based on requirements |
Coverage |
Measures code coverage |
Measures feature coverage |
Finds |
Logic errors, security flaws |
Missing features, spec violations |
Best For |
Unit testing, security |
Integration, E2E testing |
Both approaches are complementary! White box ensures code paths work correctly. Black box ensures the software meets user requirements. Use both for comprehensive coverage.
Test-Driven Development (TDD)#
TDD is a development methodology where you write tests before writing production code. It follows a strict cycle known as Red-Green-Refactor.
The TDD Cycle#
ββββββββββββββββββββ
β RED β
β Write a failing β
β test β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β GREEN β
β Write minimal β
β code to pass β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β REFACTOR β
β Improve design β
β (tests still β
β pass) β
ββββββββββ¬ββββββββββ
β
ββββββββββββββ Repeat
RED: Write a test for the next small piece of functionality. Run itβit should fail (you havenβt written the code yet).
GREEN: Write the minimum amount of code to make the test pass. Donβt over-engineer.
REFACTOR: Clean up the code while keeping tests green. Remove duplication, improve names, simplify.
TDD Example: Building a Password Validator#
Step 1: RED - First failing test
# test_password_validator.py
from password_validator import validate_password
def test_password_must_be_at_least_8_characters():
assert validate_password("short") is False
assert validate_password("longenough") is True
$ pytest
# FAILS: ModuleNotFoundError - validate_password doesn't exist yet
Step 2: GREEN - Minimal implementation
# password_validator.py
def validate_password(password: str) -> bool:
return len(password) >= 8
$ pytest
# PASSES
Step 3: RED - Add next requirement
# test_password_validator.py
def test_password_must_contain_uppercase():
assert validate_password("alllowercase") is False
assert validate_password("HasUppercase") is True
$ pytest
# FAILS: "alllowercase" returns True but should be False
Step 4: GREEN - Extend implementation
# password_validator.py
def validate_password(password: str) -> bool:
if len(password) < 8:
return False
if not any(c.isupper() for c in password):
return False
return True
Step 5: REFACTOR - Improve the code
# password_validator.py (refactored)
def validate_password(password: str) -> bool:
"""
Validate password meets security requirements:
- At least 8 characters
- Contains at least one uppercase letter
"""
checks = [
len(password) >= 8,
any(c.isupper() for c in password),
]
return all(checks)
Continue the cycle for remaining requirements (numbers, special characters, etc.)
Benefits of TDD#
Benefit |
Explanation |
|---|---|
Better Design |
Writing tests first forces you to think about interfaces before implementation |
High Coverage |
Every feature has tests because tests come first |
Documentation |
Tests describe what code should do |
Confidence |
Refactor fearlessly with comprehensive tests |
Focus |
Work on one small thing at a time |
When TDD Works Best#
New features with clear requirements
Complex business logic
APIs and libraries (designing interfaces)
When you want high test coverage by default
When TDD May Not Fit#
Exploratory/prototype code (youβre still figuring out what to build)
UI code with rapidly changing designs
Integration with poorly documented external systems
Modern Testing Frameworks#
Language |
Unit Testing |
Mocking |
Coverage |
|---|---|---|---|
Python |
pytest, unittest |
unittest.mock, pytest-mock |
pytest-cov, coverage.py |
JavaScript |
Jest, Vitest, Mocha |
Jest mocks, Sinon |
Istanbul, c8 |
Java |
JUnit 5, TestNG |
Mockito, MockK |
JaCoCo |
Go |
testing (built-in) |
gomock, testify |
go test -cover |
Best Practices#
Test Organization#
project/
βββ src/
β βββ user_service.py
β βββ payment_service.py
βββ tests/
βββ unit/
β βββ test_user_service.py
β βββ test_payment_service.py
βββ integration/
β βββ test_api.py
βββ conftest.py # Shared fixtures
Naming Conventions#
# Pattern: test_<thing>_<expected_behavior>_<condition>
def test_user_registration_succeeds_with_valid_email():
...
def test_user_registration_fails_when_email_already_exists():
...
def test_payment_is_rejected_when_amount_is_negative():
...
The Arrange-Act-Assert Pattern#
def test_user_can_update_profile():
# Arrange: Set up test data and dependencies
user = User(name="Alice", email="alice@example.com")
new_email = "newalice@example.com"
# Act: Perform the action being tested
user.update_email(new_email)
# Assert: Verify the expected outcome
assert user.email == new_email
Summary#
Key Takeaways:
Testing Pyramid: Many fast unit tests, fewer integration tests, minimal E2E tests
Unit Testing: Test isolated units following F.I.R.S.T. principles. Use mocks to isolate dependencies.
White Box vs Black Box:
White box: Test internal code paths with code visibility
Black box: Test external behavior from specifications
Use both for comprehensive coverage
TDD Cycle: Red (failing test) β Green (minimal code) β Refactor (clean up)
Benefits of TDD: Better design, high coverage, documentation, confidence