Testing Methodologies & TDD#

Introduction#

Software testing is fundamental to delivering reliable, maintainable code. Beyond catching bugs, testing provides documentation, enables safe refactoring, and gives developers confidence when making changes.

Why Testing Matters:

  • Early Bug Detection: Find issues before they reach production

  • Documentation: Tests describe expected behavior better than comments

  • Refactoring Safety: Change code confidently knowing tests catch regressions

  • Design Feedback: Hard-to-test code often indicates design problems

  • Collaboration: Tests help team members understand and modify code safely


The Testing Pyramid#

The testing pyramid is a strategy for balancing different types of tests:

              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚    E2E    β”‚  Slow, Expensive
              β”‚   Tests   β”‚  (~5-10%)
             β”Œβ”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”
             β”‚ Integration β”‚  Medium Speed
             β”‚    Tests    β”‚  (~20-30%)
            β”Œβ”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”
            β”‚   Unit Tests  β”‚  Fast, Cheap
            β”‚               β”‚  (~60-70%)
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Level

What It Tests

Speed

Scope

Unit

Individual functions/classes in isolation

Fast (ms)

Narrow

Integration

Components working together

Medium (seconds)

Medium

E2E

Full user workflows

Slow (minutes)

Wide

Follow the pyramid: many fast unit tests, fewer integration tests, minimal E2E tests. Inverted pyramids lead to slow, brittle test suites.


Unit Testing#

Unit tests verify that individual β€œunits” of code work correctly in isolation. A unit is typically a function, method, or class.

Characteristics of Good Unit Tests (F.I.R.S.T.)#

Principle

Description

Fast

Run in milliseconds; you should run them constantly

Isolated

No dependencies on external systems (DB, network, filesystem)

Repeatable

Same result every time, regardless of environment

Self-validating

Pass or fail clearly, no manual inspection needed

Timely

Written close to production code (ideally before with TDD)

Python Example with pytest#

# calculator.py
def add(a: int, b: int) -> int:
    return a + b

def divide(a: int, b: int) -> float:
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b
# test_calculator.py
import pytest
from calculator import add, divide

class TestAdd:
    def test_add_positive_numbers(self):
        assert add(2, 3) == 5

    def test_add_negative_numbers(self):
        assert add(-1, -1) == -2

    def test_add_zero(self):
        assert add(5, 0) == 5

class TestDivide:
    def test_divide_normal(self):
        assert divide(10, 2) == 5.0

    def test_divide_by_zero_raises_error(self):
        with pytest.raises(ValueError, match="Cannot divide by zero"):
            divide(10, 0)
# Run tests
pytest test_calculator.py -v

# Run with coverage
pytest --cov=calculator --cov-report=term-missing

Test Isolation with Mocking#

When code depends on external systems, use test doubles to isolate the unit:

Type

Purpose

Example

Mock

Verify interactions (was method called?)

Check if email was sent

Stub

Provide canned responses

Return fake API response

Fake

Working implementation (simplified)

In-memory database

Spy

Record calls while using real implementation

Track function calls

# service.py
class PaymentService:
    def __init__(self, payment_gateway):
        self.gateway = payment_gateway

    def process_payment(self, amount: float) -> bool:
        if amount <= 0:
            raise ValueError("Amount must be positive")
        return self.gateway.charge(amount)
# test_service.py
from unittest.mock import Mock
from service import PaymentService

def test_process_payment_calls_gateway():
    # Arrange: Create a mock gateway
    mock_gateway = Mock()
    mock_gateway.charge.return_value = True
    service = PaymentService(mock_gateway)

    # Act: Process payment
    result = service.process_payment(100.0)

    # Assert: Verify behavior
    assert result is True
    mock_gateway.charge.assert_called_once_with(100.0)

def test_process_payment_rejects_negative_amount():
    mock_gateway = Mock()
    service = PaymentService(mock_gateway)

    with pytest.raises(ValueError):
        service.process_payment(-50.0)

    # Gateway should NOT be called for invalid amounts
    mock_gateway.charge.assert_not_called()

White Box Testing#

        graph LR
    A[Input] --> B[Software\ninternal structure visible]
    B --> C[Output]
    style B fill:#fff,stroke:#333
    

White box testing examines the internal structure of code. The tester has full visibility into the implementation and designs tests to cover specific code paths.

Key Techniques#

Code Coverage Metrics:

Metric

Description

Target

Statement Coverage

% of code statements executed

80%+

Branch Coverage

% of decision branches taken

75%+

Path Coverage

% of possible execution paths

Lower priority

Example: Testing All Branches

# discount.py
def calculate_discount(price: float, is_member: bool, quantity: int) -> float:
    """Calculate discount based on membership and quantity."""
    discount = 0.0

    if is_member:
        discount += 0.10  # 10% member discount

    if quantity >= 10:
        discount += 0.05  # 5% bulk discount
    elif quantity >= 5:
        discount += 0.02  # 2% small bulk

    return price * (1 - discount)
# test_discount.py - White box approach: cover all branches
import pytest
from discount import calculate_discount

class TestCalculateDiscount:
    # Test member branches
    def test_member_gets_10_percent_discount(self):
        assert calculate_discount(100, is_member=True, quantity=1) == 90.0

    def test_non_member_no_member_discount(self):
        assert calculate_discount(100, is_member=False, quantity=1) == 100.0

    # Test quantity branches
    def test_quantity_10_plus_gets_5_percent(self):
        assert calculate_discount(100, is_member=False, quantity=10) == 95.0

    def test_quantity_5_to_9_gets_2_percent(self):
        assert calculate_discount(100, is_member=False, quantity=5) == 98.0

    def test_quantity_under_5_no_bulk_discount(self):
        assert calculate_discount(100, is_member=False, quantity=4) == 100.0

    # Combined branches
    def test_member_with_large_quantity(self):
        # 10% member + 5% bulk = 15% discount
        assert calculate_discount(100, is_member=True, quantity=10) == 85.0

When to Use White Box Testing#

  • Security-critical code paths

  • Complex algorithms with many branches

  • Achieving high code coverage requirements

  • Understanding legacy code behavior


Black Box Testing#

        graph LR
    A[Input] --> B[Black Box\ninternal structure hidden]
    B --> C[Output]
    style B fill:#000,color:#fff,stroke:#000
    

Black box testing examines behavior from the outside without knowledge of internal implementation. Tests are based on requirements and specifications.

Key Techniques#

Equivalence Partitioning: Divide inputs into classes that should behave the same way.

# Testing a registration form age field (valid: 18-120)
# Equivalence classes:
# - Invalid: age < 18 (test with 10)
# - Valid: 18 <= age <= 120 (test with 30)
# - Invalid: age > 120 (test with 130)

Boundary Value Analysis: Test at the edges of valid ranges.

# For age field (valid: 18-120)
# Test: 17 (invalid), 18 (valid), 119 (valid), 120 (valid), 121 (invalid)

@pytest.mark.parametrize("age,expected_valid", [
    (17, False),   # Just below minimum
    (18, True),    # At minimum
    (50, True),    # Middle (equivalence class representative)
    (120, True),   # At maximum
    (121, False),  # Just above maximum
])
def test_age_validation(age, expected_valid):
    result = validate_age(age)
    assert result == expected_valid

When to Use Black Box Testing#

  • API testing (testing contract, not implementation)

  • User acceptance testing

  • When testers don’t have access to source code

  • Regression testing from user perspective


White Box vs Black Box Comparison#

Aspect

White Box

Black Box

Knowledge Required

Full code visibility

Only specifications

Focus

Internal code paths

External behavior

Performed By

Developers

Testers, QA, Users

Test Design

Based on code structure

Based on requirements

Coverage

Measures code coverage

Measures feature coverage

Finds

Logic errors, security flaws

Missing features, spec violations

Best For

Unit testing, security

Integration, E2E testing

Both approaches are complementary! White box ensures code paths work correctly. Black box ensures the software meets user requirements. Use both for comprehensive coverage.


Test-Driven Development (TDD)#

TDD is a development methodology where you write tests before writing production code. It follows a strict cycle known as Red-Green-Refactor.

The TDD Cycle#

      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚      RED         β”‚
      β”‚  Write a failing β”‚
      β”‚      test        β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚     GREEN        β”‚
      β”‚  Write minimal   β”‚
      β”‚  code to pass    β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚    REFACTOR      β”‚
      β”‚  Improve design  β”‚
      β”‚  (tests still    β”‚
      β”‚     pass)        β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               └────────────→ Repeat
  1. RED: Write a test for the next small piece of functionality. Run itβ€”it should fail (you haven’t written the code yet).

  2. GREEN: Write the minimum amount of code to make the test pass. Don’t over-engineer.

  3. REFACTOR: Clean up the code while keeping tests green. Remove duplication, improve names, simplify.

TDD Example: Building a Password Validator#

Step 1: RED - First failing test

# test_password_validator.py
from password_validator import validate_password

def test_password_must_be_at_least_8_characters():
    assert validate_password("short") is False
    assert validate_password("longenough") is True
$ pytest
# FAILS: ModuleNotFoundError - validate_password doesn't exist yet

Step 2: GREEN - Minimal implementation

# password_validator.py
def validate_password(password: str) -> bool:
    return len(password) >= 8
$ pytest
# PASSES

Step 3: RED - Add next requirement

# test_password_validator.py
def test_password_must_contain_uppercase():
    assert validate_password("alllowercase") is False
    assert validate_password("HasUppercase") is True
$ pytest
# FAILS: "alllowercase" returns True but should be False

Step 4: GREEN - Extend implementation

# password_validator.py
def validate_password(password: str) -> bool:
    if len(password) < 8:
        return False
    if not any(c.isupper() for c in password):
        return False
    return True

Step 5: REFACTOR - Improve the code

# password_validator.py (refactored)
def validate_password(password: str) -> bool:
    """
    Validate password meets security requirements:
    - At least 8 characters
    - Contains at least one uppercase letter
    """
    checks = [
        len(password) >= 8,
        any(c.isupper() for c in password),
    ]
    return all(checks)

Continue the cycle for remaining requirements (numbers, special characters, etc.)

Benefits of TDD#

Benefit

Explanation

Better Design

Writing tests first forces you to think about interfaces before implementation

High Coverage

Every feature has tests because tests come first

Documentation

Tests describe what code should do

Confidence

Refactor fearlessly with comprehensive tests

Focus

Work on one small thing at a time

When TDD Works Best#

  • New features with clear requirements

  • Complex business logic

  • APIs and libraries (designing interfaces)

  • When you want high test coverage by default

When TDD May Not Fit#

  • Exploratory/prototype code (you’re still figuring out what to build)

  • UI code with rapidly changing designs

  • Integration with poorly documented external systems


Modern Testing Frameworks#

Language

Unit Testing

Mocking

Coverage

Python

pytest, unittest

unittest.mock, pytest-mock

pytest-cov, coverage.py

JavaScript

Jest, Vitest, Mocha

Jest mocks, Sinon

Istanbul, c8

Java

JUnit 5, TestNG

Mockito, MockK

JaCoCo

Go

testing (built-in)

gomock, testify

go test -cover


Best Practices#

Test Organization#

project/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ user_service.py
β”‚   └── payment_service.py
└── tests/
    β”œβ”€β”€ unit/
    β”‚   β”œβ”€β”€ test_user_service.py
    β”‚   └── test_payment_service.py
    β”œβ”€β”€ integration/
    β”‚   └── test_api.py
    └── conftest.py  # Shared fixtures

Naming Conventions#

# Pattern: test_<thing>_<expected_behavior>_<condition>

def test_user_registration_succeeds_with_valid_email():
    ...

def test_user_registration_fails_when_email_already_exists():
    ...

def test_payment_is_rejected_when_amount_is_negative():
    ...

The Arrange-Act-Assert Pattern#

def test_user_can_update_profile():
    # Arrange: Set up test data and dependencies
    user = User(name="Alice", email="alice@example.com")
    new_email = "newalice@example.com"

    # Act: Perform the action being tested
    user.update_email(new_email)

    # Assert: Verify the expected outcome
    assert user.email == new_email

Summary#

Key Takeaways:

  1. Testing Pyramid: Many fast unit tests, fewer integration tests, minimal E2E tests

  2. Unit Testing: Test isolated units following F.I.R.S.T. principles. Use mocks to isolate dependencies.

  3. White Box vs Black Box:

    • White box: Test internal code paths with code visibility

    • Black box: Test external behavior from specifications

    • Use both for comprehensive coverage

  4. TDD Cycle: Red (failing test) β†’ Green (minimal code) β†’ Refactor (clean up)

  5. Benefits of TDD: Better design, high coverage, documentation, confidence


References#

  1. Test-Driven Development by Example - Kent Beck

  2. pytest Documentation

  3. Martin Fowler - Test Pyramid

  4. The Art of Unit Testing - Roy Osherove

  5. Growing Object-Oriented Software, Guided by Tests

  6. Python unittest.mock Documentation

  7. JUnit 5 User Guide