LibraryFirst Way: FlowTools & TechniquesNon-Functional Requirements

FT-08TOOLFirst Way: Flow

Non-Functional Requirements in the Pipeline

Performance, security, and reliability are not afterthoughts. How to build NFR validation into the deployment pipeline so they are tested on every change, not just at release time.

Sources:DevOps HandbookRelease It! — Nygard

Video Lesson

A video lesson for this topic is in development. The library articles and mission exercises cover the same material in the meantime.

01

What are NFRs?

Non-functional requirements (NFRs) describe how a system behaves rather than what it does. Performance, security, reliability, scalability, maintainability — these are constraints that every feature must satisfy, not features themselves.

Performance

Response time, throughput, resource utilization under expected and peak load.

Security

Resistance to attack, data protection, authentication, authorization, encryption.

Reliability

Availability, fault tolerance, graceful degradation, recovery time.

Scalability

Behavior under increased load. Vertical vs horizontal scaling limits.

NFRs discovered at the end of a release cycle are expensive. A performance problem found in production requires an emergency fix and rollback. The same problem found in a pipeline performance gate requires a code change. Shift NFR testing left.

02

Performance testing in the pipeline

Performance testing in the pipeline does not mean running a full load test on every commit — that would be too slow. It means running a targeted baseline comparison that detects regressions before they reach production.

Baseline comparison

Every build

k6, Gatling

Run a fixed scenario against staging. Compare p99 latency to the previous baseline. Fail the build if degraded by > 10%.

Load test

Pre-release

k6, Locust

Simulate expected peak traffic. Verify the system meets its SLA at real-world load. Run in staging, not prod.

Stress test

Quarterly

k6, JMeter

Increase load until the system breaks. Find the breaking point and verify the system recovers gracefully.

Soak test

Pre-release

k6, Gatling

Run at sustained load for hours. Detect memory leaks and resource exhaustion that only appear over time.

03

Security testing in the pipeline

Shifting security left means running automated security checks in the pipeline, not just during a periodic security audit. The pipeline becomes a security control, not just a quality gate.

Commit

Secrets scan

SAST lint

Build

Dependency CVE

License scan

Test

SAST full

Unit tests

Deploy Staging

DAST scan

Perf baseline

Deploy Prod

Rate limits

Monitor

Security gates at every stage. Earlier checks are faster and cheaper. DAST requires a running application, so runs in staging.

SAST (Static)

Static Application Security Testing. Analyzes source code for security vulnerabilities without running the application. Fast, runs early. Examples: Semgrep, SonarQube, CodeQL.

DAST (Dynamic)

Dynamic Application Security Testing. Attacks a running application to find vulnerabilities. Requires deployed app — runs in staging. Examples: OWASP ZAP, Burp Suite.

Dependency scan

Checks third-party dependencies against known CVE databases. Should fail the build on critical vulnerabilities. Examples: npm audit, Snyk, Dependabot.

Secrets scan

Prevents credentials, API keys, and tokens from being committed to source control. Runs as a pre-commit hook and in CI. Examples: git-secrets, truffleHog, Gitleaks.

04

Reliability patterns

Michael Nygard's Release It! documents the stability patterns that make systems resilient to cascading failure. These are architectural choices that belong in the deployment pipeline: test them in staging before they are needed in production.

Circuit breaker

When a downstream service is failing, stop calling it. Return a cached or degraded response. After a timeout, probe whether the service has recovered.

Why it matters

Prevents a failing dependency from taking down the entire system through resource exhaustion.

Timeout

Every call to an external system must have a timeout. No call should wait indefinitely. Timeouts are the minimum reliability pattern — if you do nothing else, do this.

Why it matters

An unbounded wait holds a thread. Enough unbounded waits fill the thread pool. The service goes down.

Retry with backoff

Transient failures are often self-resolving. Retry failed requests with exponential backoff and jitter. Do not retry on client errors (4xx).

Why it matters

Immediate retries amplify load on a struggling service. Exponential backoff gives it time to recover.

Bulkhead

Partition system resources by function. Use separate connection pools for different downstream services so a slow dependency can only exhaust its own pool.

Why it matters

Isolates failure so that one misbehaving dependency cannot exhaust shared resources.

05

NFR checklist

Use this as a starting point for defining NFR gates in your deployment pipeline. Each NFR has a specific test mechanism, a pipeline stage where it runs, and a measurable pass condition:

NFR

How to test

Pipeline stage

Pass gate

Response time

Load test with k6 / Locust

Staging

p99 < 500ms

Throughput

Stress test to find max RPS

Staging

> 1,000 req/s

Dependencies CVE

npm audit / Snyk / Dependabot

Build

0 critical CVEs

Secrets in code

git-secrets / truffleHog

Commit

0 secrets found

OWASP Top 10

SAST: Semgrep / SonarQube

Test

0 high findings

DAST / runtime

OWASP ZAP against staging

Staging

0 high findings

Uptime / recovery

Chaos test + MTTR measurement

Staging

MTTR < 30 min

Data protection

TLS check, encryption at rest audit

Build

TLS 1.2+, encrypted

06

Further reading

Release It! — Michael Nygard

The definitive book on production-ready software. Stability patterns, anti-patterns, and the chapter that coined circuit breaker.

DevOps Handbook — Chapter 11

Enable and Practice Continuous Testing. NFR testing in the context of the deployment pipeline.

OWASP Testing Guide

owasp.org. The comprehensive reference for web application security testing. The basis for most DAST tools and security test checklists.

Google SRE Book — Chapter 3

Embracing Risk. How to reason about reliability requirements, SLOs, and the cost of reliability vs the cost of failure.