Performance Testing Knowledge Base

ทดสอบ Performance
ให้ได้ผลจริง

ระบบที่ผ่าน Functional Test ครบทุก Case แต่ล่มทันทีที่มี User เข้าพร้อมกัน 1,000 คน — นี่คือปัญหาที่ Performance Testing ถูกสร้างมาเพื่อป้องกัน

→ เริ่มอ่าน Concepts 📊 Metrics Guide

Test Types

Tools Compared

50+

Glossary Terms

Key Metrics

Core Concepts

Performance Testing
มีกี่แบบ?

หลายทีมเรียกทุกอย่างว่า "Performance Test" แต่ทำแค่ Load Test เพียงอย่างเดียว แล้วก็แปลกใจเมื่อระบบล่มตอน Flash Sale — บทความนี้อธิบาย 6 ประเภทให้เห็นว่าแต่ละแบบตอบคำถามต่างกันอย่างไร

01 · Load Test

⚖️ Load Testing

ทดสอบระบบภายใต้ Workload ที่คาดหวัง (Expected Load) เพื่อตรวจสอบว่าระบบตอบสนองได้ตามมาตรฐานที่กำหนดหรือไม่

ตัวอย่าง: ระบบ E-commerce รองรับ User 1,000 คนพร้อมกัน Response Time < 2s

Expected Load Baseline Verification

Duration: 30 min – 2 hr ทุก Sprint

02 · Stress Test

💥 Stress Testing

เพิ่ม Load เกินกว่าที่ระบบออกแบบมารองรับ เพื่อหา Breaking Point และดูพฤติกรรมของระบบเมื่อล้มเหลว

ตัวอย่าง: เพิ่ม User จาก 1,000 ขึ้นไปเรื่อยๆ จนระบบ Error หรือล่ม

Beyond Limit Breaking Point Recovery

Duration: 1 – 4 hr รายเดือน

03 · Spike Test

📈 Spike Testing

จำลอง Traffic ที่เพิ่มขึ้นอย่างรวดเร็วและฉับพลัน แล้วลดลงทันที เช่น Flash Sale, Breaking News, Viral Event

ตัวอย่าง: User เพิ่มจาก 100 เป็น 10,000 ภายใน 2 นาที

Sudden Surge Auto-scaling Flash Events

Duration: 15 – 60 min ก่อน Event ใหญ่

04 · Soak Test

⏳ Soak / Endurance Testing

รัน Load ต่อเนื่องเป็นเวลานาน (หลายชั่วโมง ถึงหลายวัน) เพื่อหา Memory Leak, Resource Exhaustion และ Degradation ที่ค่อยๆ สะสม

ตัวอย่าง: รัน Load 500 Concurrent Users นาน 8 ชั่วโมง

Memory Leak Long Duration Degradation

Duration: 8 – 72 hr รายไตรมาส

05 · Volume Test

🗄️ Volume Testing

ทดสอบระบบด้วย ปริมาณข้อมูลขนาดใหญ่ เช่น Database ที่มี Record หลาย 100 ล้านแถว เพื่อดูว่าประสิทธิภาพยังอยู่ในระดับที่ยอมรับได้

ตัวอย่าง: ทดสอบ Report ที่ต้องประมวลผล Transaction 500 ล้านรายการ

Big Data Database Query Performance

Duration: 1 – 8 hr ก่อน Go-live

06 · Scalability Test

📐 Scalability Testing

ทดสอบว่าระบบ Scale ได้ดีแค่ไหน ทั้ง Vertical Scaling (เพิ่ม Resource) และ Horizontal Scaling (เพิ่มจำนวน Instance)

ตัวอย่าง: ทดสอบ k8s Horizontal Pod Autoscaling ว่า Scale ได้ทันความต้องการ

H-Scale V-Scale Cloud Native

Duration: 2 – 6 hr เมื่อเปลี่ยน Arch

เปรียบเทียบ 6 ประเภท

ประเภท	Load Level	Duration	หาอะไร	ความถี่
⚖️ Load Test	Expected	30 min – 2 hr	Performance ปกติ	ทุก Sprint
💥 Stress Test	Beyond Max	1 – 4 hr	Breaking Point	รายเดือน
📈 Spike Test	Sudden Burst	15 – 60 min	Recovery Time	ก่อน Event ใหญ่
⏳ Soak Test	Normal	8 – 72 hr	Memory Leak	รายไตรมาส
🗄️ Volume Test	High Data	1 – 8 hr	Data Bottleneck	ก่อน Go-live
📐 Scalability Test	Incremental	2 – 6 hr	Scale Efficiency	เมื่อเปลี่ยน Arch

JavaScript · k6 Load Test

// ── Import k6 modules ───────────────────────────────────────────
import http from 'k6/http';
import { check, sleep } from 'k6';

// ── Options: กำหนด Load Profile ─────────────────────────────────
export const options = {
  stages: [
    { duration: '2m',  target: 100  },   // Ramp-up
    { duration: '5m',  target: 1000 },   // Peak Load
    { duration: '2m',  target: 0    },   // Ramp-down
  ],
  thresholds: {
    'http_req_duration': ['p(95)<2000'],  // p95 < 2s
    'http_req_failed':   ['rate<0.01'],   // Error rate < 1%
  },
};

// ── Virtual User Script ──────────────────────────────────────────
export default function main() {
  const res = http.get('https://api.example.com/products');

  check(res, {
    'status is 200':       (r) => r.status === 200,
    'response time < 2s':  (r) => r.timings.duration < 2000,
    'body not empty':      (r) => r.body.length > 0,
  });

  sleep(1);  // Think time
}

Metrics & KPIs

ทำไม Average Response Time
ถึงโกหกคุณ?

99 Request ตอบ 100ms แต่มี 1 Request ตอบ 10 วินาที — Average = 199ms ดูดีมาก แต่ User 1% ต้องรอ 10 วินาที บทความนี้อธิบาย Metrics ที่ควรดูจริงๆ

⏱️

Response Time

เวลาที่ระบบใช้ในการตอบสนองต่อ Request หนึ่งครั้ง นับตั้งแต่ Client ส่ง Request จนได้รับ Response ครบ รายงานในรูปแบบ p50, p90, p95, p99

ยอดเยี่ยม (p95)< 500ms

ยอมรับได้ (p95)< 2,000ms

ต้องปรับปรุง (p95)> 3,000ms

🚀

Throughput (TPS/RPS)

จำนวน Transaction หรือ Request ที่ระบบประมวลผลได้ต่อวินาที ยิ่งสูงยิ่งดี แต่ต้องดูควบคู่กับ Error Rate เสมอ

Peak Users1,000 users

Think Time2 seconds

Target TPS= 500 TPS

❌

Error Rate

เปอร์เซ็นต์ของ Request ที่ได้รับ Error Response (HTTP 4xx/5xx) หรือ Timeout เทียบกับ Request ทั้งหมด เป็น Metric สำคัญที่สุดในการบอกว่าระบบ "ล้มเหลว"

ยอดเยี่ยม< 0.1%

ยอมรับได้< 1%

ล้มเหลว> 5%

😊

Apdex Score

Application Performance Index — คะแนน 0–1 ที่รวม Response Time และ User Satisfaction เป็นตัวเลขเดียว คำนวณจาก Satisfied + Tolerating/2 หารด้วย Total Samples

Excellent0.94 – 1.0

Good0.85 – 0.94

Poor< 0.70

👥

Concurrent Users

จำนวน User ที่ใช้งานระบบพร้อมกันในเวลาเดียวกัน คำนวณด้วย Little's Law: N = λ × W ซึ่ง λ คือ Arrival Rate และ W คือ Response Time

N (Concurrent)= λ × W

λ (Arrival Rate)req/sec

W (Response Time)seconds

💻

Resource Utilization

การใช้งาน CPU, Memory, Network I/O และ Disk I/O ของ Server ระหว่าง Test ถ้า Memory เพิ่มเรื่อยๆ อาจมี Memory Leak

CPU (Sustained)< 70%

Memory< 80%

Network I/O< 70% BW

Percentiles — ทำไมต้องดู p95 ไม่ใช่ Average?

⚠️ ถ้ามี 99 Request ที่ Response 100ms และ 1 Request ที่ Response 10,000ms — Average = 199ms ดูดีมาก แต่ความจริงคือ User 1% ต้องรอ 10 วินาที!

p50

Median

50% ของ Request เร็วกว่าค่านี้

p90

90th Percentile

90% ของ User ได้ Response เร็วกว่า

p95

มาตรฐานที่ใช้มากที่สุด ★

มาตรฐาน SLA ที่ใช้กันทั่วไป

p99

Worst-case (99th)

Critical System ที่ยอมให้ช้าไม่ได้

Formula · Apdex Score

── กำหนด T (Threshold) = 2 วินาที ──────────────────────────────

Satisfied  : Response Time ≤ T         (≤ 2s)
Tolerating : Response Time ≤ 4T        (2s < t ≤ 8s)
Frustrated : Response Time > 4T        (> 8s)

── สูตร ─────────────────────────────────────────────────────────

          Satisfied + (Tolerating / 2)
Apdex = ──────────────────────────────
                 Total Samples

── ตัวอย่าง ─────────────────────────────────────────────────────
Satisfied  = 800   requests
Tolerating = 150   requests
Frustrated = 50    requests
Total      = 1000  requests

          800 + (150 / 2)       875
Apdex = ─────────────────── = ───── = 0.875  (Good ✓)
               1000             1000

ประเภทระบบ	p95 Response Time	Error Rate	Apdex	Availability
🛒 E-commerce (Checkout)	< 2,000ms	< 0.1%	> 0.90	99.95%
🏦 Banking / FinTech	< 1,000ms	< 0.01%	> 0.95	99.99%
📰 Content / Media	< 3,000ms	< 1%	> 0.85	99.9%
🏥 Healthcare	< 1,500ms	< 0.1%	> 0.92	99.99%
🎮 Gaming	< 100ms	< 0.5%	> 0.95	99.9%
📱 Mobile API	< 2,000ms	< 1%	> 0.85	99.9%

Tools Comparison

JMeter, k6 หรือ Locust
เลือกอะไรดี?

ไม่มี Tool ไหนดีที่สุดในทุกสถานการณ์ เปรียบเทียบ 5 Tools พร้อมแนวทางเลือกตามสถานการณ์ของทีม

JMX

Apache JMeter

GUI-based · Enterprise-grade

Groovy / BeanShell GUI ✓ HTTP · JDBC · JMS · FTP

เครื่องมือดั้งเดิมที่ครอบคลุม Protocol หลากหลาย มี GUI ช่วยสร้าง Test Plan ได้โดยไม่ต้องเขียนโค้ด เหมาะสำหรับทีมที่ยังไม่มีประสบการณ์ด้าน Coding

✓ ทีมที่ยังไม่มีประสบการณ์ Coding

k6

Code-first · CI/CD-ready

JavaScript (ES6) CLI only HTTP · WebSocket · gRPC

เขียนใน JavaScript แบบ Modern มีประสิทธิภาพสูง Integrate เข้า CI/CD Pipeline ได้ง่าย มี Cloud Runner ในตัว เหมาะกับ Dev/DevOps ที่คุ้นเคย Coding

✓ Dev/DevOps, CI/CD Pipeline

Locust

Python-based · Distributed

Python Web UI ◑ HTTP · WebSocket · Custom

เขียนใน Python ล้วนๆ ทำให้ยืดหยุ่นสูง รองรับ Distributed Testing ข้ามหลายเครื่องได้ดี เหมาะกับทีมที่ใช้ Python และต้องการ Custom Behavior

✓ ทีม Python, Distributed Testing

Gatling

Scala/Java · Rich Reports

Scala / Java CLI only HTTP · WebSocket · JMS

เน้น DSL ที่อ่านง่ายสำหรับทีม Java/Scala มี HTML Report ที่ละเอียดและสวยงามมาก รองรับ Simulation ที่ซับซ้อน เหมาะกับ Enterprise Java

✓ ทีม Java/Scala, Report ละเอียด

ART

Artillery

YAML/JS · Cloud-native

JavaScript / YAML CLI only HTTP · WebSocket · Socket.io

กำหนด Test Plan ผ่าน YAML ง่ายต่อการเริ่มต้น รองรับ Serverless และ Cloud-native ได้ดี มี Plugin ecosystem ที่เติบโตเร็ว

✓ ทีม Node.js, Cloud-native

Tool	ภาษา Script	GUI	Protocol	เหมาะกับ
Apache JMeter	Groovy / BeanShell	✓	HTTP, JDBC, JMS, FTP	ทีมที่ยังไม่มีประสบการณ์ Coding
k6	JavaScript (ES6)	✗	HTTP, WebSocket, gRPC	Dev/DevOps, CI/CD Pipeline
Locust	Python	◑	HTTP, WebSocket, Custom	ทีม Python, Distributed Testing
Gatling	Scala / Java	✗	HTTP, WebSocket, JMS	ทีม Java/Scala, Report ละเอียด
Artillery	JavaScript / YAML	✗	HTTP, WebSocket, Socket.io	ทีม Node.js, Cloud-native

Test Planning

วาง Performance Test Plan
ตั้งแต่ต้น

ทีมส่วนใหญ่เปิด JMeter แล้วกด Start ทันที ผลที่ได้คือตัวเลขที่ไม่รู้จะ Compare กับอะไร บทความนี้ครอบคลุม 7 ขั้นตอนจาก Objective ถึง Report

กำหนด Objective และ Scope

ระบุว่าทำ Performance Test เพื่ออะไร ครอบคลุม API / Flow ไหน และ Environment ที่จะทดสอบคือ Staging หรือ Production

Output: Test Objective Document

กำหนด Acceptance Criteria

กำหนด SLA ให้ชัดเจน เช่น "p95 Response Time < 2 วินาที ที่ 1,000 Concurrent Users และ Error Rate < 1%" ก่อนเริ่ม Test เพื่อให้รู้ว่า Pass/Fail

Output: SLA Document with Thresholds

วิเคราะห์ Workload Model

ดู Production Analytics เพื่อเข้าใจ Traffic Pattern จริง วิเคราะห์ Peak Hours, User Journey ที่ใช้บ่อย และ Transaction Mix ที่จะ Simulate

Output: Workload Model + User Scenarios

เตรียม Test Environment

ตั้งค่า Environment ให้ใกล้เคียง Production มากที่สุด บันทึก Ratio (เช่น 1:4) และตรวจสอบว่า Test Data มีเพียงพอและหลากหลาย

Output: Environment Config + Data Preparation

เขียน Test Script

สร้าง Script ที่ Simulate User Behavior จริง ใส่ Think Time, Correlation, Parameterization และ Assertions ให้ครบ

Output: Test Scripts + Smoke Test Pass

รัน Test ตาม Strategy

เริ่มจาก Load Test เพื่อ Establish Baseline → Stress Test → Soak Test ตามลำดับ Monitor Metrics แบบ Real-time ระหว่างรัน

Pro tip: Load → Stress → Soak เสมอ

วิเคราะห์ผลและทำ Report

สรุปผลเปรียบเทียบกับ Acceptance Criteria ระบุ Bottleneck ที่พบ แนะนำ Action Items พร้อม Priority และ Effort Estimate

Output: Performance Test Report

Best Practices & สิ่งที่ควรหลีกเลี่ยง

✓ ควรทำ

ทำ Load Test ก่อนเสมอเพื่อสร้าง Baseline
ใส่ Think Time ใน Script ให้ Realistic
ดู p95/p99 ไม่ใช่ Average อย่างเดียว
Test บน Environment ที่ใกล้เคียง Production
Monitor Server Resource ระหว่าง Test
Warm Up ระบบก่อนเก็บ Measurement

✗ ควรหลีกเลี่ยง

ทำ Performance Test วัน Go-live วันเดียว
ดูแค่ Average Response Time
ไม่มี Think Time ใน Script ทำให้ Load เกินจริง
Test บน Environment ที่ Ratio ไม่สม่ำเสมอ
ไม่กำหนด Acceptance Criteria ก่อน Test
Ramp-up เร็วเกินไปจนไม่ Realistic

Backend Strategy · Banking Standard

Performance Testing
สำหรับ Back-end ระดับธนาคาร

REST API ไม่ใช่ Back-end เพียงอย่างเดียว — Kafka, Event Hub, WebSocket, AI Service, gRPC ต่างมี Workload Pattern และเกณฑ์วัดที่แตกต่างกันโดยสิ้นเชิง

📨

Apache Kafka

Distributed Event Streaming Platform — ใช้สำหรับ Payment Event, Transaction Log, Real-time Data Pipeline ในระบบธนาคาร

ThroughputConsumer LagPartition

☁️

Azure Event Hub

Managed Event Streaming บน Azure — Kafka-compatible API เหมาะกับ Cloud-native Banking Architecture ที่ใช้ Microsoft Stack

Throughput UnitProcessing UnitCapture

🔌

WebSocket

Full-duplex Persistent Connection — ใช้กับ Real-time Dashboard, Market Data Feed, Notification Service, Trading System

Message LatencyConnection StabilityConcurrency

🤖

AI / LLM Service

Inference API, Embedding Service, Document Processing — ใช้ใน Chatbot, Fraud Detection, Credit Scoring, Document OCR

TTFTToken/secQueue Depth

⚡

gRPC

High-performance RPC Framework — ใช้ใน Microservices Internal Communication, Core Banking API, Inter-service ที่ต้องการ Latency ต่ำ

UnaryStreamingDeadline

🗄️

Database Layer

RDBMS (Oracle, MSSQL, PostgreSQL) และ NoSQL (Redis, MongoDB, Cassandra) — Back-end ที่ส่งผลต่อ Performance มากที่สุดในระบบธนาคาร

Query TimeConnection PoolDeadlock

📬

Message Queue (MQ)

RabbitMQ, IBM MQ, ActiveMQ — ใช้ใน Batch Processing, Async Workflow, Notification Pipeline ในระบบ Core Banking เก่า

Queue DepthMessage RateDLQ

🔄

Batch / ETL Processing

End-of-Day Processing, Settlement Batch, Report Generation — กระบวนการที่ต้องเสร็จภายใน Maintenance Window ที่กำหนดไว้เคร่งครัด

ThroughputWindow TimeCheckpoint

Banking Standard KPI

เกณฑ์มาตรฐาน KPI
ระดับธนาคาร

อ้างอิงจากมาตรฐาน BOT (ธปท.), ISO 20022, PCI-DSS และ SWIFT CSCF — แบ่งตาม Criticality Tier

⚠ หมายเหตุ: ค่าเกณฑ์เหล่านี้เป็น Baseline อ้างอิง — ธนาคารแต่ละแห่งอาจกำหนดค่า SLA ที่เข้มงวดกว่านี้ได้ ขึ้นอยู่กับนโยบายภายใน, กฎระเบียบ BOT ที่เกี่ยวข้อง และลักษณะของระบบ

ระบบ / Protocol	Tier	Metric	Pass ✓	Warning ⚠	Critical ✗	เหตุผล
PromptPay / ทันใจ Real-time Payment API	T1 — Critical	p95 Response Time	< 1s	1–3s	> 3s	กฎ BOT: Payment ต้องสำเร็จภายใน 3s
	T1	Error Rate	< 0.01%	0.01–0.1%	> 0.1%	Transaction Loss ส่งผลต่อเงินลูกค้าโดยตรง
	T1	Availability	99.999%	99.99%	< 99.99%	Downtime < 5 min/ปี สำหรับ T1
Core Banking API Balance, Account, Transfer	T1 — Critical	p95 Response Time	< 2s	2–5s	> 5s	User Experience + Timeout Policy ของ Channel
	T1	p99 Response Time	< 5s	5–10s	> 10s	ป้องกัน Timeout ที่ Client / Channel Layer
	T1	Concurrent TPS	> 500 TPS	200–500 TPS	< 200 TPS	ขึ้นกับ Peak Hour ของธนาคาร (typically 08:00–10:00)
Internet / Mobile Banking Web, iOS, Android	T2 — High	p95 Response Time	< 2s	2–4s	> 4s	APDEX Target 0.85 — Industry UX Standard
	T2	Error Rate	< 0.5%	0.5–1%	> 1%	Customer Complaint Threshold
	T2	Page Load (LCP)	< 2.5s	2.5–4s	> 4s	Google Core Web Vitals — Good threshold
Apache Kafka Payment / Transaction Events	T1 — Critical	Producer Throughput	> 50,000 msg/s	20k–50k msg/s	< 20,000 msg/s	Payment Peak Volume ช่วง EOM / Salary Day
	T1	Consumer Lag	< 1,000 msgs	1k–10k msgs	> 10,000 msgs	Lag สูง = Downstream Processing ล่าช้า
	T1	End-to-end Latency	< 500ms	500ms–2s	> 2s	Producer → Consumer Delivery Time
	T1	Message Loss Rate	0% (acks=all)	—	> 0%	Financial Data ห้าม Loss แม้แต่ 1 Message
Azure Event Hub Cloud Event Streaming	T2 — High	Ingress Throughput	> 1 MB/s per TU	0.5–1 MB/s	< 0.5 MB/s	ตาม Throughput Unit (TU) ที่ Provision ไว้
	T2	Event Latency p95	< 1s	1–3s	> 3s	Consumer Group Processing Delay
	T2	Throttling Rate	< 0.1%	0.1–1%	> 1%	429 Throttled Requests — ต้องเพิ่ม TU
WebSocket Market Data / Notification	T2 — High	Message Delivery p95	< 100ms	100–500ms	> 500ms	Real-time Feed ต้องถึงมือ Client ก่อน Market Move
	T2	Concurrent Connections	> 10,000	5k–10k	< 5,000	Capacity ต่อ Node — ต้อง Scale ผ่าน LB
	T2	Connection Drop Rate	< 0.01%/hr	0.01–0.1%	> 0.1%	Unexpected Drop ส่งผลต่อ Data Gap
	T2	Reconnect Time	< 5s	5–15s	> 15s	Auto-reconnect + Message Replay Gap
AI / LLM Service Chatbot, Fraud Detection	T2 — High	TTFT (Time to First Token)	< 2s	2–5s	> 5s	User รอเห็น Response แรก — UX Critical Point
	T2	Total Response p95	< 10s	10–30s	> 30s	Complete Generation Time (non-streaming)
	T2	Throughput (Token/s)	> 50 tok/s	20–50 tok/s	< 20 tok/s	Streaming Token Rate ที่ User รู้สึกว่า "เร็ว"
AI — Fraud Detection Real-time Scoring	T1 — Critical	Inference Latency p99	< 200ms	200–500ms	> 500ms	ต้อง Score ก่อน Transaction Authorize เสมอ
gRPC (Internal Services) Microservice Communication	T1 — Critical	Unary RPC p95	< 50ms	50–200ms	> 200ms	Internal Hop — ผล Chain เป็น End-user Latency
	T1	Streaming Throughput	> 10,000 RPS	5k–10k RPS	< 5,000 RPS	Server-side Streaming สำหรับ Bulk Data
	T1	Deadline Exceeded %	< 0.1%	0.1–0.5%	> 0.5%	Deadline คือ Timeout ใน gRPC — Cascade Failure Risk
Oracle / MSSQL (Core DB) Transaction Database	T1 — Critical	Query p95 (OLTP)	< 100ms	100–500ms	> 500ms	Core Transaction Query ไม่ควรเกิน 100ms
	T1	Connection Pool Util	< 70%	70–85%	> 85%	Buffer สำหรับ Spike — Pool Full = Hung Requests
	T1	Deadlock Rate	0	1–5/hr	> 5/hr	Deadlock = Transaction Fail + Retry Storm
Redis (Session / Cache) In-memory Data Store	T2 — High	GET/SET Latency p99	< 1ms	1–5ms	> 5ms	Redis ควรเร็วกว่า Network Round-trip เสมอ
	T2	Cache Hit Rate	> 95%	80–95%	< 80%	Cache Miss = DB Hit — Latency เพิ่ม 10–100x
EOD Batch / Settlement End-of-Day Processing	T1 — Critical	Completion Time	< 2hr window	2–3hr	> 3hr (miss window)	ต้องเสร็จก่อน BOT Settlement Deadline 00:00
	T1	Record Throughput	> 1M rec/hr	500k–1M	< 500k rec/hr	คำนวณจาก Daily Transaction Volume
IBM MQ / RabbitMQ Message Queue	T2 — High	Queue Depth (steady)	< 1,000 msgs	1k–10k msgs	> 10,000 msgs	Queue สะสม = Backpressure + Delayed Processing
	T2	DLQ Rate	< 0.01%	0.01–0.1%	> 0.1%	Dead Letter = Message ที่ต้องการ Manual Intervention

Criticality Tier: T1 — Critical: Availability 99.999%, RTO < 15min T2 — High: Availability 99.99%, RTO < 1hr T3 — Standard: Availability 99.9%, RTO < 4hr

Deep Dive · Kafka

Kafka Performance
Test Strategy

📨

Apache Kafka — Banking Payment Events

Producer / Consumer / Broker Performance Testing

KPI THRESHOLDS

Producer Throughput

> 50k msg/s

ต่อ Broker Node

Consumer Lag (steady)

< 1,000 msgs

ต่อ Consumer Group

End-to-end Latency p95

< 500ms

Producer → Consumer

Message Loss (acks=all)

Financial Data — Zero Loss

Broker CPU (peak)

< 70%

Buffer 30% สำหรับ Rebalance

Replication Lag

< 10ms

In-Sync Replica (ISR)

TEST SCENARIOS

Producer Throughput Test

วัด max throughput ของ Producer ด้วย batch.size, linger.ms และ compression.type ต่างๆ — เปรียบเทียบ gzip vs snappy vs lz4 บน Payment Event Payload ขนาด 512B–2KB

Consumer Lag Under Load

Produce message ด้วย 2x normal rate → วัด Consumer Lag ที่เกิดขึ้น → วัด Recovery Time หลัง Producer กลับสู่ Normal Rate — ต้อง Catch Up ภายใน 5 นาที

Broker Failure & Rebalance

Kill Broker กลางการทดสอบ → วัด Leader Election Time, ISR Rebalance Time, Producer Retry Duration และ Consumer Rebalance Time — ต้อง Recover ภายใน 30s

EOM (End of Month) Spike Simulation

Simulate Salary Day Traffic: เพิ่ม Produce Rate 10x ในเวลา 2 นาที → คง Peak 30 นาที → ลดลง → วัด Consumer Lag Max, Latency Spike และ DLQ Rate

Message Durability Verification

ใช้ acks=all, min.insync.replicas=2 → Produce 1M messages → Kill 1 Broker ระหว่าง Produce → ยืนยันว่า Consumer ได้รับครบ 1M messages ไม่มี Loss

Python · kafka-python — Kafka Producer Load Test

from kafka import KafkaProducer
import json, time, statistics

producer = KafkaProducer(
    bootstrap_servers=['kafka-broker-1:9092', 'kafka-broker-2:9092'],
    acks='all',                          # Financial: Zero message loss
    retries=3, batch_size=16384, linger_ms=5, compression_type='snappy',
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
latencies, errors, sent = [], [], 0

def on_send_success(metadata, t_send):
    global sent
    latencies.append((time.time() - t_send) * 1000); sent += 1

TARGET_RATE = 50_000; DURATION = 300; interval = 1 / TARGET_RATE
start = time.time()
while time.time() - start < DURATION:
    t = time.time()
    producer.send('payment-events', value={'txn_id': f'TXN-{sent}', 'ts': t}) \
            .add_callback(on_send_success, t).add_errback(lambda e: errors.append(str(e)))
    time.sleep(max(0, interval - (time.time() - t)))
producer.flush()
p95 = statistics.quantiles(latencies, n=100)[94]
print(f"Sent: {sent:,} | Errors: {len(errors)} | p95: {p95:.1f}ms | {sent/DURATION:,.0f} msg/s")

Deep Dive · WebSocket

WebSocket Performance
Test Strategy

🔌

WebSocket — Real-time Feed & Notification

Connection Stability · Message Latency · Concurrent Clients

Concurrent Connections

> 10,000

ต่อ Server Node

Message Latency p95

< 100ms

Server → Client Delivery

Connection Drop Rate

< 0.01%/hr

Unexpected Disconnect

Reconnect Time p95

< 5s

Auto-reconnect SLA

Handshake Time p95

< 200ms

TLS + WS Upgrade

Message Loss Rate

Market Data — No Gap

Connection Ramp-up Test

เพิ่ม Concurrent Connection จาก 0 → 10,000 → 50,000 อย่างค่อยเป็นค่อยไป → วัด Server Memory, CPU และ Handshake Latency ในแต่ละ Level — หา Connection Saturation Point

Message Broadcast Latency

Broadcast Market Data message ไปยัง Connection ทั้งหมดพร้อมกัน → วัด Time from Publish to Last Client Received — p95 ต้องน้อยกว่า 100ms ที่ 10,000 connections

Reconnection Storm Test

Kill Server แล้วเปิดใหม่ → วัด Thundering Herd Effect เมื่อ Client 10,000 ตัว Reconnect พร้อมกัน → ตรวจว่า Exponential Backoff ทำงานถูกต้อง

24hr Soak with Connection Monitoring

คง 5,000 Connections นาน 24 ชั่วโมง → วัด Memory Growth Rate ของ Server, WebSocket Frame Count และ Heartbeat Miss Rate — ตรวจหา Connection Leak

JavaScript · k6 — WebSocket Concurrent Connection Test

import { check } from 'k6';
import ws from 'k6/ws';
import { Trend, Counter } from 'k6/metrics';

const msgLatency = new Trend('ws_msg_latency', true);
const dropCount  = new Counter('ws_drop_count');

export const options = {
  stages: [
    { duration: '2m', target: 1000 }, { duration: '5m', target: 10000 },
    { duration: '10m', target: 10000 }, { duration: '2m', target: 0 },
  ],
  thresholds: { 'ws_msg_latency': ['p(95)<100'], 'ws_drop_count': ['count<10'] },
};

export default function () {
  const res = ws.connect('wss://market.bank.th/feed', {}, function(sock) {
    sock.on('open', () => sock.send(JSON.stringify({ type: 'subscribe', channel: 'SET' })));
    sock.on('message', (data) => {
      const msg = JSON.parse(data);
      if (msg.server_ts) msgLatency.add(Date.now() - msg.server_ts);
    });
    sock.on('close', () => dropCount.add(1));
    sock.setTimeout(() => sock.close(), 600_000);
  });
  check(res, { 'WS connected': (r) => r && r.status === 101 });
}

Deep Dive · AI Service

AI / LLM Service
Performance Strategy

🤖

AI Service — Chatbot, Fraud Detection, Document Processing

TTFT · Token Throughput · Concurrency · Model Degradation

⚠ Banking Context: AI ในธนาคารมีสองกลุ่มที่ SLA ต่างกันมาก — Conversational AI (Chatbot) ยอมรับ Latency ได้สูงกว่า แต่ Decision AI (Fraud Detection, Credit Scoring) ต้องตอบภายใน Transaction Window (ปกติ < 500ms)

TTFT — Chatbot

< 2s

Time to First Token

Token Throughput

> 50 tok/s

Streaming Generation Rate

Fraud Inference p99

< 200ms

T1 Critical — Pre-authorize

Concurrent Requests

> 100 RPS

ต่อ Model Serving Instance

Queue Wait Time p95

< 5s

Inference Queue Depth

Error Rate (timeout)

< 1%

Model Timeout / OOM

Baseline Latency Profiling

ส่ง Request ที่ Token Length ต่างกัน (128, 256, 512, 1024, 2048 tokens) ด้วย 1 Concurrent User → สร้าง Latency-vs-Token-Length Curve เป็น Baseline

Concurrent Inference Load Test

เพิ่ม Concurrent Request จาก 1 → 10 → 50 → 100 → วัด TTFT, Total Latency และ GPU/CPU Utilization — หา Concurrency ที่ทำให้ Latency เกิน SLA

Model Degradation Under Sustained Load

รัน 80% Capacity Load นาน 4 ชั่วโมง → วัด Latency Drift ทุก 30 นาที — ตรวจหา GPU Memory Fragmentation, Batch Size Fluctuation และ Throughput Degradation

Fraud Detection Latency Gate Test

ทดสอบ Fraud Model ด้วย 1,000 concurrent Transaction ต่อวินาที → ต้อง Score ทุก Transaction และ Return ภายใน 200ms (p99) ไม่งั้น Block Transaction ไม่ทัน

Deep Dive · gRPC

gRPC Performance
Test Strategy

⚡

gRPC — Internal Microservice Communication

Unary · Server Streaming · Deadline · Connection Pool

Unary RPC p95

< 50ms

Service-to-Service

Streaming Throughput

> 10k RPS

Server-side Stream

Deadline Exceeded %

< 0.1%

Cascade Failure Risk

Connection Reuse Rate

> 95%

HTTP/2 Multiplexing

ใช้ ghz หรือ k6 gRPC module สำหรับ Load Testing (ไม่ใช่ HTTP)

ทดสอบทั้ง Unary, Server-streaming, Client-streaming และ Bidirectional streaming แยกกัน

วัด Deadline Exceeded % → ถ้าสูง = Service Chain มี Latency Budget ไม่พอ

ทดสอบ Connection Pool Size: กำหนด MaxConcurrentStreams ให้เหมาะกับ Backend

gRPC Connection ที่ไม่ได้ใช้ Keep-alive จะ Timeout โดย LB/Proxy → ตรวจ Transparent Reconnect

วัด Protobuf Serialization Overhead เปรียบเทียบกับ JSON บน Payload เดียวกัน

Shell · ghz — gRPC Load Test

# ── Unary RPC: Core Account Service ─────────────────────────────
ghz --insecure \
  --proto ./proto/account.proto \
  --call  bank.account.AccountService/GetBalance \
  --data  '{"account_id": "ACC-001"}' \
  --rps 5000 --duration 300s --connections 100 --concurrency 1000 \
  --timeout 200ms --format pretty core-banking-svc:9090

# Expected: p95: 45ms | DeadlineExceeded: 0.09% ✓ (threshold <0.1%)

Deep Dive · Database & Event Hub

Database Layer &
Azure Event Hub Strategy

🗄️

Database Performance Testing

Oracle · MSSQL · PostgreSQL · Redis — Banking Core DB

OLTP Query p95

< 100ms

Core Transaction

Connection Pool Util

< 70%

Peak Hours

Deadlock / hr

Zero Tolerance

Redis GET p99

< 1ms

Session / Token Cache

Cache Hit Rate

> 95%

Redis / Memcached

Long Query (Slow log)

0 > 1s

ต้องไม่มี Query > 1s ใน OLTP

OLTP Benchmark (HammerDB / pgbench)

ทดสอบ Read/Write Mixed Workload ที่ 70:30 ratio → วัด TPS, Latency p95 และ Lock Wait Time — เป้าหมาย: > 5,000 TPS บน Standard Banking Workload

Connection Pool Exhaustion Test

เพิ่ม Connection จนถึง Pool Max → วัด Wait Queue Time, Timeout Rate และ Error Response — Pool ต้องรองรับ 2x Normal Peak ก่อน Queue เต็ม

Index Regression Test

ทดสอบ Critical Query ด้วย EXPLAIN PLAN → วัด Execution Plan หลัง Data Volume เพิ่ม 10x — ตรวจว่า Index ยังถูกใช้และไม่มี Full Table Scan บน Core Table

☁️

Azure Event Hub — Cloud Event Streaming

Throughput Unit · Partition Scaling · Consumer Group

Ingress per TU

> 1 MB/s

ต่อ Throughput Unit

Event Latency p95

< 1s

Publish → Consumer

Throttle Rate

< 0.1%

429 Too Many Requests

Checkpoint Interval

< 30s

Consumer Checkpoint

ทดสอบ TU Auto-inflate: ส่ง Event เกิน TU ที่ตั้งไว้ → ตรวจว่า Auto-scale ขึ้น TU ทัน

วัด Partition Imbalance: Consumer ทุกตัวต้องได้รับ Event กระจายเท่ากัน

ทดสอบ Capture to Storage: ตรวจว่าไม่มี Event ขาดหายเมื่อ Enable Capture to ADLS

Azure Event Hub มี Max 1MB/s Ingress ต่อ TU — ต้อง Plan TU ล่วงหน้า ไม่ใช่ตอน Spike

ทดสอบ Consumer Group Rebalance: Drop Consumer 1 ตัว → วัด Rebalance Time และ Event Gap

Tool Recommendation

เครื่องมือ Performance Test
แยกตาม Protocol

ไม่มีเครื่องมือเดียวทำได้ทุกอย่าง — เลือกให้ตรงกับ Protocol และ Banking Context

REST API / HTTP

k6 · JMeter · Locust

k6 แนะนำสำหรับ CI/CD · JMeter สำหรับ Legacy Enterprise

Kafka / Event Streaming

kafka-python · kcat · Gatling Kafka

ใช้ kafka-perf-test สำหรับ Broker-level Benchmark

Azure Event Hub

Azure SDK + Locust · JMeter Azure Plugin

Monitor ผ่าน Azure Monitor Metrics + Diagnostic Logs

WebSocket

k6 (ws module) · Artillery · Gatling

k6 ws module รองรับ Concurrent WS ได้ดี

AI / LLM Service

Locust · k6 · LiteLLM Benchmark

วัด TTFT แยก metric — ใช้ Streaming Response

gRPC

ghz · k6 (gRPC module) · Gatling gRPC

ghz ง่ายที่สุดสำหรับ Quick Benchmark

Database (OLTP)

HammerDB · pgbench · sysbench

HammerDB รองรับ Oracle, MSSQL, PostgreSQL

Redis / Cache

redis-benchmark · Memtier

Memtier รองรับทั้ง Redis และ Memcached

Glossary

คำศัพท์ด้าน
Performance Testing

50+ คำศัพท์ที่ใช้บ่อยในงาน Performance Testing เรียงตามตัวอักษร พร้อมตัวอย่างการใช้งานจริง

🔍

ทดสอบ Performance ให้ได้ผลจริง

Performance Testingมีกี่แบบ?