Skip to main content

()

TECHNICAL DEEP DIVESID: 3STATUS: PUBLISHED

Understanding TCP Handshakes in Monitoring

Author
net_ops
Date
11/10/2025
Read Time
8m 45s
Tags
NETWORKING

> Beyond HTTP 200

Most monitoring solutions stop at the HTTP status code. But what if the connection takes 10 seconds to establish?

- The 3-Way Handshake

Understanding the SYN, SYN-ACK, ACK process is crucial for identifying network-level issues before they become critical.

Client          Server
  |    SYN       |
  |------------->|
  |   SYN-ACK    |
  |<-------------|
  |    ACK       |
  |------------->|

- Why It Matters

A slow TCP handshake can indicate:

  • Network congestion
  • Server overload
  • DNS resolution issues
  • Firewall misconfigurations

- Measuring Connection Time

We track multiple metrics:

  1. DNS Lookup Time - How long to resolve the hostname
  2. TCP Connect Time - Time to establish the connection
  3. TLS Handshake Time - SSL/TLS negotiation duration
  4. Time to First Byte (TTFB) - Server response time

- Implementation

Here's how we measure these metrics in our monitoring system:

interface ConnectionMetrics {
  dnsLookup: number;
  tcpConnect: number;
  tlsHandshake: number;
  ttfb: number;
  total: number;
}

async function measureConnection(url: string): Promise<ConnectionMetrics> {
  const metrics: Partial<ConnectionMetrics> = {};

  // DNS lookup timing
  const dnsStart = performance.now();
  const resolved = await dns.resolve(url);
  metrics.dnsLookup = performance.now() - dnsStart;

  // TCP connection timing
  const tcpStart = performance.now();
  const socket = await connect(resolved);
  metrics.tcpConnect = performance.now() - tcpStart;

  // ... rest of the implementation

  return metrics as ConnectionMetrics;
}

- Real-World Example

We once caught a critical issue where HTTP requests returned 200 OK, but the TCP handshake was taking 8+ seconds due to a misconfigured load balancer.

- Best Practices

  1. Monitor all connection phases separately
  2. Set appropriate thresholds for each metric
  3. Alert on trends, not just absolute values
  4. Consider geographic distribution in your measurements

> Conclusion

True uptime monitoring goes beyond checking status codes. Understanding and monitoring the underlying network behavior is essential for maintaining reliable services.

End of log entry.
Filed Under:
#NETWORKING#TCP/IP#GUIDE

Related_Articles