Documentation/Health Check

Health Check System

Monitor your SMTP server health with built-in HTTP endpoints and custom checks.

Healthy

All components working normally. System is fully operational.

Degraded

Some components affected but system is functional.

Unhealthy

Critical components failed. System needs attention.

Basic Setup

Enable health monitoring with a single line of code:

BasicHealthCheck.cs
using Zetian.Server;
using Zetian.HealthCheck.Extensions;

// Create SMTP server
var server = new SmtpServerBuilder()
    .Port(25)
    .Build();

// Enable health check on port 8080
var healthCheck = server.EnableHealthCheck(8080);

// Start server with health check
await server.StartWithHealthCheckAsync(8080);

// Access health endpoints:
// http://localhost:8080/health
// http://localhost:8080/health/livez
// http://localhost:8080/health/readyz

/health

Complete health status with all checks

/health/livez

Liveness probe for container orchestration

/health/readyz

Readiness probe for load balancing

Custom Health Checks

Add custom checks for databases, external services, and system resources:

CustomHealthChecks.cs
using Zetian.Server;
using System.Diagnostics;
using StackExchange.Redis;
using Microsoft.Data.SqlClient;
using Zetian.HealthCheck.Models;
using Zetian.HealthCheck.Extensions;

// Create SMTP server
var server = new SmtpServerBuilder()
    .Port(25)
    .Build();

await server.StartAsync();

// Enable health check on port 8080
var healthCheck = server.EnableHealthCheck(8080);

// Add custom health checks
healthCheck.AddHealthCheck("database", async (ct) =>
{
    try
    {
        // Check database connection
        using var connection = new SqlConnection("Server=...;Database=...;User ID=...;Password=...;");
        await connection.OpenAsync(ct);

        // Check response time
        var sw = Stopwatch.StartNew();
        await connection.OpenAsync();
        sw.Stop();

        if (sw.ElapsedMilliseconds > 1000)
        {
            return HealthCheckResult.Degraded(
                $"Database slow: {sw.ElapsedMilliseconds}ms");
        }

        return HealthCheckResult.Healthy("Database is responsive");
    }
    catch (Exception ex)
    {
        return HealthCheckResult.Unhealthy(
            "Cannot connect to database", ex);
    }
});

// Add Redis health check
healthCheck.AddHealthCheck("redis", async (ct) =>
{
    try
    {
        var redis = ConnectionMultiplexer.Connect("localhost:6379");
        var db = redis.GetDatabase();
        await db.PingAsync();

        var info = new Dictionary<string, object>
        {
            ["connected_clients"] = redis.GetCounters().TotalOutstanding,
            ["memory_usage_mb"] = GC.GetTotalMemory(false) / (1024 * 1024)
        };

        return HealthCheckResult.Healthy("Redis connected", info);
    }
    catch (Exception ex)
    {
        // Redis is not critical, mark as degraded
        return HealthCheckResult.Degraded("Redis unavailable", ex);
    }
});

// Add disk space check
healthCheck.AddHealthCheck("disk_space", async (ct) =>
{
    var drive = new DriveInfo("C");
    var freeSpaceGB = drive.AvailableFreeSpace / (1024 * 1024 * 1024);

    if (freeSpaceGB < 1)
    {
        return HealthCheckResult.Unhealthy($"Low disk space: {freeSpaceGB:F2} GB");
    }
    else if (freeSpaceGB < 5)
    {
        return HealthCheckResult.Degraded($"Disk space warning: {freeSpaceGB:F2} GB");
    }

    return HealthCheckResult.Healthy($"Disk space OK: {freeSpaceGB:F2} GB");
});

// Start health check
await healthCheck.StartAsync();

// Access health endpoints:
// http://localhost:8080/health
// http://localhost:8080/health/livez
// http://localhost:8080/health/readyz

Binding Options

Configure how the health check HTTP service binds to network interfaces:

HealthCheckBinding.cs
using System.Net;
using Zetian.Server;
using Zetian.HealthCheck.Extensions;
using Zetian.HealthCheck.Options;

// Bind to localhost (default)
var healthCheck = server.EnableHealthCheck(8080);

// Bind to specific IP
var healthCheck = server.EnableHealthCheck(
    IPAddress.Parse("192.168.1.100"), 8080);

// Bind to specific hostname
var healthCheck = server.EnableHealthCheck(
    "myserver.local", 8080);

// Bind to all interfaces (Docker/Kubernetes)
var healthCheck = server.EnableHealthCheck("0.0.0.0", 8080);

// IPv6 support
var healthCheck = server.EnableHealthCheck(
    IPAddress.IPv6Loopback, 8080);

// Custom service options
var serviceOptions = new HealthCheckServiceOptions
{
    // Define HTTP prefixes to listen on
    Prefixes = new() { "http://+:8080/health/" }, // Listen on all interfaces
    DegradedStatusCode = 200 // HTTP status code for degraded state
};

// SMTP health check options
var smtpOptions = new SmtpHealthCheckOptions
{
    CheckMemoryUsage = true,
    DegradedThresholdPercent = 60,   // 60% utilization = degraded
    UnhealthyThresholdPercent = 85   // 85% utilization = unhealthy
};

var healthCheck = server.EnableHealthCheck(serviceOptions, smtpOptions);

Response Format

Health check endpoints return detailed JSON responses:

health-response.json
// Sample JSON response from /health endpoint
{
  "status": "Healthy",  // or "Degraded" or "Unhealthy"
  "timestamp": "2025-10-23T06:39:09.1509714+00:00",
  "checks": {
    "smtp_server": {
      "status": "Healthy",
      "description": "SMTP server is healthy",
      "data": {
        "status": "running",
        "uptime": "0d 0h 0m 11s",
        "endpoint": "0.0.0.0:25",
        "configuration": {
          "serverName": "Zetian SMTP Server",
          "maxConnections": 100,
          "maxMessageSize": 10485760,
          "requireAuthentication": false,
          "requireSecureConnection": false
        },
        "activeSessions": 0,
        "maxSessions": 100,
        "utilizationPercent": 0,
        "memory": {
          "workingSet": 49422336,
          "privateMemory": 16596992,
          "virtualMemory": 2237691518976,
          "gcTotalMemory": 2494752
        }
      }
    },
    "database": {
      "status": "Healthy",
      "description": "Database is responsive",
      "duration": "00:00:00.0234567"
    },
    "redis": {
      "status": "Degraded",
      "description": "Redis unavailable",
      "duration": "00:00:00.1000000",
      "exception": "Connection timeout"
    }
  }
}

HTTP Status Codes

  • 200 OK - Healthy status
  • 206 Partial Content - Degraded status
  • 503 Service Unavailable - Unhealthy status

Kubernetes Integration

Use health checks with Kubernetes liveness and readiness probes:

kubernetes-deployment.yaml
# Kubernetes deployment with health checks
apiVersion: apps/v1
kind: Deployment
metadata:
  name: smtp-server
spec:
  template:
    spec:
      containers:
      - name: smtp-server
        image: myregistry/smtp-server:latest
        ports:
        - containerPort: 25  # SMTP
        - containerPort: 8080 # Health check
        
        # Liveness probe - restart if unhealthy
        livenessProbe:
          httpGet:
            path: /health//livez
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        
        # Readiness probe - remove from load balancer if not ready
        readinessProbe:
          httpGet:
            path: /health/readyz
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2

Best Practices

1. Separate Liveness and Readiness

Use /livez to check if the process is alive (restart if fails). Use /readyz to check if ready for traffic (remove from load balancer if fails).

2. Set Appropriate Timeouts

Configure reasonable timeouts for health checks. Too short may cause false positives, too long delays detection of real issues.

3. Use Degraded Status

Return degraded status for non-critical components. This signals issues without triggering restarts or removing from load balancer.

4. Security Considerations

Set DetailedErrors = false in production to avoid exposing sensitive information. Consider adding authentication for health endpoints if needed.