Sentinel: Distributed System Monitor in Go
Problem: Monitoring a fleet of remote servers often involves a trade-off between real-time visibility and system overhead. Traditional polling methods can become "chatty", leading to increased latency and inconsistent data synchronization when scaling across multiple nodes.
Action: Developed Sentinel, a high-performance monitoring tool that utilizes gRPC client-side streaming to maintain persistent, low-overhead connections. I implemented a concurrent architecture using Goroutines to handle simultaneous data streams and a centralized dashboard, ensuring the server remains non-blocking even under high load from multiple agents.
Result: Sentinel - a scalable, single-binary solution for real-time infrastructure observability featuring:
- Efficient Streaming: Replaces traditional REST polling with gRPC streams, reducing header overhead and ensuring sub-second metric updates from remote agents.
- Concurrent Architecture: Leverages Go’s concurrency primitives to manage a distributed fleet of agents and a web-based dashboard simultaneously on a single server instance.
- Robust Configuration: Built a multi-layered configuration hierarchy (CLI, Env, YAML) using the Cobra/Viper ecosystem for seamless deployment in Docker or bare-metal environments.
- Modular Design: Designed with decoupled packages for metric collection, storage, and rendering, allowing for high testability and easy extension to new metric types.
- Production-Ready Testing: Achieved 80%+ average test coverage in core logic and verified thread-safety using the Go Race Detector to ensure stability in high-concurrency scenarios.
- Containerized Orchestration: Included a pre-configured Docker Compose environment to demonstrate a 5-node distributed system with a single command.
Project Gallery
Dashboard View: Five distributed agents reporting CPU/RAM usage to the central server.
Terminal Output: Sentinel server console output showing reporting agents and their resource usage.
Reliability: 80%+ average test coverage of core logic, and race detection verified to ensure production-grade stability.