A global enterprise was generating false SLA reports due to time-sync drift across 400+ IIS servers. Applicare's unified observability eliminated reporting errors and gave IT leadership a single source of truth.
The enterprise IT team was confident in their 99.9% server availability SLA — until a quarterly audit exposed a pattern of server events appearing out of chronological order in reports. Services that went down at 14:32 were being logged at 14:28. Servers that recovered were showing as still offline.
The root cause was NTP drift across 400+ Windows IIS servers. With each server keeping slightly different time, the central monitoring platform was stitching together an incoherent timeline. SLA compliance reports were useless — and the team didn't know whether their infrastructure was actually performing to target.
The team had been reporting 99.9% availability to leadership for 18 months. After correcting for drift, actual measured availability was 99.4% — a meaningful gap that had been completely invisible.
Applicare deployed its entity graph across the entire IIS estate, modelling each server, application pool, and web application as a distinct entity with relationships to the underlying Windows infrastructure. The critical addition: Applicare normalises all telemetry to a single authoritative time reference at ingestion — so drift at the source never contaminates the reporting layer.
Every event, metric, and log line is timestamped at ingest rather than trusting the originating server's clock. The entity graph correlates events across servers using ingest-normalised timestamps, producing an accurate causal timeline regardless of NTP drift.
Once the timeline was trustworthy, the full power of unified observability became available. Application pool crashes, IIS worker process failures, and dependency outages could be correlated across servers in the correct temporal order. Patterns that had been invisible — one IIS server's crash triggering dependent application failures 30 seconds later — became clearly visible.
We had no idea our SLA reports were wrong. Applicare gave us accurate timestamps on day one and then showed us everything else we were missing about how our IIS estate actually behaves.
Within the first week, the team identified three recurring availability events that had been masked by the drift. Two were resolved immediately with configuration changes that IntelliTune automated going forward. The third revealed a capacity issue that would have caused a major outage within 60 days.