Atlas Device Synchronisation Issues

Incident Report for Person Centred Software

Postmortem

Atlas Archive Process Incident on 10th November 2025

Summary  

 On 10th November, a service disruption occurred when a scheduled maintenance task was ran during business hours. The maintenance process required the system to safely restore data integrity, which took longer than anticipated. Service was restored in stages, with initial recovery at 2:30 PM and full service restoration by 3:20 PM. 

 Root Cause  

 The incident occurred due to a maintenance scheduling issue with our infrastructure partner. 

Contributing factors: 

  • Scheduling Error: A background maintenance task that is normally run outside core business hours was initiated during business hours as part of troubleshooting activities. 
  • Extended Recovery Time: The nature of the maintenance process required additional time for the system to safely complete data operations 
  • Process Documentation: Our partner's procedures did not have sufficient guidance on appropriate scheduling for this type of maintenance 

Timeline of Events  

Resolution 

Service was restored through the following steps: 

  • System Recovery: The maintenance process completed successfully, and systems recovered automatically 
  • Phased Restoration: Services were restored in stages to ensure stability 
  • Extended Monitoring: Additional monitoring was conducted to confirm full service stability 
  • Verification: All systems were verified as fully operational before closing the incident 

 

Customer Communication 

During the Incident: 

  • Incident began at 1:10 PM on 10th November 
  • First customer reports received at 1:18 PM 
  • Service was fully restored by 3:20 PM 
  • Incident closed at 4:45 PM after stability monitoring 
  • Total service disruption: approximately 2 hours 10 minutes 

Preventative Measures and Next Steps 

Completed Actions: 

  • Partner Engagement: Conducted a comprehensive review with our infrastructure partner during scheduled service review 
  • Updated Procedures: Our partner has updated their operational procedures to ensure background tasks are scheduled at appropriate times.  
  • Enhanced Communication: Implemented improved communication protocols requiring advance confirmation of scheduling for all maintenance activities 
  • Verification Process: Our partner now confirms scheduling details before initiating any maintenance processes 

Ongoing Actions: 

  • Continued Oversight: Regular reviews of partner procedures and scheduling practices 
  • Documentation Review: Ensuring all maintenance procedures have clear scheduling requirements 
  • Process Monitoring: Tracking adherence to updated procedures 

 

We have worked closely with our infrastructure partner to ensure this type of incident does not happen again. Updated procedures and enhanced communication protocols are now in place to prevent maintenance activities from impacting service during business hours. We continue to monitor our partner's adherence to these improved processes. 

We apologise for the inconvenience this incident caused and appreciate your patience as we worked to restore service.

Posted Nov 25, 2025 - 17:08 GMT

Resolved

This incident has been resolved.
Posted Nov 10, 2025 - 16:22 GMT

Update

We are continuing to monitor for any further issues.
Posted Nov 10, 2025 - 15:38 GMT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Nov 10, 2025 - 15:38 GMT

Identified

The issue has been identified and a fix is being implemented.
Posted Nov 10, 2025 - 14:46 GMT

Update

The ongoing issue is detrimental effect on performance of the CAPA Pharmacy system, PCS will update as soon as the issue has been identified and a resolution available
Posted Nov 10, 2025 - 14:18 GMT

Update

Sync issues are ongoing and currently this is affecting both Sync Pod servers intermittently, we update as soon as the cause is identified and a fix is available
Posted Nov 10, 2025 - 14:14 GMT

Investigating

We are aware that some users are experiencing some Atlas Device Synchronisation issues on Sync Pod A, if you are experiencing these problems please attempt to change to Sync Pod B
Posted Nov 10, 2025 - 13:35 GMT
This incident affected: eMar (CAPA, Atlas Sync).