Jul 01

Microsoft Office365 Exchange Online Service Performance Degradation and SMTP Problems

office365A few years ago, we migrated our email service to Microsoft’s Office365 cloud service. Overall, it’s been very reliable and eliminated the challenges we had hosting Exchange ourselves. It let us get to our emails using Outlook installed on Windows, any internet browser, and smartphones. Office365 also offered other Office product online (Access Web Apps, Excel, Word, etc.), SharePoint and OneDrive Business.

Unfortunately, on the morning of June 30th, we discovered:

  • Delays sending and receiving emails
  • Some emails were bouncing back from recipients who couldn’t validate our Office365 Exchange Server’s SMTP (protection.outlook.com) with our domain name. That meant the Exchange SMTP server was no longer considered a trusted sender of emails from the @fmsinc.com domain.
  • Our use of the Office365 SMTP server to send emails with our Total Access Emailer product was also failing to authenticate against the server

The problems began the evening before. Needless to say, we aren’t happy about this experience which impacted us and our clients using Office365. Reports are that it affects Office365 customers across North America.

exchangeContacting Microsoft, they confirmed problems with the health of their Office365 Exchange Server. Throughout the day, problems lessened but persisted. We hope the problems are resolved soon and that we’ll understand what went wrong once we overcome the immediate crises.

These are the reports we’ve received from Microsoft. We’ll keep you updated as we learn more:


Exchange Online Service Degraded

This is what the Office365 Admin portal shows for Service Health:

Office365Issues

Office365Health


EX71628 – E-Mail and calendar access – Restoring Service

Jun 29, 2016 12:11 PM

CURRENT STATUS

Our investigation determined that an existing transport feature which is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users. We’re bypassing the affected feature to restore service

User Impact

Users may be unable to send email messages through the Exchange Online service. Email messages may appear to be stuck in the Drafts or Outbox folders.

Scope of Impact

A few customers have reported this issue, and our analysis indicates that for most customers, it’s unlikely that many users would report impact related to this event.

  • Start Time: Thursday, June 23, 2016, at 3:00 PM UTC

Preliminary Root Cause

An existing transport feature that is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users

EX71628 – E-Mail and calendar access – Extended recovery

Jun 30, 2016 2:18 PM

Current Status

We’ve developed an additional fix to address the underlying cause of the issue. We’re preparing to deploy the fix to the affected environment to ensure that the issue does not reoccur.

User Impact

Users may be unable to send email messages through the Exchange Online service. Email messages may appear to be stuck in the Drafts or Outbox folders.

Scope of Impact

A few customers have reported this issue, and our analysis indicates that for most customers, it’s unlikely that many users would report impact related to this event.

  • Start Time: Thursday, June 23, 2016, at 3:00 PM UTC

Preliminary Root Cause

An existing transport feature that is designed to expedite the delivery of email messages became degraded, which caused impact to email delivery for a subset of users.

Next Update by: Saturday, July 2, 2016, at 7:00 PM UTC


EX71674 – E-Mail timely delivery – Service restored

Jun 30, 2016 7:35 PM

Final Status

We’ve confirmed that the remaining message queues have now drained after implementing a configuration change to optimize message filtering.

User Impact

Users were experiencing delays when sending and receiving email messages. Affected users may have received Non-Delivery Reports (NDR) when sending email messages.

Scope of Impact

Customer reports indicated that many users likely experienced impact related to this event. Our analysis indicates that this issue may potentially have affected any of your users attempting to send or receive mail.

  • Start Time: Thursday, June 30, 2016, at 2:30 PM UTC
  • End Time: Thursday, June 30, 2016, at 11:30 PM UTC

Preliminary Root Cause

The infrastructure responsible for processing Exchange Online Protection (EOP) message filtering became degraded.

Next Steps

  • We’re analyzing performance data and trends on the affected systems to help prevent this problem from happening again.
  • We’re reviewing our code for optimizations and automated recovery options.
  • We’ll publish a post-incident report within five business days.

EX71674 – E-Mail timely delivery – Service restored

Jul 1, 2016 12:08 AM

Final Status

We’ve rolled out the fix and confirmed that service is restored. Any meeting requests created during the outage will need to have the conference room calendar removed and readded to book the room.

User Impact

Users that attempted to create a meeting request with a conference room calendar were unable to successfully book a conference room. This lead to conference rooms being booked by multiple resources.

Scope of Impact

A few customers reported this issue, and our analysis indicated that this may have affected any users attempting to use this feature.
  • Start Time: Monday, June 27, 2016, at 6:00 PM UTC
  • End Time: Friday, July 1, 2016, at 2:54 AM UTC

Preliminary Root Cause

A recent update affected the ability for calendar invite requests to successfully book conference rooms.

Next Steps

  • We’re reviewing our deployment and provisioning procedures to help prevent this kind of problem in the future.
  • We’ll publish a post-incident report within five business days.