Throttling

Throttling happens when a service (usually an API) becomes overworked and intentionally limits the number of requests it processes. Microsoft uses throttling to safeguard its Microsoft 365 cloud services from excessive usage, whether due to high customer demand or malicious attacks such as denial-of-service attempts.

When throttling occurs, the API returns an error response (usually a 429) rather than fulfilling the request. The error response instructs the app or user to wait and includes metadata suggesting a wait time, which typically starts small and lasts just a few seconds making it barely noticeable. However, if the app or user continues making excessive requests, the API enforces more severe throttling. This results in more requests being rejected and increasingly longer wait times which leads to a gradual decline in performance.

Throttling doesn’t just affect app integrations like CDM, it can impact users accessing Microsoft 365 services through a web browser. For instance, if a user is navigating SharePoint and throttling is in effect, they may be redirected to a 503 error page instead.

Throttling can occur at several different levels. It can affect a single user with extremely high activity, an app integration (such as CDM), a specific API endpoint, an entire tenancy, or, in rare cases, multiple tenancies at the Microsoft server level.

There is also no universal threshold for throttling because different Microsoft API endpoints have different limits. Certain functions are designed for very high levels of sustained performance, while others are designed for less frequent checks.

Throttling might seem like an obscure topic to discuss in such detail in our knowledge base, but to be clear, managing API throttling effectively is one of the biggest challenges for Microsoft 365 integration partners. Handling throttling in a distributed system like CDM is significantly more complex than in a Software as a Service (SaaS) application with a single point of interaction, such as an HR system.