Throttling

In an API context, throttling refers to restricting the number of API requests a user or app can make within a certain period. It helps prevent server overload, ensures all users get fair access to the API and enhances security against potential cyber attacks, such as denial-of-service (DoS) attacks.

When an API throttles a request, instead of fulfilling it, the API returns either a 503 (Service Unavailable) or 429 (Too many requests) error response to indicate that the service can’t handle the request (because it is becoming overloaded).

Usually, this temporary state requires the user or the app sending the request to wait. The error response metadata usually specifies the time the user or app must wait for. However, throttling becomes more severe if the user or app ignores the error response and continues to inundate the API with requests.

Effectively, when an API applies light throttling, it may return a 503 or 429 error response to 1 in 20 requests. If throttle warnings are not addressed, the API increases its throttling rate, resulting in 1 in 5, 1 in 2, or even 4 in 5 rejected requests. When this happens, the performance of the API begins to degrade

Throttling can also impact users who access online services via web browsers. In such a scenario, when users browse to a new web page, a throttle response redirects them to a 503 error page instead.

Cloud Drive Mapper (CDM) vs. Microsoft 365 throttling

Throttling is a challenging problem partly due to the lack of data around it, which makes it difficult to know when it occurs, if the issue is with CDM, or if CDM is just being affected by throttling caused by something else, such as a different product.

Throttling can occur at several different levels. It can impact an individual user who is generating a lot of requests or an entire Microsoft 365 tenancy. It can even happen to an entire Microsoft 365 server array, causing multiple customers to be affected simultaneously, even though most may not be at fault. At IAM Cloud, we experienced a rise in throttling several times during the COVID-19 pandemic-related lockdowns. Microsoft cloud services were stretched to the limit as millions of additional users collaborated online remotely.

Over the years, we have made many changes to CDM Legacy (aka v2.x) to minimize the likelihood of it causing throttling, even though this issue is impossible to contain in its entirety as it can also be user-driven. For example, if someone scripts a migration/backup process to work over a CDM-mapped drive, this might cause a certain level of throttling beyond our control. Even some desktop applications can cause throttling. We have noticed that certain desktop application tools employ an extremely inefficient auto-save behavior, which triggers save requests to the drive every few seconds. This is not an issue when working with local C: drive. However, it is a suboptimal pattern when working with the cloud.

The foremost problem we’ve noticed is that CDM Legacy (aka v2.x) is especially vulnerable to throttling because it uses WebDAV. WebDAV doesn’t have a retry or stand-off process when encountering a throttling API, which tends to break CDM Legacy (aka v2.x) temporarily in several ways. The effects of this breakage on users vary depending on what they’re doing at the time. For instance, if CDM is hit by throttling while a user is browsing a folder, it may show 0 items within the folder, even if it contains files in the cloud. Another example could entail a user saving a file and the save operation failing, resulting in a drive error.

CDM (V3) uses an entirely different software architecture to map drives and connect to Microsoft 365. While Microsoft 365 can still throttle CDM for many reasons outside of its control, the latest version of CDM has built-in resilience to handle throttling that is vastly much more effective than CDM Legacy (aka v2.x).

The main differences between the two concerning throttling include the following:

CDM Legacy (aka v2.x) CDM (V3)
In CDM Legacy, WebDAV is end-to-end, meaning that throttling can disconnect or unmap the drive and halt drive actions. The CDM architecture includes a fully independent file system, allowing it to function without interruption even when it can’t get a response from the cloud momentarily.
CDM Legacy has no retry logic, meaning the action fails if it receives a failed request. The CDM architecture includes a comprehensive retry logic. When it receives a failed request, it waits and tries again until it receives a successful request, ensuring that the action never fails.
In CDM Legacy, WebDAV has no stand-off capabilities. When Microsoft 365 sends out throttle warnings to alert it of upcoming throttling, it can’t process these alerts and continues to function similarly, which likely makes the throttling even worse than before. The CDM architecture includes full stand-off capabilities. When Microsoft 365 sends out throttle warnings, CDM acts accordingly and slightly reduces the traffic it sends to Microsoft 365 until the throttling has subsided.

These are all fundamental ways the CDM (V3) architecture is substantially better than CDM Legacy (aka 2.x) at reducing the likelihood of throttling and dealing with it when it happens.

Additionally, CDM (V3) has a new upcoming feature on its roadmap to mitigate the issues around overly aggressive local desktop apps that auto-save every few seconds. Implementing this feature will allow CDM to detect this behavior and enable a buffer technology of sorts. This means that although the local desktop app might be auto-saving every few seconds, CDM (V3) will package these updates and upload them to the cloud every 10 minutes, for example. The upload frequency will be configurable based on your needs.

IAM Cloud includes a User-Agent string that allows Microsoft to know who we are:

Copy
ISV|IAMCloud|{AzureAppId}/{CDMVersion}"
Example: ISV|IAMCloud|237550e6-d0cd-4148-ab23-9ebd2b2da11d/3.16.0"

A 429 or a 503 is treated in CDM as a throttled response. These will usually return a Retry-After header, which is the amount of time Microsoft would like CDM to wait before sending the request again.

For more information about throttling in Microsoft 365, please refer to Learn.microsoft.com