
It was revealed that the cause of the recent YouTube and G-mail jamming accidents around the world was that computing resources required for the operation of the’User ID service’ were not properly allocated. In the process of introducing a new storage resource allocation system by Google, it explains that there was a mistake in operating with the user ID service usage set to 0. It is a ridiculous mistake that’input value 0’paralyzes Google services around the world.
On the 23rd (local time), Google released a report on the recent global Google service disruption through the Google Cloud Status Dashboard site (☞ link).
Previously, there was a problem in which most Google services such as YouTube, G-mail, Google Calendar, and Google Home were not logged in for about an hour from 4 am on the 14th based on PDT. Services that require login, such as G-mail and Google Calendar, are in a state of being stuck. Users who had set up the automatic login function on YouTube also experienced inconvenience in accessing the service.
Immediately after the accident, Google revealed a rough story of the accident, “As a problem occurred in the automated storage allocation management system, the capacity of the central ID management system was reduced, and as a result, access to many Google services requiring login was impossible.”
According to the accident report released this time, the failure occurred during the migration (replacement) of the storage allocation system of the user ID service. User ID service handles authentication credentials while having unique identifiers for all accounts.
Google is introducing a new automatic storage allocation system for user ID service. A new system was introduced last October, but some of the existing systems remain. The problem started when this legacy system incorrectly reported the usage of the User ID service as zero.
Since October, the usage has been incorrectly reported as 0. The problem wasn’t immediately revealed, as Google had a grace period at the point of limiting its quota.
From the end of the grace period, the storage quota for the user ID service began to decrease. Paxos Reader, which updates the validity period of account data within the user ID service, can no longer work. Accordingly, all account data stored in the database expired, leading to a problem of rejecting all user account authentication requests.
Google admitted a loophole in the system, saying, “We have a safety check system to prevent unintentional quota changes, but we have not been prepared for a scenario in which usage is reported as zero in a single service.”
Related Articles

Google “Youtube error, internal storage space allocation error”

Recovery after 1 hour of interruption in Google services such as YouTube…”There seems to be no compensation”

YouTube hangs again in a month… No official explanation

Google sues antitrust again… “Alone pitcher, hitter and referee role”
Google revealed that the accident affected various services within the Google Cloud platform. Cloud Console, Big Query, Cloud Storage, Cloud Networking, Kubernetes Engine, Google Workspace, and Cloud Support were affected.
Google’s policy is to fix the root cause and establish measures to prevent this type of error. As part of this, the company plans to review changes in the quota management automation system so that it does not quickly apply to the world, and improve the monitoring and alarm systems so that misconfigurations can be detected more quickly.