What happened to Vivaldi Social?
https://thomasp.vivaldi.net/?p=918
On Saturday 8 July 2023, user accounts started disappearing from the Vivaldi Social Mastodon instance. What was going on, how did this happen, and what were the consequences?
This is a very long blog post, but to be fair, this was also to be a very long weekend.
Something’s not rightIt was around 17:25 Oslo time (CEST) on the Saturday that I first noticed something was wrong. I’d just got home from a bike ride, and when I happened to check my Vivaldi Social tab, it suddenly asked me to log in again. “Unusual”, I thought, but I wasn’t immediately alarmed by it. But then when I did log in, I saw that my home timeline was now completely empty. I quickly reached out to my colleagues.
doing anything with mastodon? my home timeline is suddenly empty
Me, to my fellow sysadmins – Saturday 8 July 17:26 CEST
My fellow sysadmin Hlini very quickly got back to me. No work was ongoing, and his account was showing the same symptoms as mine. He offered to start heading to a computer so he could help me, an offer which I gratefully accepted.
By 17:32, another colleague outside of the sysadmin team had also noticed the same issue. I started to look into the database to see what was going on.
Something bad has happenedLooking at the database I could see that the affected accounts had apparently been deleted, and then recreated as a completely new account when the user logged back in.
Immediately, I started looking to see what database backups were available. As expected, we had a nightly backup from 23:00 UTC on Friday night. I started copying the file to somewhere I could make use of it.
While I was waiting for the backup file to copy, I started checking the database for other users that might be affected. Jón von Tetzchner’s account and another one that I checked had also been deleted, but had not yet been recreated, likely because those users had not tried to log back into their accounts yet.
By this time, Hlini had arrived at a computer and started looking into things with me.
I started checking the web server logs for account deletion requests, but nothing matching the account deletions showed up; and then I realized something else was odd about these deletions.
Normally when an account is deleted in Mastodon, the username is permanently reserved as unusable. If you were to try to create a new account with the same name as a deleted account, it would not allow it (since, due to the nature of the Fediverse, having a new account with the same address as an old one would not be a good thing).
But in the case of these deletions, we were getting reassigned the exact same usernames, so these could not be not normal deletions.
By 18:39, Hlini had figured out the pattern: all accounts with an ID lower than 142 (ie. the oldest accounts) were missing from the database.
We hadn’t seen any discussion from other Mastodon server admins about anything like this, and we wondered if this could be something unique to our setup – after all, Vivaldi Social uses vivaldi.net accounts for logins (thanks to Mastodon’s OAuth support) instead of the normal signup and login system of Mastodon. We started considering asking the Mastodon developers for help, and we also started discussing strategies for restoring the lost data from the backup.
But then…
Something bad is happening right nowAt 19:10, I checked the database again, and I saw that all accounts with an ID lower than 217 were now missing from the database, and that number was increasing. This meant that accounts were still being actively deleted from the database.
By this point we both agreed that we needed more help, so at 19:18 we contacted the Mastodon developers. We immediately got a reply from Renaud, and he pinged Claire and Eugen to enlist their help.
Stemming the floodAt 19:20, Hlini restarted all of the docker instances in our Mastodon setup. The deletions seemed to stop the moment he did this. The lowest ID in the database was now 236.
Fortunately it turned out that it would stay that way.
The investigation begins198 accounts in total had been deleted during the course of this incident, and over the next few hours, together with the Mastodon devs, we started looking into what could be going on. On Eugen’s suggestion, we looked into the possibility of it being the UserCleanupScheduler
deleting accounts that were “unconfirmed”, but this was eventually ruled out, as the deleted users could never have matched the query that it operated on.
Since we had upgraded to Mastodon 4.1.3 just 48 hours before the incident occurred, the Mastodon devs looked into all the code changes between v4.1.2 and v4.1.3 to see if anything there could be related. They even (and I cannot credit them enough for this) went the extra mile and looked through our published changes to see if any of the changes we had made could possibly lead to this. The conclusion though was that none of the changes could have triggered anything like this.
At the suggestion of Renaud and Eugen, we checked the filesystem to see if the deletions were being done directly in the database, or if they were being triggered by Mastodon itself. We could see that the avatar and header images for the deleted accounts had themselves also been deleted. This meant that the deletions had to be coming from the Mastodon application itself.
An attack?We also started looking for signs of system intrusion, since it was certainly a possibility that this was some kind of deliberate attack. I spent some time checking the various logs that we had available to us, but I didn’t find anything (though in these cases, the absence of evidence can never rule out the possibility).
Because Mastodon v4.1.3 included a security fix, the devs also looked into the possibility of a related exploit, for which we combed through the logs, and examined the filesystem for evidence of such an attack. Again though, nothing was found.
We debated whether we should take Vivaldi Social offline altogether while we continued the investigation. The Mastodon devs gave arguments in both directions:
We ultimately decided to keep it running. In truth what swung the decision that way was probably not the balance between the above arguments, but just a simple fact of us being sysadmins… [参照]