So I recently upgraded the main MongoDB replica set from 2.2.x to 2.4.x for performance and mostly for security after the recent major MongoDB vulnerability.
Setting up the replica set in the first place wasn’t too troublesome, but maintaining has been a pain in the butt on and off, MongoDB does have some peculiarities. Once for PRIMARY member of the set crashed or hung or something, and the other 2 got confused, and rather than electing a new PRIMARY – just stopped working entirely.
Also the Ubuntu package doesn’t come with Logrotate capability, and the MongoDB log is very noisy, so I ran out of HDD space on a machine during one of my earlier installs.
Plus the default ‘security’ setup of MongoDB is a bit worring, it comes with auth turned off, no user/password login and it listens on all ports rather than binding to localhost by default. Of course all of these things are fairly easily fixed, but it makes me wonder how many poorly secured MongoDB instances there are out there on the public Internet.
Anyway back to this problem, at first I couldn’t upgrade to 2.4 because of a duplicate user error, after installed the new package it failed to start (rather than warning me during install) – so I had to roll back to the older version and lock the package there for a while. After getting rid of the duplicate users, I could upgrade.
But 50% of the time the app was erroring out with something like this:
Fri Apr 17 09:48:01.790 [conn5] assertion 16550 not authorized for query on databasename.collection ns:databasename.collection query:{ $query: { email: "abc@hotmail.com" }, $orderby: { _id: 1 } }
But half the time it worked, I tried all sorts of things, figuring out maybe the upgrade corrupted something. I totally deleted the data from both the SECONDARY MongoDB servers, and let them resync, updated the OS, rebooted, all kinds of things.
Even switched around the PRIMARY node, all of which..did nothing – I still kept getting the errors.
Finally after digging through the Ruby logs, which verified the app was getting an authentication error and digging through the database, I figured out what might be causing the problem. The Ruby logs were showing something like this:
Moped::Errors::QueryFailure: The operation: #Moped::Protocol::Query
@length=110
@request_id=93
@response_to=0
@op_code=2004
@flags=[:slave_ok]
@full_collection_name="databasename.collection"
@skip=0
@limit=1
@selector=redacted
@fields=nil
failed with error 16550: "not authorized for query on databasename.collection"
After some more digging, I found a user with the same username, in 2 different databases. So we have database databasename and database admin, both of which had the users databaseuser and admin.
You can search for users using these commands (you’ll have to auth first, if you set it up):
> use mydatabase
> db.system.users.find()
Do this on each database you have, to check for duplicate usernames, to list all databases just do:
> use admin
> show dbs
If you find any duplicate users, you can delete them with:
> db.system.users.remove({"user" : "USERNAME"});
So I deleted the admin user from databasename and deleted the databaseuser from the admin database, what I suspect was happening was 50 percent of the time it was authing with the right user and working fine, but 50% of the time it was authing with the wrong user (databaseuser in the admin database) and not having access to what it needed, and thus giving the assertion error.
After deleting the two users, and restarting the app on all the cluster servers – the error went away and I wasn’t getting intermittent app failures any more! So yah if you’re having this problem, just double check all the databases you have and make sure you don’t have any users in different databases with the same username.
When it’s all good you’ll see something like this instead of the assertion errors:
Fri Apr 17 10:37:16.197 [conn442] authenticate db: databasename { authenticate: 1, user: "databaseuser", nonce: "7c1c7s234r23fds", key: "9sdf907897sdf78979s8d" }
Fri Apr 17 10:37:32.725 [conn216] query database.collection query: { $query: { email: "abc@hotmail.com" }, $orderby: { _id: 1 } } ntoreturn:1 ntoskip:0 nscanned:1 scanAndOrder:1 keyUpdates:0 locks(micros) r:142323 nreturned:1 reslen:126 141ms
I could find absolutely nothing on Google about this and it seems like MongoDB 2.4 deals with users/auth issues quite differently from 2.2 – so if you are upgrading beware.
I know some of these sentences.