Error messages should be simple, clear and easy to understand. But there are differences: A developer writing some sourcecode will think of something different as "easy to understand" than a user who doesn't know the source or internals. MongoDB reports a "DBClientCursor::init call() failed" on connect errors. Do you know what this message means?
Daily work: Connect to a MongoDB server to do something. I still prefer the commandline client for most actions. It's fast and simple to start:
$ mongo mongodb-02.vpn/project_db
MongoDB shell version: 2.4.1
connecting to: mongodb-02.vpn/project_db
Wed Feb 19 10:35:47.320 DBClientCursor::init call() failed
Wed Feb 19 10:35:47.322 JavaScript execution failed: Error: DBClientBase::findN: transport error: mongodb-02.vpn:27017 ns: admin.$cmd query: { whatsmyuri: 1 } at src/mongo/shell/mongo.js:L114
exception: connect failed
This is one of those common "wtf???" moments in software development. The last line is intentially: I understand "connect failed". But I don't know why.
I tried some more times and some tries succeeded and connected to the MongoDB server, but others didn't. db.stats() and db.currentOp() are good to see what's going on within the database, but they didn't tell me something useful in this case.
db.serverStatus() returns a huge amount of information and some few lines where telling something really interesting:
> db.serverStatus()
{
"host" : "mongodb-02.vpn",
"version" : "2.4.6",
"process" : "mongod",
"pid" : 1544,
"uptime" : 1709586,
"uptimeMillis" : NumberLong(1709586434),
"uptimeEstimate" : 1255812,
"localTime" : ISODate("2014-03-04T09:36:38.511Z"),
[...]
"connections" : {
"current" : 815,
"available" : 4,
"totalCreated" : NumberLong(128676230)
},
This server has 815 current connections which is a more than it should have, but still ok. This project launches tasks for various actions and each of them is allowed to have one connection to the MongoDB.
I got two problems: 815 connections are more than I expected and the MongoDB server should be able to handle about 20000 concurrent connections.
The first one was a scripting bug: A loop processing many items didn't reuse the tasks connection but connected to the MongoDB, processed one item and disconnected. Shouldn't be a problem, but the disconnected connections stayed half-open on the server side for a minute or so until the kernel finally closed them.
The second one was much more confusing: The server didn't have any configured limit for connections. It turned out, that shells are limited to 1024 open filehandles on this server. The same appears on my Ubuntu workstation:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 94697
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 95
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 94697
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
MongoDB had never been restarted on this box since it's initial setup. Someone installed it and started the server manually using a SSH shell and inherited this 1024 filehandle limit to the MongoDB task. MongoDB seems to reserve some filehandle slots for database files leaving 819 for client connections.
Another MongoDB server which had been rebooted since installation roughly offered 20000 connection slots. Processes started during system boot don't suffer from this open file limit.
Noch keine Kommentare. Schreib was dazu