Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
Never used it. I'd suggest that you read the mysql dev page I linked to and then look at your system and work out how many connections to mysql you have active at any one time. Using those numbers and the mysql dev page info about which variables are thread specific you can then work out how much memory your usual workload will make mysqld use and see if that fits in your 1GB RAM. If it does not then you need to either reduce the number of concurrent requests, adjust the mysql parameters to use less or increase the amount of RAM available.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
From memory (which is quite dicky these days):
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
are per connection allocations. So that's 1000 (your max_connections) times the above for clients. Additionally the default size of innodb_pool for the version of MySQL you're using. Mind it may be the case that those global variables can be "overwritten" with a client connection by using set session <var> = <val>.
Swapping for databases is a VERY bad thing. You need to look at your data and work out what are the appropriate values for your database and httpd.
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
are per connection allocations. So that's 1000 (your max_connections) times the above for clients. Additionally the default size of innodb_pool for the version of MySQL you're using. Mind it may be the case that those global variables can be "overwritten" with a client connection by using set session <var> = <val>.
That does suggest some very furious swapping activity. Generally OOM killer will kill of the largest consumer first (according to the docs) and if that's not enough to satisfy current requests, kill of the next and so on. Although the documentation goes on to say how memory changes all the time so what is the largest now may not be the largest at the next CPU cycle and that kind of a thing.When the VM reaches its "unavailable" failure state, it's CPU consumption (1 vCPU) is seen as high on the host (but the overall CPU consumption on the host, event at those moments, is no more that 30%).
Swapping for databases is a VERY bad thing. You need to look at your data and work out what are the appropriate values for your database and httpd.
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
When you have 2GB swap and 1GB RAM then the OOM killer won't even be invoked until you hit 3GB allocated at which point everything it wants to reference or kill is probably already swapped out and it would need to swap it in again which means it would need free RAM... oh, wait!
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
Yup, that's right. OOM killer kills the thing "most swapped out" (over time) and then gets really impatient with that process when it's busy trying to swap in the swapped out stuff - hence the name "killer"!
Don't you just love time!
A really bad thing (TM)
Don't you just love time!
A really bad thing (TM)
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
Thanks a lot for chiming in, aks.aks wrote:From memory (which is quite dicky these days):
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
are per connection allocations. So that's 1000 (your max_connections) times the above for clients. Additionally the default size of innodb_pool for the version of MySQL you're using. Mind it may be the case that those global variables can be "overwritten" with a client connection by using set session <var> = <val>.
The "myisam_sort_buffer_size" MySQL variable I think does not pose a problem, acording to this site: "myisam_sort_buffer_size (...) won't be relevant unless you are rebuilding indexes using ALTER TABLE or REPAIR TABLE etc.", and those operations I think are not something the remote users connecting to this web server can do, because the PHP scripts running on this server do not call those functions.
However, I am making the following changes to the server:
- Change de MySQL variable "max_connections" to 700 instead of 1000.
- Up the RAM assigned to this server's VM from 1GB to 2 GB.
Just to complete the picture of this server, these are the output of these two commands (while the server is operating fine, as it mostly does):
Code: Select all
$ free -m > free-m.txt
total used free shared buffers cached
Mem: 1010 967 42 0 15 172
-/+ buffers/cache: 780 230
Swap: 2047 34 2013
Code: Select all
$ top -b -n1 -m > top.txt
top - 12:48:03 up 15 days, 21:56, 5 users, load average: 0.14, 0.14, 0.14
Tasks: 87 total, 2 running, 85 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.1%us, 0.5%sy, 0.0%ni, 91.8%id, 2.6%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1034656k total, 992156k used, 42500k free, 16136k buffers
Swap: 2097144k total, 35712k used, 2061432k free, 177040k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16453 apache 15 0 135m 51m 8264 S 0.0 5.1 0:11.37 httpd
15970 apache 18 0 134m 51m 8316 S 0.0 5.1 0:29.33 httpd
16332 apache 15 0 134m 51m 8412 S 0.0 5.1 0:21.96 httpd
16531 apache 15 0 134m 51m 8536 S 0.0 5.1 0:17.87 httpd
16580 apache 23 0 134m 51m 8340 S 0.0 5.1 0:42.97 httpd
15810 apache 18 0 134m 51m 8352 S 0.0 5.1 0:51.13 httpd
15811 apache 15 0 133m 50m 8628 S 0.0 5.0 0:33.14 httpd
14956 apache 15 0 134m 50m 8376 S 0.0 5.0 1:51.34 httpd
16530 apache 15 0 133m 50m 8040 S 0.0 5.0 0:11.60 httpd
16330 apache 15 0 133m 50m 8340 S 0.0 5.0 0:23.76 httpd
15594 apache 18 0 133m 50m 8672 S 0.0 5.0 1:27.31 httpd
15014 apache 16 0 133m 49m 8468 S 0.0 4.9 0:56.66 httpd
15968 apache 19 0 129m 48m 8244 S 0.0 4.8 0:19.24 httpd
14957 apache 18 0 130m 47m 8408 S 0.0 4.7 0:56.70 httpd
15812 apache 18 0 130m 47m 8400 S 0.0 4.7 0:47.61 httpd
14955 apache 15 0 130m 46m 8400 S 0.0 4.6 0:47.96 httpd
16313 apache 15 0 129m 46m 8388 S 0.0 4.6 0:23.29 httpd
17194 apache 15 0 129m 46m 8028 R 0.0 4.6 0:06.16 httpd
16532 apache 16 0 121m 40m 8268 S 0.0 4.0 0:09.65 httpd
27518 root 18 0 95884 28m 24m S 0.0 2.8 0:13.34 httpd
2285 mysql 15 0 153m 21m 3876 S 0.0 2.1 118:54.90 mysqld
16367 apache 15 0 101m 18m 7872 S 0.0 1.8 1:23.38 httpd
2162 ntp 15 0 4532 4528 3516 S 0.0 0.4 0:00.39 ntpd
17660 root 17 0 10092 2808 2272 S 0.0 0.3 0:00.03 sshd
17340 postfix 18 0 8660 2008 1572 S 0.0 0.2 0:00.01 pickup
2531 usuario2 15 0 5984 1828 672 S 0.0 0.2 0:00.62 screen
17665 usuario2 15 0 10092 1608 1044 S 0.0 0.2 0:00.04 sshd
17666 usuario2 16 0 4784 1488 1208 S 0.0 0.1 0:00.02 bash
2412 postfix 18 0 8796 1432 1316 S 0.0 0.1 0:00.84 qmgr
2399 root 15 0 8592 1304 1224 S 0.0 0.1 0:01.78 master
2557 usuario2 15 0 4784 1276 1056 S 0.0 0.1 0:00.12 bash
17521 usuario1 15 0 10228 1224 920 S 0.0 0.1 0:00.00 pure-ftpd
2532 usuario2 16 0 4784 1136 1020 S 0.0 0.1 0:00.03 bash
2638 usuario2 16 0 4784 1120 1004 S 0.0 0.1 0:00.10 bash
2609 root 16 0 4784 1100 996 S 0.0 0.1 0:00.28 bash
17694 usuario2 15 0 4928 952 764 S 0.0 0.1 0:00.00 screen
17722 usuario2 15 0 2308 928 728 R 0.0 0.1 0:00.00 top
27516 root 25 0 5080 884 884 S 0.0 0.1 0:00.01 nss_pcache
17522 root 20 0 10228 840 584 S 0.0 0.1 0:00.00 pure-ftpd
2608 root 16 0 5032 780 780 S 0.0 0.1 0:00.00 su
2582 usuario2 15 0 4784 640 640 S 0.0 0.1 0:00.01 bash
2146 root 15 0 7264 620 528 S 0.0 0.1 0:00.03 sshd
2203 root 25 0 4648 604 604 S 0.0 0.1 0:00.03 mysqld_safe
2444 root 18 0 5392 592 532 S 0.0 0.1 0:00.36 crond
2083 root 16 0 1832 524 476 S 0.0 0.1 0:01.47 syslogd
1 root 15 0 2176 488 460 S 0.0 0.0 0:01.17 init
2480 root 20 0 1764 400 400 S 0.0 0.0 0:00.00 mingetty
2474 root 16 0 1764 380 380 S 0.0 0.0 0:00.00 mingetty
2475 root 17 0 1764 380 380 S 0.0 0.0 0:00.00 mingetty
2476 root 18 0 1764 380 380 S 0.0 0.0 0:00.00 mingetty
2479 root 19 0 1764 380 380 S 0.0 0.0 0:00.00 mingetty
2485 root 19 0 1764 380 380 S 0.0 0.0 0:00.00 mingetty
2457 root 18 0 2376 348 304 S 0.0 0.0 0:00.09 atd
583 root 18 -4 2404 344 344 S 0.0 0.0 0:00.50 udevd
2431 root 18 0 10176 316 248 S 0.0 0.0 0:00.39 pure-ftpd
2086 root 15 0 1780 308 308 S 0.0 0.0 0:00.00 klogd
2 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migration/0
3 root 39 19 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd/0
4 root 10 -5 0 0 0 S 0.0 0.0 0:00.03 events/0
5 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 khelper
6 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 kthread
9 root 10 -5 0 0 0 S 0.0 0.0 0:00.71 kblockd/0
10 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid
169 root 16 -5 0 0 0 S 0.0 0.0 0:00.00 cqueue/0
172 root 16 -5 0 0 0 S 0.0 0.0 0:00.00 khubd
174 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kseriod
241 root 18 0 0 0 0 S 0.0 0.0 0:00.01 khungtaskd
242 root 15 0 0 0 0 S 0.0 0.0 0:10.34 pdflush
243 root 15 0 0 0 0 S 0.0 0.0 0:09.89 pdflush
244 root 10 -5 0 0 0 S 0.0 0.0 1:59.22 kswapd0
245 root 16 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0
463 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
494 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 mpt_poll_0
495 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 mpt/0
496 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0
499 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 ata/0
500 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 ata_aux
505 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kstriped
514 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 ksnapd
525 root 10 -5 0 0 0 S 0.0 0.0 0:19.54 kjournald
550 root 11 -5 0 0 0 S 0.0 0.0 0:00.00 kauditd
1703 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 kmpathd/0
1704 root 18 -5 0 0 0 S 0.0 0.0 0:00.00 kmpath_handlerd
1762 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kjournald
1764 root 10 -5 0 0 0 S 0.0 0.0 0:08.67 kjournald
1769 root 10 -5 0 0 0 S 0.0 0.0 0:10.34 kjournald
1774 root 10 -5 0 0 0 S 0.0 0.0 0:24.36 kjournald
There is furious trashing indeed, and high CPU usage, when the server reaches its "non available" state. But the question is, which process is the culprit, MySQL or Apache+PHP?aks wrote:That does suggest some very furious swapping activity. Generally OOM killer will kill of the largest consumer first (according to the docs) and if that's not enough to satisfy current requests, kill of the next and so on. Although the documentation goes on to say how memory changes all the time so what is the largest now may not be the largest at the next CPU cycle and that kind of a thing.When the VM reaches its "unavailable" failure state, it's CPU consumption (1 vCPU) is seen as high on the host (but the overall CPU consumption on the host, event at those moments, is no more that 30%).
Swapping for databases is a VERY bad thing. You need to look at your data and work out what are the appropriate values for your database and httpd.
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
Adjusting max_connections won't help if you never reach that number to start with. Run show status like "max%"; inside the mysql client to see how many you actually use. Having a higher number won't affect how much memory you use unless the number of connections increases.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
The server is lightly used:TrevorH wrote:Adjusting max_connections won't help if you never reach that number to start with. Run show status like "max%"; inside the mysql client to see how many you actually use. Having a higher number won't affect how much memory you use unless the number of connections increases.
Code: Select all
mysql> show status like "max%";
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Max_used_connections | 25 |
+----------------------+-------+
1 row in set (0.01 sec)
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
In a RDBMS like mysql, max% is a (n) limit but it is not the full story. I can have a single connection (process) that uses all available limits (hardware resources), so max_connections is not necessarily the full story.
So onto the question:
What has run out of ("allowed") memory and what is continuing in the same vein (attempting to allocate some more)?
Well swap can give you a really big clue. If something (a process) swaps more and more over time (and in this case, we're seeing the problem manifest over a set time period), then that is the likilest candidate for the problem. To see this, we only have to sample the available data (it's not "quick" like memory - it's swap which is much much slower (and hence more "stable")).
The obvious choices are sar and dstat/top.
So when we have honed into where the problem is (where a process swaps more and more until there is no more), then we can tackle the next "layer" of problems.
Make sense?
So onto the question:
What has run out of ("allowed") memory and what is continuing in the same vein (attempting to allocate some more)?
Well swap can give you a really big clue. If something (a process) swaps more and more over time (and in this case, we're seeing the problem manifest over a set time period), then that is the likilest candidate for the problem. To see this, we only have to sample the available data (it's not "quick" like memory - it's swap which is much much slower (and hence more "stable")).
The obvious choices are sar and dstat/top.
So when we have honed into where the problem is (where a process swaps more and more until there is no more), then we can tackle the next "layer" of problems.
Make sense?
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
It makes total sense. I was discussing previously with TrevorH in this thread about using sar for that, but he said that tool only samples system-wide aggregates and does not have the granularity to gather single-process data. Therefore I ruled out using sar.aks wrote:What has run out of ("allowed") memory and what is continuing in the same vein (attempting to allocate some more)?
Well swap can give you a really big clue. If something (a process) swaps more and more over time (and in this case, we're seeing the problem manifest over a set time period), then that is the likilest candidate for the problem. To see this, we only have to sample the available data (it's not "quick" like memory - it's swap which is much much slower (and hence more "stable")).
The obvious choices are sar and dstat/top.
So when we have honed into where the problem is (where a process swaps more and more until there is no more), then we can tackle the next "layer" of problems.
Make sense?
Next candidate was pidstat, but that command is not included in version 7.0.2 of the "sysstat" package which ships with Centos 5.
Next candidate is the atop package, which is available for Centos 5 in the "epel" repository. But I have not had yet the time to research that tool.
Any recommendations?
Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?
While I search for a better data gathering tool with per-process granularity, I've banged these two lines into root's crontab:
So I should always have data about the most memory consuming processes from the last two days. Yes, that data will have to be parsed by hand with more/less if it's ever needed -- but still that is better than nothing, and the number of files with data to peruse --should it be needed-- I think should not get higher than about 576, and even then probably only the last ones (before a non-availability condition is reached in the server) would be of interest.
Code: Select all
*/5 * * * * top -b -n1 -m | head -30 > /tmp/top-sample_`date +\%Y-\%m-\%d_\%H-\%M-\%S`_.txt
56 20 * * * find /tmp/top-sample_* -type f -mtime +2 -print0 | xargs -r -0 rm