Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

General support questions including new installations
User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by TrevorH » 2015/08/16 12:45:04

Never used it. I'd suggest that you read the mysql dev page I linked to and then look at your system and work out how many connections to mysql you have active at any one time. Using those numbers and the mysql dev page info about which variables are thread specific you can then work out how much memory your usual workload will make mysqld use and see if that fits in your 1GB RAM. If it does not then you need to either reduce the number of concurrent requests, adjust the mysql parameters to use less or increase the amount of RAM available.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by aks » 2015/08/16 15:28:35

From memory (which is quite dicky these days):
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
are per connection allocations. So that's 1000 (your max_connections) times the above for clients. Additionally the default size of innodb_pool for the version of MySQL you're using. Mind it may be the case that those global variables can be "overwritten" with a client connection by using set session <var> = <val>.
When the VM reaches its "unavailable" failure state, it's CPU consumption (1 vCPU) is seen as high on the host (but the overall CPU consumption on the host, event at those moments, is no more that 30%).
That does suggest some very furious swapping activity. Generally OOM killer will kill of the largest consumer first (according to the docs) and if that's not enough to satisfy current requests, kill of the next and so on. Although the documentation goes on to say how memory changes all the time so what is the largest now may not be the largest at the next CPU cycle and that kind of a thing.
Swapping for databases is a VERY bad thing. You need to look at your data and work out what are the appropriate values for your database and httpd.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by TrevorH » 2015/08/16 17:18:45

When you have 2GB swap and 1GB RAM then the OOM killer won't even be invoked until you hit 3GB allocated at which point everything it wants to reference or kill is probably already swapped out and it would need to swap it in again which means it would need free RAM... oh, wait!
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by aks » 2015/08/16 18:37:18

Yup, that's right. OOM killer kills the thing "most swapped out" (over time) and then gets really impatient with that process when it's busy trying to swap in the swapped out stuff - hence the name "killer"!
Don't you just love time!
A really bad thing (TM) :twisted:

User avatar
InitOrNot
Posts: 122
Joined: 2015/06/10 18:26:51

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by InitOrNot » 2015/08/17 11:28:23

aks wrote:From memory (which is quite dicky these days):
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
are per connection allocations. So that's 1000 (your max_connections) times the above for clients. Additionally the default size of innodb_pool for the version of MySQL you're using. Mind it may be the case that those global variables can be "overwritten" with a client connection by using set session <var> = <val>.
Thanks a lot for chiming in, aks.

The "myisam_sort_buffer_size" MySQL variable I think does not pose a problem, acording to this site: "myisam_sort_buffer_size (...) won't be relevant unless you are rebuilding indexes using ALTER TABLE or REPAIR TABLE etc.", and those operations I think are not something the remote users connecting to this web server can do, because the PHP scripts running on this server do not call those functions.

However, I am making the following changes to the server:
  1. Change de MySQL variable "max_connections" to 700 instead of 1000.
  2. Up the RAM assigned to this server's VM from 1GB to 2 GB.
I am doing this hoping that if the problem lies in MySQL, then the problem should not happen again. And if it does, then most probably the problem should lie in the Apache+PHP ballpark.

Just to complete the picture of this server, these are the output of these two commands (while the server is operating fine, as it mostly does):

Code: Select all

$ free -m > free-m.txt

             total       used       free     shared    buffers     cached
Mem:          1010        967         42          0         15        172
-/+ buffers/cache:        780        230
Swap:         2047         34       2013

Code: Select all

$ top -b -n1 -m > top.txt

top - 12:48:03 up 15 days, 21:56,  5 users,  load average: 0.14, 0.14, 0.14
Tasks:  87 total,   2 running,  85 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.1%us,  0.5%sy,  0.0%ni, 91.8%id,  2.6%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1034656k total,   992156k used,    42500k free,    16136k buffers
Swap:  2097144k total,    35712k used,  2061432k free,   177040k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                            
16453 apache    15   0  135m  51m 8264 S  0.0  5.1   0:11.37 httpd                                                                              
15970 apache    18   0  134m  51m 8316 S  0.0  5.1   0:29.33 httpd                                                                              
16332 apache    15   0  134m  51m 8412 S  0.0  5.1   0:21.96 httpd                                                                              
16531 apache    15   0  134m  51m 8536 S  0.0  5.1   0:17.87 httpd                                                                              
16580 apache    23   0  134m  51m 8340 S  0.0  5.1   0:42.97 httpd                                                                              
15810 apache    18   0  134m  51m 8352 S  0.0  5.1   0:51.13 httpd                                                                              
15811 apache    15   0  133m  50m 8628 S  0.0  5.0   0:33.14 httpd                                                                              
14956 apache    15   0  134m  50m 8376 S  0.0  5.0   1:51.34 httpd                                                                              
16530 apache    15   0  133m  50m 8040 S  0.0  5.0   0:11.60 httpd                                                                              
16330 apache    15   0  133m  50m 8340 S  0.0  5.0   0:23.76 httpd                                                                              
15594 apache    18   0  133m  50m 8672 S  0.0  5.0   1:27.31 httpd                                                                              
15014 apache    16   0  133m  49m 8468 S  0.0  4.9   0:56.66 httpd                                                                              
15968 apache    19   0  129m  48m 8244 S  0.0  4.8   0:19.24 httpd                                                                              
14957 apache    18   0  130m  47m 8408 S  0.0  4.7   0:56.70 httpd                                                                              
15812 apache    18   0  130m  47m 8400 S  0.0  4.7   0:47.61 httpd                                                                              
14955 apache    15   0  130m  46m 8400 S  0.0  4.6   0:47.96 httpd                                                                              
16313 apache    15   0  129m  46m 8388 S  0.0  4.6   0:23.29 httpd                                                                              
17194 apache    15   0  129m  46m 8028 R  0.0  4.6   0:06.16 httpd                                                                              
16532 apache    16   0  121m  40m 8268 S  0.0  4.0   0:09.65 httpd                                                                              
27518 root      18   0 95884  28m  24m S  0.0  2.8   0:13.34 httpd                                                                              
 2285 mysql     15   0  153m  21m 3876 S  0.0  2.1 118:54.90 mysqld                                                                             
16367 apache    15   0  101m  18m 7872 S  0.0  1.8   1:23.38 httpd                                                                              
 2162 ntp       15   0  4532 4528 3516 S  0.0  0.4   0:00.39 ntpd                                                                               
17660 root      17   0 10092 2808 2272 S  0.0  0.3   0:00.03 sshd                                                                               
17340 postfix   18   0  8660 2008 1572 S  0.0  0.2   0:00.01 pickup                                                                             
 2531 usuario2  15   0  5984 1828  672 S  0.0  0.2   0:00.62 screen                                                                             
17665 usuario2  15   0 10092 1608 1044 S  0.0  0.2   0:00.04 sshd                                                                               
17666 usuario2  16   0  4784 1488 1208 S  0.0  0.1   0:00.02 bash                                                                               
 2412 postfix   18   0  8796 1432 1316 S  0.0  0.1   0:00.84 qmgr                                                                               
 2399 root      15   0  8592 1304 1224 S  0.0  0.1   0:01.78 master                                                                             
 2557 usuario2  15   0  4784 1276 1056 S  0.0  0.1   0:00.12 bash                                                                               
17521 usuario1  15   0 10228 1224  920 S  0.0  0.1   0:00.00 pure-ftpd                                                                          
 2532 usuario2  16   0  4784 1136 1020 S  0.0  0.1   0:00.03 bash                                                                               
 2638 usuario2  16   0  4784 1120 1004 S  0.0  0.1   0:00.10 bash                                                                               
 2609 root      16   0  4784 1100  996 S  0.0  0.1   0:00.28 bash                                                                               
17694 usuario2  15   0  4928  952  764 S  0.0  0.1   0:00.00 screen                                                                             
17722 usuario2  15   0  2308  928  728 R  0.0  0.1   0:00.00 top                                                                                
27516 root      25   0  5080  884  884 S  0.0  0.1   0:00.01 nss_pcache                                                                         
17522 root      20   0 10228  840  584 S  0.0  0.1   0:00.00 pure-ftpd                                                                          
 2608 root      16   0  5032  780  780 S  0.0  0.1   0:00.00 su                                                                                 
 2582 usuario2  15   0  4784  640  640 S  0.0  0.1   0:00.01 bash                                                                               
 2146 root      15   0  7264  620  528 S  0.0  0.1   0:00.03 sshd                                                                               
 2203 root      25   0  4648  604  604 S  0.0  0.1   0:00.03 mysqld_safe                                                                        
 2444 root      18   0  5392  592  532 S  0.0  0.1   0:00.36 crond                                                                              
 2083 root      16   0  1832  524  476 S  0.0  0.1   0:01.47 syslogd                                                                            
    1 root      15   0  2176  488  460 S  0.0  0.0   0:01.17 init                                                                               
 2480 root      20   0  1764  400  400 S  0.0  0.0   0:00.00 mingetty                                                                           
 2474 root      16   0  1764  380  380 S  0.0  0.0   0:00.00 mingetty                                                                           
 2475 root      17   0  1764  380  380 S  0.0  0.0   0:00.00 mingetty                                                                           
 2476 root      18   0  1764  380  380 S  0.0  0.0   0:00.00 mingetty                                                                           
 2479 root      19   0  1764  380  380 S  0.0  0.0   0:00.00 mingetty                                                                           
 2485 root      19   0  1764  380  380 S  0.0  0.0   0:00.00 mingetty                                                                           
 2457 root      18   0  2376  348  304 S  0.0  0.0   0:00.09 atd                                                                                
  583 root      18  -4  2404  344  344 S  0.0  0.0   0:00.50 udevd                                                                              
 2431 root      18   0 10176  316  248 S  0.0  0.0   0:00.39 pure-ftpd                                                                          
 2086 root      15   0  1780  308  308 S  0.0  0.0   0:00.00 klogd                                                                              
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0                                                                        
    3 root      39  19     0    0    0 S  0.0  0.0   0:00.02 ksoftirqd/0                                                                        
    4 root      10  -5     0    0    0 S  0.0  0.0   0:00.03 events/0                                                                           
    5 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 khelper                                                                            
    6 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kthread                                                                            
    9 root      10  -5     0    0    0 S  0.0  0.0   0:00.71 kblockd/0                                                                          
   10 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid                                                                             
  169 root      16  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/0                                                                           
  172 root      16  -5     0    0    0 S  0.0  0.0   0:00.00 khubd                                                                              
  174 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod                                                                            
  241 root      18   0     0    0    0 S  0.0  0.0   0:00.01 khungtaskd                                                                         
  242 root      15   0     0    0    0 S  0.0  0.0   0:10.34 pdflush                                                                            
  243 root      15   0     0    0    0 S  0.0  0.0   0:09.89 pdflush                                                                            
  244 root      10  -5     0    0    0 S  0.0  0.0   1:59.22 kswapd0                                                                            
  245 root      16  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0                                                                              
  463 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kpsmoused                                                                          
  494 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 mpt_poll_0                                                                         
  495 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 mpt/0                                                                              
  496 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_0                                                                          
  499 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 ata/0                                                                              
  500 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 ata_aux                                                                            
  505 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kstriped                                                                           
  514 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 ksnapd                                                                             
  525 root      10  -5     0    0    0 S  0.0  0.0   0:19.54 kjournald                                                                          
  550 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kauditd                                                                            
 1703 root      18  -5     0    0    0 S  0.0  0.0   0:00.00 kmpathd/0                                                                          
 1704 root      18  -5     0    0    0 S  0.0  0.0   0:00.00 kmpath_handlerd                                                                    
 1762 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kjournald                                                                          
 1764 root      10  -5     0    0    0 S  0.0  0.0   0:08.67 kjournald                                                                          
 1769 root      10  -5     0    0    0 S  0.0  0.0   0:10.34 kjournald                                                                          
 1774 root      10  -5     0    0    0 S  0.0  0.0   0:24.36 kjournald
As you can see, when the server is operating fine there is not pressure on the free RAM, currently there is about 25% of free RAM, which I think is fine.
aks wrote:
When the VM reaches its "unavailable" failure state, it's CPU consumption (1 vCPU) is seen as high on the host (but the overall CPU consumption on the host, event at those moments, is no more that 30%).
That does suggest some very furious swapping activity. Generally OOM killer will kill of the largest consumer first (according to the docs) and if that's not enough to satisfy current requests, kill of the next and so on. Although the documentation goes on to say how memory changes all the time so what is the largest now may not be the largest at the next CPU cycle and that kind of a thing.
Swapping for databases is a VERY bad thing. You need to look at your data and work out what are the appropriate values for your database and httpd.
There is furious trashing indeed, and high CPU usage, when the server reaches its "non available" state. But the question is, which process is the culprit, MySQL or Apache+PHP?

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by TrevorH » 2015/08/17 13:30:36

Adjusting max_connections won't help if you never reach that number to start with. Run show status like "max%"; inside the mysql client to see how many you actually use. Having a higher number won't affect how much memory you use unless the number of connections increases.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

User avatar
InitOrNot
Posts: 122
Joined: 2015/06/10 18:26:51

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by InitOrNot » 2015/08/17 18:50:06

TrevorH wrote:Adjusting max_connections won't help if you never reach that number to start with. Run show status like "max%"; inside the mysql client to see how many you actually use. Having a higher number won't affect how much memory you use unless the number of connections increases.
The server is lightly used:

Code: Select all

mysql> show status like "max%";
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Max_used_connections | 25    |
+----------------------+-------+
1 row in set (0.01 sec)

aks
Posts: 3073
Joined: 2014/09/20 11:22:14

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by aks » 2015/08/17 21:01:11

In a RDBMS like mysql, max% is a (n) limit but it is not the full story. I can have a single connection (process) that uses all available limits (hardware resources), so max_connections is not necessarily the full story.
So onto the question:
What has run out of ("allowed") memory and what is continuing in the same vein (attempting to allocate some more)?
Well swap can give you a really big clue. If something (a process) swaps more and more over time (and in this case, we're seeing the problem manifest over a set time period), then that is the likilest candidate for the problem. To see this, we only have to sample the available data (it's not "quick" like memory - it's swap which is much much slower (and hence more "stable")).
The obvious choices are sar and dstat/top.
So when we have honed into where the problem is (where a process swaps more and more until there is no more), then we can tackle the next "layer" of problems.
Make sense?

User avatar
InitOrNot
Posts: 122
Joined: 2015/06/10 18:26:51

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by InitOrNot » 2015/08/18 07:15:52

aks wrote:What has run out of ("allowed") memory and what is continuing in the same vein (attempting to allocate some more)?
Well swap can give you a really big clue. If something (a process) swaps more and more over time (and in this case, we're seeing the problem manifest over a set time period), then that is the likilest candidate for the problem. To see this, we only have to sample the available data (it's not "quick" like memory - it's swap which is much much slower (and hence more "stable")).
The obvious choices are sar and dstat/top.
So when we have honed into where the problem is (where a process swaps more and more until there is no more), then we can tackle the next "layer" of problems.
Make sense?
It makes total sense. I was discussing previously with TrevorH in this thread about using sar for that, but he said that tool only samples system-wide aggregates and does not have the granularity to gather single-process data. Therefore I ruled out using sar.

Next candidate was pidstat, but that command is not included in version 7.0.2 of the "sysstat" package which ships with Centos 5.

Next candidate is the atop package, which is available for Centos 5 in the "epel" repository. But I have not had yet the time to research that tool.

Any recommendations?

User avatar
InitOrNot
Posts: 122
Joined: 2015/06/10 18:26:51

Re: Out-Of-Memory in LAMP server - CentOS 5.9 x86 - Why now?

Post by InitOrNot » 2015/08/18 08:07:22

While I search for a better data gathering tool with per-process granularity, I've banged these two lines into root's crontab:

Code: Select all

*/5 * * * * top -b -n1 -m | head -30 > /tmp/top-sample_`date +\%Y-\%m-\%d_\%H-\%M-\%S`_.txt
56 20 * * * find /tmp/top-sample_* -type f -mtime +2 -print0 | xargs -r -0 rm
So I should always have data about the most memory consuming processes from the last two days. Yes, that data will have to be parsed by hand with more/less if it's ever needed -- but still that is better than nothing, and the number of files with data to peruse --should it be needed-- I think should not get higher than about 576, and even then probably only the last ones (before a non-availability condition is reached in the server) would be of interest.

Post Reply