unaffectedoddball's picture

I can enter the container, then it hesitates for about two seconds and comes back with 'exited from CT [CTID]':

# vzctl enter 110
entered into CT 110
exited from CT 110
#

Rebuilt it a couple times and it keeps kicking me out. I can log in via ssh and from a the noVNC console just fine. I've looked in /var/log on the node and an item of interest is:

kernel: Out of memory in UB 110: OOM killed process 181201 (bash) score 0 vm:4214012kB, rss:2030972kB, swap:0kB

 

We're working with the debian-7-turnkey-postgresql_13.0-1_amd64 image.

Thanks!

Forum: 
Jeremy Davis's picture

I have no idea and have not encountered that before. TBH I think that this is possibly a(nother) question for the Proxmox forums!?

Bottom line is that it seems that something is running out of RAM (your guest probably by the look of it). Assuming that this is the same template you were having issues with before I'd test the integrity of it (even if it seems unlikely I think that would be a good thing to rule out).

Out of interest; when I get a chance I might retry boosting the specs on my PqSQL CT and see if I can replicate your other findings.

Another thing that I'd consider testing is the RAM on your host. I had a Proxmox server running on desktop hardware for about 3 years that generally ran smoothly but sometimes exhibited odd behaviour - but it only seemed to affect new Linux containers (the older containers seemed fine and all KVM machines were fine whether Windows or Linux). I never could pinpoint the issue but when I upgraded the PVE hardware and re-purposed the old hardware to be a desktop Windows kept BSODing during install. After some testing it ended up that there was a bad stick of RAM. I RMAed it and was told that it was probably bad from the start (but may have got worse over time). Apparently it was a bad batch but they hadn't seen any sticks for RMA from that batch for a long time...

unaffectedoddball's picture

Thanks again Jeremy for the quick and helpful response. You are absoultely correct that it was running out of RAM. I suppose I am still perplexed that it would allow me to ssh or log in via noVNC, but can't seem to handle vzctl enter. I'd actually expect that request to be the least resource-intensive!

Noticing how poor the performance had been with too many resources allocated, I went too far the other way and turned it down to 2GB, and received the reported behavior. After searching TKL, I found this, which says 'we usually test appliances with 256MB of memory' so I cranked it down and it still had the same symptoms.

Bumping it up to 4GB allows me to vzctl enter the appliance, and running top I see it's already consuming 3GB!

# free
             total       used       free     shared    buffers     cached
Mem:       4194304    3186636    1007668

So problem solved. Mind your RAM!

unaffectedoddball's picture

Huh. So here's what I've found:

No other shell * 32 MB used
ssh connection 40 MB used
vzctl enter 3,200 MB used

So my assumption that vzctl enter is a low-resource function is very incorrect! 

* Monitoring was performed via a noVNC console

Jeremy Davis's picture

Because I just tested (on that same template I launched the other day - it's still running. And mine uses nowhere near as much RAM as yours...

Mine has 512MB RAM allocated (hasn't changed) and at idle it is using just over 200MB of that (and no swap). Looking at top (vztop -E <CTID>) I can see almost all of that is taken up by 2 PostgreSQL processes (99.4MB & 98.7MB). sshd is using about 50KB (which increases a little when I log in via SSH, but not much) and when I log in via vzctl; that uses only ~26KB; anyway, here it is...

 20:43:08  up 10 days, 20:44,  2 users,  load average: 0.00, 0.00, 0.00
18 processes: 18 sleeping, 0 running, 0 zombie, 0 stopped
CPU0 states:   2.2% user   1.2% system    0.0% nice   0.0% iowait  96.1% idle
CPU1 states:   2.1% user   2.0% system    0.0% nice   0.0% iowait  95.2% idle
Mem:  8171916k av, 8103704k used,   68212k free,       0k shrd,  377540k buff
      2329040k active,            5369484k inactive
Swap: 7340028k av,   76752k used, 7263276k free                 5729284k cached

  PID USER     PRI  NI  SIZE  RSS SHARE  VEID STAT %CPU %MEM   TIME CPU COMMAND
702472 root      20   0 10608  756   640   118 S     0.0  0.0   0:04   1 init
702473 root      20   0     0    0     0   118 SW    0.0  0.0   0:00   0 kthreadd/118
702474 root      20   0     0    0     0   118 SW    0.0  0.0   0:00   0 khelper/118
709659 sshd      20   0 98.7M 1760   532   118 S     0.0  0.0   0:53   0 postgres
709660 sshd      20   0 99.4M 2948  1112   118 S     0.0  0.0   0:13   1 postgres
710421 root      20   0  4072  660   508   118 S     0.0  0.0   0:00   1 acpid
710472 www-data  20   0 40424 1716   624   118 S     0.0  0.0   0:12   0 lighttpd
710473 www-data  20   0 98624 7524  4416   118 S     0.0  0.0   0:00   1 php-cgi
711309 root      20   0 49888 1248   640   118 S     0.0  0.0   0:00   0 sshd
711834 ais       20   0 29696 1556  1120   118 S     0.0  0.0   0:00   1 shellinaboxd
711835 root      20   0  4052  516   420   118 S     0.0  0.0   0:00   1 startpar
711883 root      20   0 18836  988   756   118 S     0.0  0.0   0:00   0 cron
712021 root      20   0 73752  16M  1656   118 S     0.0  0.2   0:13   1 miniserv.pl
712034 root      20   0 14532  876   716   118 S     0.0  0.0   0:00   1 getty
992470 root      20   0 25560  664   448   118 S     0.0  0.0   0:00   0 vzctl
992471 root      20   0 19756 4004  1588   118 S     0.0  0.0   0:00   0 bash

unaffectedoddball's picture

The results seem different today. (I used vztop -C -E CTID to keep the CPUs from scrolling off the screen):

 11:07:06  up 102 days, 19:37,  1 user,  load average: 0.05, 0.05, 0.07
15 processes: 15 sleeping, 0 running, 0 zombie, 0 stopped
CPU states:  25.6% user   9.6% system   0.0% nice   0.0% iowait 3158.4% idle
Mem:  65921960k av, 62416528k used, 3505432k free,       0k shrd,  157428k buff
      9200924k active,            51424652k inactive
Swap: 36438008k av,   32600k used, 36405408k free                 50094488k cached

  PID USER     PRI  NI  SIZE  RSS SHARE  VEID STAT %CPU %MEM   TIME CPU COMMAND
147266 root      20   0 10600  756   704   110 S     0.0  0.0   0:04  22 init
147267 root      20   0     0    0     0   110 SW    0.0  0.0   0:00   0 kthreadd/110
147268 root      20   0     0    0     0   110 SW    0.0  0.0   0:00   0 khelper/110
149802 ais       20   0 29608 1132  1124   110 S     0.0  0.0   0:00  17 shellinaboxd
149803 root      20   0  4048  436   432   110 S     0.0  0.0   0:00  17 startpar
149829 root      20   0  4068  488   484   110 S     0.0  0.0   0:00  17 acpid
149875 root      20   0 18832  824   740   110 S     0.0  0.0   0:05  14 cron
149928 www-data  20   0 56684  760   628   110 S     0.0  0.0   0:08   4 lighttpd
149940 www-data  20   0 96492  336   332   110 S     0.0  0.0   0:00  17 php-cgi
149951 sshd      20   0 99012 8660  8536   110 S     0.0  0.0   0:03   0 postgres
149981 root      20   0 49804  608   600   110 S     0.0  0.0   0:00  17 sshd
150004 sshd      20   0 98996 1136  1000   110 S     0.0  0.0   0:32  25 postgres
150005 sshd      20   0 98996  604   536   110 S     0.0  0.0   0:32  25 postgres
150100 root      20   0 14528  852   696   110 S     0.0  0.0   0:00  17 getty

While executing vzctl enter [CTID] in another shell:

 18:48:42  up 103 days,  3:18,  2 users,  load average: 0.07, 0.08, 0.18
19 processes: 19 sleeping, 0 running, 0 zombie, 0 stopped
CPU states:   9.6% user   9.6% system   0.0% nice   0.0% iowait 3174.4% idle
Mem:  65921960k av, 65577508k used,  344452k free,       0k shrd,  157488k buff
      12397824k active,            51328544k inactive
Swap: 36438008k av,   32512k used, 36405496k free                 50335372k cached

  PID USER     PRI  NI  SIZE  RSS SHARE  VEID STAT %CPU %MEM   TIME CPU COMMAND
147266 root      20   0 10600  756   704   110 S     0.0  0.0   0:04   0 init
147267 root      20   0     0    0     0   110 SW    0.0  0.0   0:00   0 kthreadd/110
147268 root      20   0     0    0     0   110 SW    0.0  0.0   0:00   0 khelper/110
149802 ais       20   0 29608 1132  1124   110 S     0.0  0.0   0:00  17 shellinaboxd
149803 root      20   0  4048  436   432   110 S     0.0  0.0   0:00  17 startpar
149829 root      20   0  4068  488   484   110 S     0.0  0.0   0:00  17 acpid
149875 root      20   0 18832  824   740   110 S     0.0  0.0   0:06  16 cron
149928 www-data  20   0 56684  760   628   110 S     0.0  0.0   0:08  27 lighttpd
149940 www-data  20   0 96492  336   332   110 S     0.0  0.0   0:00  17 php-cgi
149951 sshd      20   0 99012 8672  8536   110 S     0.0  0.0   0:04  26 postgres
149981 root      20   0 49804  608   600   110 S     0.0  0.0   0:00  17 sshd
150100 root      20   0 14528  852   696   110 S     0.0  0.0   0:00  17 getty
154380 root      20   0 25560  668   452   110 S     0.0  0.0   0:00   0 vzctl
154381 root      20   0 19768 3984  1544   110 S     0.0  0.0   0:00   0 bash
176798 sshd      20   0 98.6M 6148  4212   110 S     0.0  0.0   0:00   2 postgres
177334 sshd      20   0 98.1M 5812  3964   110 S     0.0  0.0   0:00   2 postgres
177338 sshd      20   0 98.8M 6112  4044   110 S     0.0  0.0   0:00   2 postgres
177339 sshd      20   0 98.7M 5956  4056   110 S     0.0  0.0   0:00   2 postgres

Do you know how to change the sort column in vztop? Hitting ‘?’ or ‘h’ indicates that ‘o’ and ‘O’ change the sort, but it just dumps me to a list of columns and ends the program.

Add new comment