host:
pveversion
pve-manager/7.4-3/9002ab8a (running kernel: 5.15.104-1-pve)

turnkey template:
nextcloud 17.2.1

container is unprivilidged and has nesting enabled both at creation time and runtime.

When logging in for the first time using the console feature of the procmox GUI. The first init sequence starts and I am asked to enter a password for adminer.

As soon as I hit enter after typing the password the screen freezes. Only hitting backspace will show the following;

	| Traceback (most recent call last):   File                │
	│ "/usr/lib/python3/dist-packages/libinithooks/dialog_wrap │  
	│ per.py", line 82, in wrapper     retcode = method("      │  
	│ " + text, *args, **kws)   File                           │  
	│ "/usr/lib/python3/dist-packages/dialog.py", line 3130,   │  
	│ in passwordbox     return                                │  
	│ self._widget_with_string_output(   File                  │  
	│ "/usr/lib/python3/dist-packages/dialog.py", line 1719,   │  
	│ in _widget_with_string_output     code, output =         │  
	│ self._perform(args, **kwargs)   File                     │  
	│ "/usr/lib/python3/dist-packages/dialog.py", line 1518,   │  
	│ in _perform     exit_code, output =                      │  
	│ self._handle_program_exit(child_pid,   File              │  
	│ "/usr/lib/python3/dist-packages/dialog.py", line 1484,   │  
	│ in _handle_program_exit                                  │  
	│ self._wait_for_program_termination(child_pid,   File     │  
	│ "/usr/lib/python3/dist-packages/dialog.py", line 1430,   │  
	│ in _wait_for_program_termination     raise DialogError(  │  
	│ dialog.DialogError: dialog-like terminated due to an     │  
	│ error: the dialog-like program exited with status 3      │  
	│ (which was passed to it as the DIALOG_ERROR environment  │  
	│ variable). Sometimes, the reason is simply that dialog   │  
	│ was given a height or width parameter that is too big    │  
	│ for the terminal in use. Its output, with leading and    │  
	│ trailing whitespace stripped, was:                       |

I have been troubled by something similar in the past and then the conclusion was to enable nesting. But the fact that I have this time around whilst still running into this wall suggests there is mroe wrong with my setup than obvious at first glance.

I have enough resources allocated to the container so that should not be a bottleneck.

Any suggestions on how to proceed in determining the root cause would be much welcomed ;)

Forum: 
Jeremy Davis's picture

Firstly, apologies for slow response. I've been away and am still catching up...

The error message (python stacktrace) relates to the "dialog" program crashing, most likely because it (automatically) resized to an invalid valid. So I'm fairly sure that it is unrelated.

I just tried to recreate your issue locally (applied all available PVE updates and downloaded a fresh copy of the TurnKey Nextcloud 17.2.1 template first). I just used the defaults and it "just works" for me?! FWIW (after updates and before my test) I have the same Proxmox version as you:

root@pve ~# pveversion
pve-manager/7.4-3/9002ab8a (running kernel: 5.15.107-1-pve)

I do have a newer kernel version than you. Whilst I wouldn't expect the kernel version to cause this, perhaps it's worth updating just in case it's some weird edge case bug?

Also, out of interest, here is the config for my container:

root@pve ~# cat /etc/pve/local/lxc/127.conf
arch: amd64
cores: 2
features: nesting=1
hostname: JED-TEST-nextcloud
memory: 1024
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=36:B3:5A:A9:06:E7,ip=192.168.1.127/24,type=veth
ostype: debian
rootfs: local-lvm:vm-127-disk-0,size=8G
swap: 512
unprivileged: 1

You can find the same thing by catting /etc/pve/local/lxc/VM_ID.conf - where VM_ID is the VM ID number (i.e. mine is 127).

As you can see, mine is both nested and unprivileged.

Looking back at your past posts, this seems very similar to the previous issues you have reported. The specific issue you've reported here (hanging when setting Adminer user password) is consistent with MySQl (actually MariaDB) not running. TBH, unless you made a mistake (and it's not actually nested when you thought it was) I'm stumped on what could be causing this issue for you.

To try to troubleshoot, you could try getting into the container from the PVE host via the CLI. I.e.:

pct enter VM_ID

Again, where VM_ID is the VM ID of your container. Then check to see if my guess is right (that MariaDB isn't running) via systemctl. FWIW here's all the output of those commands on my system:

root@pve ~# pct enter 127
root@JED-TEST-nextcloud ~# systemctl status mariadb
* mariadb.service - MariaDB 10.5.18 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2023-05-03 00:57:38 UTC; 30min ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 393 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 8 (limit: 38418)
     Memory: 121.9M
        CPU: 1.574s
     CGroup: /system.slice/mariadb.service
             `-393 /usr/sbin/mariadbd

May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: 2023-05-03  0:57:38 0 [Note] Server socket created on IP: '127.0.0.1'.
May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: 2023-05-03  0:57:38 0 [Note] Reading of all Master_info entries succeeded
May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: 2023-05-03  0:57:38 0 [Note] Added new Master_info '' to hash table
May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: 2023-05-03  0:57:38 0 [Note] /usr/sbin/mariadbd: ready for connections.
May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: Version: '10.5.18-MariaDB-0+deb11u1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Debian 11
May 03 00:57:38 JED-TEST-nextcloud systemd[1]: Started MariaDB 10.5.18 database server.
May 03 00:57:38 JED-TEST-nextcloud mariadbd[393]: 2023-05-03  0:57:38 0 [Note] InnoDB: Buffer pool(s) load completed at 230503  0:57:38
May 03 00:57:38 JED-TEST-nextcloud /etc/mysql/debian-start[420]: Upgrading MySQL tables if necessary.
May 03 00:57:38 JED-TEST-nextcloud /etc/mysql/debian-start[455]: Checking for insecure root accounts.
May 03 00:57:38 JED-TEST-nextcloud /etc/mysql/debian-start[465]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables

If the "Active:" line of your output is anything other than "active (running)", then please share the output of:

journalctl -u mariadb

Plus also the output of the container config as I noted above.

I finally found some time to upgrade the host and I am ending up in the same situation I am afraid.

I have some more time now to try a vanila debian container and see how that goes.

#pveversion
pve-manager/7.4-3/9002ab8a (running kernel: 5.15.107-1-pve)

#cat /etc/pve/local/lxc/301.conf
arch: amd64
cores: 4
features: nesting=1
hostname: nextcloud2
memory: 2048
net0: name=eth0,bridge=vmbr0,hwaddr=C6:B5:CB:95:5B:2B,ip=dhcp,ip6=dhcp,tag=5,type=veth
ostype: debian
rootfs: local-zfs:subvol-301-disk-0,size=8G
swap: 512
unprivileged: 1

#on first boot / init of container
the symtom and error message is simalor to my first post

#systemctl status mariadb


mariadb.service - MariaDB 10.5.18 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2023-05-05 07:44:04 UTC; 2min 5s ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
    Process: 248 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
    Process: 266 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 289 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, sta>
    Process: 663 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 667 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
   Main PID: 432 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 9 (limit: 38274)
     Memory: 87.6M
        CPU: 283ms
     CGroup: /system.slice/mariadb.service
             └─432 /usr/sbin/mariadbd

May 05 07:44:04 nextcloud2 mariadbd[432]: 2023-05-05  7:44:04 0 [Note] /usr/sbin/mariadbd: ready for connections.
May 05 07:44:04 nextcloud2 mariadbd[432]: Version: '10.5.18-MariaDB-0+deb11u1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Debian 11
May 05 07:44:04 nextcloud2 systemd[1]: Started MariaDB 10.5.18 database server.
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[675]: Looking for 'mariadb' as: /usr/bin/mariadb
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[675]: Looking for 'mariadb-check' as: /usr/bin/mariadb-check
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[675]: This installation of MariaDB is already upgraded to 10.5.18-MariaDB.
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[675]: There is no need to run mysql_upgrade again for 10.5.18-MariaDB.
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[675]: You can use --force if you still want to run mysql_upgrade
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[705]: Checking for insecure root accounts.
May 05 07:44:04 nextcloud2 /etc/mysql/debian-start[715]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables

Jeremy Davis's picture

Deep apologies for my super slow turnaround on this. I've had some (non-TurnKey related) crisis to deal with and I've really only been able to keep up with paid support and a bit of dev work behind the scenes to unblock some other team members. But I'm back at the desk now and hopefully should be back on top of things within the next day or 2.

Anyway, the weirdest part is that the most recent output you've shared looks like it's running fine!? (Albeit it's only been running about 3 minutes).

I wonder if it's a ZFS issue? Perhaps I'm getting confused, but I recall someone else having issues with our templates on PVE (which I also couldn't reproduce) was also using ZFS? Perhaps there is something weird going on there?

Also, if you want to test TKL against vanilla Debian, be sure to install MariaDB etc too.

johenkel's picture

I don't think it is a zfs or db issue.  
I just tried to install v 18.0.1 and ran into the same issue.

Canceled the script, did all updates and ran it again. Same error.

https://imgur.com/a/t0YEI9P

Jeremy Davis's picture

Thanks for reporting.

Assuming that it occurred for you when it asks for a password, but not on any of the previous questions, then it clearly seems to be a reproduction of the OP's issue. So whilst I can't reproduce it, it seems highly likely that at least under some circumstance, there is a specific issue here. As such, I've opened a bug on our issue tracker.

I wish I had thought of it earlier, but I do (likely) have a workaround. It may even help me get an understanding of the issue too? Instead of doing it interactively, please try setting the password non-interactively. You should be able to do that like this:

/usr/lib/inithooks/bin/nextcloud.py domain='YOUR_DOMAIN' pass='YOUR_PASSWORD'

Where YOUR_DOMAIN is the domain you wish to use and YOUR_PASSWORD is the password you wish to use.

Hopefully that should work around the issue. If you still get an error message (any error message, even if it makes it clear to you what the error is and allows you to resolve it), please share it.

johenkel's picture

Thank you for replying. The error message popped up after entering the MYSQL password and the admin password. This time I canceled the script after that and used your above code. After your code, the window for an admin password popped back up again. After entering it, the previous error message came pack up. No worries on my behalf. I will go and install nextcloud using a different route. Cheers.
Jeremy Davis's picture

Thanks for posting back with the additional info.

After your code, the window for an admin password popped back up again.

Hmm, that window shouldn't show when used with the switches like I gave you. Something seriously off here...

No worries on my behalf. I will go and install nextcloud using a different route. Cheers.

I completely understand. Thanks anyway for trying...

Regardless, I'll have a closer look ASAP.

BobbyJ1's picture

I am currently testing ISO 16.1, 17 and now 18.0.

I am not sure why but running them on the same HyperV server they 16.1 is much snappier on the interface.    ISO 18.0 next clound 27.xx is around 3 times slower to change screens like 3.5 seconds vs 1 second or less for the 16.1.    17 is a little slower than 16.

Next Clound 23.0.12 (from turnkey 16.1 ISO) is nice and wondering how secure it is if not using all the features.  Mostly just for zips and pds file uploads with under 10 users.

18.0 is so slow is it the newer version of PHP or the newer version of Debian?  Any tweeks to make it faster on the interface or is that a nextcloud 27.xx and 28.xx issue and has nothing to do with Turnkeys chosen default settings for the ISO?

Sorry to post it here but for some reason I am unable to create a new thread in the message base I can only add to an existing topic.

Jeremy Davis's picture

I too have noticed that all of the servers have incrementally become more laggy and resource intensive over the years. Although I wouldn't have expected that much difference.

During the v18.0 smoke testing I didn't notice it to be significantly worse. Having said that, I'm not a regular user of Nextcloud and it's been a while since I ran a v17.x server (and longer since I ran v16.x), so you would be in a better place to judge that, especially if you have them all running side by side.

It may well be that the underlying OS is causing part of the delays and newer Nextcloud adding more overhead? There is also newer Apache and MySQL (actually MariaDB these days). AFAIK, newer versions of PHP should give overall performance improvements. In my experience though, the performance improvements are at the cost of higher resource requirements.

Having said that, whilst there should be overall PHP performance improvements, that only applies when comparing the same software & version running on different PHP versions. It's quite possible that newer Nextcloud versions are heavier/more resource intensive which chews up PHP performance gains. I still wouldn't expect that alone to cause such a radical difference in performance though. Beyond likely higher resource requirements of newer PHP, there are also some improved security stuff configured in Apache, so perhaps that also adds to the overhead too?

Perhaps it's worth first trying to add more resources? I.e. another vCPU (if possible) and/or some extra RAM. Another thing that might be worth a try in installing the "cloud" kernal (package name: 'linux-image-cloud-amd64'). AFAIK that should work on HyperV (it strips out most of the hardware support and just leaves the basics that most hypervisors should require). Although I wouldn't recommend removing the current kernel until you've rebooted into the new one to confirm it at least boots.

There may also be options to tune other factors. I haven't tried tuning Apache much, but I know that MySQL (MariaDB) can be tuned when you are targeting specific hardware provision and use cases. We ship with (mostly) Debian defaults which we consider pretty sane for most use cases, but you may be able to squeeze a bit more out?

Nextcloud themselves may also provide some tuning options. And there may also be some PHP config which might help (e.g. cache size and other settings, memory usage, etc).

Finally, maybe there are some HyperV tuning that could improve things? TBH I've never used it, so wouldn't know if that's a possibility or even where to start...

Regarding continuing use of the old version, TBH I can't really say how secure that may or may not be. Ultimately, it really depends on how you access it. If it's only ever accessed within a LAN and/or via VPN (i.e. not publicly available) then that should not a problem (a malicious actor would need to gain access to your LAN before being able to do anything with Nextcloud).

If your server it's publicly available, then the risks are going to be higher. If it's the same 10 users and they all have static IPs, then an easy way to minimize risks would be to use a firewall to limit external connections from those IPs. Even if they don't have static IPs, most ISPs will own/control a specific IP range, so your could at least cut down access to specific IP ranges. Or like I hinted above, use a VPN to access your Next cloud within a LAN.

Sorry for the essay...!

BobbyJ1's picture

Thanks Jeremy I don't mind the long response I appreciate it.

I think it could be Version 28.xx of nextcloud.   I think I was running 25.xx (Fast) Many people were complaining about the user interface lag some had to go to a different product.  

So It might not be Debian or Turn key.  Could be the code has a few bottle necks.   I was hoping someone found tweaks to get the performance back.   I could Add more ram but I was trying to keep all parameters the same.  2GB ram (its debian that should be plenty)  and 2 CPU.   I run Pihole on a raspberry pie with a very snappy interface.    Sort of apples and oranges comparison sorry :)

Something is up, I am on the border of not wanting to use it and I spent some time configuring it and customizing the graphics etc.  

Probably something I can't fix without some direction from the programming team.   I wish they would pause visual features and speed a cycle just optimizing the code rather than inserting new features.

Someone broke something but not sure where.   For now I'' stick with the less secure but faster 16.1.  17.1 is decent too I can give 18.00 more ram and see.   What are the requirments on 18 ??  is 2 Gigs not enough?

Jeremy Davis's picture

I would expect 2GB to be plenty for most TurnKey appliances. There are a few that explicitly require more, but I wouldn't expect Nextcloud to be one of them. It might be worth testing, but with only 10 users, even adding more would likely only provide minimal improvements. Having said that, IMO more is always better and it might be worth double checking.

The Nextcloud docs note that the explicit recommended minimum requirement is only 512MB RAM. Although I also note that that the memory recommendations haven't been updated for ~6 years. And I'm almost certain that the RAM usage of Nextcloud would have increased in that timeframe!

I note that they also have a page on server tuning. The general reducing server load might be worth a try (at least check RAM and CPU usage) - if it's running out of RAM and swap is enabled in the server (not by default in TurnKey IIRC), then that may be a source of slow downs. Although another thing worth being aware of is that Windows has a similar system (i.e. pagefile). IIRC from Win10 onwards, it tries to fill RAM before it starts paging. Even if it is paging, then ensuring that the pagefile is on a SSD will give an improvement. Having said that, if that was an issue, then I would expect the issue to be consistent across VMs - not specfic to one.

The default logging should be fine, but probably worth double checking. Perhaps that can even be reduced to a lower level than default? (Don't forget to change it back to higher lvels if/when you have issues to debug).

Web server compression via HTTPS is disabled by default, but that is because AFAIK that has security implications. Regardless, it may be worth enabling just to see if there is any noticeable difference? Enabling HTTP/2 should definitely provide a performance boost but to support that that requires reconfiguration of Apache to use of php-fpm (by default we use Apache's mod_php). That will require a fair bit of mucking around.

Under specific circumstance, swapping web server to Nginx (which requires php-fpm) might give a performance improvement, but bench-marking we've done, suggests that under most circumstance that won't be noticeable, except under particularly high load (e.g. lots of users logging in simultaneously). Even then, with sufficient RAM and CPU resources, that won't make much difference.

I was going to suggest that using some sort of memory caching might help. But then I realized that we've included Redis by default since TurnKey v15.2. If you have sufficient memory (i.e. aren't already getting close to running out and/or can add more), it may be worth configuring APCu as well. Another caching mechanism that can help is PHP OPcache - although I'm almost certain that that is configured by default.

Good luck with it all and if you find something/anything that significantly improves performance of newer Nextcloud versions, then I'd love to hear.

johenkel's picture

Oh sorry, I forgot to add the screenshot ( https://imgur.com/a/fdeCnGa )
Jeremy Davis's picture

Thanks

Add new comment