Domhnall Currie's picture

Hello Jeremy!  Long time no talk to.  Hope you're doing well.  Myself, I've been up to my neck in alligators for as long as I can remember now.  :0 

I just tried to install v18 of Redmine because I can't get my reports to work like I want them to on my old version. (That's been an ongoing issue for years, not a TKL issue.)  So I thought I'd install a fresh VM, screw around with some test data and modifying the .rb files or whatnot to see if I could get my reports like I wanted, and then I'd migrate my data to the new system.  I'm having a hard time getting v18 to work however, as it looks like MariaDB is refusing to start.  I install like I've always done and everything appears to work, but RM won't come up.  The only thing I could find based on the error msgs was the possibility of ib_logfile0 not being there after Maria not being shut down properly and then upgrading to a new version, but with a fresh install not sure that's applicable. 

Anyway, just thought I'd check to see if anyone else had seen this problem before I continue to beat my head against the wall.  I've installed this as a VM on Proxmox 8.1.4 with 8GB RAM and a 100GB storage space.  All the settings are pretty generic like I always do and I just installed a clean RM v17.1 VM to make sure I hadn't changed something that the previous version didn't like.  My old version 17.1 is running fine and the new VM came up with the fresh install and is running fine, so I'm not sure what I'm doing wrong with 18.0.  Anybody have any idea, give me a holler! :) 

 

Apr 30 15:31:27 rm-new.fortyhourday.com systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [ERROR] Aborting
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [ERROR] Unknown/unsupported storage engine: InnoDB
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Warning] 'innodb-large-prefix' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Warning] 'innodb-file-format' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] Plugin 'FEEDBACK' is disabled.
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Starting shutdown...
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [ERROR] InnoDB: File ./ib_logfile0 was not found
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Completed initialization of buffer pool
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Using liburing
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Using SSE4.2 crc32 instructions
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Number of transaction pools: 1
Apr 30 15:31:27 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:27 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
Apr 30 15:31:26 rm-new.fortyhourday.com mariadbd[8049]: 2024-04-30 15:31:26 0 [Note] Starting MariaDB 10.11.6-MariaDB-0+deb12u1 source revision  as process 8049
Forum: 
Jeremy Davis's picture

All in all I'm doing ok, although the v18.x release is taking longer than I'd like as I have a huge backlog of stuff I really want to get onto! Regardless, it sounds like I'm doing better than you! :/


Whist the message "File ./ib_logfile0 was not found" seems relevant, IMO it shouldn't actually be an issue. AFAIK the ib_logfileX files are where MariaDB stores historical DB changes so that if something bad happens, your DB can be restored - including DB changes occurring immediately prior to the crash - which may not have yet been written to the DB. I'm not sure, but the fact that it's looking for them, suggests to me that it may have previously crashed or not been shutdown cleanly prior to this issue?

It's likely cold comfort for you, but FYI I just launched a fresh v18.0 Redmine and doesn't have any ib_logfileX files and MariaDB is starting fine:

Apr 30 23:06:39 JED-TEST-REDMINE-180 systemd[1]: Starting mariadb.service - MariaDB 10.11.6 database server...
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] Starting MariaDB 10.11.6-MariaDB-0+deb12u1 source revision  as process 397
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Number of transaction pools: 1
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Using liburing
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Completed initialization of buffer pool
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: End of log at LSN=1841905
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: 128 rollback segments are active.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: log sequence number 1841905; transaction id 3242
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] Plugin 'FEEDBACK' is disabled.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Warning] 'innodb-file-format' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Warning] 'innodb-large-prefix' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Warning] You need to use --log-bin to make --expire-logs-days or --binlog-expire-logs-seconds work.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] Server socket created on IP: '127.0.0.1'.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] /usr/sbin/mariadbd: ready for connections.
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: Version: '10.11.6-MariaDB-0+deb12u1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Debian 12
Apr 30 23:06:41 JED-TEST-REDMINE-180 mariadbd[397]: 2024-04-30 23:06:41 0 [Note] InnoDB: Buffer pool(s) load completed at 240430 23:06:41
Apr 30 23:06:42 JED-TEST-REDMINE-180 systemd[1]: Started mariadb.service - MariaDB 10.11.6 database server.

I was able to recreate the issue you hit, by starting MariaDB manually and then killing it (i.e. not cleanly), reinforcing my suspicion that your DB did not close cleanly prior to this issue.

I have a (perhaps dirty?) fix/workaround. It may not include recent changes but I'm not sure on what else there is to do. It assumes that the only DB you want to keep is 'redmine_production' (which should be the only one required - unless you've created some custom DBs). If you want others, please rerun the "mysqldump" line with "redmine_production replaced with the name of the desired DB name.

# manually start mariadb ignoring errors
mariadbd-safe --innodb-force-recovery=6 &
# dump the redmine production DB
mariadb-dump redmine_production > redmine_production.sql
# kill the emergency mariadb process
kill $(pgrep mariadb)
# double check it's not running
pgrep mariadb
# if that gives NO result then you're good to continue. If it outputs a number, try the pgrep command until it does
# move the broken DB out of the way
mv /var/lib/mysql /var/lib/mysql.bak
# reinstall mariadb
apt install --reinstall mariadb-server
# it should now start
systemctl start mariadb
# double check it's running
systemctl status mariadb
# recreate redmine DB and load from dump
mariadb-admin create redmine_production
mariadb redmine_production 

Hopefully Redmine should be working again!


If not, then TBH, I'm not sure. Perhaps if you answer these questions I might think of something?:

  • Is this a "clean" v18.0, or have you restored your old data (and/or database) from your old instance?
  • If it's from data from an older TKL instance, what TKL version did it come from?
  • If from an older TKL instance was it a TKLBAM restore or a manual one?
  • If TKLBAM, was it a complete restore or a partial one?
  • Regardless If a manual restore or partial TKLBAM, what files/directories did you restore?
  • Regardless, has this server crashed and/or was MariaDB not shutdown cleanly?
Domhnall Currie's picture

This is a brand new install.  I got the ISO link from the TKL website and the 512 hash tag, hit "Download from URL" in Proxmox and imported it into my Proxmox ISO repository.  The hash checked good and I installed it from there as a virtual machine.  I'll try your "dirty fix" to see if that fixes my problem and post back in case anyone else has a similar issue going forward, otherwise I'll try another fresh install and pay close attention to any install settings or Proxmox settings that could possibly have any effects on things.  If I still can't get it I'll try it on a bare metal box.  Maybe I'm just not holding my mouth right!  That's probably a US expression, I don't know where it came from. :)  As always, thanks for your help!  Good luck finishing off v18!

Jeremy Davis's picture

Sorry I should have read your OP properly... You did already say it was v18.0! Next I was going to suggest that perhaps it got some minor corruption, but you've also ruled that out.

Different to you, I installed the v18.0 LXC template on Proxmox. Although It should not make any fundamental difference - as we create the LXC templates from the ISO. So other than a few specific tweaks required for LXC, they should be essentially the same. FYI I downloaded the LXC template via the "Proxmox Templates" in the web UI. If there were issues, I would expect them to be more likely under LXC, not the other way around.

As you're running the latest Proxmox, that is another difference - I'm still running the older v7.x. But again, I would not expect that to cause any fundamental difference in this case. While newer versions of packages on the host may cause some different behavior in guests, if there were issues, I would expect more "show stopper" type ones. As I noted above, if issues did occur, they would be more likely to occur under LXC, rather than in a "proper" VM. As you're likely aware, LXC is tied much more closely to the host, where KVM provides full virtualisation of hardware. If a KVM related issue did occur, this seems like a very strange one.

Regardless, I'll aim to try again with the ISO later today and report back. I have intended to update Proxmox too, but I won't commit to getting that done in the next few days.

If you do get a chance, please try a fresh install to a new VM and take note of anything that strikes you as unexpected, weird or any error messages. Once its installed and first thing after you've rebooted into the system, please post the full journal output. It'll likely be quite long, so perhaps better to add it to your OP as an attachment (let me know if you do that, as it won't show up by default - dumb website bug that I've never worked out).

To export the full journal to a file:

journalctl --no-tail > full-journal.txt
Domhnall Currie's picture

I don't know Boss.  :(  I ran 

mariadbd-safe --innodb-force-recovery=6 &

and got

I fiddled around for a while looking at all the user access rights and what-not and can't reconcile between users root, mysql, admin & adminer.  Started all over with a fresh install.  When I install RM and reboot the system, Maria comes up running.  After I set all the RM parameters in the TKL system and access through Webmin, Maria is running.  After I do the system updates and reboot, Maria fails on boot.  This is print screen of Maria after fresh install:

 

I changed that user to root as per the TKL website notes for RM.  Maria came up and running:

 

This is running when I'm at this stage of the install:

After these updates are installed and the system reboots, when it comes back up Maria will no longer run.  The only thing I changed during the installation this time was changing Maria user adminer to root and that made Maria run, but after the updates are installed, Maria no longer runs.  I haven't changed the RM database or any other configuration options.  This is not a migration, it's a brand new install.  I guess I'm doing something that is stopping the installation from completing properly.  I'll fiddle around with it some more later and try to see what I'm doing wrong.

Jeremy Davis's picture

Thanks for the additional info. I have a suspicion what might be going on here.

Are you manually/explicitly setting a password for the MariaDB 'root' user?

TL;DR - if so, that's the issue! Please don't do that. Use the 'adminer" user (password should be set on firstboot) with Adminer/Webmin. Details/context below.


can't reconcile between users root, mysql, admin & adminer

I'm not sure, but I suspect that this may be the root cause of your issue (excuse the pun). These users have different contexts and are used in different scenarios, for different purposes. If I'm correct and understand the issue correctly, it's not the installing the security updates that is the issue, but the reboot.

Let me explain a bit more:

  • 'root' is the name of both a Linux user and a mysql/maraidb user. Whilst they are named the same, they are separate users, used in specific contexts. I.e. logging into your Linux system or logging into maraidb respectively. Just to make it clear as mud, they are somewhat linked as I'll explain further down.
  • 'mysql' is a Linux (system) user only. It's the limited user account that the mariadb service runs under. All the files that the mariadb service needs to write to need to be owned by the 'mysql' Linux user.
  • 'admin' is the default Redmine user. It is only for logging into the Redmine UI via your web browser.
  • 'adminer' is a "root-like" mariadb user - for logging into mariadb only. E.g. in Adminer or the Webmin mysql module.
  • Just to confuse things some more, there is also a 'vcs' Linux user. That one is used to interact directly with the underlying version control system (git or svn).

Now circling back to the 'root' Linux and MariaDB users; once upon a time (in a release long, long ago), the 'root' MariaDB user had a password set. So the 'root' MariaDB user could be used for both CLI interaction and within Webmin and Adminer.

However, some time ago, a better security system for accessing MariaDB became the default for the 'root' MariaDB user. This "new" authentication system is called "socket authentication" (incidentally PostgreSQL has used it forever, but let's not go there). Socket auth requires both a Linux user and a MariaDB user of the same name and somewhat links them for the purposes of MariaDB access. Any user can use socket auth to get access, but on TurnKey only the 'root' user is configured by default (inherited from Debian).

I won't go into the nuts and bolts of "socket auth" works, but effectively it allows a logged in Linux user to access MariaDB without requiring a password. No other Linux users can access MariaDB by that user at all - unless running as that user (e.g. using sudo or su). Plus it also only works for local MariaDB access - e.g. via the CLI.

However, that security improvement could also be considered a downside of socket auth. Tools such as Webmin and Adminer (and any other third party access inc remote access except local CLI) can only use password auth. Enter the "root like" 'adminer' MariaDB user. That's a (MariaDB only) user that we create specifically to allow access via username/password (Webmin & Adminer).

One of the security disadvantages of username/password auth is that to start/stop the MariaDB service, a 'root-like' Linux/MariaDB system user pair is required. By requirement that means that a plain text password needs to be stored in a config file (in /etc). With socket auth, the MariaDB can be stopped/started by 'root', with the MariaDB service itself using the limited 'mysql' Linux user.

My suspicion is that there is something you are doing along the line that is breaking our assumptions. Likely setting a password for the 'root' MariaDB user? But perhaps something else? You could revert to the old way of things ('root' using user/pass auth), but it would also require the additional steps to (re)create a 'root-like' Linux/MariaDB user pair - with the password saved somewhere.

If that sounds right, please let me know. Although if I have time, I intend to double check whether I can recreate the issue when installing as ISO.

Domhnall Currie's picture

Look man, I've been harassing you for help for a long time....  you should know by now I ain't steppin' outside the box.  LOL  If it doesn't ask me or tell me, I ain't messin' with it. :D  Ran through another install real quick and it doesn't mention Maria or Adminer or anything like that.  It doesn't tell me to leave anything blank.  It just steps me through like usual.  After it reboots and comes back up under the Webmin users it just shows root.  Under the system users among all the normal system users I just have root, mysql and vcs.  I don't have admin or adminer or anything else like that.  New install currently showing a socket error and adminer as the maria user looking at it through the Webmin Maria server configuration settings.  Let me know if you still need the log output and I'll get that to you either this evening or tomorrow morning.

Domhnall Currie's picture

Sitting in front of my fan cooling off in the woods and i thoroughly re-read your last post.  If I'm supposed to have an "adminer" user that's got to be it.  I dont have that user but when i click the config gear in webmin it shows adminer as the database user. Can i just create that user with no pw? Can't hurt i don't guess so I'll give that a shot when i get home and if it doesn't work I'll roll out another fresh install and shoot you that log.

 

Jeremy Davis's picture

You nailed it! It's the 'adminer' account - or more to the point the lack of one! Gross oversight on my behalf! That's what happens when I almost never use Webmin - except when people have issues.

Sorry that I didn't see it sooner! Not paying enough attention when going through the steps...

FWIW it doesn't actually matter what it's called, and it doesn't make much sense to call it 'adminer' - because the Redmine appliance doesn't have the Adminer DB webUI (just the Webmin MySQL module).

So it seems you were thinking, creating a new "root like" user is the way to go - although to use it with the Webmin MySQL module, you will need to set a password. As per above, I'd probably call it 'admin' or similar. Assuming a username of 'admin', on the CLI, run the below command. Replace DB_PASS with the desired password:

mysql --execute "GRANT ALL PRIVILEGES ON *.* to admin@localhost identified by 'DB_PASS'; FLUSH PRIVILEGES;"

Then use that user account in Webmin - NOT the root account. I.e. DO NOT set a MariaDB root user password. Then you should be golden. You should be able to access MariaDB via the Webmin module. If/when you use the CLI you can use either 'admin' (with a password) or 'root' without. Please let me know if that's not the case.


It's been like that for ages (at least v16.0, perhaps v15.0)! TBH it blows my mind that no one has ever reported this before now. Or perhaps they have and they weren't persistent enough and I just didn't join the dots? I wonder how many people have launched our Redmine appliance (or one of the other ones with MariDB, the Webmin module but not Adminer), done the same thing and just thrown their arms in the air and gone elsewhere!? So whilst it's a pain, thank you so much from reporting and persisting! Now I know what is going on, I can look at addressing it!

Which brings me to the best way forward... And I'd love your feedback/input.

As I say, it doesn't make sense to call the "root like" MariaDB user 'adminer' on servers that don't have Adminer. We could just use 'admin' on apps (like Redmine) that don't have Adminer and leave it as 'adminer' on one's that do. But IMO it might make sense to rename the 'adminer' DB user to 'admin' too so it's consistent everywhere.

I guess another option is to install Adminer on Redmine (and others too), but it's a PHP app. As all PHP based apps include Adminer anyway and Redmine is a Ruby app, it's a whole lot of extra dependencies for limited value.

Beyond that, we probably need to consider how we handle the Webmin MySQL module. IIRC on all the LAMP based appliances, we're adding 'adminer' as the default user, and perhaps even a warning about usage of root. Obviously we need to address that in Redmine (and others). But I suspect that's not enough? Perhaps we need to block setting of the 'root' MariaDB user via the Webmin module? Perhaps even a link to a doc page that explain why?

I'd love your thoughts!

Error500's picture

I am having the exact same issue and error on a fresh vm install. Same version as Domhnall. Let me know if i can participate in supplying info so we can get this to work.    
Jeremy Davis's picture

Thanks for taking the time to add your voice to this issue.

I've just written a response above. Hopefully that helps you work around the issue.

I'd also appreciate your thoughts on the way forward/

Also do you want a "proper" user account so you don't need to guest post and wait for me to approve your posts? Either set one up your self and let me know so I can approve it. Or I can create one using the email address in your previous post.

Domhnall Currie's picture

Depending on your answer, that was going to be my next question.....  is the user an actual system user, or just a user in the Maria user database that is used internally?  :)  Because I tried adding an "adminer" user and couldn't get it to work, but I didn't give it a password and I didn't use the MYSQL command to grant it privileges, I was just hoping adding the user through Webmin and letting it do the "create in other modules" was going to do the trick.  I was wondering myself how many people have used this and not reported anything back.  I'm guessing they didn't go elsewhere, they're just not as ignorant as I am and they quickly figured out how to fix it.  :) 

As far as naming the user, maybe name it Maria?  Personally for me when the users perform different functions but named the same thing.....  well, you know what I mean, that leads to confusion for a slow guy like me.  If it's a hassle to have an extra user that requires additional work on the site admin, I'd say throw those tasks onto the same admin user or mysql user or whatever.  But to me, when I run into errors it's helpful to have a distinct user when reading through logs that I'm not well versed in to begin with.  That was one of the problems I was having, the socket user was root and everything looked ok.  A lot of stuff the user was mysql and everything looked ok.  All the dir rights were set to mysql and all that looked ok.  If all the other V18 appliances are using the adminer user, I'd just stay with that if it was me with maybe a blog note saying why and fix it with V19 if necessary.  Did I just make your eyes cross mentioning V19?  LOL  BTW, you might want to check the other appliances, just in case.....  That was my next step, to download something like Mantis or Gitea and see how Maria was set up with those to match my Redmine to.  I really don't like that they use root with no password for socket access or anything at all.  I know I'm not familiar enough with it to see that it's more secure, etc but it doesn't seem like a long stretch to user error or a hacker's short trip to user mis-configuration for a serious security issue.  I remember several serious discussions "back in the day" when I ran the WAN for the local school system and tech specialists out in the schools couldn't get something to work so they'd just give full rights to the app directory to get it to work, half the time giving full rights to the entire server.  Speaking of making your eyes cross!

I definitely wouldn't add Adminer, the module, requiring PHP unless there's a specific benefit to it.  To me that's one of the biggest benefits to TKL, having you guys wade through all that stuff for me.  I can set up a server, install Ruby, Redmine & whatever and get it all going.  I'm not even a stranger to the command line.  But seems like every time PHP or Python or Ruby or Maria or whatever releases their "latest and greatest" all hell breaks loose.  LOL  I'm not knocking Adminer, but when I've used it in the past it seems like most of that stuff can be done through Webmin and the stuff you couldn't, most of those guys are going to go to the command line and use MYSQL commands directly.  I just don't use the command line enough to remember all my commands so I have to do a quick search on how to do that, so Webmin is where it's at for me. 

As far as a fix to RM v18, wouldn't the simplest thing for you to do be to just add the adminer user, execute that MYSQL command above and repackage the ISO?  Easy peasy..... maybe throw in a note in the readme as to why you're still using that user and then make everything uniform later down the road?  But if I understood the whole socket thing, doesn't that cause other issues having the password?  If it was me, I'd pick the flat out easiest way to fix it for you guys regardless of any confusion because I know you're still working on finishing up V18 and it's nothing that can't be fixed with a readme or posted on the web in the Learn More > section.  If us slow guys can't figure it out with that heads up, I guess you'll just have to help us one at a time.  :D 

Jeremy, you are DA MAN!  Whatever they pay you isn't enough.  The amount of time you spend helping folks in the forum seems like a full time job on top of all the coding you do to roll these things out.  Your instructions are clear and you're always patient with folks.  TKL is the model other vendors should use on how to deal with their end users.  I had a math teacher back in high school who would always ask "What do you not understand?" when you said you didn't understand something.  If I don't understand, I'm not sure I can answer the question as to what.  :)  I can't count how many times I've asked what I thought was a legitimate question in a support forum elsewhere only to be berated about reading the manual, that question has been asked and answered, did I do a search for it, etc.  My first computer was a Commodore 64, but my first PC was an 8088 processor with 256K (yes K, not MB or GB) RAM running DOS 2.0.   As I get older, I realize I get confused more easily and I don't grasp concepts as quickly as I used to but as "they" say, "this is not my first rodeo."  If I'm asking a question it's because I've tried to figure things out and for whatever reason I couldn't.  All of you at TKL go above and beyond to help folks with their problems without chewing on them as to why they're asking it and it is greatly appreciated.  No one or no organization is perfect and I'm sure y'all have a white board somewhere or whatever it is "the kids are using now" to figure out how you can make your product/support/profit or whatever better, but TKL is the mark other companies should be aiming for....  Just sayin'.  :)

Jeremy Davis's picture

I meant to say a big thanks for this. I won't reply properly yet, but I do hope to circle back around soon. It's just that I'm really under the pump with a few things, have a huge backlog and also have a few non-TKL things I really need to take care of today.

I just wanted to quickly note that I really appreciate your input here. As an experienced Linux user and dev, with a strong preference for CLI and limited "dogfooding" of TKL servers I have huge blind spots! FWIW we do run a few TKL servers; this site runs on our Drupal appliance and we have a (private) Mattermost server for planning and private team coms. I also have a local personal Gitea server.

Anyway, I hope to circle back to this and reply properly ASAP.

Domhnall Currie's picture

No problem at all.  :)  I know y'all have a mountain of stuff to do getting everything upgraded to V18.  I wish I had the time to learn some programming so I could actually help out.  TKL is a great organization with a great product! 

Error 500's picture

mysql --execute "GRANT ALL PRIVILEGES ON *.* to admin@localhost identified by 'testpassword'; FLUSH PRIVILEGES;"
Your command, did not work;
root@redmine .../lib/mysql# mysql --execute "GRANT ALL PRIVILEGES ON *.* to admin@localhost identified by 'testpassword'; FLUSH PRIVILEGES;"
ERROR 2002 (HY000): Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)


Did a new vm deploy, no updates installed/ no manual changes done.
Besides the username or maybe related, there are some others errors see bold text:


Command used:  journalctl -xeu mariadb.service

May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] Starting MariaDB 10.11.6-MariaDB-0+deb12u1 source revision  as process 2358
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Number of transaction pools: 1
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Using liburing
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Completed initialization of buffer pool
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [ERROR] InnoDB: File ./ib_logfile0 was not found
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] InnoDB: Starting shutdown...
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Note] Plugin 'FEEDBACK' is disabled.
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Warning] 'innodb-file-format' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [Warning] 'innodb-large-prefix' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [ERROR] Unknown/unsupported storage engine: InnoDB
May 06 13:40:36 redmine mariadbd[2358]: 2024-05-06 13:40:36 0 [ERROR] Aborting
May 06 13:40:36 redmine systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE

Following command:

root@redmine .../lib/mysql# systemctl status mariadb.service


x mariadb.service - MariaDB 10.11.6 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2024-05-06 13:50:22 UTC; 6s ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
    Process: 2716 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
    Process: 2718 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 2720 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, stat>
    Process: 2804 ExecStart=/usr/sbin/mariadbd $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
   Main PID: 2804 (code=exited, status=1/FAILURE)
     Status: "MariaDB server is down"
        CPU: 313ms

May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [Note] InnoDB: Starting shutdown...
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [Note] Plugin 'FEEDBACK' is disabled.
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [Warning] 'innodb-file-format' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [Warning] 'innodb-large-prefix' was removed. It does nothing now and exists only for compatibility with old my.cnf files.
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [ERROR] Unknown/unsupported storage engine: InnoDB
May 06 13:50:22 redmine mariadbd[2804]: 2024-05-06 13:50:22 0 [ERROR] Aborting
May 06 13:50:22 redmine systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
May 06 13:50:22 redmine systemd[1]: mariadb.service: Failed with result 'exit-code'.
May 06 13:50:22 redmine systemd[1]: Failed to start mariadb.service - MariaDB 10.11.6 database server.



Defaults access rights on files:
aria_log_control   ib_buffer_pool     multi-master.info  mysql_upgrade_info  redmine_development  redmine_production        redmine_test             sys
root@redmine .../lib/mysql# ls -l
total 78292
-rw-rw---- 1 mysql mysql   417792 May  6 13:40 aria_log.00000001
-rw-rw---- 1 mysql mysql       52 May  6 13:40 aria_log_control
-rw-r--r-- 1 root  root         0 Mar 13 20:52 debian-10.11.flag
-rw-rw---- 1 mysql mysql     2616 May  6 13:38 ib_buffer_pool
-rw-rw---- 1 mysql mysql 79691776 May  6 13:38 ibdata1
-rw-rw---- 1 mysql mysql        0 Mar 13 21:37 multi-master.info
drwx------ 2 mysql mysql     4096 Mar 13 20:53 mysql
-rw-r--r-- 1 root  root        15 Mar 13 20:53 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Mar 13 20:53 performance_schema
drwx------ 2 mysql mysql     4096 Mar 13 21:41 redmine_development
drwx------ 2 mysql mysql     4096 Mar 13 21:37 redmine_development@003b
drwx------ 2 mysql mysql     4096 Mar 13 21:42 redmine_production
drwx------ 2 mysql mysql     4096 Mar 13 21:37 redmine_production@003b
drwx------ 2 mysql mysql     4096 Mar 13 21:41 redmine_test
drwx------ 2 mysql mysql     4096 Mar 13 21:37 redmine_test@003b
drwx------ 2 mysql mysql    12288 Mar 13 20:53 sys

 

From this moment, i did some changes.

I did a small test, by deleting the mysql data:

cd /var/lib/mysql
ls
rm -r *
mysql_install_db --user=msql --basedir=/usr --datadir=/var/lib/mysql
systemctl start mysqld
systemctl start mysql.service
systemctl start mariadb
mysql

Then mariadb starts ok, also in webmin if user/pass are adjusted in mariadb config, but ofcouse have lost the redmine databases.
However there seems to be something wrong with the data(maybe some file corruption) or access rights in /var/lib/mysql/

 

Have made a snapshot of default installation, so i can change things at will and go back to orginial state. If you want me to test it out.

Jeremy Davis's picture

It looks like when you first tried to add the user, MariaDB wasn't running, hence the failure you note first.

Did a new vm deploy, no updates installed/ no manual changes done. Besides the username or maybe related, there are some others errors see bold text:
Command used:  journalctl -xeu mariadb.service
[...]
InnoDB: Starting shutdown...
[ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
[...]
[ERROR] Unknown/unsupported storage engine: InnoDB
[ERROR] Aborting

Hmm, that is interesting - and unexpected! That's the same info report originally by Daniele - which I couldn't reproduce. And if that's before making any changes, that suggests perhaps something else/more than what I thought was going on.

I'll try again and explicitly check the journal. And I'll follow your lead and even if I can't exactly reproduce it, I'll try your other steps too.

Domhnall Currie's picture

Well, as I was playing around with the RMv18 container I created and I looked at the MariaDB screen of Webmin, it was showing the screen with "access denied for user adminer@localhost" and gave a user and password login option.  I said to heck with it and threw root in there with no password and it came right up and ran, so.....  :)  I had tried that previously on the ISO version and it didn't work, but as they say, it is what it is.  LOL  I'll go back and try that again on the ISO VM juuuuust in case.  :)  Sometimes I look at things so long I overlook something obvious especially when I'm already chasing my tail in circles.....

Jeremy Davis's picture

Thanks for the extra info.

That is super interesting! I hadn't expected using root without a password to work!?

I'll do some more testing and see what I can work out...

I'll reconsider what steps I take to make the user experience better once I understand it all better.

Error 500's picture

4th time installing the iso, took every step exactly the same.
But this time when i went to redmine, the website worked. So i went to check mariaDB

Domhnall Currie - Tue, 2024/05/07 - 14:58:

Well, as I was playing around with the RMv18 container I created and I looked at the MariaDB screen of Webmin, it was showing the screen with "access denied for user "adminer@localhost" and gave a user and password login option.

And then i saw this exact message, all those other attempts mariadb did not show that in webmin.
I did the same, used root and the password, and can see inside the db.

Going to use the system only as local only so it won't get internet access, so i don't really mind the root user for mariadb, however if you connect it to the internet i would run it under a different user.
---
While writing this message, i rebooted the server, and error 500 reappeared.

Jeremy Davis's picture

Thanks for persisting with this and sorry it's still causing pain...

Re your latest experience, assuming I understand correctly, in Domhnall's last post he noted that he used 'root' without a password. To quote him (my bold):

[...] threw root in there with no password and it came right up and ran [...]

Although he did also note:

[...] I had tried that previously on the ISO version and it didn't work [...]

So I really don't understand what the hell is going on here and it's really frustrating...

I can only imagine what a PITA it is for you! My guess is that you're just trying to get some work done... So much for "turnkey"... :(

Regardless, thanks again for persevering and helping out.

Moving on to your post specifically (that I'm replying to):

[...] used root and the password, and can see inside the db.

[...]

i rebooted the server, and error 500 reappeared.

TBH, I think that's the bit causing the issue: I.e. setting a root password, without also recreating/adding an additional system user - with the password stored somewhere and the service adjusted to use that user/password. As I think I may have mentioned elsewhere above, AFAIK the service uses 'root' (mysql user) for starting/stopping, then drops to the 'mysql' user for all other functionality. I do recall working that out back when the 'root' user socket authentication first became the default. But given the experiences noted in this thread, I need to double check my understanding.

In the meantime, do you still have a snapshot of a working system? If so, could you try NOT setting a 'root' MariaDB user password as per what Domhall noted above and see how that goes? If MariaDB and Redmine continue to work after that, please double check by rebooting and/or restarting the MariaDB service.

If not that's cool.

I should also post how the 'root' user can be configured back to using socket authentication. If we can confirm that not setting a 'root' MariaDB "fixes" it, then that would be additional confirmation that my thoughts here are correct.

Rereading this thread, I'm still a bit concerned as some of your and Domhnall's other posts (in this thread) do suggest that perhaps there is something else/additional going on here too? Perhaps a race condition somewhere? *sigh* They're the bane of my existence as they are hard to recreate, thus really hard to fix... As a general rule, if I can recreate the issue myself, I can fix it - or at least devise a workaround.

Regardless, now you guys have provided more info, I definitely need to try to recreate this again. That will hopefully assist me to fully understand this issue better and both provide a work around for "fixing" it, plus look at how I can ensure that users such as you guys avoid hitting it in future.

Jeremy Davis's picture

I've open a bug on our issue tracker around what I see as the root cause of the main issue discussed in this thread.

I still need to 100% confirm that my understanding is correct, but I think the issue stands regardless. I.e. a dedicated root-like MariaDB for use in Webmin, consistent with the name for use in Adminer as appropriate.

If either of you (or others) have a GitHub account, please feel free to add further info/questions/suggestions/etc there, although further discussion around this and other related Redmine issues are probably still best discussed here.

Error 500's picture

The installation in vmware only takes only about 5 min.   Did a new test, installed it, skip updates, redmin works (so mariadb works in the background for webmin). In webmin at mariadb same error with option: "access denied for user "adminer@localhost" and gave a user and password login option. (note, that this time i did not change the password.) Then login to ssh/terminal, and do a reboot.  (For my case i will be running it on a machine, only during programming hours and will be shutdown afterwards)   After the reboot, redmin, don't work. And in webmin the service of mariadb can't be accessed anymore.  Error!  MariaDB is not running on your system - database list could not be retrieved. After reboot these errors:
Command used:  journalctl -xeu mariadb.service
[...]
InnoDB: Starting shutdown...
[ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
[...]
[ERROR] Unknown/unsupported storage engine: InnoDB
[ERROR] Aborting
  For time being, i wanted a fast deployment instead of installing debian + all software manual. Time going almost to the point that doing it manual might be more fruitfull. However then i still have to learn how to access mariadb in case something like this happens there. So i figure we can better try to fix this and learn from it. Only after i have my files/mariadb backu export and import scripts/manuals ready then i will go work fully with system. Dont want my notes, history etc to be lost in files i can't access. Can you follow my steps and do the reboot after install and then see for yourself?
Jeremy Davis's picture

I don't have VMware (which shouldn't matter), but other than that, I'll follow step by step and report back.

Thanks for your patience...

Domhnall Currie's picture

In the interest of speed, if you don't need the latest and greatest, you might install V17.1.  It works and it's using RM 5.whatever.  It would be pretty simple to shift your data over to V18 later.  Personally, I'd rather go that route than installing everything by hand from scratch.  :)

Jeremy Davis's picture

After installing from ISO I was able to reproduce the issue. I'm not sure why I wasn't able to reproduce it in an LXC container and I still don't understand why/how it happens. But as you've reported, for some reason the MariaDB database becomes corrupted on reboot.

Thank you all for your persistence and patience. I will be re-releasing a new fixed Redmine appliance ASAP.

In the meantime, I have developed a script that will "fix" it (destroy all corrupted data and recreate it from scratch). Please be aware that it is destructive and will wipe all databases - irretrievably destroying all data and reconfiguring MariaDB as if it were a clean install. It does not affect the filesystem (beyond the MariaDB data files).

It will ask you to update the Redmine 'admin" password and email, as well as a password for the (new) 'mariadb_admin' "root like" MariaDB user- for use in Webmin. It reloads the default Redmine data and databases, but does not include the "helloworld" example data repos/projects that normally come by default.

I have tested it, but not extensively. I have logged into Redmine as well as Webmin. I've played around in both and rebooted my test server multiple times and it appears to be all good.

Please do share any feedback you have.

To download and run the script:

wget https://raw.githubusercontent.com/JedMeister/redmine-18.0-fix/master/redmine-fix
chmod +x redmine-fix
./redmine-fix

PS Bug opened on the issue tracker.

Domhnall Currie's picture

I'll give your script a try this weekend.  In my infinite ignorance, I think something's going on with the Innodb engine.  There's a lot of info around on dbase corruption if Innodb is not shut down properly, etc.  I tried deleting the ibdata files and letting Maria recreate them when it came back up and I couldn't get it working, but I would always get different results depending on what I altered.  It's almost like whatever is shipped is different than whatever the installation creates and Maria doesn't like it.  Crazy stuff man.  :)

Jeremy Davis's picture

Yeah it does seem related to if/when it's not closed "cleanly". Although I'm not really sure why it seems to happen reliably on first reboot after install - and so far, only seems to affect the Redmine appliance. It's bizarre!

As per you, I also tried a raft of "fixes" and some seemed to work, but not consistently or reliably. Nuking all the MariaDB data, reinitializing the MariaDB setup and reloading the Redmine data was the only thing that seemed to work reliably.

TBH while my script does seem to work and it didn't corrupt for me again on multiple reboots, I would really like to understand why it happens in the first place. Then I can be assured that it won't happen under "normal" circumstance.

Obviously I can't guarantee it'll never happen again. Data corruption does happen sometimes - hence why backups are always a must. Although it should generally only happen under specific circumstance, such as hardware corruption/failure - e.g. faulty RAM or disk corruption. And it certainly shouldn't happen to all or even most Redmine users of our v18.0 build on firstboot!

Domhnall Currie's picture

Oh, just one other thought....  I'm guessing a lot of the TKL appliances use a MariaDB backend....  If there's something causing db corruption, how the heck are all those appliances working and Redmine is not?  Are they all using the same version of Maria?

Jeremy Davis's picture

Yeah 80% of our apps include MariaDB. And it's explicitly the same version and same package direct from Debian. So it's super weird!

No one else has reported it on any of our other appliances and I couldn't find a bug report that matches on the Debian or upstream MariaDB bug trackers. There are a few issues that may be vaguely related but it doesn't exactly match our experience - or the log entries, so I suspect that they are unrelated.

So for some weird reason it seems limited to Redmine. But it doesn't make sense to me that it only affects the Redmine ISO and only happens on first reboot. The LXC build is built from the ISO, so I would expect that to also be affected. Super weird indeed!

Error 500's picture

Thanx for supplying the fix!


Typo: Line: 60   IFS= read -r -s -p "Password for MariaDB 'maraidb_admin' user: " pass1   (should be mariadb_admin .  

Installation went ok.   


Redmine was working after the fix, after reboot, also redmine working.   
However, now i can't connect to the webmin panal anymore, it is not responding. (no login screen) 

 
I tried btw making a user account on this website a few days back, but the account has not yet been activated. Can you check that?

 

 

Jeremy Davis's picture

Thanks for noting the typo. It's non-critical as it's just text (not code). But it's a quick and easy fix.

I see that you have another post that relates to the Webmin login issue, so I'll reply with thoughts there.

Also, re your website user account, I've just enabled it.

FYI literally hundreds of user accounts are created a day, most are clearly spammers or SEO farmers (just looking at the usernames). So I only enable accounts that have demonstrated that they're human - which clearly you've done. I did check for an account linked to your email initially, but as you didn't note that you'd created one (until now) I hadn't checked again... Anyway, it's done now.

Error 500's picture

Oh, just one other thought....  I'm guessing a lot of the TKL appliances use a MariaDB backend....  If there's something causing db corruption, how the heck are all those appliances working and Redmine is not?  Are they all using the same version of Maria?

I have managed almost 15 years long Mysql databases on debian for website.
This normally don't happen that fast that corruptions take place. Even hard shutdowns of mysql usally don't give those errors.

However i have seen these cases before when upgrading went not so good, and inno db version where from the older version but newer version of mysql was install.  (maria db is a fork from mysql)

So that said you don't have to worry that this is going to happen when all is fixed.
However i do advice to make a script to dump the database daily to a readable format. So you can import it if it goes wrong. Don't backup the files direcly from the sql directly, Only use commands to directly dump a file. Otherwise you might end up getting these kind of errors when restoring the backup.

Jeremy Davis's picture

Thanks for your reassurance. That makes me feel much more comfortable about this.

Although MariaDB is installed as a fresh install and as I noted above, this particular issue seems to only affect the Redmine appliance?! The other strange thing is that my "fix" script essentially does exactly what the initial build code should be doing?!

Anyway, I 100% completely agree with your suggestion of taking a daily backup!

We do encourage use of TKLBAM (our built in backup tool) linked to the TurnKey Hub (our SaaS product) for that. That's our main source of revenue and allows me to dedicate my time to TurnKey and pay a couple of other devs part time.

Although I am a strong supporter of "free software" in both senses. So while we hope people find our appliances useful enough to contribute in some way (which you've done here), there is zero obligation to use TKLBAM/Hub, and we don't cripple TurnKey appliances in any way. Your suggestion of a DB dump (i.e. using mysqldump) is completely valid.

Domhnall Currie's picture

I understand.  I was just thinking that since all the other appliances are running Maria, looking at those might provide some info as to why this one is not working, like a simple typo or if something had to be coded different for this version of Redmine that might be making Maria burp. Sometimes those things can happen when folks are rolling out new versions of things, they've been putting in too many hours and looking at the same lines of code over and over and over.....  ok, that was just a tease for Jeremy.  :D  He's the best.  I'm not worried.  :)

Error 500's picture

It took a while for the webmin to respond.

However i was not able to login with my admin password? (tried it later with root, and webmin did not respond anymore).

You said the script also did reset the admin pass.

I wonder if my password special characters where not being escaped properly, and set the webmin admin password to something else.

Password contains #@!  so if those are not escaped the right way in the script, it might break the script.

Jeremy Davis's picture

I suspect that loosing access to Webmin login is because fail2ban is locking you out. I forget exactly what our default config is, but IIRC by default it will lock you out for 10 minutes after 3 log in failures within a minute.

I suspect that the initial Webmin login failures you're hitting are poor communication on my part!? Your suggestion of incorrect password escaping is possible. but I'm pretty sure the primary reason is that the Webmin login user remains 'root' (via PAM authentication). The 'mariabd_admin' user is a MariaDB user - intended to be used explicitly for the purposes of MariaDB access within Webmin.

Related to that, after finishing the "fix" script (which took me way longer than it should have...!) it did occur to me that perhaps I should have just been setting a random string myself - rather than asking the user? Once the Webmin config is prefilled (which I'm already doing in my script anyway) the user doesn't really need to know it.

As I'm sure I've noted above somewhere, default root CLI MariaDB access uses socket authentication - so the root Linux user is bound to the root MariaDB user and a password isn't required. So IMO using root when wanting CLI access is the easier and most logical (although I often suffer from the curse of knowledge). I did have a quick fiddle with the Webmin MySQL/MariaDB module using root and socket auth. But in my brief adventures, I couldn't get it to working. Hence the additional account. I suspect with a little more fiddling it's possible but I've got a massive backlog, so not for now...

So if I go generating a random password, I'm thinking perhaps a long base64 string? E.g. using openssl:

openssl rand -base64 40

Which produces random strings like this:

  • vym35hnN4E1UPVo9rtk/8kUW9Q9HPDXyNWv19uN7Qd6/QNKh7aDHDA==
  • 36ybbkk/wm3QUAFvMmZukwH28+cImHZn+DnMKz1VHa55q0KI64S06A==
  • z93uy15He5NrKptRyOUrZp1Gzl9/wHBL+cmV8rWJVCrbRhv3Z1MoFw==
  • Mcc8N23N7TEwbM/6Q9DiXElvBFyfyJFeNTpZT1MhQvmoWqksF6b/Rw==
  • etc...

What do you think?

Error 500's picture

Got the Webmin to work. I was a bit confused because i was earlyer able to login with admin on webmin. And did not think of using root to login webmin. So All is working now!

Usually if i install a debian machine, i disallow root to be able to login remotely to the machine.
Then i have to login with normal user account, then i sudo root myself. We did this default on all linux machines.

However this machine will only be used locally so i leave this settings alone.

One other question, does turnkey have packages of software from the other appliances to install inside this one? Like for example installing media wiki next to redmine, so it runs on the same server.

For now i have setup Redmine, to autostart with vmware workstation when booting up my pc, and for shutdown it is configures with vmware tools, to run shutdown -h command, while shutting down my host pc. Works like a charm. Databases stay healthy.

In webmin, under mariadb i found the backup databases button, and i see there using the crontab -e style to configure the backups. Nice addition, usually i did this manually from the console!

Jeremy Davis Thanks for your help!

Jeremy Davis's picture

Got the Webmin to work. I was a bit confused because i was earlyer able to login with admin on webmin. And did not think of using root to login webmin. So All is working now!

Thanks for the info, although TBH, I'm not sure why you were able to log into Webmin as "admin"?! It should always be 'root' - with servers on AWS Marketplace being the exception - as that's a strict requirement of AWS.

Usually if i install a debian machine, i disallow root to be able to login remotely to the machine. Then i have to login with normal user account, then i sudo root myself. We did this default on all linux machines.

As you seem aware, under the hood TKL is Debian. So you can create a new sudo user if you want and disable root. Although I'm not sure how Webmin will cope? You may need some further config there there, but feel free to report any issues (or even success!) and I'll do what I can to assist. Or if you work it out, please share. It's probably a good thing to document for others that would prefer that.

Actually (t osupport AWS requirements), we do have a tool that creates a sudo user called 'admin' which does work with Webmin - although one of the things it does (which you may not like - but may be required for Webmin?) is it disables the sudo password - i.e. no password is required to run sudo. Running our built in tool it with no args will show the help:

turnkey-sudoadmin

Re context of root/sudo user: That is an ideological decision - one which quite a few technical server projects adopt.

Regardless, a sudo user on a Linux desktop is must! Only Linux Desktop users with a desire for pain and trouble run as root!

On a single user server IMO its more of a personal choice. Almost every command you will want to run will require root and it's security by obscurity at best and IMO just an extra layer of PITA (essentially "Simon says"). Unless there is at least some "real security" then security by obscurity can give a false sense of security. If you have the "real sec" covered, then there are things such as a sudo user that can raise the bar a bit.

There are security screws you can tighten to harden your server, some "real" and others additional security by obscurity. But with fail2ban and a good root password, I personally don't think it's necessary.

One other question, does turnkey have packages of software from the other appliances to install inside this one? Like for example installing media wiki next to redmine, so it runs on the same server.

You're no the first to ask that question! :) But strictly speaking no. Having said that, many of our users run TurnKey under LXC - e.g. as a Proxmox container. So whilst it's still a separate server, it uses very little additional resources. We have done Docker builds in the past - albeit via a hack that loads the full rootfs into a Docker container. So it is essentially the same scenario as LXC, but in Docker. IIRC the reason we stopped was issues with Docker, but AFAIK, that's been resolved upstream. I'd like to get back to doing Docker builds - even if via the hack I noted. But I've got a huge backlog and it's not a huge priority - we've only had a few requests to do them again since we stopped.

So if you don't want/have LXC, then OTTOMH the only way would be to manually install the secondary software yourself. Not very TurnKey... It should be possible to run the scripts we use to build all our appliances but it's not something we technically support and might have issues.

[...] Databases stay healthy.

Yay! :) Thanks for confirming.

Glad it's working out for you!

Domhnall Currie's picture

I got the same results as Error500.  RM was working, but after 3 attempts to login to Webmin, it locked me out and I just got the unable to connect screen.  I started over and tried again and got the same results with it not letting the mariadb_admin user into Webmin.  On the 3rd attempt, I tried logging in as root with the system password and it let me in.  Looking at the Webmin users through their interface, it is only showing root there.  Should it be showing mariadb_admin there as well?  On running the redmine-fix script, both times I noticed a "fail" warning as the script was running:

 

Doesn't look like the mariadb_admin user is getting created.  Redmine is up and working, though.

Jeremy Davis's picture

Sorry, it sounds like miscommunication by me. I'm pretty sure I've confused you with my choices and words within the script.

To clarify, you will still need to log into Webmin with 'root' - the same password as 'root' login via SSH. Webmin initial login authenticates via PAM - i.e. as a Linux user. The 'mariadb_admin' user is a MariaDB user (only), as opposed to a Linux user.

The purpose of the 'mariadb_admin' user is purely to give you full control of MariaDB in Webmin - via the MariaDB Webmin module. As it can be preconfigured (as my script does) you shouldn't need to enter the 'mariadb_admin' password anywhere. It should "just work".

Because you should never actually need to enter it manually, I think it'd be better to just generate a complex random password for the 'mariadb_admin' user, rather than asking the user to set it. My guess is that was the bit that created the confusion!?

Hopefully that clarifies thing for you? Can you please confirm that it works as I've descried?

Thanks again for all your feedback. I get that you may feel a bit "dumb" or whatever sometimes. But your feedback is incredibly useful! It helps me understand the sort of issues and confusion that "inexperienced" Linux users encounter. After all, people like yourself are our target market! As a seasoned Linux user, I often suffer from the "curse of knowledge". So it's easy to miss issues/problems that I miss because may be unintuitive to many.

FWIW, I'm trying to get a proper understanding of what causes the initial Redmine issue and fix it is a really high priority. I want to rebuild and publish a bugfixed update ASAP. I'm quite concerned that someone might launch it, put a ton of data in and not realize the issue until they reboot! That would be really bad!

Jeremy Davis's picture

I've done a new build of Redmine and while I haven't tested it much, on face value it does appear to be ok. I can reboot it and MariaDB comes back everytime. I have applied to tweaks to Webmin too. although I admit I haven't tested that at all.

After spending a bit of time dissecting the previous build, I still don't understand what was going on. But this updated build does appear to have at least resolved the DB issue. It'd be super awesome if one or both of you guys could give it a spin and give me some feedback?!

Please note that I'm actually still uploading it now and the connection from Australia to US west coast is a bit laggy. Rsync says that it'll take about another hour, so I suggest checking the timestamp of this post and don't try downloading before ~2+ hours after this post. Also please note that the download might be a little slow as I've just pushed it to our webserver, rather than the mirror.

Here's the ISO:

turnkey-redmine-18.1rc1-bookworm-amd64.iso

And the SHA512 hash:

3faf47becd46eba62551a41e83bbb39adcba6c5d77f9b39b842ab5d5166e76f656972ddd8acc8b8a2d647d8f8202fab3ab465806dabe09ef923d08a597a91d51

[Update]: the ISO noted above is no longer available as a fixed ISO has been released: download v18.1 Redmine ISO.

Regardless, the latest Redmine image can always be downloaded from the Redmine appliance page.

Jeremy Davis's picture

Now that I have tested a bit more, I am happy with the updated Redmine appliance.

Webmin works as I expected - the 'mariadb_admin' user password is randomly (non-interactively) generated on firstboot and added to the Webmin config. So other than logging into Webmin (via 'root') no user interaction is required. The MariaDB Webmin module "just works"!

But more importantly MariaDB survives restarts and/or reboots, no problem. And so Redmine continues to work as it should! Yay :)

Regardless, please feel free to provide any feedback you have.

I've dissected the broken build, comparing before and after the issue becomes apparent. I've compared them with the new working build. Regardless, I'm still not sure why/how the DB issue occurred. All I needed to do to "fix" it was rebuild it. Along why the issue didn't affect any other apps that include MariaDB. I can only assume that there was some random corruption at build time...!?!

TBH, I don't like it when I don't understand why an issue has occurred and/or what it was that fixed it. Regardless, I'm not going to fight it.

Despite it ending up that there was little to do to fix the issue, I also tidied up a few other bits and pieces that I noticed while pulling the appliance apart .

For further detail, please see the code I changed via the pull requests on GitHub:

Once the changes have been reviewed by a colleague, I'll merge them and build v18.1 of our Redmine appliance. It will be published early next week.

Thanks again to both you guys for all you've done here.

Account is activated. tmp account Error500 is now user: Tester.
Not much time the coming days. But will test the RC1 candidate at the end of this week/weekend.

Jeremy Davis's picture

That sounds awesome. I'm pretty confident but more testing and feedback is always good! :)

Domhnall Currie's picture

Everything is good here.  Everything seems to be working ok with the new ISO.  More later.

Jeremy Davis's picture

Thanks mate! :)

Domhnall Currie's picture

Regarding the reason for the database not starting, if you've verified everything with the InnoDB engine and the redo log file and it should be working ok, then that shouldn't be the problem.  However, with my very limited knowledge of the inner workings of MySQL and Maria and what makes them run, it looks to me like Maria won't initialize without the ib_logfile in place.  In the MariaDB system variables, there are umpteen variables regarding InnoDB that I don't understand and maybe some settings in there will allow Maria to run without that logfile or that will allow it to recreate it in place during database creation.  But after all is said and done during the initialization of the original RMv18 ISO, the ib_logfile was not in place.  In the original ISO /usr/lib/inithooks/firstboot.d/20regen-rails-secrets, you've got:

# remove innodb logfiles (workarounds really weird bug)
rm -f /var/lib/mysql/ib_logfile*

In the 18.1rc1, that is not in there and after initialization, there is an ib_logfile0 in place.  That may not have anything at all to do with it, but unless that "really weird bug" allowed doing without the ib_logfile, in the absence of anything else..... I'd go with it.  :)

 

 

Domhnall Currie's picture

Ah!  I see the note of the removal of the removal in pull 303.  How that wouldn't have affected any other appliances if that caused the problem is beyond me, so maybe that's not what actually fixed it....  go figure.  :)

Domhnall Currie's picture

The only other thing I could think of was if there was a bug related to time stamp being off.  IIRC, the TKL setup used to ask you your timezone during installation, but RMv18 did not.  If the database was created in one timezone and then something about the database was initialized during the install in another time zone before the user had a chance to change the system setting.....  just grasping at straws.....  :)  Only reason I realized it was off was when I was looking at the system logs and I was like, WTH?  :)

Domhnall Currie's picture

Sorry, just reread and realized I didn't explain why I thought that.....  if the time stamp being different during appliance initialization by the end user versus the time stamp of initial setup of Maria caused database corruption or InnoDB to puke itself.....  Over the years I've seen some issues with time stamps not being realistic causing something to balk.  Sorry, not very eloquent, but you know what I mean.  :) 

@Domhnall Currie Jeremy Davis 

This post related to the v18 (not the 18  rc1)

You might want to change this config as it is a bit 'misconfigured'.  When you work "fast" in redmin. You get an error in your Browser, and hit the anti DOS protection.  Happend many times here.

Log reads as: May 16 17:26:11 redmine mod_evasive[1098]: Couldn't open logfile /var/log/apache2/dos-xxx.xxx.xxx.xxx: Permission denied

Have solved this by editing this file:

/etc/apache2/mods-available/evasive.conf

<IfModule mod_evasive20.c>
    #DOSHashTableSize    3097
    #DOSPageCount        2
    #DOSSiteCount        50
    #DOSPageInterval     1
    #DOSSiteInterval     1
    #DOSBlockingPeriod   10
    #DOSEmailNotify      you@yourdomain.com
    #DOSSystemCommand    "su - someuser -c '/sbin/... %s ...'"
    DOSLogDir           "/var/log/apache2"
</IfModule>

The # says those settings are not configured, so by default it uses the default ones.

For my Local lan Usage i Added; DosWhiteList IPADRESS, and that solved the issue for me. 

Restart Apache to apply changes.

However i think this should be configured correctly in the ISO so new uses won't encounter this, A bit loser restricions then default.

Also, there is an issue with Kernal values, happening every startup, would be nice if that could be solved to:
May 16 11:51:51 redmine systemd[1]: Failed to start systemd-sysctl.service - Apply Kernel Variables.
May 16 11:51:51 redmine systemd-sysctl[390]: Couldn't write '"1 4 1 7"' to 'kernel/printk': Invalid argument
May 16 11:51:50 redmine systemd-sysctl[363]: Couldn't write '"1 4 1 7"' to 'kernel/printk': Invalid argument
May 16 11:51:50 redmine systemd[1]: Failed to start systemd-sysctl.service - Apply Kernel Variables.

However not found a solutions for that Yet.

This weekend i will test the RC1.. Then i will write also a migration document how to migrate the data to a new server. 
 

 

Jeremy Davis's picture

Thanks for your further testing and feedback, especially the bug report(s). You rock! :)


Hmm, I wonder WTH is going on with the log failure? By default that dir should be owned by www-data - the user Apache runs under - so that's all very weird... I'll have to look into that; I've opened a bug . As I'm super keen to fix the critical issue with Redmine, unfortunately it won't be fixed for v18.1.

Otherwise we're shipping the Debian mod_evasive defaults. Generally Debian provide sane defaults, but it seems it want's some loosening.

FYI we use it (with the TurnKey default) on this website and I haven't noticed any issues. But that doesn't mean that others don't - and obviously you did in your use case. IIRC there was at least one other user that was getting 403s and didn't know why - hence why we made the log location change. Others have just disabled it.

I'm ok with loosening the defaults but TBH, I'm not sure exactly what we should change them too? Adding client IPs (other than localhost?) is not an option for us. Although as a minimum we probably should add the (disabled) setting with a comment. Beyond that, these would be the ones to adjust:

DOSPageCount - threshhold for the number of requests for the same page by the same client within DOSPageInterval

DOSSiteCount - threshhold for the total number of requests for any object by the same client and the same listener process within DOSSiteInterval

DOSPageInterval / DOSSiteInterval - time in seconds that above are applied

The DOSSiteCount is per listener and we have no control over the number of listener processes as it depends on server resources (to a max of 1000 IIRC). So servers with more resources will have more listeners so likely take more beating the hit the limit. And the opposite for lower resource servers.

DOSPageCount is simply per interval, but limited to a specific URL.

My guess is that you were hitting the DOSSiteCount but unfortunately because it wasn't logging we can't be sure.

I'll have a play when I have time.


The systemd-sysctl errors are new to me?! I wonder if it's hardware dependent?

FWIW my local TKL v18.x servers only have 2 journal entries per boot:

systemd[1]: systemd-sysctl.service: Deactivated successfully.
systemd[1]: Stopped systemd-sysctl.service - Apply Kernel Variables

Although on second glance it's a bit weird that the "Deactivated successfully" comes before the "Apply Kernel Variables"? Although they both have the same timestamp, so perhaps that's just a querk?

Regardless, to get rid of the error message in your logs, disable it. I.e.: comment out kernel.printk in /etc/sysctl.conf. FWIW it's commented out by default, but I don't recall exactly what it does. Something to do with quieting kernel output to the terminal I think?

Also an other minor issue i had; from the logs:

redmine ntpd[756]: statistics directory /var/log/ntpsec/ does not exist or is unwriteable, error Permission denied

/var/log/ntpsec/  is missing.

Recreated it, and set owner to; ntpsec
Rebooted machine, error gone.

Jeremy Davis's picture

Weird that I don't have that error either? But I don't have the dir either! So logged as a bug too.

Domhnall Currie's picture

Was trying to upgrade an ipFire router a couple months ago.  Somehow ntpsec got screwed up and it couldn't sync to any of the network time servers.  DNS wouldn't work because the time stamp was off too far causing DNSSEC to balk.  I tried everything I could figure, manually setting time, inputting IPs instead of domain names, etc and ended up having to just reinstall the router from scratch.  Luckily I didn't have a ton of rules or any fancy configuration in there so I didn't have to go too far to get it back up.  I had backups, but pakfire wouldn't work either..... basically the whole thing crapped all over itself, from what I could see, just because ntp got screwed up somehow.  I was saying bad words.  :) 

Installation 18.1(rc1) went flawless! Evertyhing works directly.
However the same 'little issues' i posted yesterday&today about v18 are also present in the 18.1(rc1).

Also tested restore from backup to the new install 18.1, was able to restore www and sql within 5min. (not much data yet)
The new install/deployment+restore backup was done in 20min.

Jeremy Davis's picture

Great to hear. Thanks again for testing and confirming. I'll publish v18.1 ASAP - exactly the same as v18.1rc1 and should be available early next week - hopefully Monday.

I think for my case it could be a to fast double click on a link, so hitting the same page multiply times as a result. I have set DOSPageCoun from 2 to 5. From some testing this looks good. Will report it back in a few days.

<IfModule mod_evasive20.c>
    DOSHashTableSize    3097
    DOSPageCount        5
    DOSSiteCount        50
    DOSPageInterval     1
    DOSSiteInterval     1
    DOSBlockingPeriod   10
    #DOSWhitelist xxx.xxx.xxx.xxx
    #DOSEmailNotify      you@yourdomain.com
    #DOSSystemCommand    "su - someuser -c '/sbin/... %s ...'"
    DOSLogDir           "/var/log/apache2"
</IfModule>
Jeremy Davis's picture

Your suggested config looks good to me. I look forward to how it goes.

Reflecting on this some more, I think it's much better to provide something that doesn't interfere with users and "just works". If users what to they can tighten the screws themselves.

Having said that, I still think that it would be good to provide docs on tweaking the config.

FWIW I've always hoped to write up a general TurnKey "hardening" doc, but I never seem to get there...

Pix1mil's picture

Hi, I downloaded the rails tkl app iso yesterday and did fresh install on VMWare Workstation, and am experiencing Mariadb issues with Journalctl run from CMD line complaining that ./ib_logfile0 is missing and Mariadb service showing not started. Seems related to this. I will check fixup script, but thought I'd mention it so it's on the radar... Big fan of tkl, thanks for an amazing service.

Jeremy Davis's picture

Thanks for dropping in and bringing this to my attention. TBH it should have occurred to me that the same issue would almost certainly affect Rails too because Redmine is a Rails app.

Please let me know how the script goes, but I'll aim to rebuild the Rails app ASAP to apply the same fix as I did to Redmine.

Also, thanks for your kind words re TKL. Glad that it's of use to you! :)

Add new comment