Jiger's picture

Hi all,

We have a Turnkey Linux File Server version 12.0. 

 

The server is being used to take the back up of an Application running on a Linux based server. The server took the backups correctly for around 2 months time & then I noticed that something went wrong as the Backup file size turned to 0 Bytes rather than around 900+ Mbs.

 

I am able to login to the TKL console using the root login & also using the winscp software via the service accounts which I created additionally. In addition to this when I open the webmin using https://<Servername>:12321, the web console opens normallly. However, when I open the ajaxplorer webconsole, i see junk characters. I have attached the relevant screenshots.

Can somebody please look into this and help me with what all things can I look into and if possible kindly suggest the solution.

 

Thanks

 

 

Forum: 
Jeremy Davis's picture

But who knows really...

Without knowing what method you have been using for backups I don't even know where to start troubleshooting this one...

If I were to assume that the backup is running on the other Linux server then I would be checking the backup logs to see what sort of errors that might be throwing. Perhaps even manually initiating the backup job and see what happens...

One other thing to check is that there is room on your Fileserver appliance. A full file system may be a reason for your backups to fail!?

As for the issues with AjaXplorer, have you tried another web browser? IE is known to have regular updates which often break things... (Or perhaps that's just my bias...)

Jiger's picture

Hi Jeremy,

The TKL file server has a NAS volume mounted where the Backup of the application is getting stored. Regarding the issue with the space on the NAS server - that's not the case as I checked that before raising the issue here.

Also, when I checked the backup on the Application server itself which is pushed to the NAS volume via TKL file server  I notice that the backup file is definitely getting generated. However not getting pushed to the NAS volume via the TKL file server which was happening earlier.

Regarding the AJAXexplorer - yes I did check using other browsers however the issue is persistent even on other browsers.

Do we have some commands to check if the TKL server is running the required services or may be some other required components & may be repair if at all corrupted?

JN

Jeremy Davis's picture

If so what is it saying the problem is?

Have you tried to connect to the NAS from this other server to see if there is an issue there?

Where is the server running? Is it a VM or on bare metal? Do you have a UPS connected to the server (the physical machine if bare metal, or the host physical machine if a VM)?

Again I'm only guessing and throwing ideas about, but what about the possibility that the machine has suffered a power loss? (Hence my question about UPS). Most VM environment software in general often don't handle unintended power loss very well... If that's a possibility perhaps the file system has been corrupted and things are running as they should...?

As for commands, I would check that the NAS is mounted (as it should be). The mount command will tell you what is mounted and where. You can also check logs for issues (they should be in /var/log/). I would imagine that the Samba log would be of particular interest (/var/log/samba/samba.log - although you may want to dig through older logs until you can find logs from when it was working to compare to). Another log that may be worth checking (particularly in regards to AjaXpolorer) is the LigHTTPd error log (/var/log/lighttpd/error.log). I don't have a v12.x Fileserver appliance handy ATM But the Daemon log and Syslog may also be worth having a look at (/var/log/daemon.log and /var/log/syslog respectively)

Jiger's picture

The TKL File server is being hosted on a ESX server. Power loss might have caused this issue however I did install the acpi-support & the acpi-support-base packages to avoid any file system corruptions on the TKL VM b’cos of the power loss issues.

How can I check if there are any file system issues causing the issue?

Is this issue of file system corruption being taken care in version 12.1 and are the packages acpi-support & acpi-support-base being included in 12.1 or some other mechanism to take care of this issue on the VM based environments?

Yes the Application server is able to connect to the NFS / NAS volume via the TKL file server. The architecture is as follows:

Application Server --> Connects to TKL file server using scp --> Posts the Application server’s

                                                                                                           backup on the NAS volume using

                                                                                                           TKL’s mount point

TKL file server is being used b'cos the Application server doesn't directly understand the NFS protocol hence the TKL file server does this by acting as an intermediate device.

Regarding the Logs, I am trying to deep dive into all the logs mentioned in your last post and will update you after this.

JN

Jiger's picture

Hi Jeremy,
 
I noted that the mount volume was getting detached hence I remounted it however after that when I gave the mount command, I noticed a line mentioning "overflow on /tmp type tmpfs (rw,size=1048576,mode=1777)" which I believe seems to be the problem.
 
Can anybody please let me know what's that and is it that't creating the issue? If yes, how do I resolve this.

 

root@dlpau020file01 ~# mount

/dev/mapper/turnkey-root on / type ext4 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/sda1 on /boot type ext2 (rw)
overflow on /tmp type tmpfs (rw,size=1048576,mode=1777)
1.2.3.4:/vol/svrau182frm06_proj_xlp/proj_xlp on /srv/storage type nfs (rw,vers=3,addr=1.2.3.4)
 
 
Note: I had to change the mount drive's IP :)

JN

Jiger's picture

Some more details:
 
I noticed that the loop of the backup files is being created - not sure why...
 
root@xlpau020file01 ~# find / -name mythfrontend.log
find: File system loop detected; `/srv/storage/.snapshot/hourly.0' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/hourly.3' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/hourly.1' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/hourly.2' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/nightly.0' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/hourly.5' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/hourly.4' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/nightly.1' is part of the same file system loop as `/srv/storage'.
find: File system loop detected; `/srv/storage/.snapshot/mirau020hwt04(157379835                3)_svrau182frm06_proj_xlp.259' is part of the same file system loop as `/srv/storage'.

JN

Add new comment