Ask the Community
Groups
Out of Space (Special database mode to allow deletions, and not consume freed space) - Connect IT Community | Kaseya
<main> <article class="userContent"> <h2 data-id="summary"><strong>SUMMARY</strong></h2> <p>Details on how to start troubleshooting out of space issues.</p> <h2 data-id="issue"><strong>ISSUE</strong></h2> <p>An appliance can run out of space for various reasons, including log levels being too high or too much data being ingested at the same time. This article will help you narrow down what is the cause of the problem and how to fix it. <br><br>For more information on "Out of Space" conditions see <a rel="nofollow" href="/home/leaving?allowTrusted=1&target=https%3A%2F%2Funitrends-support.zendesk.com%2Fhc%2Fen-us%2Farticles%2F360013172657">Error: No more space on device</a></p> <h2 data-id="resolution"><strong>RESOLUTION</strong></h2> <p></p> <ol><li>SSH to the appropriate appliance</li> <li>Use the command below to display storage being used, appliance asset and the Unitrends version. <pre class="code codeBlock" spellcheck="false" tabindex="0"> With Support macro: [isi Without: clear;echo;dpu asset;echo "Version: `rpm -qa|grep unitrends-rr|grep -v windowmgr|awk -F'-' '{print $3 "-" $4}'|awk -F'.' '{print $1"."$2"."$3}'`";echo "Running: `cat /etc/redhat-release|sed -e 's/Recovery/Cent/g'|awk '{print $1, $3}' | grep -iv uni`";echo "Up for: `uptime|awk '{print $3, $4}'|sed 's/,$//'`";echo "Load average: `cat /proc/loadavg|awk '{print $1, $2, $3}'`";if [[ $(ps aux|grep connector_rc|grep -v grep >/dev/null 2>&1;echo $?) -eq 0 ]];then echo "iTivity Tunnel: Asset = `dpu asset|awk '{print $NF}'`";elif [[ $(ps aux|grep 222|grep support >/dev/null 2>&1;echo $?) -eq 0 ]];then echo "Legacy Tunnel Number: `ps -leaf|grep 222|grep support|awk '{print $24}'|awk -F':' '{print $1}'`";else echo "No Tunnel";fi;echo;if [[ -e `ps -leaf|grep tasker|grep -v grep|awk '{print $NF}'` ]];then echo "Tasker is running";else echo -e "\e[1;31mTasker is not running.\e[0m";fi;if [[ -e `ps -leaf|grep devmon|grep -v grep|awk '{print $NF}'` ]];then echo "Devmonitor is running";else echo -e "\e[1;31mDevmonitor is not running.\e[0m";fi;echo;echo "Processor count: $(cat /proc/cpuinfo|grep processor|wc -l)";echo;echo "Memory Useage: ";free -m|grep Mem|awk '{print "Total: "$2 "MB"}';free -m|grep Mem|awk '{print "Used: "$3 "MB"}';free -m|grep Mem|awk '{print "Free: "$4 "MB"}';echo;echo;df -h;echo;echo;if [[ $(ifconfig|grep -iq tun;echo $?) -eq 0 ]];then if [[ $(/usr/bp/bin/bputil -g -c `hostname` "Replication" "Enabled" -1 master_ini) == [Yy]es ]];then echo "This Unit has tun0 and Replication enabled";echo;elif [[ $(/usr/bp/bin/bputil -g -c `hostname` "Securesync" "AutoSyncEnabled" -1 master_ini) == [Yy]es ]];then echo "This Unit has tun0 and Vaulting enabled.";echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == DPU ]];then echo "We have tun0, but no Vaulting or Replication.";echo "This may be a Managed System.";echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == Vault ]];then echo "We have tun0 and this is a Target.";echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == Manager ]];then echo "This has tun0 and is a Manager.";else echo "Something is broken.";fi;psql -U postgres bpdb -c "select * from bp.managers";echo;else if [[ $(/usr/bp/bin/bputil -g -c `hostname` "Replication" "Enabled" -1 master_ini) == [Yy]es ]];then echo "This unit has Replication enabled";echo "but does not have tun0";echo;psql -U postgres bpdb -c "select * from bp.managers";echo;elif [[ $(/usr/bp/bin/bputil -g -c `hostname` "Securesync" "AutoSyncEnabled" -1 master_ini) == [Yy]es ]];then echo "This unit has Vaulting Enabled";echo "but does not have tun0";echo;psql -U postgres bpdb -c "select * from bp.managers";echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == DPU ]];then echo "No tun0 but no vaulting/replication either.";echo;echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == Vault ]];then echo "We don't have tun0, but this is a target.";echo;elif [[ $(psql -U postgres bpdb -c "select name, role from bp.systems"|grep `hostname`|grep -v .dpu|awk '{print $NF}') == Manager ]];then echo "This unit does not have tun0, but is a Manager.";echo;else echo "Something is broken.";echo;fi;fi</pre> </li> <li>From the output, verify where the high data usage is happening. <ol><li>If high usage (100%) is shown in /, contact support</li> <li>If high usage is shown in the database, and system is running CentOS 5 then contact support to perform a CentOS 5 to CentOS 6 </li> <li>If high usage is shown in the database and the system is running CentOS 6, a database dump and reload may be required. Contact support for assistance</li> <li>If high usage is being shown in /backups, please see below. </li> </ol></li> <li>Check the <a rel="nofollow" href="/home/leaving?allowTrusted=1&target=https%3A%2F%2Funitrends-support.zendesk.com%2Fhc%2Fen-us%2Farticles%2F360013169897">Capacity Report</a> in the Legacy Interface to verify that the data footprint is not more than the capacity allowed on the appliance. </li> <li> Also, verify if any Instant Recovery space is being used by <b>Settings > Storage and Retention > Storage Allocation</b>. This storage may need to be recovered to be used as backup space if enough is not available on the appliance. </li> <li>If problems are still encountered, check the free space on the appliance. You will probably see no Free Space available on the output of: <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/smgr_display</pre> </li> <li>If no Free Space is available stop all services: <pre class="code codeBlock" spellcheck="false" tabindex="0"> /etc/init.d/bp_rcscript stop</pre> </li> <li>Restart the database: <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/start_db.sh</pre> </li> <li>Start devmonitor: <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/devmonitor</pre> </li> <li>Start cryptodaemon. This impacts the landing zone in smgr_display. <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/cryptoDaemon</pre> </li> <li>Start filededup: <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/fileDedup</pre> </li> <li>Start retentionMgr:</li> <li> <pre class="code codeBlock" spellcheck="false" tabindex="0"> /usr/bp/bin/retentionMgr</pre> </li> <li>Go to the UI and clear out all failed backups. </li> <li>Check and see if there are any synthetic jobs running by checking in the backup browser. If synthetic jobs are running, kill the jobs with this the following command: <pre class="code codeBlock" spellcheck="false" tabindex="0"> psql bpdb -U postgres -c "update bp.backups set status='25168' where backup_no='X'";</pre> <b>Note</b>: In this command, the status is always = '25168' and X= the backup id number </li> <li>With these processes started, the UI will be active but no backups nor auto synthesis will be running. Since the appliance was brought offline, any jobs that were running releases the reservation it had in smgr_display. This is the best scenario for the appliance - where filededup is running, making backups smaller, and no other services are running, adding to the storage. In that 'mode' you can select other backups to remove (preferably failed ones) from the Backup Browser (Settings > Storage and Retention > Backup Browser). Selecting those backups to be removed will remove any backups in that chain as well. After a few moments you should be able to see <b>df -h</b> and <b>/usr/bp/bin/smgr_display</b> display growing free space.</li> <li>If you are still seeing problems, check to see if <b>/usr/bp/bin/retentionMgr</b> is running (or spacereclaim in versions prior to 9.0) by using the command below. If its been running longer than the time it has taken to clear backups, its probably a stale session. If it's running and matches the landing zone, it hasn't picked up the latest changes. Kill all <b>retentionMgr</b> or <b>spacereclaimer</b> jobs and it will auto kick off a new one. <pre class="code codeBlock" spellcheck="false" tabindex="0"> ps aux | grep spacereclaim ps aux | grep retentionMgr </pre> </li> <li>At the conclusion of reclaiming space, stop all processes prior to bringing everything back online. <pre class="code codeBlock" spellcheck="false" tabindex="0"> /etc/init.d/bp_rcscript stop /etc/init.d/bp_rcscript start</pre> </li> <li>If a space condition was caused by a synthetic backup claiming all / any available space, start a new 1 time full backup of the client who's backup you failed in step 15 .</li> <li>If you do all of this and are still seeing problems, you may have database bloat on the appliance. Please contact support for assistance if this is the case. </li> </ol> </article> </main>