KBAN00000217
Help! My Filesystem is Full (10.xx)
Document Information Table
Help! My Filesystem is Full (10.xx)
DocId:
KBAN00000217

Updated:
20000824

DOCUMENT

+—————————————————+
|        10.X Filesystem Full Information           |
+—————————————————+

filesystem is full
filesystem is full
filesystem is full
filesystem is full

Oh, oh!  It seems there are only two type of HP-UX system administrators:

Those that have seen this message,

and

those that are going to!

Solution

The effect of this message depends primarily on what filesystem is reporting
the dreaded error, with the root filesystem (/) having the worst effect.
When root fills up, things start failing; processes are killed or core dump;
programs that depend on files on the root directories begin to have
problems;
and the list goes on.

The first thing to do is to gracefully shutdown the system, unless you know
where the problem is hiding.  There are several common reasons for a
filesystem to become full without warning, while other reasons require some
stealth to locate.

System Panics
————-

One of the biggest files to suddenly appear in the /var filesystem can be
found in the directory /var/adm/crash. This can be configured by editing
/etc/rc.config.d/savecore file and changing SAVECORE_DIR= to something
other than /var/adm/crash (default). Another setting to check is
SAVECORE= This will determine if you are going to save a dump. 1 to
save and 0 to disable. If you disable saving crash dumps you get into
a good news bad news situation:

The good news is that a panic will not suddenly fill your filesystem
with a core dump from 16 to 256 megs (or more) of data.  The bad news
is that there is little chance to determine the reason for the system
panic without this file.

Is /var/adm/crash the only location for a panic core dump?  No.
The directory is specified in the system startup file
/etc/rc.config.d/savecore as stated above. You can change this to a
filesystem that has more space than /var/adm/crash. If you do you will need
as much space as you have main memory to save a full crash dump.

When savecore runs, depending on what revision of 10.x, you will get
two files or a subdirectory and related files. The files vmcore.# and
vmunix.# are created, along with a small file called bounds.
The # sign is a number starting from 0 and incrementing with every
panic that occurs, the bounds file keeps track of the next number to use.

If on 10.10 or later you will get a directory /var/adm/crash/core.#
Again, the # is a number that increments with each crash dump. In the
core.# directory there will be a series of compressed files that
makeup the crash dump. Example: /var/adm/crash/core.0/core.0.1.gz
/var/adm/crash/core.0/core.0.2.gz and so on…. There will also be
a vmunix.gz file. For 10.10 and later there will also be a INDEX
file. The core files can be in multiple parts and the INDEX file has
information about the size of the core file chunks.

When the system reboots after a panic, the /sbin/init.d/savecore
file checks to see if savecore exists and if the directory specified
exists…if both are true, savecore checks the dump area
(typically the primary swap area) for a valid HP-UX memory dump.
Finding a properly stored memory dump, the savecore
program announces the date/time that the panic occurred and creates the file
core.0 (if this is the first core dump in the directory).  The process
continues until all of physical memory (RAM) has been written to the disk.
If there is not enough filesystem space to save the full dump, savecore
will not attempt to save it.  Without a properly written dump in the
primary swap area, savecore does nothing and displays nothing.

Then, savecore writes a copy of the current /stand/vmunix file as
vmunix.0 or vmunix.gz to match the dump file/files.  If the
filesystem is already full, this file is created as a zero-length file.
To be useful, the core dump must also have a copy of /stand/vmunix
(the kernel file) at the time of the dump.

So what is the best technique to prevent the /var filesystem from filling up
due to a system panic?  Simply pick another filesystem to store the dump,
one
that typically has a lot of space, or a filesystem that always has at least
as much space as the size of RAM plus about 10-20 megs (for vmunix.#).  How
do
you locate the size of RAM?  You can type the command:  dmesg and look at
the
amount of real memory that is available.

If you need the space back after a panic, simply store the contents of the
core dump directory on tape…the simplest command to use is:

cd /my_core_dump_directory
tar cvf /dev/my_favorite_tape_devicefile *

Then, you can remove all the files from the core dump directory,
at which point, you can contact HP to have the core dump analyzed
for the possible reason(s) for the panic.

/stand
——

The HP-UX kernel at 10.x can be much larger than at 9.x. For this reason
/stand and /stand/build can fill quickly. You should have /stand/vmunix,
/stand/vmunix.prev, /stand/system, and /stand/system.prev in the /stand
directory. If you have other kernels or system files you should look at
them to determine if they should be removed. In /stand/build there could
be other kernels. Look for a file called /stand/build/vmunix_test,
vmunix_test is a kernel that was created but never moved into place.

Filesystem minfree
——————

Every administrator has probably looked at the bdf command after mounting a
brand new disk and asked:  Where did some of that empty space go?  The
answer
is that approximately 6% to 8% of a disk’s space is occupied by inode tables
and superblocks, which contain the pointers to the various blocks of files
that are on the disk.  In addition, the default newfs command will reserve
10% minfree or 10% of what’s left before files are stored on the disk, to
enhance the filesystem performance.

This buffer allows system administrators to fix problems with the space on a
given disk (once the filesystem is marked full) and still have some room
(the
10% minfree area) to work.  Although the minfree area can be reduced to
zero,
this is not recommended for the root disk since a file system full message
might not allow even the system administrator to log onto the ailing system.

Other disks might be allowed to use 0% minfree, as long as the space is
monitored, or the space usage is essentially fixed.  Note also that the HFS
method of disk space management in HP-UX relies heavily on 10% minfree to
keep the performance in allocating and deallocating filespace at a high
level.

Another filesystem tuning is to increase the bytes-per- node value when
initializing the filesystem with newfs.  By changing the number of bytes of
disk space managed by an inode, the overhead can be reduced by as much as
50%, at the expense of the total number of files that may be stored with
fewer inodes.  This parameter is tricky since it may prevent easy
interchange of data with other Unix systems that cannot handle a wide range
of bytes-per-inode values.

In general, changing this parameter from 2048 bytes to 64K bytes will only
return about 3% or so of the disk space, with a corresponding reduction in
the total number of files that can be stored, but this may be ideal for a
small collection of large files.  Be sure to choose a value that will be
compatible with your operating system revision.  Large inode sizes are often
not-portable to other systems or revs.

Files that don’t belong in /dev
——————————-

Another very common problem is a file system full message just after doing a
backup.  How can this be?  HP-UX is quite friendly in that it will allow
spelling errors, but it will often do something not entirely expected.  For
instance, if the user were to misspell the name of the tape drive as in:

tar cvf /dev/rmt/om /

instead of

tar cvf /dev/rmt/0m /

then, rather than displaying an error message such as:

tape not found

or,

devicefile does not exist,

tar will simply create an ordinary file with the name that was given
in the tar command (or cpio or fbackup, etc) and all the data to be
backed up begins filling the /dev/rmt/om file until the entire system has
backed itself up onto the root disk.  This process eventually fails with a
file system full message.

To find these inadvertent spelling errors in /dev, use the following
command:

find  /dev  -type  f

This list will contain files that should never appear in /dev, that is,
ordinary files.  Occasionally, a core file or other unexpected files may
also
be discovered in the /dev directory.

Managing /tmp and /var/tmp
—————————————–

/tmp is one of those directories where everyone has access but few seem to
treat it with respect.  /tmp is defined as a temporary storage area and is
not to be considered permanent.  Processes like email or the vi editor use
the /tmp directory for files, but normal operations will cleanup afterwards
and not leave files in the /tmp directory.  Some HP programs will leave
their logfile in /tmp, but this is considered correct practice in
that the logfile should be reviewed for errors, and then removed or
archived.

There is a caveat that customers who are running older versions of Omniback
should be aware of. Older versions of Omniback require that a file called
CRS.pid exist in the /tmp directory. Otherwise Omniback will not work.
Newer
versions of Omniback, e.g Omniback II, put the file in
/var/opt/omni/tmp/CRS.pid so this is no longer a concern in the /tmp
filesystem.

One way to enforce cleanup of the /tmp directory is to use a command such
as:

find  /tmp -type f -atime  +14  -exec  rm {} ;

which will remove any files in /tmp (or files in directories below /tmp)
that have not been accessed in more than 14 days.  The other temporary
storage area, /var/tmp, is less often abused by users since they overlook
it’s existence.  Again, some processes will create temporary files in
/var/tmp and should (if they terminate correctly) remove their files
and editors like vi use /var/preserve.  This command
will clear up /var/tmp of files not accessed in more than 7 days:

find  /var/tmp  -atime  +7  -exec  rm {} ;

System administrators need to decide if /tmp should allow regularly accessed
files to stay in /tmp.  A user might bypass the above tests by using the
touch command on the files.  In this case, change the -atime option to
-mtime which means that the file must be modified.

Once in a while, you may need to check for old directories which are not
removed by the above command.  The contents will be cleared but after a
while, /tmp may get cluttered with empty directories.

Here’s a possibility for files in directories.  This combination purges
files that are older than 7 days, followed by a removal of the directory
if it hasn’t been updated for 7 days.   However, a simple rmdir is used
so the command will fail if the directory isn’t empty.  Thus, until all
files have been removed, the directory will stay.

find /tmp -type f -atime +7 -print -exec rm -f {} ;
find /tmp -type d -atime +7 -print -exec rmdir {} ;
find /var/tmp -type f -atime +7 -print -exec rm -f {} ;
find /var/tmp -type d -atime +7 -print -exec rmdir {} ;

Another common practice is to cleanup /var/tmp after every full backup.

Check /home/ftp
—————-

Although /home is one of those directories that can grow unexpectedly (from
individual users creating lots of files), there may be an ftp directory,
also known as anonymous ftp.  This directory allows users from the network
to send/receive files without having specific logins to the system and this
can lead to the appearance of large files unexpectedly.  To check on it,
use:

du -k /home/ftp

Big numbers (more than 10000) might mean that someone on the net is storing
large files…this can be prevented by changing the permissions on the
/usr/share/lib/pub directory from 777 to 755.  The rest of the (standard)
ftp
directories are set to 755 already.  Anonymous ftp can be setup using SAM
although finding the option is tricky…for 8.0x systems:

select:

Networks/Communications ->
LAN Hardware and Software (Cards and Services) ->
ARPA Services Configuration ->
Create Public Account for File Transfers …

For 9.0x systems:

select:

Networking/Communications->
Services:  Enable/Disable
Anonymous FTP    Disabled   Public account file transfer capability

For 10.x systems:

select:

Networking/Communications->
Network Services
Anonymous FTP    Disabled   Public account file transfer capability

By pressing Return, Anonymous FTP will be highlighted and you can then
select
the Action menu by:

pressing f2 (label=Alt) and then the letter a

or

pressing f4 (label=Menubar) and moving the menu to the right using
the arrow keys.  Select Enable or Disable as appropriate.

Where to remove filesets
————————

Starting with HP-UX version 8.0 and higher, the ability to remove unneeded
filesets or applications has been provided through the program rmfn.  At
10.x
The utility is called freedisk(1M) see the man page for more details. You
can run this from SAM, by going to the Routine Tasks section. Which by the
way is where you can find other helpful filesystem full utilities.
Another tool is the cleanup command, provided by a patch for older 10.xx
systems.

Why can’t I just remove the program?  Well, in the good old days of
computers, a simple program was just one item or at most, one directory and
therefore easy to remove.  But that was then and this is now; today,
programs
are stored in various pieces all over the filesystem.  Things like rc files
for local configs, X/Window resources in an app-defaults file, man pages for
documentation, commands needed only for the administrator and other commands
for general use, all part of a single application program.

To track all these items, the swinstall and swremove programs make use of
indexes kept /var/adm/sw directory. Additionally, dependencies between files
in different filesets are tracked, which prevents misloading portions of
file
groupings that would not be fully functional as a whole.  While many third
party suppliers of software use the swinstall program, many do not so you
will
have to refer to your supplier’s documentation for space management.

Filesets that might be removed?
===============================

man pages
———

The documentation pages (man pages) can occupy from 6 to 20 megabytes.  Once
the man pages are removed, the man command will no longer find any online
help files but this can save a lot of disk space, especially in a system
where little or no program development takes place.

Another alternative is to remove just the directories in /usr/share/man
that start with the letters ‘cat’.  These directories, when they exist
provide a location for formatted help pages.  When running the man
program, you may see the message:

formatting…please wait

which comes from the man program as it turns the help pages into a readable
format.

If the /usr/share/man/cat* directories exist, the finished pages are saved,
thereby avoiding the delay when the same page is requested in the future.  A
fully formatted set of man pages may be about 20 megabytes larger than the
unformatted pages.  If users are not annoyed with this delay, removing the
/usr/share/man/cat* directories can save an average of 10 megabytes.

Here’s a tip:  most users need only the section 1 or section 1m commands for
day to day operations.  As the system administrator, you may find that this
space (approximately 3-4 megs) is well worth the time saved, as long as it
doesn’t grow any larger.  There is a command called catman that can format
complete sections (1 and 1m are the basic HP-UX commands for all users and
system administrators, respectively) and by removing all cat* directories
except:

/usr/share/man/cat1
/usr/share/man/cat1m

then just the pages for these commands can be formatted at one time (I would
suggest doing this overnight) by using the command: catman 11m.  Now, all
the
man page requests for sections 1 and 1m pop up immediately, yet the disk
space will not grow as man pages from other sections are referenced (they
will still be formatted on each occurrence).

Another technique is to remove the pages that have not been accessed in the
last n days where n might be 15 or 30, whatever value fits your site.  A
weekly cron job can be started that searches for formatted pages in the
/usr/share/man/cat* directories, finding the files that are older than the
specified time.  Check the find command for time stamp options and use the
-exec option to do an rm of the file(s).

Still another technique is to make the man pages a remote (NFS) directory on
another system.  By making /usr/man reside on a single system, dozens of
megabytes of duplicate pages can be eliminated.

NLS files
———

Native language support is another area that can be trimmed from systems
that
do not require language support other than English.  Some files can be quite
large for Far Eastern languages where a complex character set (ie, Kana or
Korean) might be needed.

HP Diagnostics
————–

These are tricky.  Removing them can save a lot of space, especially on the
700 and 800 systems.  On the other hand, they do provide service people with
detailed information on problems that may occur on the system, along with
detailed logfiles on specific errors.  You may wish to discuss the pros and
cons of removing the diagnostics with your local support people.

As with all HP filesets, they can be reinstalled simply by running the
swinstall program and selecting the required fileset(s). Refer to the
documentation for the Support Plus CDROM (located on the CDROM). There
is a DIAGNOSTICS directoryt with .pdf files for documentation.

lost+found directory
——————–

During an abnormal powerfail or panic of the system, the filesystem will not
be shutdown cleanly and this may require manual intervention with fsck, the
filesystem fixit program.  If fsck is unable to repair files or directories,
rather than delete them, fsck will ask if you wish to fix the problem.  If
you answer yes, then the inode (a pointer to files or directories) may be
moved to the lost+found directory and given a name that is actually the
inode
number.

The space represented by these entries in lost+found might have been
temporary files that were deleted but their free space was not recorded on
the filesystem, or just ordinary files and/or directories that have lost
their names or their connection with the rest of the directories.  In this
case, the system administrator must look at the contents of each file or
directory to determine what, if any, data is to be saved.  Otherwise, these
files will occupy space but serve no purpose, and might lead to creeping
file
system growth.

Unmounted disks
—————

HP-UX connects separate disks into a single filesystem called root
(represented by the / symbol) by making directories do double duty.  A
directory can be changed into a mount point (a logical connection to another
disk’s filesystem) by using the mount command.  The file /etc/fstab can
also accomplish this indirectly in that the mount command reads checklist
for
guidance on where to mount disks.

The curious thing is that unmounting a disk returns the mount point
directory
to a local status and files stored in that directory will return.  These
files became dormant after the mount command changed the use of the
directory
into a mount point; that is, the files on the root disk exist, but cannot be
seen because the mounted disk ‘overlays’ the mount point directory.

If a mounted disk is unmounted, the files in the root directory again become
visible, and this can lead to a common error:

1. Someone notices the files are missing and starts to load the files
back from tape.

2. The files are reloaded (or a portion of the files are loaded) and
someone notices that the root filesystem is full or close to full.

3. Someone types the bdf command and discovers that the second disk is
not mounted…and mounts it.  Now the files are back to what they
were, but the root filesystem is still almost full.

What happened is that the directory was not in use as a mount point but
there
are no red flags showing this condition.  That’s why the bdf command is so
important:  the mounted filesysytems are shown to the right of the listing.

Here’s a tip:  Bring the system down into single user mode by typing
shutdown
0 and once the shell prompt shows up (you won’t have to login), find all
your
mount points by examining the /etc/fstab file.  The second parameter is
the mountpoint directory.

Now, check each directory to see that it is empty.  If not, you will need to
clean up the directory as needed, since no other disks are mounted in
single user mode except the root filesystem.  Now issue the following
command
for every mountpoint:

touch /mount_point/IamNOTmounted

This will create a zero-length file in the mountpoint directory to serve as
a
reminder that this is a mountpoint and not a general use directory.  When
disks are mounted, this file disappears from view; when a disk is unmounted,
the file comes back as a reminder not to run for the backup tapes.

Locations of other big files
============================

It’s important to *NOT* look for big files as a way to clean up disk space.
Two
reasons;

— Often, a big file, especially in HP-UX directories) may be important
it’s use may not be obvious to a new sysadmin,

— Big files may not be the problem.  There may be thousands of little
files that were accidently created by a runaway script or program.

So, look for big directories.  Big directories are easily located with the
du command.  Here’s a quick way to locate the biggest directories in /:

du -kx / | sort -rn > /var/tmp/du.list

Note that the -k option is not documented in 10.x (but works) and will
provide the listing in 1K units rather than blocks (512 bytes).  Here’s
an example:

40396   /
15249   /etc
14474   /sbin
10494   /root
8666    /etc/lp
8557    /etc/lp/interface
6458    /etc/lp/interface/model.orig
3823    /root/tmp
3228    /sbin/fs
1682    /etc/lvmconf

This is fairly typical for directories on a 10.20 system.  Notice that
/etc is the largest, followed by /sbin.  However, if you have non-HP-UX
directories in the root filesystem, they must be moved as they do not
belong in this static (and very important filesystem).  They should be
moved to another logical volume and if necessary, a symbolic can be used
to keep the /filesystem_name in place.

For example:

140630   /
100234  /cadcam
15249   /etc
14474   /sbin
10494   /root
8666    /etc/lp
8557    /etc/lp/interface
6458    /etc/lp/interface/model.orig
3823    /root/tmp
3228    /sbin/fs
1682    /etc/lvmconf

In the above case, the directory /cadcam is very large and certainly
does not belong in the / directory.  There is an extra disk with various
directories on it, so create a new directory for cadcam by moving the
directory and files to the new disk, removing the old directory and
files, then creating a symlink:

mkdir /extra/cadcam
cd /cadcam
find . | cpio -pdlmv /extra/cadcam
cd $HOME
rm -rf /cadcam
ln -s /extra/cadcam /cadcam

An alternative is to simply create a completely new cadcam volume on
a new disk.  Mount it under a temporary mountpoint and then copy the
files.  Once the copy is verified (ls -R to count files and directories)
then you remove the old files and mount the new volume at the /cadcam
directory:

mount /dev/<new_volume>  /mnt
cd /cadcam
find . | cpio -pdlmv /extra/cadcam
cd $HOME
rm -rf /cadcam/*
umount /mnt
mount /dev/<new_volume>  /cadcam

And the files are back.  If you skip[ the rm -rf step, then the old disk
space will not be recovered and will be invisible as long as /cadcam is
used as a mountpoint.  Normally, files and directories become invisible
when a directory is used as a mountpoint.  However, even when mounted,
any files below the mountpoint on the original disk can be seen with a
command called ncheck.  Here is an example:

# mkdir /xyz
# touch /xyz/testfile
# mount /dev/dsk/cdrom /xyz
# ls /xyz
ABC.TXT;1 DEFGHI;1 ….

# bdf /
Filesystem          kbytes    used   avail %used Mounted on
/dev/vg00/lvol3      99669   43283   46419   48% /

# ncheck /dev/vg00/lvol3 | grep /xyz
/xyz/testfile

So even though there is a CDROM mounted on top of the mountpoint for
/xyz, there is an invisible file (/xyz/testfile) underneath the
mountpoint.  ncheck *requires* the device file rather than a mountpoint
and will list every file on that volume.

Thus, if you had just copied some files to a new volume, then mounted
the new volume and noticed no reduction in disk usage by copying the
files, ncheck would show that the files still exist under the mounted
volume.  When those old files are removed, the disk space will be
reclaimed.  It is a good idea to run an ncheck test for the various
mountpoints just too see if hidden files are still lurking under a
mountpoint.

Data collection files
———————

HP’s PerfView Analyzer collects information about system performance and
tasks.  These files will be found in /var/opt/perf/datafiles. There are 5
total and all start with log. The log files are limited in growth by values
in /var/opt/perf/parm, usually 10 meg for all except process, it is set to
20 meg. The status files are in /var/opt/perf. Use the du command to
quickly check the size of these directories.  Data collection files can
grow rapidly in size if collection limits are not set.

core files:
———–

Let’s start with the files that seem to show up everywhere:  core, a.out and
*.o files.  A core file is produced in the current working directory
whenever
a program is terminated abnormally, typically through some sort of error
condition not anticipated by the program, or to a lesser degree, by
receiving
certain signals.

While these core files might be useful to a programmer that is designing or
supporting a particular program, the files are generally wasted space and
can
be removed.  core files can be a few thousand bytes or up to many megabytes
in size.  Here is a command that might be added to a cron entry to remove
core files on a regular basis:

find  /  -name  “core”  -exec rm {} ;

Is there a way to prevent core files from being created?  There are two
ways to do this.
The core file creation process is part of the kernel and it simply takes
everything in the program’s memory and writes it as a file named ‘core’ in
the current directory.  Oone way to prevent this from happening, do the
following:

cd  <to_someplace_where_core_files_shouldn’t_be>
touch core
chmod 0 core
chown root core

Now, core files can’t be created because the file has no permissions for
any user (except the superuser) and the file can’t be changed by the user
since it’s owned by root.  It is a zero-length file so it occupies no space.
And to keep from having cron messages about not being able to remove a
directory called core, change the above find command to:

find  /  -name  “core” -type f  -exec rm {} ;

However, the easier technique is to use the POSIX shell builtin called
ulimit.  Normally, ulimit is a programming interface (section 2 of the
man pages) but with ksh and sh-posix, this is much easier to change at
the command line.  ksh does not have the full complement of options (it
only supports the maximum file size that can be created), while POSIX
shell has almost a dozen options.  To limit core files to zero, use:

ulimit -c 0

>From now on, any process that is started from this shell prompt will
produce a zero-length core file.  Try this (with the POSIX shell)
to prove it:

ulimit -c 0
sleep 500 &
ps
<find the PID>
kill -3 <PID_for_sleep>

This will produce a zero-length core file.

a.out and *.o files
——————-

Other files that are commonly left over from programming efforts are a.out
and files ending with *.o (which are compiled but unlinked files) and these
are often left in various places by busy programmers.  A polite way to
notify
users about these files is to send an email message to everyone with a list
of the files:

find /home -name “a.out” -print > aout.list

find /home -name “*.o” -print > o.list

Then mail these lists to the less than tidy programmers to clean up
their disk space.  If this effort is unsuccessful, the previous
corefile-remover command can be modified from the name “core” to the
names “a.out” although you may wish to add an aging option to the find
command such that only files more than 30 days old are removed.  Note
that you should not remove any *.o files in application (such as Oracle)
or HP-UX directories such as langtools.

HP-UX Spooler
————-

The directory to check is: /var/spool/lp/request.  There
will be a directory for every printer and large files in those directories
may indicate a spooler problem if they are more than a few minutes old.

Printers may become disabled, causing the request directories to fill.
Also, administrators may change and forget to remove test printers which
don’t really exist.  Verify that the report from lpstat -v and the
directories in /var/spool/lp/request are the same.  The lpstat -v listing
shows the printers known to the spooler so if there are other directories
(or files) in /var/spool/lp/request, then they don’t belong there.

Also check the log files in /var/adm/lp/log…they are optional but when
started, they will grow without bounds.  The files are log, oldlog and
lpd.log. lpsched -v is used to start lp logging, although HP’s JetDirect
software always logs information from the interface scripts.

UUCP
—-

Another area to check is uucp’s directories.  Like the lp spooler, this
directory can be very dynamic, holding traffic for other nodes or simply the
repository for various files sent or received using uucp.  The directory is:

/var/uucp

and places to look are: Admin., typically the audit file will grow depending
on traffic; Log. where all the logfiles are kept and then the directories
that are the names of remote machines allowed access to this computer.

MAIL
—-

Finally, /var/mail is a place where unexpected bursts of growth can occur.
Very large files can be easily (too easily) emailed to users on the system
and this directory can grow quite rapidly.  Sending a big file to everyone
in a distribution list will cause multiple copies of the same file to be
placed in everyone’s mailfile, thus growing /var/mail rapidly. Note that
quotas on /var/mail don’t work too well as the delivery agent will treat
a quota problem as a soft error. This can cause the email delivery to
retry 16 to 32 times and if the mail is fairly large, it will be quite
costly in network traffic. Use nagware (automated email messages) to prompt
a user about overly large email files.

Other interesting files
———————–

There is an interesting directory called /usr/newconfig and it contains some
very useful files, namely, the unedited (known to work) custom files such as
gettydefs, inittab, passwd and so on.  If one of these critical files in
the /etc directory becomes corrupt (ie, inittab), a known-to-work version
can
be copied from /usr/newconfig which will then allow the system to get back
online.

/users or /home directory
————————-

So what about users that are abusing their filespace with test files or
other
unnecessary data?  First, where are these files?  The simplest answer is to
check the /home directories, and a good way to do this is with the du
command.  Here is a sample:

du  /home
12         /home/rst
480        /home/wpw
2          /home/jes
3308       /home/djs/nova-files
10442      /home/djs
2          /home/mda
6          /home/jws
2          /home/gfm
2          /home/gedu
12         /home/jam
12         /home/blh
11016      /home

The numbers on the left are in blocks, or 512 byte units of measure.  These
values are not as meaningful as Kbytes or Mbytes so you can use -k (which is
not documented but works in 10.x) The du command shows the diskspace usage
in
directories, not individual files and this is the first step towards
tracking
down disk space problems.

Now, you’ll notice that some directories are not really very interesting
such
as /home/mda (2 blocks or 1 Kbyte), so how can we limit the list to
interesting numbers, such as directories larger than 5 megabytes?

Well, the du command uses left justified numbers, so the standard sort
command won’t produce the desired result.  Use du piped to sort -r -n for a
listing of the largest directories first.  The grep command can be more
useful in this case; here’s an example:

du -kx /home | sort -nr | grep ^….[0-9]

The sort command says:  sort in reverse order, numbers which are
left-justified.  The grep pattern says:  starting at column 1 (^), skip 4
columns (dots are don’t-care positions), then match only when the 5th column
contains a numeric character ([0-9]).  So, the above command applied to our
example above produces:

du /home | sort -r -n | grep ^….[0-9]
6345    /home
5221    /home/djs

which is certainly easier to read.  Now it is obvious that /home/djs is the
largest (5,221 Kbytes or approximately 5 Mbytes) user in /home.  Another
option is to create some simple scripts to show directories in megs:

#!/usr/bin/sh
# Usage:
#     dus <starting-directory>
#
# Show usage in directories measured in megabytes (less than 1024 bytes
#      are not shown)
#
# Measurement is displayed in megabytes (1024*1024).
#   The Kbyte from du -k value is divided by 1024.
#
du -kx $1 | awk ‘{printf “%10.1f  %sn”, $1/1024, $2}’

(the -k is not documented for 10.20 — it reports size in Kbytes)

Once a unexpectedly large directory has been found, you can list the files
in size order with:

ll | sort -nrk 5 | more

This shows the files sorted by number of bytes, largest first.  Here’s a
script called lls (long listing sorted) which sorts the files by size:

#!/usr/bin/sh
#
# Usage: lls [optional directory or file spec]
#
#  Long listing sorted
/usr/bin/ll -aHF $@ | sort -nr -k 5 | more

As an example:

lls /tmp
-rw-r–r–   1 root     other    2109440 May 17 15:46 blh.tar
-rw-r–r–   1 root     other     316916 Jun  2 00:36 foo2
-rw-rw-rw-   1 root     other     260619 Mar  9 05:03 catalog.hp
-rw-r–r–   1 root     other     242044 Sep 24  1994 shoe1.tif
-rw-r–r–   1 root     other     190009 Jan 21  1995 cop_man.ps
-rw-r–r–   1 root     sys       124891 Jan 29  1994 update.log2
-rw-r–r–   1 root     sys        79228 Jan 29  1994 update.log1
-rw-r–r–   1 root     other      48998 May 24 18:11 newsrc.orig2.
-rw-r–r–   1 root     other      48998 May 23 23:33 newsrc.orig.
-rw-r–r–   1 root     other      46514 Aug 28 16:51 oldnewsrc.
-rw-r–r–   1 root     other      46514 Aug 21 15:10 newsrc.
-rw-r–r–   1 root     other      39525 Jan 21  1995 cop-user.sam
-rw-r–r–   1 root     other      22088 Sep 11  1994 tif.
-rwxr-xr-x   1 root     other      20480 Nov 22  1994 set_disp*
-rw-r—–   1 root     other      20131 Mar 19 13:49 gtest
-rw——-   1 root     other      17246 Aug 21 17:08 gpm.
-rw-rw-rw-   1 root     other      15185 Jul 24 12:02 stm.log
-rw——-   1 root     other      12949 Aug 23 15:39 netscape-history.

What about looking for big files rather than directories?  The find command
has an option that will search for the size of a file in blocks or
characters.  For instance, to locate all files that are greater than 1
Mbyte,
the following commands will work:

find / -size +2000 | pg

find / -size +1000000c | pg

where the first form specifies 2000 blocks (2000 x 512 bytes = 1 Mbyte apx)
and the second form will find files that are greater than 1,000,000 bytes.
In the man page for find, the use of + to mean greater than or equal is
documented at the beginning of the section.  You may wish to change the
output from a pipe into pg (or the more command) to redirect into a file as
in:

find / -size +2000 > /tmp/bigfiles

Some files need to stay, for example, /stand/vmunix and
/stand/vmunix.prev are usually larger than 1 Mbyte but don’t remove
them!  The system will have a very difficult time rebooting when these
files are removed (as some new system managers or well-intentioned users
may have already discovered).  These are the most important files needed
for the system to boot.

Logfiles – information and lots of space!
—————————————–

Many logs are kept in HP-UX systems and the majority grow without bounds,
which can generate the infamous “file system full” message.  The root
filesystem is by far the most critical in that many HP-UX processes depend
on
having some space available, including space for logfiles.  Many of these
logfiles are optional and are not created in a default system, but there are
several that do exist and should be monitored.

Some very common ones that can grow quickly are:

/var/adm/syslog/syslog.log      (network and system logs)
/var/adm/syslog/mail.log        (email logs)
/var/adm/diag/LOG*              (diagnostic logs)
/var/adm/wtmp                   (login/logout, etc)
/var/adm/btmp                   (failed login attempts)
/var/adm/syslog/mail.log        (mail log)
/var/adm/lp/log                 (lp log)

In general, most system logs are kept in /var with most being in /var/adm,
but as with all HP-UX commands, there are exceptions. Like /etc/rc.log
and /etc/shutdownlog.

One of the big logfile makers is the optional system monitor program
Perfview Analyzer which uses Measureware which logs computer activity.
Depending on the settings used to quantify ‘interesting’ processes,
the logs may grow very rapidly. The status files can grow without bound.
The /var/opt/perf/parm file sets the size limits of the data files:

size global = 10.0, application = 10.0, process = 20.0, device=10.0

The size for the above settings is in megabytes.

In the following table, there is a list of HPUX related log files.
There might be others but most should be in /var.  SAM has a very good log
file trimming utility.  The SAM log file trimming may not work if /var
is full.  So you may need to manually go through the following list and
trim some files.

/var/adm/sw/swcopy.log
/var/adm/sw/swagentd.log
/var/adm/sw/swagent.log
/var/adm/sw/swinstall.log
/var/adm/sw/patch/PATCH.log
/var/adm/sw/swmodify.log
/var/adm/sw/swremove.log
/var/adm/sw/swpackage.log
/var/adm/sw/swconfig.log
/var/adm/sw/swverify.log
/var/adm/sw/swreg.log
/var/adm/cron/log
/var/adm/cron/OLDlog
/var/adm/syslog
/var/adm/syslog/mail.log
/var/adm/syslog/syslog.log
/var/adm/syslog/OLDsyslog.log
/var/adm/syslog/mail.logSAMTRM
/var/adm/lp/log
/var/adm/lp/oldlog
/var/adm/ptydaemonlog
/var/adm/OLDsulog
/var/adm/rpc.statd.log
/var/adm/rpc.lockd.log
/var/adm/automount.log
/var/adm/vtdaemonlog
/var/adm/shutdownlog
/var/adm/rc.log
/var/adm/sulog
/var/tmp/swagent.log
/var/tmp/gatherftp.log
/var/tmp/sam_remove.log
/var/spool/lp/lpd.log
/var/spool/lp/lpana.log
/var/spool/lp/log
/var/spool/sw/swagent.log
/var/ppl/log
/var/opt/sharedprint/errorlog.test
/var/opt/dce/config/dce_config.log
/var/opt/dce/rpc/rpcd.log
/var/opt/dde/dde_error_log
/var/opt/hppak/hppak_error_log
/var/opt/perf/datafiles/logindx
/var/opt/perf/datafiles/logglob
/var/opt/perf/datafiles/logappl
/var/opt/perf/datafiles/logproc
/var/opt/perf/datafiles/logdev
/var/sam/log/samlog
/var/sam/log/samlog.old
/var/sam/log/br_log
/var/adm/wtmp
/var/adm/btmp

wtmp needs to be zeroed out using the following:

cat /dev/null > /var/adm/wtmp

or

> /var/adm/wtmp

Rather than using the commands:  rm, touch, chmod, chown, chgrp in order to
create an empty file, the cat /dev/null technique retains all the
characteristics of the old file.  Note that zeroing /var/adm/wtmp on a
running
system may cause errors to be reported from the who command.  These errors
are caused by who not finding the users currently logged in.  The best way
to
trim /var/adm/wtmp is to do it in single user mode.  Do not zero the
/etc/utmp…this is done automatically at bootup.  If you do zero the
/etc/utmp file, you get error messages that you must login through the
lowest level shell.  The only remedy (since utmp is gone) is to reboot.

Also, SAM has been enhanced to perform many of the big file searches and
offers other disk space management tools in HP-UX revision 9.0 release and
above.