Site: US UK AU |
Nexcess Blog

Sorting The Output Of du

June 19, 2013 0 Comments RSS Feed

Sorting the Output of du

du is a utility to tell you the disk usage of files or directories. Usually when people run it, they want the output in a human readable format like 314M, 2.7K, and 161G which are a lot easier to read than 314159265, 2718, 161803398874 respectively.

When people are looking for directories using the most storage, they can usually only care about directories using a gigabyte or more, so people will pipe the output of du to grep 'G' to only see them. This works OK but you might miss a directory that is using 987M of storage and the grep will also match a directory that has ‘G’ in it somewhere which an be annoying.

After having done the grep 'G' trick enough times and seen it shortcomings, I figured there had to be a better way to do this. I found two ways instead, one which is incredibly easy and useful and a slightly more complex one.

CentOS 6

Pipe the output to sort -h. That’s it! sort can now sort by sizes that are in a human readable format. This was added to sort in GNU coreutils 7.5. See the example below:

$ du -sh * | sort -h
12K	includes
16K	pkginfo
36K	shell
68K	media
188K	errors
964K	var
1.5M	downloader
8.9M	js
8.9M	skin
40M	lib
60M	app
1.1G	bigdir

CentOS 5 and earlier

The versions of sort that comes with CentOS 5 and earlier don’t include the -h flag so you have to improvise. I’ve seen various methods to deal with this, but the one I like is to force du to use megabytes for all of its output, then pipe it to sort -g. To make du do this, combine the -B flag with ‘M’ to use megabytes, ‘G’ to use gigabytes, etc. See the below output for an example. It will round up to the nearest megabyte but it is much easier to read and sort the output of it.

$ du -hsBM  * | sort -g
1M	errors
1M	includes
1M	media
1M	pkginfo
1M	shell
1M	var
2M	downloader
9M	js
9M	skin
40M	lib
60M	app
1071M	bigdir
Posted in: Linux