2004-08-06

This is for unredeemed hackers who must see the raw bits to be happy.
-- Solaris truss man page

They say that automatic garbage collection is one of the advantages of Java. That doesn't quite get you off the hook completely. True, it saves you from having to delete each new object you create, as one would in C++. However, besides memory there are other resources that are scarce (diskspace, network, database) but those aren't cleaned up automatically as opposed to the objects that represent them. For instance, below is a routine which reads a textfile into a string:

    private String readFile(File file) throws IOException {
        BufferedReader in = null;
        try {
            String line = null;
            StringBuffer contents = new StringBuffer();
            in = new BufferedReader(new FileReader(file));
    
            while ((line = in.readLine()) != null) {
                contents = contents.append(line + "\n");
            }
            return contents.toString();
        } catch (Exception e) {
            return null;
        }
    }

Have you spotted the error? I was in a hurry when I coded this method and I certainly didn't. The error is that the reader needs to be closed:

        finally {
            if (in != null) {
                in.close();
            }
        }

This method isn't heavily used but when there is a maximum of 1024 open files and the application runs a few hours, you'll get messages in your logs stating that you opened Too many files. (By the way, the maximum can be found with ulimit -n if you're using the Bash shell). What is nasty, is that that message doesn't really tell you exactly where you open too many files. Rather than pouring over all that code, I used truss on the Solaris box (on Linux, it's called strace). This command can tell you which system calls an application makes.

System calls are basically the commands that you can give to the kernel. Your favourite programming language may have nice libraries or even have some drag-and-drop widgets, but it all comes down tot asking the kernel whether you can open that file. The kernel will then do the work of checking permissions, checking whether the file is actually there, returning an error if necessary, et cetera. An example of a system call is open, or fork, or something else.

As said before, truss and strace can tell us which system calls an application makes. I was interested in the open and close calls. After I had done a ps to find the process ID (pid) of the running app, I typed

    truss -t open,close -p 7601 -f > trussoutput.txt 2>&1

When the application had run for some time, I interrupted truss, opened it in vi and threw away all pairs of open and close calls. What remained, were the open calls that had no counterpart. Since the arguments to the system calls are also shown, the filename shows up. And that definitely rang a bell; I knew immediately which piece of code opened that file.