til

Today I Learned: collection of notes, tips and tricks and stuff I learn from day to day working with computers and technology as an open source contributor and product manager

View project on GitHub

File::Find

File::Find is awesome for processing files and directories recursively.

Normally you would just provide a processor function and one or more arrays to traverse. This is outlined in the synopsis.

use File::Find;
find(\&wanted, @directories_to_search);
sub wanted { ... }

It also has the capabilities to both post process and pre process.

This requires that you call find with a set of options instead:

find({ wanted => \&_process, preprocess => \&_preprocess }, @directories_to_search);

I had a hard time getting my head around the preprocessing, but this answer from StackOverflow filled in the blanks.

The preprocess function is expected to return a (possibly modified) list of items to examine further. In your example, you can add @; at the end of preprocess to return the arguments unmodified. You can do something like grep { $ !~ /pattern/ } @_ to filter out unwanted items, and so on.

Here follows a basic prototype I did for my yak side-project, based heavily on File::Find.

#!/usr/bin/env perl

use strict;
use warnings;
use v5.10; # say
use File::Find; # find

find({ wanted => \&_process, preprocess => \&_preprocess }, $ARGV[0]);

exit 0;

sub _process {
    say 'we are in process';
}

sub _preprocess {
    say 'we are in preprocess';

    # This is quite important
    # See quote in TIL text
    return @_;
}

Another interesting aspect is the ability to prune, which is used for describing the process of cutting of branches of trees etc. (see Wikipedia).

When you are processing large directory structures this can be quite useful. For example if you do not want to process git directories.

sub _proces {
    /^.git\z/s && ($File::Find::prune = 1);

    # ...
}

Resources and References

  1. StackOverflow: “Find::File preprocess”
  2. Perldoc.org: File::Find