Ketil Froyn's blog

Main blog

Thu, 13 Oct 2011

Recursive selective copy in Linux (like XCOPY /S)

I saw a question online today on how todo a selective recursive copy of files, while maintaining directory structure in the the target folder. An example from Windows world was:

XCOPY /S *.htm foo
which would presumably recursively copy all *.htm files to the foo folder.

I'm not sure about all the detailed workings of running XCOPY /S like mentioned above, but I can do the same thing on the command line in Linux, and I'm pretty confident that this command is equally or more robust, and equally or more efficient. My suggestion, covering multiple extensions at the same time, would be (skip the end-of-line backslashes and merge to one line if you like):

find /path/to/source/folder -type f -regextype posix-egrep \
     -iregex '.*\.(csv|doc|docx|odb|odm|odp|ods|odt|ots|pdf|ppt|pptx|rtf|txt|xls|xlsx)' -print0 | \
      xargs -0 cp --parents -v -t /path/to/target/folder
to copy a bunch of different types of document files.

The -t parameter to cp specifies target folder. The --parents parameter specifies that the target should maintain the directory structure from the source. "find -print0 | xargs -0" is a combination to use null termination for files rather than the standard newline. Some files can contain weird quotes, newlines, binary characters, and more, but null termination should handle all that, including malicious file names like "file; rm -rf /; echo" or something like that. In addition, -iregex handles both upper and lower case, which people from the Windows world aren't used to care about.

I bet it's possible to write this regexp in a way that uses the standard regextype instead of posix-egrep, but I'm more familiar with the latter so I wrote it like that. Drop the -v flag to cp to avoid all the output.

posted at: 15:43 | path: /2011/10 | permanent link to this entry