Linux: Copy only certain filetypes with RSYNC from foldertree

As I struggled again with the fiddly include/exclude-syntax of RSYNC, I think it’s worth a note here.

I want to copy only JPG- and PNG-files with their corresponding folder-structure from a foldertree – leaving all the rest where it is.

To explicitly include or exclude folders and files in an rsync-command I can define patterns that are checked one after another against each object in the given tree until one matches.
The catch is, that the given patterns are not exclusive. That means, if I define “–include=”*.JPG” –include=”*.PNG”” WITHOUT a terminating “–exclude=”*””, my include would have virtually no effect. It would just mean: “Explicitly include all JPG- and PNG-files and implicitly include all the rest too!”.
But here’s another catch: with the above include/exclude-patterns rsync would not descent into my foldertree. Just staying on the first level, as the “–exclude=”*”” checked against each subfolder would skip them. To let rsync run through all folders I have to set “–include=”*/” before the “–exclude=”*””. That means: “Explicitly include all folders!” so that the “–exclude=”*”” is never reached for a folder.

Remember: The include/exclude-patterns are checked in the given order from first to last for a match. When a matching pattern is found, the pattern-processing of the current object is stopped and the next one is checked. If finally no pattern matched, the object IS NOT SKIPPED but copied over to DEST.

So here is my case:

  • I want to copy only all JPG- and PNG-files with parent-folders from SOURCE to DEST
  • Don’t want to copy the content of the folders THUMB, ARCHIVE, DELETED and TEMP

The resulting rsync-command is this:

rsync -r --exclude="thumb/" --exclude="archive/" --exclude="deleted/" --exclude="temp/" --include="*/" --include="*.JPG" --include="*.PNG" --exclude="*" /usr/people/lampp_htdocs/vrwiki/images/ /usr/people/lampp_htdocs/test/vrclone/images/

Let’s analyse this command in detail:

All patterns ending with slash (/) are taken as folders. So I first exclude all folders I don’t want:

--exclude="thumb/" --exclude="archive/" --exclude="deleted/" --exclude="temp/"

Then I include all remaining folders so that rsync goes through my whole tree (the “-r” option needs to be set also):

--include="*/"

Then I explicitly include my desired image-files:

--include="*.JPG" --include="*.PNG"

And finally exclude all the rest that didn’t matched so far:

--exclude="*"

The pattern-search is case-sensitive. To make it case-insensitive and e.g. sync all JPG, jpg, PNG and png files one has to use regular expressions in the “include”:

--include="*.[Jj][Pp][Gg]" --include="*.[Pp][Nn][Gg]"

The man-page-contents for RSYNC can be found here:
http://ss64.com/bash/rsync.html

Advertisements

CSV-Importproblem: How to convert text to numbers in Excel

Just had the following problem: I exported a table from my database to flatfile (csv) and imported it to MS Excel. Excel treated some numeric columns as “text” with the consequence, that I couldn’t do calculations on that values.
Excel placed a little green comment-triangle in the upper-left corner of every of that cells with a corresponding note. To convert the values back to number, one could click each and every comment and select the appropriate command there – very frustrating job. So I thought “What a nonsense! This is easy-doing!”, marked the wrong-formatted columns and set the format to “number”. But to my surprise this had no effect on the values.

At MS I found a helpful info on how to achieve what I want. They suggest various ways, but for me this one was best and easiest:

  1. select the to-be-converted column
  2. select menu DATA -> TEXT TO COLUMNS – here just leave the default an click “finish”

This way the format of all cells of that column is set to “general” which makes Excel to take numeric values as what they are.