bash-scripting: preserve whitespaces in variables

When you want to store a string like

"abc        def    gh ijk"

in a variable on a linux-shell, you’d be normally faced with this:

bash> VAR1="abc        def    gh ijk"
bash> echo $VAR1
abc def gh ijk
bash>

Your whitespaces are trimmed. This is a problem if you need the exact string e.g. for a string-compare.
The cause of this behaviour is the internal shell variable $IFS (Internal Field Separator), that defaults to whitespace, tab and newline.
Thus the variable $VAR1, when passed over to “echo”, is not seen as one single string but as a bunch of strings separated by whitespaces:

bash> VAR1="abc        def    gh ijk"
bash> for a in $VAR1
> do
> echo $a
> done
abc
def
gh
ijk
bash>

To preserve all contiguous whitespaces you have to set the IFS to something different:

bash> IFS='%'
bash> echo $VAR1
abc        def    gh ijk
bash>
bash> for a in $VAR1; do echo $a; done
abc        def    gh ijk
bash>

Afterwards you can switch back to default with “unset IFS”:

bash>unset IFS
bash>echo $VAR1
abc def gh ijk
bash>

Edit 2014-05-21:
Please read Martin’s advice and my reply on it below for another approach.

Advertisements

Template-hassle with MediaWiki

One of our departments makes extensive use of including templates in wiki-articles. They have one article with 56 KB in size that hosts some larger tables. And every row of that tables includes at least one template. There is a total of 24 different templates in use in the article.

At the top of that not flawless loading page we now see this:

"Kategorie:Seiten, in denen die maximale Größe eingebundener Vorlagen überschritten ist"
 
or in english: "Category:Pages where template include size is exceeded"

Examining the HTML-source of the page shows now and then this warning:

<!-- WARNING: template omitted, post-expand include size too large -->

Near the end of the code we find this block of information:

<!-- 
NewPP limit report
Preprocessor node count: 25885/1000000
Post-expand include size: 2097152/2097152 bytes
Template argument size: 263610/2097152 bytes
Expensive parser function count: 3/500
ExtLoops count: 148/200
-->

As you can see, the “post-expand include size” hit the upper limit of 2048 KB. Remember: The core-text of the article was only 56 KB in size. But as stated here, every parsing of (almost) every template adds the size of that template to the total-size of the article-page while preprocessing the page.

A proper way would be to split the article into smaller subpages and link them together in a main-page. But department was under time pressure and urged me to rise the limit.

Unfortunately I found no parameter to be set in the LocalSettings.php to increase the “post-expand include size”. But running through the MediaWiki-parameters-page and scanning every one with a “Max” in it’s name, I came across the “$wgMaxArticleSize” whose default-value coincidentally equaled the 2048 KB of the “Post-expand include size”-limit. And hence all summed up template- and article-code actually makes up the “ArticleSize”, I gave it a try and set “$wgMaxArticleSize = 4096” in the LocalSettings.

The page now loads without any problem and no limit is hit:

<!-- 
NewPP limit report
Preprocessor node count: 25938/1000000
Post-expand include size: 2672362/4194304 bytes
Template argument size: 277317/4194304 bytes
Expensive parser function count: 3/500
ExtLoops count: 152/200
-->

According to “Google” there really doesn’t seem to be a special-parameter for “Post-expand include size” but as my new limits now again equal the $wgMaxArticleSize, it seems as this is the way to do it. At least it works…