Tag: bash

The key to faster shell scripts: know your shell’s features and use them!

I have a cleanup program that I’ve written as a Bash shell script. Over the years, it has morphed from a thing that just deleted a few fixed directories if they existed at all (mostly temporary file directories found on Windows) to a very flexible cleanup tool that can take a set of rules and rewrite and modify them to apply to multiple versions of Windows, along with safeguards that check the rules and auto-rewritten rules to prevent the equivalent of an “rm -rf /*” from happening. It’s incredibly useful for me; when I back up a customer’s PC data, I run the cleaner script first to delete many gigabytes of unnecessary junk and speed up the backup and restore process significantly.

Unfortunately, having the internal rewrite and safety check rules has the side effect of massively slowing the process. I’ve been tolerating the slowness for a long time, but as the rule set increased in size over the past few years the script has taken longer and longer to complete, so I finally decided to find out what was really going on and fix this speed problem.

Profiling shell scripts isn’t quite as easy as profiling C programs; with C, you can just use a tool like Valgrind to find out where all the effort is going, but shell scripts depend on the speed of the shell, the kernel, and the plethora of programs executed by the script, so it’s harder to follow what goes on and find the time sinks. However, I observed that a lot of time was spent in the steps between deleting items; since each rewrite and safety check is done on-the-fly as deletion rules are presented for processing, those were likely candidates. The first thing I wanted to know was how many times the script called an external program to do work; you can easily kill a shell script’s performance with unnecessary external program executions. To gather this info, I used the strace tool:

strace -f -o strace.txt tt_cleaner

This produced a file called “strace.txt” which contains every single system call issued by both the cleaner script and any forked programs. I then looked for the execve() system call and gathered the counts of the programs executed, excluding “execve resumed” events which aren’t actual execve() calls:

grep execve strace.txt | sed ‘s/.*execve/execve/’ | cut -d\” -f2 | grep -v resumed | sort | uniq -c | sort -g

The resulting output consisted of numbers below 100 until the last two lines, and that’s when I realized where the bottleneck might be:

4157 /bin/sed
11227 /usr/bin/grep

That’s a LOT of calls to sed, but the number of calls to grep was almost three times bigger, so that’s where I started to search for ways to improve. As I’ve said, the rewrite code takes each rule for deletion and rewrites it for other possible interpretations; “Username\Application Data” on Windows XP was moved to “Username\AppData\Roaming” on Vista and up, while “All Users\Application Data” was moved to “C:\ProgramData” in the same, plus there is a potential mirror of every single rule in “Username\AppData\Local\VirtualStore”. The rewrite code handles the expansion of the deletion rules to cover every single one of these possible cases. The outer loop of the rewrite engine grabs each rewrite rule in order while the inner loop does the actual rewriting to the current rule AND and all prior rewrites to ensure no possibilities are missed (VirtualStore is largely to blame for this double-loop architecture). This means that anything done within the inner loop is executed a huge number of times, and the very first command in the inner loop looked like this:

if echo “${RWNAMES[$RWNCNT]}” | grep -qi “${REWRITE0[$RWCNT]}”

This checks to see if the rewrite rule applies to the cleaner rule before doing the rewriting work. It calls grep once for every single iteration of the inner loop. I replaced this line with the following:

if [[ “${RWNAMES[$RWNCNT]}” =~ .*${REWRITE0[$RWCNT]}.* ]]

I had to also tack a “shopt -s nocasematch” to the top of the shell script to make the comparison case-insensitive. The result was a 6x speed increase. Testing on an existing data backup which had already been cleaned (no “work” to do) showed a consistent time reduction from 131 seconds to 22 seconds! The grep count dropped massively, too:

97 /usr/bin/grep

Bash can do wildcard and regular expression matching of strings (the =~ comparison operator is a regex match), so anywhere your shell script uses the “echo-grep” combination in a loop stands to benefit greatly by exploiting these Bash features. Unfortunately, these are not POSIX shell features and using them will lead to non-portable scripts, but if you will never use the script on other shells and the performance boost is significant, why not use them?

The bigger lesson here is that you should take some time to learn about the features offered by your shell if you’re writing advanced shell scripts.

Update: After writing this article, I set forth to eliminate the thousands of calls to sed. I was able to change an “echo-sed” combination to a couple of Bash substring substitutions. Try it out:


It accepts $VARIABLES where the strings go, so it’s quite powerful. Best of all, the total runtime dropped to 10.8 seconds for a total speed boost of over 11x!

[SOLVED] X terminal emulator doesn’t read .bashrc or .profile or /etc/profile, shows ‘sh’ prompt

This is an ancient problem, but I seem to run into it frequently and even briefly forget it. When you open an X terminal emulator such as Xterm or rxvt, you might be frustrated by the fact that you can’t get a profile or startup script to run in the shell prior to displaying a prompt. Your shell prompt also defaults to something fairly useless like “sh-4.3$” instead of something more informative and fancier.

This occurs because your user account’s default shell is either not explicitly set or is set as /bin/sh rather than /bin/bash. To fix this you’ll need to run the following command as root (i.e. with su or sudo):

usermod -s /bin/bash your_user_name

You’ll need to log out and back in before the change will fully take effect; alternatively, if you’re starting X from a command prompt, you can drop out of X, export SHELL=/bin/bash, and start X again. This can also be done in your .profile or .bashrc as a workaround if you don’t have root access for some reason and don’t use a graphical login manager.

Windows Registry FUSE Filesystem

Here’s some code which will allow you to mount Windows registry hive files as filesystems: https://github.com/jbruchon/winregfs

The README file says:


     If you have any questions, comments, or patches, send me an email:

One of the most difficult things to deal with in years of writing Linux
utilities to work with and repair Windows PCs is the Windows registry.
While many excellent tools exist to work with NTFS filesystems and to change
and remove passwords from user accounts, the ability to work with the
registry has always been severely lacking. Included in the excellent chntpw
package is a primitive registry editor "reged" which has largely been quite
helpful and I have been grateful for its existence, but it suffers from a
very limited interface and a complete lack of scriptability that presents a
major hurdle for anyone wanting to do more with the registry than wipe out a
password or change the "Start" flag of a system service.

Because of the serious limitations of "reged," the only practical way to do
anything registry-oriented with a shell script was to export an ENTIRE HIVE
to a .reg file, crudely parse the file for what you want, create a .reg file
from the script to import the changes, and import them. Needless to say, the
process is slow, complicated, and frustrating. I even wrote a tool called
"read_inf_section" to help my scripts parse INF/INI/REG files faster because
of this need (but also for an unrelated need to read .inf files from driver
packages.) This complexity became too excessive, so I came up with a much
better way to tweak the registry from shell scripts and programs.

Thus, the Windows Registry FUSE Filesystem "winregfs" was born. chntpw
( http://pogostick.net/~pnh/ntpasswd/ ) has an excellent library for
working with Windows NT registry hive files, distributed under the LGPL.
winregfs is essentially a glue layer between ntreg.c and FUSE, translating
Windows registry keys and values into ordinary directories and files.

winregfs features case-insensitivity and forward-slash escaping. A few keys
and value names in the Windows registry such as MIME types contain forward
slash characters; winregfs substitutes "_SLASH_" where a forward slash appears
in names.

To use winregfs, make a directory to mount on and point it to the registry
hive of interest:

$ mkdir reg
$ mount.winregfs /mnt/sdc2/Windows/System32/config/software reg

Now, you can see everything in that hive under "reg":

$ ls reg
7-Zip/                  Google/              Policies/
AVAST Software/         InstalledOptions/    Program Groups/
Adobe/                  Intel/               RegisteredApplications/
Analog Devices/         LibreOffice/         S3/
C07ft5Y/                Macromedia/          Schlumberger/
Classes/                Microsoft/           Secure/
Clients/                Mozilla/             Sigmatel/
Diskeeper Corporation/  MozillaPlugins/      The Document Foundation/
GNU/                    NVIDIA Corporation/  Windows 3.1 Migration Status/
Gabest/                 ODBC/                mozilla.org/
Gemplus/                Piriform/

Let's say you want to see some things that automatically run during startup.

$ ls -l reg/Microsoft/Windows/CurrentVersion/Run
total 0
-r--r--r-- 1 root root 118 Dec 31  1969 Adobe ARM.sz
-r--r--r-- 1 root root 124 Dec 31  1969 DiskeeperSystray.sz
-r--r--r-- 1 root root  60 Dec 31  1969 HotKeysCmds.sz
-r--r--r-- 1 root root  66 Dec 31  1969 IgfxTray.sz
-r--r--r-- 1 root root  70 Dec 31  1969 KernelFaultCheck.esz
-r--r--r-- 1 root root  66 Dec 31  1969 Persistence.sz
-r--r--r-- 1 root root 100 Dec 31  1969 SoundMAXPnP.sz
-r--r--r-- 1 root root 118 Dec 31  1969 avast.sz

You want to see what these values contain.

$ for X in reg/Microsoft/Windows/CurrentVersion/Run/*
> do echo -en "$X\n   "; cat "$X"; echo; done
reg/Microsoft/Windows/CurrentVersion/Run/Adobe ARM.sz
   "C:\Program Files\Common Files\Adobe\ARM\1.0\AdobeARM.exe"

   "C:\Program Files\Diskeeper Corporation\Diskeeper\DkIcon.exe"



   %systemroot%\system32\dumprep 0 -k


   C:\Program Files\Analog Devices\Core\smax4pnp.exe

   "C:\Program Files\AVAST Software\Avast\avastUI.exe" /nogui

Has anything hijacked the Windows "shell" value that runs explorer.exe?

$ cat reg/Microsoft/Windows\ NT/CurrentVersion/Winlogon/Shell.sz

How about the userinit.exe value?

$ cat reg/Microsoft/Windows\ NT/CurrentVersion/Winlogon/Userinit.sz

Perhaps check if some system policies are set (note that REG_DWORD will
probably change in a future release to text files instead of raw data):

$ hexdump -C \
> reg/Policies/Microsoft/Windows/System/Allow-LogonScript-NetbiosDisabled.dw
00000000  01 00 00 00                                       |....|

You can probably figure out what to do with it from here. ;-)

Sort compressed tar archives to make them smaller… 20% smaller!

How would you like for your file archives to be 20% smaller, with just the tools every Linux distribution already provides and a little ingenuity?  Read on and see how I did it!

There was a folder “NESRen” that I wanted to pack up for archival, and I knew it contained many files that share the same data, minus a few changes here and there.  When packing them up and compressing them into a tarball archive, I knew that I will achieve better compression if these largely similar files are put side-by-side in the archive so that the repeated blocks of data “compress themselves away” and take up almost no space.  Unfortunately, the GNU “tar” command in nearly every Linux distribution packs up files and folders in the order that the underlying filesystem chooses, which is almost always unordered and not optimal for compression.

How do we make tar put files in an order which will compress better?  The answer is to use the tar -T option, which lets you feed tar a list of files to pack up.  The list is processed line-by-line, and each file is packed up in the order provided.  You can, for example, create a list of files with the “find” command, then hand-edit that list to be optimal, and pass the list to tar (you must use the –no-recursion option when creating the archive from this list since the “find” makes a recursive list already):

find folder_of_files/ > list.txt
vi list.txt
tar -c --no-recursion -T list.txt | xz > archive.tar.xz

In my case, however, the folder structure and naming conventions allowed for creative use of the “sort” command to arrange the files. Since there is one folder “NESRen” followed by a series of categorizations, followed by the file names themselves (i.e. “NESRen/World/Pinball (JU).nes”) I can do something like this to make all files with the same name sort beside each other, regardless of the name of the category directory (as “sort” with no options would do):

find NESRen | sort -t / --key=3 | \
  tar -cv -T - --no-recursion | xz -e > NESRen.tar.xz

The “-t /” tells sort to use a slash as a field delimiter, and –key=3 tells it to sort by the third field (NESRen is field 1, the folder is 2, the file is 3).  What kind of difference did that make for the size of my .tar.xz archive?  Take a look (-nosort archive created with “tar -c NESRen | xz -e > NESren-nosort.tar.xz”):

Size of each file in bytes:

212958664 NESRen-nosort.tar.xz
170021312 NESRen.tar.xz

Size of the original folder and each file in megabytes:

705M    NESRen
204M    NESRen-nosort.tar.xz
163M    NESRen.tar.xz

By sorting the files, I saw a 20.1% drop in archive size using the exact same compression method, with a total compression ratio of 23.1% versus the unsorted 28.9%.  That’s a huge difference!  If this were 70.5GB instead of 705MB and the data exhibited identical performance, the final archive would be 4.1GB smaller–nearly the entire capacity of a single-layer DVD-R in space savings, just by sorting the file names before compression.

Applying a similar sort-then-compress process to the packing of the “ext” version of the Tritech Service System, a 700KB reduction in the total size of the archive containing “ext” was seen.  Of course, this doesn’t help as much because the archive itself was already 32.7MB in size (700KB is only a 2.1% reduction) but it still means shorter load and boot times due to less overall data to handle.

Next time you’re packing a lot of stuff up, see if you can use these tricks to improve your compression ratio.