Rob van der Woude's Scripting Pages

Wildcards Mess

Wildcards were implemented in MS-DOS to allow specifying a group of files, e.g. DIR *.txt which would list all files with the extention txt.

DOS and Windows support 2 wildcard characters:

* matches any number of characters, including none
? matches exactly one character

In FAT file systems (MS-DOS) a wildcard could never match a dot, in FAT32 and NTFS it can.

DIR *.hta will list the following files (assuming they exist, of course):

Hardware.hta
Romans.hta
UpdateCheck.beta.hta
WMIGen.hta

DIR *.hta* will list the following files:

Hardware.hta
Romans.hta
UpdateCheck.beta.hta
WMIGen.hta
.htaccess

The latter, .htaccess, may come as a surprise, but it is consistent with the rule that the * wildcard matches any number of characters, including none.

The ? wildcard behaves rather inconsistent when used in extensions:
DIR *.ht? will list the following files:

Hardware.hta
Hardware.test.htm
Romans.hta
UpdateCheck.beta.hta
WMIGen.hta

So far so good.
DIR *.hta? will list the following files:

Hardware.hta
Romans.hta
UpdateCheck.beta.hta
WMIGen.hta

Theoretically, it should only list all files with a four character extension, with the first three characters hta, e.g. hta2, htax etcetera, i.e. an empty list.
However, DIR *.h?a will properly list all files with hta extension, as expected.

Wildcards after the third character of the extension seem to be ignored entirely.
DIR *.htm will list the following files:

hardware.test.htm
hardware.test.html
wmigen.htm
wmigen.html

The files with exension html are a "bonus"...

To complicate matters, try DIR ??????.htm and you will get wmigen.htm only.

OK, one more clue: DIR ????????.??? (8 question marks, dot, 3 question marks) will list all files, including those with names of more than 8 characters, just like DIR *.* would.
Isn't this command supposed to return just the files that comply to the "8.3 format"?

All these quirks are caused by the FAT32 and NTFS file systems also maintaining "hidden" short file names, e.g. C:\PROGRA~1 for C:\Program Files.
When testing file names against wildcards, matches for both the long and short file name are returned.

This also explains another quirk that had me baffled for some time, and which inspired me to write this page: the seeming randomness of the list returned by DIR *~*.*.
DIR *~*.* will not return just the files with a tilde in their (long) file name, but also those with a generated short file name (which always includes a tilde) — without showing that generated short name!

So DIR *A*.* may return all files with a character A in their name, but DIR *~*.* will return a completely unpredictable list of file names.

Believe me when I tell you that DEL *~*.* is a receipe for disaster — I speak from hard-gained experience.


page last uploaded: 2017-07-13, 11:26