How to use the FOR command in a Batch File to parse text files as part of a command line batch program or script in Windows command line programming.
Batch file programming is a very useful way to automate small, repetitive, tasks. For those who are new to creating batch files, the Windows Command Line Programming tutorial is a good starting place to read about the basics of batch file programming.
Using the FOR command to parse a text file is quite an advanced topic, but can be very useful. For example, it can be used to pre-process (tokenize) an incoming file to remove comments, or extract commands. Data conversion and automation tasks can also be implemented, for example, using it to build action files from incoming data in an automated fashion.
In this way, several existing programs can be linked together with some basic programming logic, using the output of one as input to another to achieve the desired result; it is a much quicker solution than writing applications in a "real" programming language.
To use the FOR command to process data by iterating over a set of files, text strings, or the output of a command, the /F option must be specified:
The set can be a set of files (*.txt, *.doc etc. as in the Batch File Programming FOR Command article), a literal string, or a command.
The keywords are optional, the remaining parts are required. The command may be followed by one or more options and parameters, and everything up to the end of the line is taken as pertaining to the command and passed as-is. There are some variable substitutions allowed, including variables specified in the FOR command, and thee can be integrated with the command. For example:
In the above example, the variables %%i and %%j will be echoed to the screen. They contain the values of columns 4 and the rest of the line, as output from the command 'dir /A-d' which lists files without directories. The resulting list is a collection of pairs of file name and size.
The keyword tokens specifies that the variables are to be allocated values from the delimited data on each line returned by the command. In this case, because the filenames might contain spaces (which is a delimiter), and appear at the end of the line, the asterisk is used to indicate that all data up to the end of the line should be taken. The %%j variable is not declared explicitly, but implicitly.
The other available keywords are:
They can be combined in the keywords list, separated by spaces. eol specifies a new end of line character (for example, eol=; would stop parsing each line at the first ; character). skip specifies a number of lines to skip (for example CSV column headings) at the start of the input. delims changes the delimiter set (delims=, for parsing comma separated files).
tokens is described above, but it should be noted that a range can also be specified (tokens=1-5 would assign %%i, %%j, %%k, %%l, %%m). Finally usebackq allows for the use of quotes in the filename set, for cases where the input would contain spaces. If in doubt, the usebackq option provides the safest usage, but modifies the way that set should be specified: