Fetching random text file lines with a .bat file
[page last modified 3/21/2016]

This Windows .bat fetches some number of random lines from a text file and returns them to stdout.
Yes this is a task more suitable to a C tool... but I was interested to see if a .bat file could accomplish it.
No external commands (eg. find) are used -- only internal commands (eg. set, for, echo).

To capture a single random line from foo.txt:

   C>randline foo.txt >output.txt

To capture 10 random lines (which appear the order they exist in the source):

   C>randline foo.txt 10 >output.txt

Blank source lines are ignored -- a blank line is never returned.   Special characters (eg <>&|!) are allowed.

The problem was interesting simply to see if .bat could do the job; and it can, within limits.
First it can be slow when dealing with a big source file since the source is processed via a for/do construct.
You will have to time it for your application.   For what I use it for it's fine.

What is prohibitively slow is returning a large percentage of the lines of a big source file.
The random line # selection starts colliding with lines already chosen and progressively slows down.
A smarter algorithm (beyond .bat capability) is needed to fix this.   Fortunately it's not something I do.

The cmd.exe %random% special environment variable is used to generate source line indexes.
It's weak (only 15 bits -- 0..32647) so three invocations per index are generated to create
a 31 bit random value, which we then modulo to create a number in the required range.

Another note about DOS %random% -- it is seeded using the system time to only a one second
resolution when a cmd.exe instance is created.  So if you do something like:

   for /l %%e in (1,1,9999) do for /f %%f in ('echo %%random%%') do echo %%f
You'll see a blocks of identical "random" numbers scroll by that change once a second.
This can be a problem if you launch via task scheduler at the same time each day, or use randline.bat
in a way that causes a new cmd.exe instance to be created (a quite common thing in practice).

To address the random seed issue I originally added a couple lines to randline.bat<:

  set r=& for /f "delims=" %%f in ('time ^<nul') do if not defined r set r=%%f
  set /a s=1%r:~-2%-99& for /l %%f in (1,1,!s!) do set r=!random!
This "spins" %random% 1-100 times per the millisecond count to try to mitigate the problem.
Then I decided that's beyond the scope of the tool and to let the caller handle random seed issues.

(Added 3/21/16: see A method to generate truly random numbers in a .BAT file)


Change Log

Initial version released 3/14/2016 (.bat file is also displayed below)

Click here to download  randline.bat


@echo off& setlocal enabledelayedexpansion& set i=%~2& if "%~2"=="" set i=1 set c=& set /a 2>nul c=%i% if defined c if "%c%"=="%i%" if not exist "%~1\" if exist "%~1" goto args_ok if not "%~1%~2"=="" echo %~n0: *** file doesn't exist or bad count& exit/b echo. echo Outputs unique random file lines to stdout. paulhoule.com 3/14/2016 echo.& echo Usage: %~n0 file.txt {cnt} [cnt defaults to 1] echo.& echo Blank lines in source file are ignored (never returned). echo Result lines occur in ascending order from source file. exit /b :args_ok set r=0& for /f "tokens=1* delims==" %%v in ('set r') do set %%v= set r=0& for /f "delims=" %%f in (%1) do set /a r+=1 (if %c% GTR %r% set c=%r%)& if !c!==0 exit /b set i=0 :rlp set /a rb31=!random!*65536+!random!*2+!random!%%2& set /a j=!rb31!%%r+1 (if defined r!j! goto rlp)& set r%j%=.& set /a i+=1& if not !i!==%c% goto rlp set i=0& for /f "delims=" %%f in (%1) do (set /a i+=1 if defined r!i! setlocal disabledelayedexpansion& echo %%f& endlocal)

Widget is loading comments...

You are visitor 2584       Go to Home Page