Main Page |
Recipe 8.8 Reading a Particular Line in a File8.8.1 ProblemYou want to extract a single line from a file. 8.8.2 SolutionThe simplest solution is to read the lines until you get to the one you want: # looking for line number $DESIRED_LINE_NUMBER $. = 0; do { $LINE = <HANDLE> } until $. = = $DESIRED_LINE_NUMBER || eof; If you are going to be doing this a lot and the file fits into memory, read the file into an array: @lines = <HANDLE>; $LINE = $lines[$DESIRED_LINE_NUMBER]; The standard (as of v5.8) Tie::File ties an array to a file, one line per array element: use Tie::File; use Fcntl; tie(@lines, Tie::File, $FILE, mode => O_RDWR) or die "Cannot tie file $FILE: $!\n"; $line = $lines[$sought - 1]; If you have the DB_File module, its DB_RECNO access method ties an array to a file, one line per array element: use DB_File; use Fcntl; $tie = tie(@lines, DB_File, $FILE, O_RDWR, 0666, $DB_RECNO) or die "Cannot open file $FILE: $!\n"; # extract it $line = $lines[$sought - 1]; 8.8.3 DiscussionEach strategy has different features, useful in different circumstances. The linear access approach is easy to write and best for short files. The Tie::File module gives good performance, regardless of the size of the file or which line you're reading (and is pure Perl, so doesn't require any external libraries). The DB_File mechanism has some initial overhead, but later accesses are faster than with linear access, so use it for long files that are accessed more than once and are accessed out of order. It is important to know whether you're counting lines from 0 or 1. The $. variable is 1 after the first line is read, so count from 1 when using linear access. The index mechanism uses many offsets, so count from 0. Tie::File and DB_File treat the file's records as an array indexed from 0, so count lines from 0. Here are three different implementations of the same program, print_line. The program takes two arguments: a filename and a line number to extract. The version in Example 8-1 simply reads lines until it finds the one it's looking for. Example 8-1. print_line-v1#!/usr/bin/perl -w # print_line-v1 - linear style @ARGV = = 2 or die "usage: print_line FILENAME LINE_NUMBER\n"; ($filename, $line_number) = @ARGV; open(INFILE, "<", $filename) or die "Can't open $filename for reading: $!\n"; while (<INFILE>) { $line = $_; last if $. = = $line_number; } if ($. != $line_number) { die "Didn't find line $line_number in $filename\n"; } print; The Tie::File version is shown in Example 8-2. Example 8-2. print_line-v2#!/usr/bin/perl -w # print_line-v2 - Tie::File style use Tie::File; use Fcntl; @ARGV = = 2 or die "usage: print_line FILENAME LINE_NUMBER\n"; ($filename, $line_number) = @ARGV; tie @lines, Tie::File, $filename, mode => O_RDWR or die "Can't open $filename for reading: $!\n"; if (@lines > $line_number) { die "Didn't find line $line_number in $filename\n"; } print "$lines[$line_number-1]\n"; The DB_File version in Example 8-3 follows the same logic as Tie::File. Example 8-3. print_line-v3#!/usr/bin/perl -w # print_line-v3 - DB_File style use DB_File; use Fcntl; @ARGV = = 2 or die "usage: print_line FILENAME LINE_NUMBER\n"; ($filename, $line_number) = @ARGV; $tie = tie(@lines, DB_File, $filename, O_RDWR, 0666, $DB_RECNO) or die "Cannot open file $filename: $!\n"; unless ($line_number < $tie->length) { die "Didn't find line $line_number in $filename\n" } print $lines[$line_number-1]; # easy, eh? If you will be retrieving lines by number often and the file doesn't fit into memory, build a byte-address index to let you seek directly to the start of the line using the techniques in Recipe 8.27. 8.8.4 See AlsoThe documentation for the standard Tie::File and DB_File modules (also in Chapter 32 of Programming Perl); the tie function in perlfunc(1) and in Chapter 29 of Programming Perl; the entry on $. in perlvar(1) and in Chapter 28 of Programming Perl; Recipe 8.27 |
Main Page |