Skip to content Skip to sidebar Skip to footer

Perl Read First Column of a File

I used to piece of work on an application where we wanted to let non-developers, and even managers to be able to provide input to our system in batches. We did not want them to fill a spider web form, every bit those usually have rather express editing capabilities.

They were already familiar with Microsoft Excel, so letting them use that and send us the Excel files looked like a good manner to go them involved.

And so we faced the issue, of given a file created by Microsoft Excel, how can we read the content while running on a Linux car running Red Lid, or on a Solaris box.

CPAN has quite a few modules for reading Excel files. There is Spreadsheet::Read that provides a very high level abstraction simply which ways we might take no admission to all the details subconscious in the Excel file. On the other hand information technology will be able to handle other blazon of spreadsheets too. In addition to Microsoft Excel files, it tin can as well read OpenOffice, LibreOffice, SquirrelCalc, and plain CSV files. All of these with 1 unproblematic interface.

Then there are the low-level libraries reading files by different versions of Excel: Spreadsheet::ParseExcel tin can read Excel 95-2003 files, and Spreadsheet::ParseXLSX tin read file in the Excel 2007 Open XML XLSX format. There is also Spreadsheet::XLSX, simply as far as I tin can tell that'southward not recommended any more than.

In addition there is likewise Spreadsheet::ParseExcel::Uncomplicated that works at an brainchild level somewhere between the higher up two, simply it has non been changes for quite some time and I am not sure if it is necessary at all.

Create the Excel file

I don't accept Excel on my computer so instead of that I am going to utilize a file created using Excel::Writer::XLSX as explained in the how to create Excel file article.

The script to create the Excel file is here:

examples/create_excel.pl

#!/usr/bin/perl use strict; use warnings;  utilise Excel::Writer::XLSX;  my $workbook  = Excel::Writer::XLSX->new( 'unproblematic.xlsx' ); my $worksheet = $workbook->add_worksheet();  my @data_for_row = (1, two, 3); my @table = (     [4, five], 	[6, vii], ); my @data_for_column = (10, 11, 12);   $worksheet->write( "A1", "Hi Excel!" ); $worksheet->write( "A2", "second row" );  $worksheet->write( "A3", \@data_for_row ); $worksheet->write( four, 0, \@table ); $worksheet->write( 0, 4, [ \@data_for_column ] );  $workbook->close;        

The resulting Excel file looks like this:

Spreadsheet::Read

Let's see the highest level of abstraction that makes information technology easiest to admission the contents of the Excel file.

Spreadsheet::Read exports a number of function that you lot either import, or use with their fully qualified name. In our solution we are going to import the ReadData function and use the fully qualified name of the other functions, for no item reason. Maybe just to testify that both piece of work.

The strangely named ReadData role accepts a filename that can be an Excel file, an Open Office Calc file, a Libre Function Calc file, or even a obviously CSV file. It will use the file-extension to guess which format the file is in, it will load the appropriate back-end module and apply that module to load and parse the file. In the end it will create an array reference representing the whole file:

utilize Spreadsheet::Read qw(ReadData); my $book = ReadData ('unproblematic.xlsx');        

The get-go element of the returned array contains some full general information about the file. Each one of the rest of the elements correspond ane of the sheets in the original file. In a CSV file there is only one canvass, simply the other formats allow multiple sheets. So $book->[1] represents the first sail. It is a hash reference and we can use this to access the content of the cells using the note familiar from the spreadsheets. $volume->[1]{A1} is the A1 element

say 'A1: ' . $volume->[i]{A1};        

The output of the to a higher place snippet is

A1: Hi Excel!        

This tin be neat if we know exactly which cells to expect at, only if we don't know exactly which rows contain information and how many cells take data we need some other tools.

Fetch a row

The row office of Spreadsheet::Read volition accept a canvass, and a row-number and volition return an assortment representing the values of the given row. The size of the returned array depends on the correct-most prison cell that has data. Then fifty-fifty though Excel can have many, many columns, our arrays will only grow to the necessary size.

Cells that are empty will take undef in the respective element of the assortment.

Considering we have not imported the row function, we are using it with its fully qualified proper noun. The side by side snippet will read the first row of the first canvas (which is commonly represented past the letter of the alphabet A) and then information technology will loop over the indexes and brandish the content of each field. (displaying an empty string if the value was undef).

my @row = Spreadsheet::Read::row($book->[i], ane); for my $i (0 .. $#row) {     say 'A' . ($i+1) . ' ' . ($row[$i] // ''); }        
A1 Howdy Excel! A2  A3  A4  A5 10        

Fetch all the rows

Being able to fetch a single row is not enough though. We need to be able to go over all the rows. That's where we tin use the rows function provided past the module. This function too accepts a canvas, but it does not need any more parameters. It returns an array or array references. Effectively a two dimensional assortment or "matrix". Each element in the returned array represents one row in the spreadsheet.

This is how we iterate over all the elements:

my @rows = Spreadsheet::Read::rows($book->[1]); foreach my $i (1 .. scalar @rows) {     foreach my $j (one .. scalar @{$rows[$i-i]}) {         say chr(64+$i) . " $j " . ($rows[$i-one][$j-1] // '');     } }        

The result is

A 1 Hi Excel! A 2  A 3  A iv  A 5 10 B i 2nd row B ii  B 3  B 4  B 5 11 C 1 i C 2 2 C iii 3 C four  C 5 12 D 1  D 2  D 3  D four  D 5  E 1 iv Eastward 2 six E iii  Eastward four  E 5  F 1 5 F ii 7 F 3  F 4  F 5        

With this we can already piece of work quite well.

Read Excel script

The full script nosotros used to read the excel file:

examples/read_excel.pl

#!/usr/bin/perl use strict; use warnings; employ 5.010;  use Spreadsheet::Read qw(ReadData);  my $book = ReadData('uncomplicated.xlsx');  say 'A1: ' . $book->[1]{A1};   my @row = Spreadsheet::Read::row($book->[one], 1); for my $i (0 .. $#row) {     say 'A' . ($i+1) . ' ' . ($row[$i] // ''); }  my @rows = Spreadsheet::Read::rows($book->[1]); foreach my $i (i .. scalar @rows) {     foreach my $j (one .. scalar @{$rows[$i-1]}) {         say chr(64+$i) . " $j " . ($rows[$i-1][$j-ane] // '');     } }        

In the comments, please wrap your lawmaking snippets within <pre> </pre> tags and use spaces for indentation.

spencesigne1955.blogspot.com

Source: https://perlmaven.com/read-an-excel-file-in-perl

Post a Comment for "Perl Read First Column of a File"