I’ve been trying to drag useful data out of MS Excel files without having to actually install Windows somewhere – and preferably without having to work with Excel. I discovered that Excel will export files as XML. The schema is painful, but, after a bit of working with XML::Simple, I was able to get the data into a more useable format.
This isn’t a finished product by any means, but I thought it might be useful to someone as a starting point, so that you don’t have to plough through the XML schema yourself:
#!/usr/bin/perl use XML::Simple; my @cols; my $x = XMLin('AcademicTables.xml'); # Name of file my $w = $x->{'Worksheet'}; foreach my $worksheet (@{$x->{'Worksheet'}}) { next unless $worksheet->{'ss:Name'} eq 'Academics_Depts'; # Name of worksheet @rows = @{$worksheet->{'Table'}->{'Row'}}; # First row - Get the column names $firstrow = shift(@rows); foreach my $cell (@{$firstrow->{'Cell'}}) { push @cols, $cell->{'Data'}->{'content'}; } # The the rest of the rows foreach my $row (@rows) { @values = @{$row->{'Cell'}}; foreach $col (@cols) { $c = shift @values; $val = $c->{'Data'}->{'content'}; $val =~ s/n{2,}/<br />/g; # Remove multiple newlines print $col . " -> " . $val . "n"; } } }