APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Replacing a Spreadsheet

Spreadsheets can be powerful tools, and particularly so in the hands of an expert user. A spreadsheet can be used to reorganize data and to extract information not otherwise available. For example, at a client site, an application report generates a listing of hourly billing, but can't give the cross-reference totals desired. The raw output looks something like this:


Branch Date     Hrs.   Employee  Cust    Type     Status Shift
MAIN   01/01/03 9.00               1228  A2              3
SOUTH  01/01/03 9.00    99999      1228  A3       C      3
SOUTH  01/01/03 8.50    12332     46765  D2       X      2
...
 

The customer needed several breakdowns of this data, including hours by shift, and by status. The application could produce this, but only by purchasing a very expensive add-on report module. As the customer saw no other benefit to that expense, they instead loaded the report into a spreadsheet and created the needed output from that.

This was a complex procedure, but the responsible person understood how to do it, and provided the data for several years in that manner. Earlier this year, however, she left on maternity leave, and the job had to be taken over by someone else. The generation of the report itself also involved some complexity, and of course the spreadsheet manipulations were not only difficult but were ad-hoc: the owners might need one representation today ("What is the breakdown of status C for the WEST office in February") and another tomorrow. The new employee just wasn't up to the challenge.

I offered to provide an HTML page that could generate desired reports on demand. I did this with a Perl cgi script that is reproduced with comments here. The advantage of this is that anyone can run it - even the owners and managers who need the ad hoc queries. If new requirements come up, I can easily add them to the output.

#!/usr/bin/perl
$SPREAD_DIR="/usr/home/pcunix/www/data/";
$TITLE="Billing Breakdown";
$UNPACK="A6x1A8x1A5x1A8x1A9x2A2x7A1x6A1";
# The "A" strips ending blanks, "x" skips characters
# 
$GOODLINE=60;
# Length of a "good" line
#
$MYNAME=$ENV{REQUEST_URI};
# So here, will be "/cgi-bin/billing.pl"
#

use CGI qw(:standard);
foreach $i (param) {
  # set each parameter to a matching variable
  foreach $j (param($i) ) {
    $$i=$j;
  }
}
#
print "Content-TYPE: text/html\n\n";
chdir("$SPREAD_DIR");
# output beginning of html page
#
myhead();
# if we have all data, do the selection and produce output
#
runit() if defined $havedata;
# or if we only have the files to read, get rest of selection criteria
#
selecting() if defined $havefiles;
#
# Otherwise, here we are at the beginning
#
$size=0;
$size++ while (<*SP>);
$size %=5;
print <<EOF;
<p>
<form method=post action="$MYNAME">
<table style="margin-left: auto; margin-right: auto;" width="75%">
<tr><td><select name="file" size="$size">
EOF
$s="selected"; # just to force first selection
foreach (<*SP>) {
 print qq(<option $s value="$_">$_</option>);
 $s="";
}
#
print <<EOF;
</select></table>
<input value="Choose File" type=submit name=havefiles>
</form>
<hr></body></html>
EOF

sub myhead() {
#
print  <<EOF;
<html><head><title>$TITLE</title>
</head><body>
<h2>$TITLE</h2>
EOF
#
}

sub selecting {
# This lets the user select the criteria for the report
# We build the selections from data actually available in the 
# file, which makes this self-maintaining.
#
print "<h2>$file</h2>";
open(I,"<$file");
@all=<I>;close I;
[email protected];
print "<br>$clines total lines, ";
$lowdate="999999";$highdate="000000";
$lowdisp="";$highdisp="";
%allclients=();
%allbranches=();
%allstatus=();
$clines=0;
foreach(@all) {
  next if /^Branch/;
  next if length($_) < $GOODLINE;
  chomp;
  next if not $_;
  $clines++;
  $line=$_;
  $line =~ s/^\f\r//;
  chomp $line;
  @foo=unpack ($UNPACK,$line);
  @dt=split /\//, $foo[1];
  $t=sprintf("%6d",$foo[4]);
  $t="(BLANK)" if $t == 0;
  $allclients{$t}=$t;
  $t=uc($foo[0]);
  $t=~ s/ //g;
  $t="(BLANK)" if not $t;
  $allbranches{$t}=$t ;
  $t=uc($foo[5]);
  $t=~ s/ //g;
  $t="(BLANK)" if not $t;
  $allstatus{$t}=$t ;
  $td=sprintf("%0.2d%0.2d%0.2d",$dt[2],$dt[0],$dt[1]);
  if ($td < $lowdate) {
    $lowdate=$td;
    $lowdisp=$foo[1];
  }
  if ($td > $highdate) {
    $highdate=$td;
    $highdisp=$foo[1];
  }
}
#
$size=%allclients;$size %=5;$size++;
print <<EOF;
$clines lines read 
<p>
<form method=post action="$MYNAME">
<table style="margin-left: auto; margin-right: auto;" width="75%">
<tr><td>Low date:</td><td><input type=text size=8 maxlength=8 name="slowdate" value=$lowdisp></td>
<tr><td>High date:</td><td><input type=text size=8 maxlength=8 name="shighdate" value=$highdisp></td>
<tr><td>Client(s)</td><td><select name="clients" size="$size" multiple="multiple">
EOF
#
  print qq(<option selected value="(ALL)">(ALL)</option>);
  print qq(<option value="(BLANK)">(BLANK)</option>);
foreach (sort keys %allclients) {
  next if /BLANK/;
  print qq(\n<option value="$_">$_</option>);
}

#
$size=%allbranches;$size %=5;$size++;
print <<EOF;
</select>
<tr><td>Branch</td><td><select name="branches" size="$size" multiple="multiple">
EOF
#
  print qq(<option selected value="(ALL)">(ALL)</option>);
foreach (sort keys %allbranches) {
  print qq(\n<option value="$_">$_</option>);
}
#
$size=%allstatus;$size %=5;$size++;
print <<EOF;
<tr><td>Service Type</td><td><select name="servtype" size="$size" multiple="multiple">
EOF
#
  print qq(<option selected value="(ALL)">(ALL)</option>);
foreach (sort keys %allstatus) {
  print qq(\n<option value="$_">$_</option>);
}
#
print <<EOF;
</table>
<input type=hidden  name="lowdisp" value="$lowdisp">
<input type=hidden  name="highdisp" value="$highdisp">
<input type=hidden  name="lowdate" value="$lowdate">
<input type=hidden  name="highdate" value="$highdate">
<input type=hidden  name="file" value="$file">
<input type=submit name=havedata>
</form>
</body></html>
EOF
exit 0;
#
}

sub runit {
# This generates the actual output
#
@values = param( 'clients' );
@bvalues = param( 'branches' );
@svalues = param( 'servtype' );
$now=time();
print "<p>From $file, for $slowdate to $shighdate, with <p>Clients: ";
foreach(@values) {
  print "<br>$_";
}
print "<p>Branches";
foreach(@bvalues) {
  print "<br>$_";
}
print "<p>Service Type";
foreach(@svalues) {
  print "<br>$_";
}
#
#
open(I,"$file");@all=<I>;close I;
@dt=split /\//,$slowdate;
$lowdate=sprintf("%d%0.2d%0.2d",$dt[2],$dt[0],$dt[1]);
@dt=split /\//,$shighdate;
$highdate=sprintf("%d%0.2d%0.2d",$dt[2],$dt[0],$dt[1]);

$gt=0;$tct=0;
foreach (@all) {
  next if /^Branch/;
  next if length($_) < $GOODLINE;
  $line=$_;
  $line =~ s/^\f\r//;
  chomp $line;
  @foo=unpack ($UNPACK,$line);
  @dt=split /\//, $foo[1];
  $thisdate=sprintf("%d%0.2d%0.2d",$dt[2],$dt[0],$dt[1]);
  next if $thisdate < $lowdate;
  next if $thisdate > $highdate;
  foreach (@foo) {
   s/ //g;
 }
 $t=sprintf("%0.5d",$foo[4]);
 $t="(BLANK)" if not $t;
  $cmatch=0;
 foreach (@values) {
   $cmatch=1 if /ALL/;
   last if $cmatch;
   $tt=$_;
   $tt="(BLANK)" if not $tt;
   $cmatch=1 if ($tt == $t);
 }
 next if not $cmatch;
  $cmatch=0;
 foreach (@bvalues) {
   $cmatch=1 if /ALL/;
   last if $cmatch;
   $t=uc($foo[0]);
   $tt=uc($_);
   $t=~ s/ //g;
   $tt=~ s/ //g;
  $t="(BLANK)" if not $t;
  $tt="(BLANK)" if not $tt;
   $cmatch=1 if ($tt eq $t);
 }
 next if not $cmatch;
  $cmatch=0;
 foreach (@svalues) {
   $tt=uc($_);
   $cmatch=1 if /ALL/;
   last if $cmatch;
   $t=uc($foo[5]);
   $t=~ s/ //g;
   $tt=~ s/ //g;
  $t="(BLANK)" if not $t;
  $tt="(BLANK)" if not $tt;
   $cmatch=1 if ($tt eq $t);
 }
 next if not $cmatch;
 push @detail, "<tr>";
 foreach (@foo) {
  push @detail, "<td>$_</td>\n";
 }
 $tct++;
 $gt+=$foo[2];
 $pline=join "\|",@foo;
 $cbystatus="$foo[4]|$foo[6]";
 $shiftbystatus="$foo[4]|$foo[6]|$foo[7]";
 $bystatus{$cbystatus}+=$foo[2];
 $countbystatus{$cbystatus}++;
 $countbyshift{$shiftbystatus}++;
}
  printf("<hr><p>Total records %d Hours  %.2f </td>\n", $tct,$gt);
#
print <<EOF;
<hr>
<table>
EOF
#
print  "<tr><td><b>Sum of Hours</b></td>";
print  "<tr><td>Client</td><td>Status</td><td>Sum</td>\n";
$gt=0;
foreach (sort keys %bystatus) {
  @t=split /\|/,$_;
  printf("<tr><td>$t[0]</td><td>$t[1]</td><td align=right>%.2f</td>\n",$bystatus{$_});
  $gt+=$bystatus{$_};
}
  print "<tr><td></td><td></td><td align=right>-------</td>";
  printf("<tr><td>Total</td><td></td><td align=right>%.2f</td>\n",$gt);

print  "</table><hr><table><tr><td><b>Count</b></td>\n";
  print  "<tr><td>Client</td><td>Status</td><td>Count</td>\n";
$tct=0;
foreach (sort keys %countbystatus) {
  @t=split /\|/,$_;
  $tct+=$countbystatus{$_};
  print  "<tr><td>$t[0]</td><td>$t[1]</td><td align=right>$countbystatus{$_}</td>\n";
}
  print  "<tr><td></td><td></td><td align=right>-----</td>\n";
  printf("<tr><td>Total</td><td></td><td align=right>%d</td>\n",$tct);

print  "</table><hr><table><tr><td><b>By Shift</b></td>\n";
  print  "<tr><td>Shift</td><td>Client</td><td>Status</td><td>Count</td>\n";
$tct=0;
foreach (sort keys %countbyshift) {
  @t=split /\|/,$_;
  $tct+=$countbyshift{$_};
  print  "<tr><td>$t[2]</td><td> $t[0]</td><td>  $t[1]</td><td align=right> $countbyshift{$_}</td>\n";
}
  print  "<tr><td></td><td></td><td></td><td align=right>-----</td>\n";
  printf("<tr><td>Total</td><td></td><td></td><td align=right>%d</td>\n",$tct);
#
print <<EOF;
</table>
<hr>
<p>Detail Lines
<table style="margin-left: auto; margin-right: auto;" width="75%">
EOF
foreach (@detail) {
  print;
}
print <<EOF;
</table>
<hr>
<hr>
</body></html>
EOF
exit 0;
#
}
 

Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Replacing a Spreadsheet


3 comments



Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Tony Lawrence




---January 3, 2005

Would of something like a ODBC connector from their spreadsheet program to a MySQL (or any other SQL server) database helped out with something like this?

I am just asking because I've used MS Office's Access program and it's fairly usefull for things... Of course it has its limitations, like even though Access Database is relational, it's not much above a plain text file because of the program's limitations. (for instance it's not multiuser, you have to have Access running to use a Access database, you can't access it over a network and so on and so forth).

However one of the things that makes it interesting is the ability to use Access (and other parts of Office, like you could possible yank data into a spreadsheet to build presentation graphs) as a front-end to a SQL database with the ODBC connectors.

OpenOffice.org/Staroffice is working on stuff like this, currently it's a "hidden" feature, because it's not realy mature enough for mainstream use. I've setup a OpenOffice.org tied into UnixODBC tied into MySQL to setup tables and entries, but it's not that usefull, not like Access is. StarOffice uses a free version of Adabas database version for it's database backend.

But it wouldn't be to hard to use a ODBC server to dump a spreadsheet program into a SQL server and tie that into a webserver thru mature PHP/Perl/Python SQL libraries/functions.

Then later on, if they need to move on from using just a spreadsheet program to a real database all their information for so many years already exist in a SQL database, ready to be used and people used to spreadsheets could still use that to enter data.

Of course you'd probably want to use Postgresql if your goal is interlopey with potentially MS SQL, Oracle, or SysBase (also has a no-cost versoin for small businesses) based SQL servers, since it's closer to standards then MySQL is, or so I've been told.

--Drag

---January 3, 2005


Well, remember that the starting point isn't a spreadsheet. If I wanted to put this into a database, I'd take it straight from the report to there. But really, the database IS the report - why put it anywhere else?

--TonyLawrence



---January 3, 2005

I see. I misunderstood a bit. Thought that you were working with a spreadsheet format or whatnot, not the output from a seperate program.

At the point you started working on the data was a task of reformating text, running filters and such. Perfect task for Perl.

-Drag



------------------------
Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





Don't blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for "the average programmer" — you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner. (Edsger W. Dijkstra)

Don't blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for "the average programmer" — you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner. (Edsger W. Dijkstra)












This post tagged: