/*

       This program produces web site statistics from log file records.

 

       Various reports are produced as a function of criteria supplied by the user.

 

       Depending on the time period specified by the user, one or more log files

       must be accessed.  There is always a current "access" log file and possibly

       one or more archive log files.  Archive files are gzipped and named to

       reflect the date.

 

       To speed processing, as little time as possible is spent reading log records.

       Fields are read into work areas and later distributed and processed as needed.

 

       Arrays are built driven by the needs of the various reports.  If an array fills

       up, it is enlarged using "realloc."  Arrays are sorted using quicksort after

       they are finished being populated.

 

       Report 1: Top n clients accessing the site.  Shows a count and the domain name.

              Sorted by count, highest first.

 

       Report 2: Top n files accessed.  Shows count and file (e.g. a gif or an HTML page).

              Sorted by count, highest first.

 

       Report 3: Files containing the string "whatever."  Shows count and client (IP).

              Sorted by count, highest first.

 

       Report 4: Period totals for site.  Shows individual clients, all clients, hits,

              page hits, KB transmitted.  Individual clients = unique IP's; all clients

              counts an IP as many times as days it occurs, and = the sum of the "clients"

              column in report 5; hits is sum of "hits" column in report 5; page hits =

              sum of "page hits" column in report 5; KB transmitted = sum of "KB" column

              in report 5.

 

       Report 5: Daily Totals for site.  Shows date, clients, hits, page hits, KB transmitted.

              Date is in form: weekday (3 char), e.g. Mon; month (3 char), day (no leading zero),

              year (4 digits).  KB is to two decimal places.

 

       Report 6: Daily Averages.  Shows same columns as report 5.  Values = report 4 values

              divided by number of days in period.  Clients here is based on "all clients"

              from report 4, rather than "individual clients."

 

       Report 7: Hourly averages.  Shows hour, hits, percentage, and KB transmitted.

              Hour appears as e.g. "Midnight to 1 am" or "1 am to 2 am,"  Percentage is

              nn.n %.

 

       Report 8: Summary of HTTP errors.  Shows count, error code (e.g. 404), text (e.g. Not Found).

              Sorted by count, highest first.

 

       Report 9: (Top 20) requests causing errors.  Shows count and request (file).

              Sorted by count, highest first.

 

       Design notes:

 

              Since this isn't Perl, and none of us can figure out how to redirect standard output

              from a shell command back to this C program's standard in, the output of the cat or

              gzcat is to a work file.

 

              For reports 4-9 we read in the log records one-by-one.  But we only read each

              work file once, extracting all of the data for all of those reports in one pass.

 

              The processing for reports 1-3 is affected by the fact that there can be from a handful

              to millions of log records to be processed for a given user submission of the stats

              engine.  This precludes building an array in-core to hold information for each ip

              encountered, e.g., without first having determined how many occur.  And even if we

              built an array based on that count of unique ip's, if we read each log record we have

              to search the array each time to find the right entry to increment the count for.

              So, a hash table is a better data structure to use.

 

              For these reports, we extract selected log fields from each relevant log file

              (based on the user-supplied date range) using shell commands (via system calls)

              and concatenate this output to a work file.  This allows us to determine how many

              hash table entries to allocate for the primary array.  We also can perform a number

              of useful operations on the combined log data.

 

              For the purposes of report 1 we "cut" extract just the first field, which is the ip,

              from each log record.*  After cutting from all relevant files, this gives us a file of

              all occurrences of ip's.  We sort this and pipe to "uniq" which gives us just the unique

              ip's that occurred.  A "wc" of that tells us how many ip's to take into account when

              allocating the primary hash table stucture.

 

              We can now populate the hash table.  Reading in a record from the file with all ip

              occurrences, we directly hash to the right entry and increment its count.  In case of

              the entry already being populated by a different ip (a synonym situation), we chain a

              linked list off the main entry so that each different ip mapping to that same primary

              entry gets its own entry.

 

              When we're done accumulating occurrences, we can produce report 1.

 

              For report 2 we employ a similar strategy to that used for report 1.  We extract the

              "file" field instead of the ip field.  A hash table is eventually built based on the

              file combining the "file" fields from all relevant log files.

 

              For report 3 we read the log records as in reports 4-9, but proceed more like we do for

              reports 1 and 2.  We look for records whose "file" field contains the user-supplied string.

              For those that do, we write a record to a file with just the ip.  After all log files

              have been read, we process the resulting ip file.  We sort it and uniq it to get the

              number of unique ip's.  We read this in and allocate an array with that many entries.

              We read the sorted ip file and accumulate the count of occurrences for each ip there.

              We then write the ip's and counts to a file and sort it by count, descending.  We read

              that in and produce the report.

 

              *Since for report 5 we need ip information within date, we build a work file with both

              ip and date.  We use this later for report 1, cutting the ip column to create the ip file.

 

*/

 

#include "stdio.h"

#include "string.h"

#include "time.h"

#include "math.h"

#include "stdlib.h"

#include "unistd.h"       /* For getpid, execle commands */

#include <sys/types.h>

#include <sys/socket.h>

#include <netinet/in.h>

#include <arpa/inet.h>

#include <netdb.h>

#include <ctype.h>

 

/* Multiplied times number of unique ip's when malloc'ing hash table */

#define scaling_factor 3

 

char ip[8192];            /* Host   (IP address)       */

char ident[256];       /* Ident field          */

char authuser[256];      /* Authuser             */

char timestamp1[256];    /* Time Stamp part 1 */

char timestamp2[256];    /* Time Stamp part 2 */

char file1[256];       /* HTTP Request       part 1       */

char file2[256];       /* HTTP Request       part 2       */

char file3[256];       /* HTTP Request       part 3       */

char status[256];       /* Status Code          */

long bytes;          /* Transfer Volume */

 

char *blank_date = "           ";     /* 11 blanks */

char  blank_ip[16];

char  blank_string[256];

char *buff;

char  buffer[1024];

char *buff2;

long  bump1 = 1000;

long  bump2 = 100;

long  bump3 = 10;

char  cat_string[256];

char  cgipath[10] = "/tmp/";

/*char  cgipath[44] = "/Space/Domains/stats.simplenet.net/cgi-bin/";*/

char  char_oldest_day[4];

char  char_oldest_year[5];

char  char_youngest_day[4];

char  char_youngest_year[5];

int   code_red_found = 0;

char  curr_file[100];

long  curr_s1 = 0;

long  curr_s2 = 0;

long  curr_s3 = 0;

char *dashed_line = "\n\t\t\t---------------------------\n\n";

char  date[12];

long  date_bias;

char  date_save[12];

char  domain[256];

char  end_date[12];

char  end_day[3];

char  end_month[3];

char  end_year[5];

long  endloop;

char  enterprise_string[64];

char *enterprise_string1 = "format=%Ses->client.ip%";

char *enterprise_string2 = "%Req->srvhdrs.content-length%";

long  entire_period = 0;

long  e400_count = 0;

char *e400_lit = "Syntax error         ";

long  e401_count = 0;

char *e401_lit = "Unauthorized         ";

long  e402_count = 0;

char *e402_lit = "Unauthorized         ";

long  e403_count = 0;

char *e403_lit = "Forbidden            ";

long  e404_count = 0;

char *e404_lit = "Not found            ";

long  e405_count = 0;

char *e405_lit = "Not found            ";

long  e406_count = 0;

char *e406_lit = "Internal server error";

long  e410_count = 0;

char *e410_lit = "No longer available  ";

long  e500_count = 0;

char *e500_lit = "Internal server error";

long  e501_count = 0;

char *e501_lit = "Not implemented      ";

long  e502_count = 0;

char *e502_lit = "Bad gateway          ";

long  e503_count = 0;

char *e503_lit = "Service unavailable  ";

long  e504_count = 0;

char *e504_lit = "Bad gateway          ";

char  error_code_file_string[256] = "error_code_file_";

char  file2_buffer[8192];

int   file10_open = 0;

int   file11_open = 0;

int   file12_open = 0;

int   file15_open = 0;

int   file16_open = 0;

long  file_date_status = 0;

char  file_list_string[256] = "file_list_";

long  fqdn_limit;

char  header_work[256];

char  header_work2[256];

long  indexx;

char *input_parm_file;

long  int_start_year;

long  int_start_month;

long  int_start_day;

long  int_end_year;

long  int_end_month;

long  int_end_day;

long  ip_date_file_removed = 0;

char  ip_date_file_string[256] = "ip_date_file_";

char  ip_save[256];

int   null_found = 0;

long  num_clients;

long  num_dates;

long  num_dates_used = 0;

long  num_uniq_items;

long  oldest_year;

long  oldest_month;

long  oldest_day;

char  outfile_string[256] = "outfile_";

char  parm_line[256];

char  pid[256] = "\0";

char  plural[2];

char  random_suffix_char[2] = " ";

int   rc = 0;

char  rec_date[12];

char  rec_day[3];

int   rec_day_int;

long  rec_hour;

char  rec_month[4];

int   rec_month_int;

long  rec_too_new;

char  rec_year[3];

int   rec_year_int;

char  rec_year_long[5];

long  record_count = 0;

char  report1_ip_file_string[256] = "report1_ip_file_";

char  report2_file_file_string[256] = "report2_file_file_";

char  report3_ip_file_string[256] = "report3_ip_file_";

char  report9_file_file_string[256] = "report9_file_file_";

char  report_file_string[256] = "report_file_";

long  report_index;

char  report_input_file[256];

long  report_limit;

char  report_lit[256];

long  report_num;

char  report_work[256];

char  r1_exclude_string[16];

long  r1_limit = 30;

char *r1_lit = "Top %ld Clients Accessing %s:\n\n";

char  r2_exclude_string[256];

long  r2_limit = 40;                  /* User supplied */

char *r2_lit = "Top %ld Files Accessed on %s:\n\n";

long  r3_limit = 40;

char *r3_lit = "Clients accessing files containing the string \"%s\" in this period:\n\n";

char  r3_string[256];

int   r5sorted_ip_date_file_created = 0;

int   r5sorted_ip_date_file_removed = 0;

char  r5sorted_ip_date_file_string[256] = "r5sorted_ip_date_file_";

long  r5_data_needed = 0;

long  r9_limit = 20;

char *r9_lit = "Top %ld Requests Causing Errors:\n\n";

char  sorted_error_code_file_string[256] = "sorted_error_code_file_";

char  sorted_report_file_string[256] = "sorted_report_file_";

char  sort_string[256];

char  start_date[12];

char  start_day[3];

char  start_month[3];

char  start_year[5];

char  std_error_file_string[256] = "std_error_file_";

char  std_error_file_string2[256] = "std_error_file2_";

long  tot_clients;

long  tot_hits;

float tot_kb;

long  tot_page_hits;

char  uniq_ip_count_string[256] = "uniq_ip_count_";

char  uniq_ip_list_string[256] = "uniq_ip_list_";

char  uniq_item_count_string[256] = "uniq_item_count_";

char  uniq_item_list_string[256] = "uniq_item_list_";

char *val;

char  work_bytes[256];

long  work_count;

char  work_date[12];

char  work_day[4];

char  work_daynum[3];

char  work_file_string[256] = "work_file_";

char  work_mon[4];

char  work_string1[256];

char  work_string2[256];

char  work_string3[256];

char  work_string4[256];

char  work_string5[256];

char  work_year[5];

long  y;

long  youngest_year;

long  youngest_month;

long  youngest_day;

 

typedef struct struc_1

{

       char   s1_string[256];

       long   s1_count;

       struct struc_1 *s1_next_hash_entry;

}d1;

 

struct struc_1 *report_hash;

struct struc_1 *report_array;      /* Really an array, not used as a hash table.  See create_report1_file(). */

 

typedef struct struc_3

{

       char   s3_date[12];

       long   s3_clients;

       long   s3_hits;

       long   s3_page_hits;

       float  s3_kbytes;

       struct struc_3 *s3_next_hash_entry;

}d3;

 

struct struc_3 *report5_hash;

 

typedef struct struc_4

{

       char  s4_time[21];

       long  s4_hits;

       float s4_kbytes;

}d4;

 

struct struc_4 *report7_array;

 

struct tm *time_ptr;

time_t lt;

 

FILE *fptr1;       /* log file list            */

FILE *fptr2;       /* cat'd log file          */

FILE *fptr3;       /* stats report file          */

FILE *fptr4;       /* count of unique ip's          */

FILE *fptr5;       /* ip from each log rec    */

FILE *fptr6;       /* report initial data as i/p */

FILE *fptr7;       /* count of report ip's or files*/

FILE *fptr8;       /* report data to sort         */

FILE *fptr9;       /* sorted report data          */

FILE *fptr10;       /* ip_date_file                */

FILE *fptr11;       /* report1_ip_file            */

FILE *fptr12;       /* report2_file_file          */

FILE *fptr13;       /* error code file          */

FILE *fptr14;       /* sorted error code file       */

FILE *fptr15;       /* report3 initial data as o/p */

FILE *fptr16;       /* report9 initial data as o/p */

FILE *fptr17;       /* input parameter file          */

 

long  check_date_range();

void  check_file_date();

long  check_for_match();

long  check_for_wildcards();

void  close_some_files();

int   create_record_stream();

void  create_report();

void  create_report_file();

char *dayname();

void  do_reports();

char *error_lit();

void  estimate_num_dates_entire();

void  estimate_num_dates_specific();

void  get_count();

void  get_dates();

void  get_domain_name();

void  get_form_input();

void  get_ip_count();

void  get_random_suffix();

void  get_time();

char *get_value();

long  hash_date();

long  hash_file();

long  hash_ip();

void  init();

void  init_report5_hash();

void  init_report7_array();

void  init_report_array();

void  init_report_hash();

char *month_name();

long  month_num();

void  omit_header();

void  open_report_input_file();

void  open_some_files();

void  output_report5();

long  page_hit();

void  parse_input();

void  populate_report_hash();

char *prepare_filename_string(char *);

void  prepare_filename_strings();

void  prepare_filename_suffix();

void  prepare_file_list();

void  prepare_parm_input();

void  print_glossary();

void  print_header();

void  print_report_hash();

void  print_sorry();

void  print_time();

void  report();

void  report_setup();

void  reports1and4_setup();

void  report1_setup();

void  report2_setup();

void  report3_ongoing();

void  report3_setup();

void  report4();

void  report5();

void  report5_init();

void  report5_ongoing();

void  report6();

void  report7();

void  report7_ongoing();

void  report8();

void  report9_setup();

void  reports8and9_ongoing();

void  set_num_dates();

void  update_report5_hash();

void  update_report5_totals();

void  update_report_hash();

void  update_strucs();

void  wrapup();

void  wrapup_hogs();

 

 

int main(int argc, char *argv[])

{

       input_parm_file = argv[1];

       init();

 

       /* get name of next log file */

       while (fscanf(fptr1,"%s",curr_file) != EOF)

       {

              if (strstr(curr_file,"%") != NULL)       {continue;}

              if (entire_period == 0)                     {check_file_date();}

              if (file_date_status != 0)           {continue;} /* too old */

 

              rc = create_record_stream();

              if (rc > 0) {continue;}

              omit_header();

 

/* to find file clobbering buffer (when "not found" becomes email address) */

printf("curr_file = %s\n",curr_file);

printf("buffer = %s\n\n",buffer);

 

              rec_too_new = 0;

              while (fscanf(fptr2,"%s%s%s",ip,ident,authuser)

                     != EOF && rec_too_new == 0)

              {

                     /* IP field can contain an unlimited number of leading nulls.

                        Corruption or even coring can occur.  Skip the record.

                     */

                     if (ip[0] < '!') {null_found = 1;}

                                  

                     /* authuser can sometimes consist of multiple fields,

                        so read fields til definitely at timestamp1

                     */

                     if (fscanf(fptr2,"%s",timestamp1) == EOF) {break;}

 

                     while (timestamp1[0] != '[') {if (fscanf(fptr2,"%s",timestamp1) == EOF) break;}

 

                     /* Ignore timestamp1 if it has been partially clobbered by new rec.

                        Use current ip rather than ip of new rec (hard to extract from

                        clobbered timestamp1 field).

                     */

 

                     if (strlen(timestamp1) != 21 || strstr(timestamp1,".") != NULL)

                     {

                            if (fscanf(fptr2,"%s%s",ident,authuser) == EOF) {break;}

                            if (fscanf(fptr2,"%s",timestamp1) == EOF) {break;}

                            while (timestamp1[0] != '[') {if (fscanf(fptr2,"%s",timestamp1) == EOF) break;}

                     }     

 

                     if (fscanf(fptr2,"%s",timestamp2) == EOF) {break;}

 

                     /* Read http fields til last char of field read is ".

                        If more than one, second is request.  There can be as

                        few as one or two fields, as many as 3 or more.

 

                        URL *can* contain blanks + stuff, throwing off expected field sequence.

                        For example, "GET /cgi-bin/Stats/stats.cgi [ <a href= HTTP/1.0" shows up

                        in the log record, and has 3 extraneous "fields" before "HTTP".

                      */

 

                     work_count = 0;

                     if (fscanf(fptr2,"%s",file2_buffer) == EOF) {break;}

                     while (file2_buffer[strlen(file2_buffer)-1] != '"')

                     {

                            work_count++;

                            if (work_count == 2)

                            {

                                   if (strlen(file2_buffer) > 255)

                                   {

                                          memcpy(file2,file2_buffer,255);

                                          memset(file2+255,'\0',1);

                                   }

                                   else {strcpy(file2,file2_buffer);}

                            }

                            if (fscanf(fptr2,"%s",file2_buffer) == EOF) {break;}

                     }

 

                     if (fscanf(fptr2,"%s%s",status,work_bytes) == EOF) {break;}

                     if   (strcmp(work_bytes,"-") == 0)

                            {bytes = 0;}

                     else       {bytes = atol(work_bytes);}

 

/*printf("%s\n",timestamp1);

*/

                     if       (strstr(file2,"default.ida") != NULL)       {code_red_found = 1;}

 

                     if       (null_found == 0 && code_red_found == 0)

                     {

                            update_strucs();

                     }

                     else

                     {

                            null_found = 0;

                            code_red_found = 0;

                     }

              }

              strcpy(work_string1,"rm ");

              strcat(work_string1,work_file_string);

              system(work_string1);

       }

 

/*       strcpy(work_string1,"rm ");

       strcat(work_string1,file_list_string);

       system(work_string1);

*/

 

       close_some_files();

       do_reports();

       wrapup();

 

/*       print_time("ending");

*/

       return (0);

}

 

 

void init()

{

/*       print_time("starting");

*/

       prepare_parm_input();

       prepare_filename_suffix();

       prepare_filename_strings();

       prepare_file_list();

       open_some_files();

       if (entire_period == 0) {get_dates();}

       set_num_dates();

       init_report5_hash();

       init_report7_array();

 

       memset(blank_string,' ',255);

       memset(blank_string+255,'\0',1);

       strcpy(report_work,blank_string);

 

       memset(blank_ip,' ',15);

       memset(blank_ip+15,'\0',1);

       strcpy(domain,blank_string);

       strcpy(domain,get_value("login"));

 

       strcpy(r3_string,get_value("nameaccessfile"));

}

 

 

void get_time()

{

       lt = time(NULL);

       time_ptr = localtime(&lt);

}

 

 

void print_time(char * msg)

{

       get_time();

       printf("%s at %s<br>",msg,asctime(time_ptr));

}

 

 

void prepare_parm_input()

{

       /*

              Input parameters containing stats form fields and the path to the

              stats data are contained in the input parm file provided by the

              stats engine wrapper.  Open the file, read all the records, and

              concatenate the data into one long string.  This string will be

              equivalent to a query string that has been url-decoded.

 

              The get_value subroutine will take this string and extract values

              for specified input variables.

       */

 

       if ((fptr17 = fopen(input_parm_file,"r")) == NULL  ) 

       {

              printf("can't open %s\n",input_parm_file);

              exit(1);

       }

 

       strcpy(buffer,"\0");

       while (fscanf(fptr17,"%s",parm_line) != EOF)

       {

              strcat(buffer,parm_line);

              strcat(buffer,"&");

       }

 

       if (strcmp(get_value("period"),"entire") == 0)       {entire_period = 1;}

 

       if (strcmp(get_value("usedailytotal"),"on") == 0               /* if report 5 requested */

              || strcmp(get_value("useperiodtotal"),"on") == 0        /* or report 4 requested */

              || strcmp(get_value("usedailyavg"),"on") == 0)         /* or report 6 requested */

       {r5_data_needed = 1;}

}

 

 

void prepare_filename_suffix()

{

       strcat(pid,get_value("login"));         /* use for unique filenames */

       strcat(pid,"_");

       get_random_suffix();

       /* Shouldn't have more than one for an account at a time, but just in case */

       strcat(pid,random_suffix_char);         /* use for unique filenames */

}

 

 

void get_random_suffix()

{

       int stime, random_suffix;

       long ltime;

 

       ltime = time(NULL);

       stime = (unsigned) ltime/2;

       srand(stime);

       random_suffix = rand()%10;

 

       switch (random_suffix)

       {

              case 0: strcpy(random_suffix_char,"0"); break;

              case 1: strcpy(random_suffix_char,"1"); break;

              case 2: strcpy(random_suffix_char,"2"); break;

              case 3: strcpy(random_suffix_char,"3"); break;

              case 4: strcpy(random_suffix_char,"4"); break;

              case 5: strcpy(random_suffix_char,"5"); break;

              case 6: strcpy(random_suffix_char,"6"); break;

              case 7: strcpy(random_suffix_char,"7"); break;

              case 8: strcpy(random_suffix_char,"8"); break;

              case 9: strcpy(random_suffix_char,"9"); break;

              default:                          break;

       }

}

 

 

void prepare_filename_strings()

{

       strcpy(error_code_file_string,prepare_filename_string(error_code_file_string));

       strcpy(file_list_string,prepare_filename_string(file_list_string));

       strcpy(ip_date_file_string,prepare_filename_string(ip_date_file_string));

       strcpy(outfile_string,prepare_filename_string(outfile_string));

       strcpy(r5sorted_ip_date_file_string,prepare_filename_string(r5sorted_ip_date_file_string));

       strcpy(report_file_string,prepare_filename_string(report_file_string));

       strcpy(report1_ip_file_string,prepare_filename_string(report1_ip_file_string));

       strcpy(report2_file_file_string,prepare_filename_string(report2_file_file_string));

       strcpy(report3_ip_file_string,prepare_filename_string(report3_ip_file_string));

       strcpy(report9_file_file_string,prepare_filename_string(report9_file_file_string));

       strcpy(sorted_error_code_file_string,prepare_filename_string(sorted_error_code_file_string));

       strcpy(sorted_report_file_string,prepare_filename_string(sorted_report_file_string));

       strcpy(uniq_ip_count_string,prepare_filename_string(uniq_ip_count_string));

       strcpy(uniq_ip_list_string,prepare_filename_string(uniq_ip_list_string));

       strcpy(uniq_item_count_string,prepare_filename_string(uniq_item_count_string));

       strcpy(uniq_item_list_string,prepare_filename_string(uniq_item_list_string));

       strcpy(work_file_string,prepare_filename_string(work_file_string));

}

 

 

char *prepare_filename_string(char *filename_string)

{

       strcpy(work_string1,cgipath);

       strcat(work_string1,filename_string);

       strcat(work_string1,pid);

       return(work_string1);

 

}

 

 

void prepare_file_list()

{

       strcpy(work_string1,"ls ");

/*       strcat(work_string1,get_value("path"));

*/       strcat(work_string1,"*access* > ");

       strcat(work_string1,file_list_string);

       system(work_string1);

}

 

 

void open_some_files()

{

       if ((fptr1  = fopen(file_list_string,"r")) == NULL)          {printf("can't open file_list\n");exit(1);}

       if ((fptr3  = fopen(outfile_string,"w")) == NULL)             {printf("can't open outfile\n");exit(1);}

       if ((fptr15 = fopen(report3_ip_file_string,"w")) == NULL)       {printf("can't open report3_ip_file\n");exit(1);}

       if ((fptr16 = fopen(report9_file_file_string,"w")) == NULL)       {printf("can't open report9_file_file\n");exit(1);}

       if (r5_data_needed == 1)

       {

              if ((fptr10 = fopen(ip_date_file_string,"w")) == NULL)

              {

                     printf("can't open ip_date_file\n");

                      exit(1);

              }

              file10_open = 1;

       }

       else if (strcmp(get_value("useclients"),"on") == 0)         /* if report 1 requested */

       {

              if ((fptr11 = fopen(report1_ip_file_string,"w")) == NULL)

              {

                     printf("can't open report1_ip_file\n");

                      exit(1);

              }

              file11_open = 1;

       }

 

       if (strcmp(get_value("usefiles"),"on") == 0)            /* if report 2 requested */

       {

              if ((fptr12 = fopen(report2_file_file_string,"w")) == NULL)

              {

                     printf("can't open report2_file_file\n");

                      exit(1);

              }

              file12_open = 1;

       }

}

 

 

void set_num_dates()

{

       if       (entire_period == 0)       {estimate_num_dates_specific();}

       else                        {estimate_num_dates_entire();}

}

 

 

void estimate_num_dates_specific()

{

       /*

              6/98 - 6/98 ->  1 month

              6/97 - 6/98 -> 13 months

              7/97 - 6/98 -> 12 months

              5/97 - 6/98 -> 14 months

       */

 

       long work;

       int year_diff, month_diff, startmonth, endmonth;

 

       /* find number of months in range, multiply by 31 */

 

       year_diff = atoi(end_year) - atoi(start_year);

 

       startmonth = atoi(start_month);

       endmonth   = atoi(end_month);

       month_diff = endmonth - startmonth;

       num_dates  = 31 * ((12 * year_diff) + month_diff + 1);

 

       /* find date_bias (see comment in hash_date subroutine) */

 

       memcpy(rec_year_long,start_year,4);

       memset(rec_year_long + 4,'\0',1);

       strcpy(rec_month,start_month);

       strcpy(rec_day,"01");

 

       work = (long) (372 * atoi(rec_year_long) + 31 * (atoi(rec_month) - 1) + atoi(rec_day));

       date_bias = work % (long) num_dates;

}

 

 

long month_num(char *month_nam)

{

       if (strcmp("Jan",month_nam) == 0) {return (1);}

       if (strcmp("Feb",month_nam) == 0) {return (2);}

       if (strcmp("Mar",month_nam) == 0) {return (3);}

       if (strcmp("Apr",month_nam) == 0) {return (4);}

       if (strcmp("May",month_nam) == 0) {return (5);}

       if (strcmp("Jun",month_nam) == 0) {return (6);}

       if (strcmp("Jul",month_nam) == 0) {return (7);}

       if (strcmp("Aug",month_nam) == 0) {return (8);}

       if (strcmp("Sep",month_nam) == 0) {return (9);}

       if (strcmp("Oct",month_nam) == 0) {return (10);}

       if (strcmp("Nov",month_nam) == 0) {return (11);}

       if (strcmp("Dec",month_nam) == 0) {return (12);}

/*       printf("%s %s bad month name = %s<br>",get_value("login"),timestamp1,month_name);

*/     return (13);

}

 

 

char * month_name(long month_number)

{

       if (month_number == 1)  {return "Jan";}

       if (month_number == 2)  {return "Feb";}

       if (month_number == 3)  {return "Mar";}

       if (month_number == 4)  {return "Apr";}

       if (month_number == 5)  {return "May";}

       if (month_number == 6)  {return "Jun";}

       if (month_number == 7)  {return "Jul";}

       if (month_number == 8)  {return "Aug";}

       if (month_number == 9)  {return "Sep";}

       if (month_number == 10) {return "Oct";}

       if (month_number == 11) {return "Nov";}

       if (month_number == 12) {return "Dec";}

/*       printf("bad month number = %s<br>",month_num);

*/     return "Jan";

}

 

 

void estimate_num_dates_entire()

{

       /*

       For "entire" period, allow for two full years of dates.  Since the most

       recent date is yesterday, the date range begins two years ago today.

 

       To find the date_bias (see hash_date subroutine for comments), just find the

       number of days so far this year.

 

       Why is it so simple?  Suppose today is 9/16/98.  The normal ("specific")

       date bias calculation would produce "work" = (365 * 98) + (31 * 9) + 16 = 36065,

       which would be reduced to 295 via modulo 730 (730 is num_dates, 2 * 365).

       295 just happens to be 31*9 + 16, the number of days so far this year.

       So we don't need the year, nor do we need to modulo.

 

       p.s. Don't sweat leap year.

       */

 

       num_dates = 744; /* Since 372 used in hash_date, double it here, else bad things. */

/*       num_dates = 730;

*/

       get_time();

 

       date_bias = (long) (31 * (time_ptr->tm_mon) + time_ptr->tm_mday);

/*       date_bias = (long) (31 * (time_ptr->tm_mon+1) + time_ptr->tm_mday);

*/

}

 

 

void get_dates()

{

       strcpy(start_day,  get_value("startday"));

       strcpy(start_month,get_value("startmonth"));

       strcpy(start_year, get_value("startyear"));

       strcpy(end_day,    get_value("endday"));

       strcpy(end_month,  get_value("endmonth"));

       strcpy(end_year,   get_value("endyear"));

 

       int_start_year       = atoi(start_year);

       int_start_month       = atoi(start_month);

       int_start_day       = atoi(start_day);

       int_end_year = atoi(end_year);

       int_end_month       = atoi(end_month);

       int_end_day  = atoi(end_day);

 

       strcpy(start_date,month_name(int_start_month));

       strcat(start_date," ");

       strcat(start_date,start_day);

       strcat(start_date," ");

       strcat(start_date,start_year);

       strcpy(end_date,month_name(int_end_month));

       strcat(end_date," ");

       strcat(end_date,end_day);

       strcat(end_date," ");

       strcat(end_date,end_year);

}

 

 

void init_report5_hash()

{

       register long i;

 

       if ((report5_hash = (struct struc_3 *) malloc(num_dates * sizeof(struct struc_3))) == NULL)

       {

              printf("error allocating report5 hash - aborting");

              exit(1);

       }

 

       for (i = 0; i < num_dates; i++)

       {

              strcpy(report5_hash[i].s3_date,blank_date);

              report5_hash[i].s3_clients              = 0;

              report5_hash[i].s3_hits                 = 0;

              report5_hash[i].s3_page_hits            = 0;

              report5_hash[i].s3_kbytes        = 0;

              report5_hash[i].s3_next_hash_entry     = NULL;

       }

}

 

 

void init_report7_array()

{

       long i;

 

       if ((report7_array = (struct struc_4 *) malloc(24 * sizeof(struct struc_4))) == NULL)

       {

              printf("error allocating report7_array - aborting");

              exit(1);

       }

 

       for (i = 0; i < 24; i++)

       {

              report7_array[i].s4_hits = 0;

              report7_array[i].s4_kbytes = 0;

       }

 

       strcpy(report7_array[0].s4_time, "Midnight to   1 AM  ");

       strcpy(report7_array[1].s4_time, "  1 AM   to   2 AM  ");

       strcpy(report7_array[2].s4_time, "  2 AM   to   3 AM  ");

       strcpy(report7_array[3].s4_time, "  3 AM   to   4 AM  ");

       strcpy(report7_array[4].s4_time, "  4 AM   to   5 AM  ");

       strcpy(report7_array[5].s4_time, "  5 AM   to   6 AM  ");

       strcpy(report7_array[6].s4_time, "  6 AM   to   7 AM  ");

       strcpy(report7_array[7].s4_time, "  7 AM   to   8 AM  ");

       strcpy(report7_array[8].s4_time, "  8 AM   to   9 AM  ");

       strcpy(report7_array[9].s4_time, "  9 AM   to  10 AM  ");

       strcpy(report7_array[10].s4_time," 10 AM   to  11 AM  ");

       strcpy(report7_array[11].s4_time," 11 AM   to  12 PM  ");

       strcpy(report7_array[12].s4_time," 12 PM   to   1 PM  ");

       strcpy(report7_array[13].s4_time,"  1 PM   to   2 PM  ");

       strcpy(report7_array[14].s4_time,"  2 PM   to   3 PM  ");

       strcpy(report7_array[15].s4_time,"  3 PM   to   4 PM  ");

       strcpy(report7_array[16].s4_time,"  4 PM   to   5 PM  ");

       strcpy(report7_array[17].s4_time,"  5 PM   to   6 PM  ");

       strcpy(report7_array[18].s4_time,"  6 PM   to   7 PM  ");

       strcpy(report7_array[19].s4_time,"  7 PM   to   8 PM  ");

       strcpy(report7_array[20].s4_time,"  8 PM   to   9 PM  ");

       strcpy(report7_array[21].s4_time,"  9 PM   to  10 PM  ");

       strcpy(report7_array[22].s4_time," 10 PM   to  11 PM  ");

       strcpy(report7_array[23].s4_time," 11 PM   to Midnight");

}

 

 

void check_file_date()

{

       char *date_pointer;

       char  file_year[3];

       int   file_year_int;

       char  file_month[3];

       int   file_month_int;

       char  file_day[3];

       int   file_day_int;

 

       if ((date_pointer = strstr(curr_file,"-")) == NULL)  /* can't tell */

       {file_date_status = 0; return;}

 

       memcpy(file_year,date_pointer+1,2);

       memset(file_year+2,'\0',1);

       file_year_int = atoi(file_year);

       if        (file_year_int > 90)       {file_year_int += 1900;}

       else                        {file_year_int += 2000;}

 

       memcpy(file_month,date_pointer+3,2);

       memset(file_month+2,'\0',1);

       file_month_int = atoi(file_month);

 

       memcpy(file_day,date_pointer+5,2);

       memset(file_day+2,'\0',1);

       file_day_int = atoi(file_day);

 

       if (file_year_int < atoi(start_year)) {file_date_status = -1; return;}

 

       if (file_year_int == atoi(start_year))

       {

              if ((file_month_int < atoi(start_month))

              ||  (file_month_int == atoi(start_month) && file_day_int < atoi(start_day)))

              { file_date_status = -1; return;}

       }

       file_date_status = 0;

}

 

 

int create_record_stream()

{

       strcpy(cat_string,blank_string);

 

       if        (strstr(curr_file,".gz") != NULL)

       {

              strcpy(cat_string,"/usr/bin/gzcat ");

       }

       else

       {

              strcpy(cat_string,"cat ");

       }

 

       strcat(cat_string,curr_file);

       strcat(cat_string," >");

       strcat(cat_string,work_file_string);   /* overwrite work file if it exists */

       system(cat_string);

 

       if ((fptr2 = fopen(work_file_string,"r")) == NULL)

       {

              printf("can't open work_file\n");

               exit(1);

       }

       return 0;

}

 

 

void omit_header()

{

       /* Enterprise server puts a one-line header at the start of a log file.

          If it's there, omit it.

       */

       fscanf(fptr2,"%s",header_work);

       if(strcmp(header_work,"") == 0) /* Empty file */

       {return;}

 

       if (strstr(header_work,"format=%Ses->client.ip%") != NULL)

       {

              fscanf(fptr2,"%s%s%s%s%s%s",header_work,header_work,header_work,header_work,header_work,header_work);

       }

       else

       {

              fscanf(fptr2,"%s",header_work2);       /* Don't know why it would, but...*/

              if (strcmp(header_work2,"-") != 0)  /* If *2nd* fscanf got ip field */

              {

                     strcpy(ip,header_work2);

                     fscanf(fptr2,"%s",ident);

              }

              else

              {

                     strcpy(ip,header_work);

                     strcpy(ident,header_work2);

              }

 

              fscanf(fptr2,"%s",authuser);

 

              /* authuser can sometimes consist of multiple fields,

                 so read fields til definitely at timestamp1

               */

              fscanf(fptr2,"%s",timestamp1);

              while (timestamp1[0] != '[')

              { fscanf(fptr2,"%s",timestamp1);}

 

              fscanf(fptr2,"%s",timestamp2);

 

              /* Read http fields til last char of field read is ".

                 If more than one, second is request.  There can be as

                 few as one or two fields, as many as 3 or more.

 

                 URL *can* contain blanks + stuff, throwing off expected field sequence.

                 For example, "GET /cgi-bin/Stats/stats.cgi [ <a href= HTTP/1.0" shows up

                 in the log record, and has 3 extraneous "fields" before "HTTP".

               */

 

              work_count = 0;

              fscanf(fptr2,"%s",file2_buffer);

             

              while (file2_buffer[strlen(file2_buffer)-1] != '"')

              {

 

                     work_count++;

                     if (work_count == 2)

                     {

                            if (strlen(file2_buffer) > 255)

                            {

                                   memcpy(file2,file2_buffer,255);

                                   memset(file2+255,'\0',1);

                            }

                            else {strcpy(file2,file2_buffer);}

                     }

                     fscanf(fptr2,"%s",file2_buffer);

              }

 

              fscanf(fptr2,"%s%s",status,work_bytes);

              if   (strcmp(work_bytes,"-") == 0)

                     {bytes = 0;}

              else       {bytes = atol(work_bytes);}

 

              if       (strstr(file2,"default.ida") != NULL)       {code_red_found = 1;}

 

              if       (null_found == 0 && code_red_found == 0)

              {

                     update_strucs();

              }

              else

              {

                     null_found = 0;

                     code_red_found = 0;

              }

       }

}

 

 

void update_strucs()

{

       long i;

 

       parse_input();

 

       if        (entire_period == 0)     {i = check_date_range();}

       else                        {i = 0;}

 

       if        (i < 0)       {return;}                  /* rec too old */

       else if     (i > 0)       {rec_too_new = 1; return;}       /* rec too new, done with file */

 

       if (atoi(status) > 399)       {reports8and9_ongoing(); return;}

 

       record_count++;

/*     if (record_count > 1000000)

       {

              wrapup_hogs();*/ /* shunt request to hogs queue */

/*            printf("hog found -- %s\n",get_value("login"));

              exit(10);

       }

*/

       /* cut recs for report input files, if reports requested */

 

       if (r5_data_needed == 1)

       {

              fprintf(fptr10,"%s %s\n",ip,rec_date);

              report5_ongoing();         /* needed if r4, r5, or r6 requested */

       }

       else if (strcmp(get_value("useclients"),"on") == 0)  /* if report 1 requested */

       {

              fprintf(fptr11,"%s\n",ip);

       }

 

       if (strcmp(get_value("usefiles"),"on") == 0)            /* if report 2 requested */

       {

              fprintf(fptr12,"%s\n",file2);

       }

 

       if (strcmp(get_value("usehourly"),"on") == 0)            {report7_ongoing();}

       if (strcmp(get_value("useclientaccess"),"on") == 0)     {report3_ongoing();}

}

 

 

long check_date_range()

{

       /* If log rec date before start date */

 

       if ((rec_year_int < int_start_year)

       || (rec_year_int == int_start_year && rec_month_int < int_start_month)

       || (rec_year_int == int_start_year && rec_month_int == int_start_month && rec_day_int < int_start_day))

       {

              return -1;

       }

 

       /* If log rec date after end date */

 

       else if ((rec_year_int > int_end_year)

       || (rec_year_int == int_end_year && rec_month_int > int_end_month)

       || (rec_year_int == int_end_year && rec_month_int == int_end_month && rec_day_int > int_end_day))

       {

              return 1;

       }

 

       else return 0;

}

 

 

void parse_input()

{

       char work_hour[3];

 

       memcpy(rec_date,timestamp1+1,11);

       memset(rec_date + 11,'\0',1);

 

       memcpy(work_hour,timestamp1+13,2);

       memset(work_hour + 2,'\0',1);

       rec_hour = atoi(work_hour);

 

       memcpy(rec_year_long,rec_date+7,4);

       memset(rec_year_long+4,'\0',1);

       rec_year_int = atoi(rec_year_long);

 

       memcpy(rec_month,rec_date+3,3);

       memset(rec_month+3,'\0',1);

       rec_month_int = month_num(rec_month);

 

       memcpy(rec_day,rec_date,2);

       memset(rec_day+2,'\0',1);

       rec_day_int = atoi(rec_day);

 

       if (record_count > 0)

       {

              if (rec_year_int  < oldest_year

              || (rec_year_int == oldest_year && rec_month_int  < oldest_month)

              || (rec_year_int == oldest_year && rec_month_int == oldest_month && rec_day_int < oldest_day))

              {

                     oldest_year         = rec_year_int;

                     oldest_month        = rec_month_int;

                     oldest_day          = rec_day_int;

                     strcpy(char_oldest_year,rec_year_long);

                     strcpy(char_oldest_day,rec_day);

              }

 

              if (rec_year_int  > youngest_year

              || (rec_year_int == youngest_year && rec_month_int  > youngest_month)

              || (rec_year_int == youngest_year && rec_month_int == youngest_month && rec_day_int > youngest_day))

              {

                     youngest_year              = rec_year_int;

                     youngest_month             = rec_month_int;

                     youngest_day        = rec_day_int;

                     strcpy(char_youngest_year,rec_year_long);

                     strcpy(char_youngest_day,rec_day);

              }

       }

       else

       {

              oldest_year         = rec_year_int;

              oldest_month        = rec_month_int;

              oldest_day          = rec_day_int;

              strcpy(char_oldest_year,rec_year_long);

              strcpy(char_oldest_day,rec_day);

              youngest_year              = rec_year_int;

              youngest_month             = rec_month_int;

              youngest_day        = rec_day_int;

              strcpy(char_youngest_year,rec_year_long);

              strcpy(char_youngest_day,rec_day);

       }

}

 

 

void close_some_files()

{

       if (file10_open) {fclose(fptr10);}

       if (file11_open) {fclose(fptr11);}

       if (file12_open && fptr12 != fptr11) {fclose(fptr12);}

       fclose(fptr15);

       if (fptr16 != fptr6) {fclose(fptr16);}

}

 

 

void do_reports()

{

       print_header();

 

       if (record_count == 0)       {print_sorry(); return;}

 

       print_time("starting report 5");

       if (r5_data_needed == 1)                           {report5();}

 

       if (strcmp(get_value("useperiodtotal"),"on") == 0       /* report 4 requested */

       ||  strcmp(get_value("useclients"),"on") == 0)            /* report 1 requested */

                                                        {reports1and4_setup();}

 

       print_time("starting report 4");

       if (strcmp(get_value("useperiodtotal"),"on") == 0)       {report4();}

 

       print_time("starting report 6");

       if (strcmp(get_value("usedailyavg"),"on") == 0)         {report6();}

 

       print_time("starting report 7");

       if (strcmp(get_value("usehourly"),"on") == 0)            {report7();}

 

       print_time("starting report 1");

       if (strcmp(get_value("useclients"),"on") == 0)         {report_num = 1; report();}

 

       print_time("starting report 2");

       if (strcmp(get_value("usefiles"),"on") == 0)            {report_num = 2; report();}

 

       print_time("starting report 3");

       if (strcmp(get_value("useclientaccess"),"on") == 0)       {report_num = 3; report();}

 

       print_time("starting report 8");

 

                                                        {report8();}

       print_time("starting report 9");

 

                                                        {report_num = 9; report();}

                                                        {print_glossary();}

}

 

 

void print_sorry()

{

       fprintf(fptr3,"\nWe're sorry, but no records fall in the date range specified.\n");

}

 

 

void print_header()

{

       fprintf(fptr3,"From: statsmaster@simplenet.com\n");

       fprintf(fptr3,"Subject: Your Statistics Report\n");

       fprintf(fptr3,"Reply-to: statsmaster@simplenet.com\n");

 

 

/*       get_time();

       fprintf(fptr3,"Date: %s\n\n",asctime(time_ptr));

*/    

       fprintf(fptr3,"\t\t\t   %s\n","SimpleNet Statistics");

       fprintf(fptr3,"%s",dashed_line);

 

       if (entire_period == 1 && record_count > 0)

       {

              strcpy(start_date,month_name(oldest_month));

              strcat(start_date," ");

              strcat(start_date,char_oldest_day);

              strcat(start_date," ");

              strcat(start_date,char_oldest_year);

              strcpy(end_date,month_name(youngest_month));

              strcat(end_date," ");

              strcat(end_date,char_youngest_day);

              strcat(end_date," ");

              strcat(end_date,char_youngest_year);

       }

 

       if (entire_period == 0 || record_count > 0)

       {

              fprintf(fptr3,"Access Report for %s to %s\n\n",start_date,end_date);

       }

       else

       {

              fprintf(fptr3,"Access Report for entire period\n\n");

       }

}

 

 

void reports1and4_setup()

{

       /* If report 5 data not needed, report1_ip_file was created earlier. */

 

       strcpy(work_string1,"cut -f1 -d\" \" < ");

       strcat(work_string1,r5sorted_ip_date_file_string);

       strcat(work_string1," > ");

       strcat(work_string1,report1_ip_file_string);

       strcpy(work_string2,"sort ");

       strcat(work_string2,report1_ip_file_string);

       strcat(work_string2," | uniq > ");

       strcat(work_string2,uniq_ip_list_string);

       strcpy(work_string3,"wc -l ");

       strcat(work_string3,uniq_ip_list_string);

       strcat(work_string3," > ");

       strcat(work_string3,uniq_ip_count_string);

       strcpy(work_string4,"rm ");

       strcat(work_string4,uniq_ip_list_string);

       strcpy(work_string5,"rm ");

       strcat(work_string5,r5sorted_ip_date_file_string);

 

       if (r5_data_needed == 1)       {system(work_string1);}

 

       system(work_string2);

       system(work_string3);

       system(work_string4);

       get_ip_count();

 

       if (r5_data_needed == 1)

       {

              system(work_string5);

              r5sorted_ip_date_file_removed = 1;

       }

}

 

 

void get_ip_count()

{

       strcpy(work_string1,"rm ");

       strcat(work_string1,uniq_ip_count_string);

 

       if ((fptr4 = fopen(uniq_ip_count_string,"r")) == NULL)

       {

              printf("can't open uniq_ip_count\n");

               exit(1);

       }

 

       fscanf(fptr4,"%ld",&num_clients);

       system(work_string1);

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report3_ongoing()

{

       if (check_for_match(file2,r3_string,"any"))

       {

              fprintf(fptr15,"%s\n",ip);

       }

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report5_ongoing()

{

       update_report5_hash(rec_date);

}

 

 

void update_report5_hash(char *work_date)

{

       struct struc_3 *old, *start;

       long done;

 

       indexx = hash_date(work_date);

 

       if (strcmp(report5_hash[indexx].s3_date,blank_date) == 0)     /* If hash table entry is unoccupied */

       {

              strcpy(report5_hash[indexx].s3_date,work_date);

              report5_hash[indexx].s3_hits++;

              if (page_hit(file2) == 1)

              {

                     report5_hash[indexx].s3_page_hits++;

              }

              report5_hash[indexx].s3_kbytes += (float)bytes / 1000;

       }

       else

       if (strcmp(report5_hash[indexx].s3_date,work_date) == 0)/* Entry occupied, input date matches entry's */

       {

              report5_hash[indexx].s3_hits++;

              if (page_hit(file2) == 1)

              {

                     report5_hash[indexx].s3_page_hits++;

              }

              report5_hash[indexx].s3_kbytes += (float)bytes / 1000;

       }

       else       /* Search chain for match -- if none, add entry to end of chain */

       {

              old   = &report5_hash[indexx];

              start = report5_hash[indexx].s3_next_hash_entry;

              done  = 0;

              while (start != NULL && !done)

              {

                     if (strcmp(start->s3_date,work_date) == 0)

                     {

                            start->s3_hits++;

                            if (page_hit(file2) == 1)

                            {

                                   start->s3_page_hits++;

                            }

                            start->s3_kbytes += (float)bytes / 1000;

                            done = 1;

                     }

                     else

                     {

                            old = start;

                            start = start->s3_next_hash_entry;

                     }

              }

              if (!done)

              {

                     start = (struct struc_3 *) malloc (sizeof(struct struc_3));

                     if (!start)

                     {

                            printf("out of memory\n");

                            return;

                     }

                     strcpy(start->s3_date,work_date);

                            start->s3_hits++;

                            if (strstr(file2,".htm") != NULL)

                            {

                                   start->s3_page_hits++;

                            }

                            start->s3_kbytes += (float)bytes / 1000;

                     start->s3_next_hash_entry = NULL;

                     old->s3_next_hash_entry = start;

              }

       }

}

 

 

long page_hit(char *field)

{

       long i;

       char *ptr = field + strlen(field);      /* string must be at end of filename */

       char ptr2[5];                    /* long enough to hold longest suffix + null */

 

       for (i = 0; i < 4; i++)

       {

              memset(ptr2 + i,tolower(*(ptr - 4 + i)),1);

       }

       memset(ptr2 + i,'\0',1);

 

       if (strcmp((char *)(ptr2 + 1),"htm")       == 0       /* includes shtm */

       ||  strcmp((char *)(ptr2),"html")    == 0       /* includes shtml */

       ||  strcmp((char *)(ptr2),".hts")    == 0

       ||  strcmp((char *)(ptr2 + 1),".mv")       == 0

       ||  strcmp((char *)(ptr2 + 3),"/")       == 0)

              return 1;

       else       return 0;

}

 

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report7_ongoing()

{

       report7_array[rec_hour].s4_hits++;

       report7_array[rec_hour].s4_kbytes += (float) bytes / 1000;

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void reports8and9_ongoing()

{

       if        (strcmp(status,"400") == 0)       {e400_count++; fprintf(fptr16,"%s\n",file2);}

       else if       (strcmp(status,"401") == 0)       {e401_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"402") == 0)       {e402_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"403") == 0)       {e403_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"404") == 0)       {e404_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"405") == 0)       {e405_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"406") == 0)       {e406_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"410") == 0)       {e410_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"500") == 0)       {e500_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"501") == 0)       {e501_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"502") == 0)       {e502_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"503") == 0)       {e503_count++; fprintf(fptr16,"%s\n",file2);}

       else if (strcmp(status,"504") == 0)       {e504_count++; fprintf(fptr16,"%s\n",file2);}

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report()

{

/*     long i;

*/

       /*

               open uniq_item_count file (based on all items that occurred, with no duplicates)

              create hash table, sized as function of number of unique items

              open item file (all items that occurred, with duplicates)

              for each item,

                     map it into hash table

                     if it's already there, update the count (may have to traverse chain)

                     else add entry with hash_value, item, count=1, next_ptr

              endfor

       */

 

       report_setup();

       init_report_hash();

       init_report_array();

       open_report_input_file();

       populate_report_hash();

 

/*     for (i = 0; i < scaling_factor * num_uniq_items; i++)

       {

              print_report_hash(report_hash[i],i);

       }

*/

       create_report_file();

       create_report();

       free(report_hash);

       free(report_array);

 

}

 

 

void report_setup()

{

       if (report_num != 1)

       {

              strcpy(sort_string,blank_string);

              strcpy(sort_string,"sort < ");

       }

 

       report_limit = 0;

       switch (report_num)

       {

              case  1: report1_setup();    break;

              case  2: report2_setup();    break;

              case  3: report3_setup();    break;

              case  9: report9_setup();    break;

              default:                   break;

       }

 

       if (report_num != 1)         /* For r1 we've already done this. */

       {

              strcat(sort_string,report_input_file);

              strcat(sort_string," | uniq > ");

              strcat(sort_string,uniq_item_list_string);

              system(sort_string);

              strcpy(work_string1,"wc -l ");

              strcat(work_string1,uniq_item_list_string);

              strcat(work_string1," > ");

              strcat(work_string1,uniq_item_count_string);

              system(work_string1);

              get_count();

              strcpy(work_string1,"rm ");

              strcat(work_string1,uniq_item_list_string);

              system(work_string1);

       }

}

 

 

void report1_setup()

{

       char x[16] = "\0";

 

       fqdn_limit   = 40;

 

       strcpy(x,get_value("numclients"));

       if (strcmp(x,"") == 0)       {r1_limit = 1000;}

       else                 {r1_limit = atoi(x);}

       report_limit = r1_limit;

 

       num_uniq_items       = num_clients;

       if (strcmp(get_value("useexclude"),"on") == 0)

       {

              strcpy(r1_exclude_string,get_value("clientexclude"));

       }

       strcpy(enterprise_string,enterprise_string1);

       strcpy(report_lit,r1_lit);

       strcpy(report_input_file,report1_ip_file_string);

}

 

 

void report2_setup()

{

       char x[256] = "\0";

 

       fqdn_limit   = 0;

 

       strcpy(x,get_value("numfiles"));

       if       (strcmp(x,"") == 0)       {r2_limit = 1000;}

       else if       (strlen(x) > 4)            {r2_limit = 32766;} /* Max 32767 for numfiles */

       else                        {r2_limit = atoi(x);}

      

       report_limit = r2_limit;

 

       if (strcmp(get_value("useexcludefile"),"on") == 0)

       {

              strcpy(r2_exclude_string,get_value("nameexcludefile"));

       }

 

       strcpy(enterprise_string,enterprise_string2);

       strcpy(report_lit,r2_lit);

       strcpy(report_input_file,report2_file_file_string);

}

 

 

void report3_setup()

{

       fqdn_limit   = 40;

       report_limit = r3_limit;

       tot_hits     = 0;

       strcpy(enterprise_string,enterprise_string1);

       strcpy(report_lit,r3_lit);

       strcpy(report_input_file,report3_ip_file_string);

}

 

 

void report9_setup()

{

       fqdn_limit   = 0;

       report_limit = r9_limit;

       strcpy(enterprise_string,enterprise_string2);

       strcpy(report_lit,r9_lit);

       strcpy(report_input_file,report9_file_file_string);

}

 

 

void get_count()

{

       strcpy(work_string1,"rm ");

       strcat(work_string1,uniq_item_count_string);

 

       if ((fptr6 = fopen(uniq_item_count_string,"r")) == NULL)

       {

              printf("can't open uniq_item_count\n");

               exit(1);

       }

 

       fscanf(fptr6,"%ld",&num_uniq_items);

       system(work_string1);

}

 

 

void init_report_hash()

{

       register long i;

 

       if ((report_hash = (struct struc_1 *)

              malloc(scaling_factor * num_uniq_items * sizeof(struct struc_1))) == NULL)

       {

              printf("error allocating report_hash - aborting");

              exit(1);

       }

 

       for (i = 0; i < scaling_factor * num_uniq_items; i++)

       {

              strcpy(report_hash[i].s1_string,blank_string);

              report_hash[i].s1_count                 = 0;

              report_hash[i].s1_next_hash_entry      = NULL;

       }

}

 

 

void init_report_array()

{

       register long i;

 

       if ((report_array = (struct struc_1 *)

              malloc(num_uniq_items * sizeof(struct struc_1))) == NULL)

       {

              printf("error allocating report_array - aborting");

              exit(1);

       }

 

       for (i = 0; i < num_uniq_items; i++)

       {

              strcpy(report_array[i].s1_string,blank_string);

              report_array[i].s1_count         = 0;

              report_array[i].s1_next_hash_entry     = NULL;

       }

}

 

 

void open_report_input_file()

{

       if ((fptr7 = fopen(report_input_file,"r")) == NULL)

       {

              printf("can't open %s\n",report_input_file);

               exit(1);

       }

}

 

 

void populate_report_hash()

{

       long  enterprise_seen = 0;

 

       report_index = 0;

 

       while (fscanf(fptr7,"%s",report_work) != EOF)

       {

              if (strstr(report_work,enterprise_string) == NULL)       /* omit enterprise record */

              {

                     if       (report_num == 1

                            && strcmp(get_value("useexclude"),"on") == 0

                            && check_for_match(report_work,r1_exclude_string,"left")){;}

                     else if (report_num == 2

                            && ((strcmp(get_value("useexcludefile"),"on") == 0

                                  && check_for_match(report_work,r2_exclude_string,"any"))                                 

                               || (strcmp(get_value("usehtmlonly"),"on") == 0

                                  && page_hit(report_work) == 0)))                 {;}

                     else       update_report_hash(report_work);

              }

              else if (enterprise_seen == 0)       /* only subtract once from count of unique items */

              {

                     num_uniq_items--;

                     enterprise_seen = 1;

              }

       }

       fclose(fptr7);

}

 

 

/*

       See if string 1 meets any criteria from string 2.  String 1 might be the IP field

       or the file field from a log record, while string 2 might be a list of one or

       more character strings, each of which could have one or more wild card aspects

       indicated by an asterisk.  List items are separated by commas.  String 2 might

       come in from a form field such as "exclude clients matching this IP address pattern"

       or "show all clients accessing this file" or "exclude files with this pattern."

 

       For example, suppose string 1 is a file such as "index.html" and string 2 is

       "in*.htm*,*.gif".  This subroutine must determine if string 1 satisfies "in*.htm*"

       OR "*.gif".  While checking against an individual string within the list in

       string 2, the subroutine must determine if string 1 satisfies EVERY part of

       that substring from string 2.  So, given "in*.htm*" as the substring from string 2,

       string 1 must contain "in" AND ".htm" in that order.

 

       Since "index.html" satisfies the first substring from string 2 ("in*.htm*"), the

       subroutine determines that there is a match and returns 1.  If no match can be

       found, 0 is returned.

 

       The third input parameter, "type," is either "left" or "any."  "Left" means the

       match must occur on the leftmost part of string 1, while "any" means that the match

       can occur starting anywhere in string 1.  Left matching is important when working

       with IP's.  E.g. if the IP in the log record is 123.132.209.5, and string 2 is

       "209*", without requiring a left-end match we would get a false positive, i.e.

       that there is a match when there really isn't as far as the user is concerned.

 

       Even if "left" is specified, if string 2 begins with an asterisk, e.g. *209,

       then it doesn't matter where a match occurs.  Also, a left match only matters

       for the first subsubstring of the substring from string 2.  A substring from

       string 2 has multiple subsubstrings when an asterisk has content to both sides

       within the substring.  E.g. given the substring "abc*ghi", there are two

       subsubstrings, "abc" and "ghi".  Even in a left match situation, once "abc"

       matches the leftmost part of string 1, then anything else in the substring

       can match anywhere in string 1 (to the right of "abc").  So, if string 1 is

       "abcdefghi" and string 2 is "abc*ghi,vw.x.yz*", the substring "abc*ghi" matches

       string 1.

*/

 

long check_for_match(char * string1, char * string2, char * type)

{

       char *curr_ptr;

       char *new_ptr;

       long  length;

       long  rc;

 

       curr_ptr = string2;

       while ((new_ptr = strstr(curr_ptr,",")) != NULL)

       {

              length = new_ptr - curr_ptr;

              memcpy(work_string1,curr_ptr,length);

              memset(work_string1 + length,'\0',1);

 

              rc = check_for_wildcards(string1,work_string1,type);

              if (rc == 1)       {return 1;}

 

              curr_ptr = new_ptr + 1;   /* go past comma to next substring */

       }

 

       rc = check_for_wildcards(string1,curr_ptr,type);

      

       if        (rc == 0)       {return 0;}

       else                 {return 1;}

}

 

 

long check_for_wildcards(char * string_1, char * string_2, char * type2)

{

       char *new_s1ptr;

       char *curr_s1ptr;

       char *new_s2ptr;

       char *curr_s2ptr;

       long  length2;

       long  first_or_only = 1;

 

       curr_s1ptr = string_1;

       curr_s2ptr = string_2;

 

       while ((new_s2ptr = strstr(curr_s2ptr,"*")) != NULL)

       {

              length2 = new_s2ptr - curr_s2ptr;

              memcpy(work_string2,curr_s2ptr,length2);

              memset(work_string2 + length2,'\0',1);

 

              if ((new_s1ptr = strstr(curr_s1ptr,work_string2)) == NULL)       {return 0;}

 

              if (strcmp(type2,"left") == 0

              &&  first_or_only == 1

              &&  string_2[0] != '*'

              &&  new_s1ptr != curr_s1ptr)       {return 0;}

 

              first_or_only = 0;

 

              curr_s1ptr = new_s1ptr + length2;

              curr_s2ptr = new_s2ptr + 1;     /* go past asterisk to next subsubstring */

       }

 

       if ((new_s1ptr = strstr(curr_s1ptr,curr_s2ptr)) == NULL)

              {return 0;}

       else if (strcmp(type2,"left") == 0

              &&  first_or_only == 1

              &&  string_2[0] != '*'

              &&  new_s1ptr != curr_s1ptr)

              {return 0;}

       else        {return 1;}

}

 

 

void create_report_file()

{

       long i;

 

       strcpy(work_string1,"sort +0 -1 -rn ");

       strcat(work_string1,report_file_string);

       strcat(work_string1," > ");

       strcat(work_string1,sorted_report_file_string);

       strcpy(work_string2,"rm ");

       strcat(work_string2,report_file_string);

 

       if ((fptr8 = fopen(report_file_string,"w")) == NULL)

       {

              printf("can't open report_file\n");

               exit(1);

       }

 

       /* Get count for each hash entry in use and write it to the report file.

          Report array entry's next_hash_entry is pointer not to a next entry

          in the report array, but to the *hash table* entry for this item.

          We're just re-using the hash table structure.  The count is unused.

       */

 

       for (i = 0; i < report_index; i++)

       {

              fprintf(fptr8,"%ld %s\n",report_array[i].s1_next_hash_entry->s1_count,report_array[i].s1_string);

       }

 

       fclose(fptr8);

       system(work_string1);

       system(work_string2);

}

 

 

void create_report()

{

       long i;

       long report_count;

 

       if ((fptr9 = fopen(sorted_report_file_string,"r")) == NULL)

       {

              printf("can't open sorted_report_file\n");

               exit(1);

       }

 

       /* report_limit is the cutoff the USER set (e.g. top n clients) */

 

       strcpy(work_string1,"rm ");

       strcat(work_string1,sorted_report_file_string);

 

       if       (report_num == 1

         ||   report_num == 2)       {fprintf(fptr3,report_lit,report_limit,domain);}

       else if (report_num == 3)       {fprintf(fptr3,report_lit,r3_string);}

       else                        {fprintf(fptr3,report_lit,report_limit);}

 

       strcpy(report_work,blank_string);

 

       i = 0;

       while (fscanf(fptr9,"%ld %s",&report_count,report_work) != EOF && i < report_limit)

       {

              fprintf(fptr3,"  %7ld",report_count);

              if (i < fqdn_limit)                      /* Hard limit, else way too slow */

              {

                     get_domain_name(report_work);

              }            

              else

              {

                     fprintf(fptr3,"\t%s\n",report_work);

              }

              i++;

              if (report_num == 3) {tot_hits += report_count;}

              strcpy(report_work,blank_string);

       }

 

       if (report_num == 3)

       {

              if (report_index != 1)       {strcpy(plural,"s");} else {strcpy(plural,"");}

              if (report_index == 0)       {fprintf(fptr3,"\t\tNo match...\n");}

              fprintf(fptr3,"\n\nTotalling %ld hits by %ld individual client%s\n",tot_hits,i,plural);

       }

       fprintf(fptr3,"%s",dashed_line);

 

       system(work_string1);

}

 

 

void get_domain_name(char *input_ip)

{

       u_int addr;

       struct hostent *hp;

       char **p;

 

       if ((int)(addr = inet_addr(input_ip)) == -1)

       {

              (void) fprintf(fptr3,"\t%s\n",input_ip);

              return;

/*            (void) printf("IP-address must be of the form a.b.c.d\n");

              exit (2);

*/     }

 

       hp = gethostbyaddr((char *)&addr, sizeof (addr), AF_INET);

       if (hp == NULL)

       {

              (void) fprintf(fptr3,"\t%s\n",input_ip);

              return;

       }

 

       for (p = hp->h_addr_list; *p != 0; p++)

       {

              struct in_addr in;

/*            char **q;

*/

              (void) memcpy(&in.s_addr, *p, sizeof (in.s_addr));

              (void) fprintf(fptr3,"\t%s",hp->h_name);

              (void) fprintf(fptr3,"\n");

       }

}

 

 

void print_report_hash(struct struc_1 *h_e, long i)

{

       printf("%ld %s %ld %p\n<br>",i,h_e->s1_string,h_e->s1_count,h_e->s1_next_hash_entry);

 

       if (h_e->s1_next_hash_entry != NULL)

       {

              print_report_hash(h_e->s1_next_hash_entry,i);

       }

}

 

 

void update_report_hash(char *work_string)

{

       struct struc_1 *old, *start;

       long done;

 

       if        (report_num == 1 || report_num == 3)

              {indexx = hash_ip(work_string);}

       else if (report_num == 2 || report_num == 9)

              {indexx = hash_file(work_string);}

             

       if (strcmp(report_hash[indexx].s1_string,blank_string) == 0)  /* If hash table entry is unoccupied */

       {

              strcpy(report_hash[indexx].s1_string,work_string);

              report_hash[indexx].s1_count = 1;

              strcpy(report_array[report_index].s1_string,work_string);

              report_array[report_index].s1_next_hash_entry = &report_hash[indexx];

              report_index++;

       }

       else

       if (strcmp(report_hash[indexx].s1_string,work_string) == 0)/* Entry occupied, input ip matches entry's */

       {

              report_hash[indexx].s1_count++;

       }

       else       /* Search chain for match -- if none, add entry to end of chain */

       {

              old   = &report_hash[indexx];

              start = report_hash[indexx].s1_next_hash_entry;

              done  = 0;

              while (start != NULL && !done)

              {

                     if (strcmp(start->s1_string,work_string) == 0)

                     {

                            start->s1_count++;

                            done = 1;

                     }

                     else

                     {

                            old = start;

                            start = start->s1_next_hash_entry;

                     }

              }

              if (!done)

              {

                     start = (struct struc_1 *) malloc (sizeof(struct struc_1));

                     if (!start)

                     {

                            printf("out of memory\n");

                            return;

                     }

                     strcpy(start->s1_string,work_string);

                     start->s1_count = 1;

                     start->s1_next_hash_entry = NULL;

                     old->s1_next_hash_entry = start;

                     strcpy(report_array[report_index].s1_string,work_string);

                     report_array[report_index].s1_next_hash_entry = start;

                     report_index++;

              }

       }

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report4()

{

       fprintf(fptr3,"Period Totals for %s:\n\n",domain);

       fprintf(fptr3,"    \t\t\t   All\t\t\t\t\t  Kilobytes\nIndividual Clients     Clients       Hits  \t  Page Hits     Transmitted\n\n");

       fprintf(fptr3,"%18ld%12ld%11ld%18ld%16.2f\n",num_clients,tot_clients,tot_hits,tot_page_hits,tot_kb);

       fprintf(fptr3,"%s",dashed_line);

}

 

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report5()

{

       long  i;

       long count;

 

       report5_init();

 

       /* read r5sorted_ip_date_file

          for each unique date

              while more recs for this date

                     while next ip = saved ip, next rec

                     else bump count in report5 array of ip's seen

              end

          end

          output report 5 if r5 requested (r4 and r6 need the rest even if r5 not requested)

       */

 

       fscanf(fptr8,"%s %s",ip,date);

       strcpy(date_save,date);

       strcpy(ip_save,ip);

       count = 1;

       while (fscanf(fptr8,"%s %s",ip,date) != EOF)

       {

              if (strstr(ip_save,"format") != NULL)

              {

                     strcpy(date_save,date);

                     strcpy(ip_save,ip);

                     continue;

              }

 

              if (strcmp(date,date_save) == 0)

              {

                     if (strcmp(ip,ip_save) != 0)

                     {

                            count++;

                            strcpy(ip_save,ip);

                     }

              }

              else

              {

                     update_report5_totals(count);

                     count = 1;

              }

       }

 

       /* We still need sorted_ip_date_file for reports 1 and/or 4. */

 

       /* Do last guy if not done already.  Wouldn't be done already unless last record

          in sorted_ip_date_file's date was different from previous record's. */

 

       if (strcmp(date,date_save) == 0)       {update_report5_totals(count);}

 

       if        (strcmp(get_value("usedailytotal"),"on") == 0)       {output_report5();}

       else  

       {

              for (i=0; i < num_dates; i++)                        /* Used in report 6 */

              {

                     if (strcmp(report5_hash[i].s3_date,blank_date) != 0){num_dates_used++;}

              }

       }

 

       free(report5_hash);

}

 

 

void report5_init()

{

       strcpy(work_string1,"sort +1 -2 +0 -1 ");

       strcat(work_string1,ip_date_file_string);

       strcat(work_string1," > ");

       strcat(work_string1,r5sorted_ip_date_file_string);

       strcpy(work_string2,"rm ");

       strcat(work_string2,ip_date_file_string);

 

       system(work_string1);

       r5sorted_ip_date_file_created = 1;

       system(work_string2);

       ip_date_file_removed = 1;

 

       if ((fptr8 = fopen(r5sorted_ip_date_file_string,"r")) == NULL)

       {

              printf("can't open r5sorted_ip_date_file\n");

               exit(1);

       }

 

       tot_clients = 0;

       tot_hits     = 0;

       tot_page_hits        = 0;

       tot_kb               = 0;

}

 

 

void update_report5_totals(count)

       long  count;

{

       long i;

       i = hash_date(date_save);

 

       report5_hash[i].s3_clients = count;

       tot_clients += report5_hash[i].s3_clients;

       tot_hits     += report5_hash[i].s3_hits;

       tot_page_hits        += report5_hash[i].s3_page_hits;

       tot_kb        += report5_hash[i].s3_kbytes;

       strcpy(date_save,date);  

       strcpy(ip_save,ip);

}

 

 

void output_report5()

{

       /* Had to move work var's to global var's, since they were breaking here, though OK

          before this subroutine was removed from main report5 subrtn. */

 

       long i;

       fprintf(fptr3,"Daily Totals for %s:\n",domain);

       fprintf(fptr3,"\t\t\t\t\t\t\t\t  Kilobytes\nDate\t\t       Clients       Hits  \t  Page Hits     Transmitted\n\n");

       for (i=0; i < num_dates; i++)

       {

              if (strcmp(report5_hash[i].s3_date,blank_date) != 0)

              {

                     strcpy(work_date,report5_hash[i].s3_date);

                     memcpy(work_mon,work_date+3,3);

                     memset(work_mon + 3,'\0',1);

                     memcpy(work_daynum,work_date,2);

                     memset(work_daynum + 2,'\0',1);

                     memcpy(work_year,work_date+7,4);

                     memset(work_year + 4,'\0',1);

                     if (month_num(work_mon) == 13) {continue;}

                     strcpy(work_day,dayname(month_num(work_mon),atoi(work_daynum),atoi(work_year)));

                     fprintf(fptr3,"%s %s %.2s %.4s\t\t %5ld %10ld\t %10ld\t %10.2f\n",

                            work_day, work_mon, work_daynum, work_year,

                            report5_hash[i].s3_clients,

                            report5_hash[i].s3_hits,

                            report5_hash[i].s3_page_hits,

                            report5_hash[i].s3_kbytes);

                     num_dates_used++;       /* Used in report 6 */

              }

       }

       fprintf(fptr3,"%s",dashed_line);

}

 

 

char * dayname(m,d,y)

       long m, d, y;

{

       long val;

       long dd[12] = {0,3,2,5,0,3,5,1,4,6,2,4};

      

       if (m < 3) {y--;}

       val = (y+(int)(y/4)-(int)(y/100)+(int)(y/400)+dd[m-1]+d) % 7;

 

       switch (val)

       {

              case 0: return("Sun");

              case 1: return("Mon");

              case 2: return("Tue");

              case 3: return("Wed");

              case 4: return("Thu");

              case 5: return("Fri");

              case 6: return("Sat");

       }

       return ("");

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report6()

{

       fprintf(fptr3,"Daily Averages:\n\n");

 

       fprintf(fptr3,"    \t\t\t\t\t\t\t\t  Kilobytes\n\t\t       Clients       Hits\t  Page Hits     Transmitted\n\n");

 

       fprintf(fptr3,"%30ld%11ld%18ld%16.2f\n",

              tot_clients  / num_dates_used,

              tot_hits     / num_dates_used,

              tot_page_hits       / num_dates_used,

              tot_kb        / num_dates_used);

 

       fprintf(fptr3,"%s",dashed_line);

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report7()

{

       long   i;

       float tot_hits = 0;

       float tot_kbytes = 0;

 

       fprintf(fptr3,"Hourly Averages (Pacific Time):\n\n");

       fprintf(fptr3,"    \t\t\t\t\t\t\t\t  Kilobytes\n\t\t      Time\t   Hits\t     Percentage\t        Transmitted\n\n");

 

       for (i = 0; i < 24; i++)

       {

              tot_hits   += report7_array[i].s4_hits;

              tot_kbytes += report7_array[i].s4_kbytes;

       }

 

       for (i = 0; i < 24; i++)

       {

              fprintf(fptr3,"\t%s\t%7ld\t\t%5.1f %%\t\t %10.2f\n",

                     report7_array[i].s4_time,

                     report7_array[i].s4_hits,

                     100*(float)report7_array[i].s4_hits/tot_hits,

                     100*report7_array[i].s4_kbytes/tot_kbytes);

       }

 

       fprintf(fptr3,"%s",dashed_line);

 

       free(report7_array);

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

void report8()

{

       long  work_count;

   char work_status[4];

 

       strcpy(work_string1,"sort +0 -1 -rn +1 -2 < ");

       strcat(work_string1,error_code_file_string);

       strcat(work_string1," > ");

       strcat(work_string1,sorted_error_code_file_string);

       strcpy(work_string2,"rm ");

       strcat(work_string2,error_code_file_string);

       strcpy(work_string3,"rm ");

       strcat(work_string3,sorted_error_code_file_string);

 

       fprintf(fptr3,"Summary of HTTP errors:\n\n");

 

       if ((fptr13 = fopen(error_code_file_string,"w")) == NULL)

       {

              printf("can't open error_code_file\n");

               exit(1);

       }

 

       /* Don't write lit's to file 'cause they contain blanks, which throw off fscanf. */

 

       if (e400_count > 0) {fprintf(fptr13,"%ld %s\n",e400_count,"400");}

       if (e401_count > 0) {fprintf(fptr13,"%ld %s\n",e401_count,"401");}

       if (e402_count > 0) {fprintf(fptr13,"%ld %s\n",e402_count,"402");}

       if (e403_count > 0) {fprintf(fptr13,"%ld %s\n",e403_count,"403");}

       if (e404_count > 0) {fprintf(fptr13,"%ld %s\n",e404_count,"404");}

       if (e405_count > 0) {fprintf(fptr13,"%ld %s\n",e405_count,"405");}

       if (e406_count > 0) {fprintf(fptr13,"%ld %s\n",e406_count,"406");}

       if (e410_count > 0) {fprintf(fptr13,"%ld %s\n",e410_count,"410");}

       if (e500_count > 0) {fprintf(fptr13,"%ld %s\n",e500_count,"500");}

       if (e501_count > 0) {fprintf(fptr13,"%ld %s\n",e501_count,"501");}

       if (e502_count > 0) {fprintf(fptr13,"%ld %s\n",e502_count,"502");}

       if (e503_count > 0) {fprintf(fptr13,"%ld %s\n",e503_count,"503");}

       if (e504_count > 0) {fprintf(fptr13,"%ld %s\n",e504_count,"504");}

 

       fclose(fptr13);

       system(work_string1);

       system(work_string2);

 

       if ((fptr14 = fopen(sorted_error_code_file_string,"r")) == NULL)

       {

              printf("can't open sorted_error_code_file\n");

               exit(1);

       }

 

       while (fscanf(fptr14,"%ld%s",&work_count,work_status) != EOF)

       {

              fprintf(fptr3,"  %7ld\t%s\t%s\n",work_count,work_status,error_lit(work_status));

       }

       system(work_string3);

 

       fprintf(fptr3,"%s",dashed_line);

 

}

 

 

char * error_lit(err_code)

       char * err_code;

{

       if (strcmp(err_code,"400") == 0) {return (e400_lit);}

       if (strcmp(err_code,"401") == 0) {return (e401_lit);}

       if (strcmp(err_code,"402") == 0) {return (e402_lit);}

       if (strcmp(err_code,"403") == 0) {return (e403_lit);}

       if (strcmp(err_code,"404") == 0) {return (e404_lit);}

       if (strcmp(err_code,"405") == 0) {return (e405_lit);}

       if (strcmp(err_code,"406") == 0) {return (e406_lit);}

       if (strcmp(err_code,"410") == 0) {return (e410_lit);}

       if (strcmp(err_code,"500") == 0) {return (e500_lit);}

       if (strcmp(err_code,"501") == 0) {return (e501_lit);}

       if (strcmp(err_code,"502") == 0) {return (e502_lit);}

       if (strcmp(err_code,"503") == 0) {return (e503_lit);}

       if (strcmp(err_code,"504") == 0) {return (e504_lit);}

       return ("");

}

 

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

/**********************************************************************************************/

 

long hash_date(char *input_date)

{

       long work;

 

       /* Calculate hash value. */

 

       /* Compute a number as a function of the year, month, and day.

          pre-bias-Num = 365 * yr + 31 * mon + day

          Num = pre-bias-Num - date_bias.

 

          Date_bias is used so that all dates are in sequence within hash.  It is calculated by

          treating the first possible date in the date range as a data value and hashing in with that

          to the date range before any bias is used.  This gives a remainder or offset for the first

          possible date into the hashed date range.  We want the first possible date to line up with

          the first possible hashed value, so all hashed values are in date sequence.  Otherwise,

          if the first possible date is somewhere within the range of hashed values, dates later than

          that date can hash in either after it or *before* it in the hash range, causing a date sequence

          problem.  So, we subtract the bias from the first possible date's pre-bias Num so that date

          will map to the beginning of the hash value range.  We also adjust all other dates the same way.

 

          To clarify, suppose there are 100 dates in the date range, and that hashing the first possible date,

          say October 1, puts October 1's entry as the 30th entry in the hash table.  The first 69 dates

          after October 1 are OK, since they hash in after it in the hash range, but the next 30 dates

          after *those* hash in *before* October 1's entry.  So, printing entries out in hash table sequence

          will show a date wraparound rather than a strictly ascending date sequence.  To fix this, we

          subtract 30 from each pre-bias Num that is calculated, so 30 becomes 0, etc.

 

          We use 372 as a multiplier since we treat all months as having 31 days.  This way 1/1 > 12/31.

       */

 

       memcpy(rec_year_long,input_date + 7,4);

       memset(rec_year_long + 4,'\0',1);

       memcpy(rec_month,input_date+3,3);

       memset(rec_month + 3,'\0',1);

       memcpy(rec_day,input_date,2);

       memset(rec_day + 2,'\0',1);

 

       work = (long) (372 * atoi(rec_year_long) + 31 * (month_num(rec_month) - 1) + atoi(rec_day) - date_bias);

      

       return

       (

              work % (long) num_dates

       );

}

 

 

long hash_file(char *input_file)

{

       long work = 0;

       register long i;

 

       /* Calculate hash value. */

 

       for (i = 0; i < strlen(input_file); i++)

       {

              work += input_file[i] + 128;

       }

 

       return ((long)(work % (scaling_factor * num_uniq_items)));

}

 

 

long hash_ip(char *input_ip)

{

       char work[16];

       register long i,j,k,l;

       strcpy(work,blank_ip);

 

       /* Calculate hash value. */

 

       /* Convert ip string; get rid of dots and reverse the digits (so more random).

          So, e.g. 123.45.189.67 becomes 7698154321.

       */

      

/*printf("%s\n",input_ip);

*/    

 

       /* backwards ip often bigger than max long (2,147,483,647), so stop after 9 digits */

 

       j = 0;

       k = strlen(input_ip)-1;

       if (k > 8) {l = k - 8;} else {l = 0;}

       for (i = k; i >= l; i--)

       {

              if (isdigit(input_ip[i]))

              {

                     work[j] = input_ip[i];

              }

              else

              {

                     /* hi 4 bits -> 0, lo 4 bits -> 0-9, prepend 0011, -> 30-39 = number */

                     work[j] = (input_ip[i] & 9) | 48;

              }

              j++;

       }

             

       indexx = atol(work);

       indexx = indexx % (scaling_factor * num_uniq_items);

 

       if   (indexx < 0)       {return (-1 * indexx);}

       else                    {return (indexx);}

}

 

 

void wrapup()

{

       char mail_string[256];

       strcpy(mail_string,blank_string);

       strcpy(mail_string,"/usr/lib/sendmail ");

strcat(mail_string,"statsmaster@simplenet.com");

/*       strcat(mail_string,get_value("address"));

*/       strcat(mail_string," < ");

       strcat(mail_string,outfile_string);

       strcat(mail_string,"\0");

 

       strcpy(work_string1,"rm ");

       strcat(work_string1,ip_date_file_string);

       strcpy(work_string2,"rm ");

       strcat(work_string2,cgipath);

       strcat(work_string2,"report*");

       strcat(work_string2,pid);

       strcpy(work_string3,"rm ");

       strcat(work_string3,outfile_string);

 

       fclose(fptr3);

       if (ip_date_file_removed != 1 && r5_data_needed == 1) {system(work_string1);}

       system(work_string2);

       system(mail_string);

       system(work_string3);

 

       strcpy(work_string5,"rm ");

       strcat(work_string5,r5sorted_ip_date_file_string);

       if (r5sorted_ip_date_file_created == 1

       && r5sorted_ip_date_file_removed != 1) {system(work_string5);}

}

 

 

void wrapup_hogs()

{

       strcpy(work_string1,"rm ");

       strcat(work_string1,file_list_string);

       system(work_string1);

 

       strcpy(work_string1,"rm ");

       strcat(work_string1,work_file_string);

       system(work_string1);

 

       strcpy(work_string1,"rm ");

       strcat(work_string1,ip_date_file_string);

       if (ip_date_file_removed != 1 && r5_data_needed == 1) {system(work_string1);}

 

       strcpy(work_string1,"rm ");

       strcat(work_string1,cgipath);

       strcat(work_string1,"report*");

       strcat(work_string1,pid);

       system(work_string1);

 

       fclose(fptr3);

       strcpy(work_string1,"rm ");

       strcat(work_string1,outfile_string);

       system(work_string1);

 

       strcpy(work_string5,"rm ");

       strcat(work_string5,r5sorted_ip_date_file_string);

       if (r5sorted_ip_date_file_created == 1

       && r5sorted_ip_date_file_removed != 1) {system(work_string5);}

}

 

 

void get_form_input()

{

       /* This subroutine gets the form variables and values as one big string, "buffer".

          You need a global variable called "buffer", e.g. char buffer[1024];

          You also need a global variable called "val" declared e.g. char *val.

 

          You also need the other subroutine, "get_value", to access the variables. */

 

       char r_method[5];

       char c_length[6];

       char work_buffer[1024];

       long i = 0;

       long j = 0;

       register char digit;

 

       if (getenv("REQUEST_METHOD") != NULL)

       {

              strcpy(r_method,getenv("REQUEST_METHOD"));

              if (strcmp(r_method,"POST") == 0)

              {

                     if (getenv("CONTENT_LENGTH") != NULL)

                     {

                            strcpy(c_length,getenv("CONTENT_LENGTH"));

                            fgets(work_buffer,atoi(c_length)+1,stdin);

                     }

              }

              else if (getenv("QUERY_STRING") != NULL)

              {

                     strcpy(work_buffer,getenv("QUERY_STRING"));

              }

       }

 

       /* Convert from urlencoding */

 

       while (i < strlen(work_buffer))

       {

              if        (work_buffer[i] == '%')

              {

                     if   (work_buffer[i+1] >= 'A')       {digit = ((work_buffer[i+1] & 0xdf) - 'A') + 10;}

                     else                        {digit = work_buffer[i+1] - '0';}

                     digit *= 16;

                     if   (work_buffer[i+2] >= 'A')       {digit += ((work_buffer[i+2] & 0xdf) - 'A') + 10;}

                     else                        {digit += (work_buffer[i+2] - '0');}

                     buffer[j] = digit;

                     i += 3;

              }

              else if (work_buffer[i] == '+')           {buffer[j] = ' '; i++;}

              else                               {buffer[j] = work_buffer[i]; i++;}

              j++;

       }

       strcat(buffer,"\0");

 

       val = NULL;  /* Init here, so can check on automatically freeing it on entry to get_value. */

}

 

 

char * get_value(char varname[256])

{

       /* This subroutine extracts the value of a form variable.  The name of the variable

          must be passed as a string.  E.g. to get the value of "form_city", call as follows:

          get_value("form_city");  The value is returned as a string.  You can use "val" or assign

          the value to a variable, e.g. strcpy(city,get_value("form_city"));  Make sure you

          allocate space for the receiving variable, e.g. char city[256]; rather than char *city;

       */

       char *name_start;

       char *val_start;

       char *parm_end;

       char  name[100];

       long  val_length, name_length;

 

       strcpy(name,varname);

       strcat(name,"=");

       name_length = strlen(name);

 

       if (val != NULL) {free(val);}

 

       if        (strstr(buffer,name) != NULL)

       {

              name_start = strstr(buffer,name);

              val_start = name_start + name_length;

              if        (strstr(name_start,"&") != NULL)

              {

                     parm_end = strstr(name_start,"&");

                     val_length = parm_end - val_start;

              }

              else       {val_length = buffer + strlen(buffer) - val_start;}

              val = (char *) malloc (val_length + 1);

              memcpy(val,name_start + name_length,val_length);

              memset(val + val_length,'\0',1);

       }

       else

       {

              val = (char *) malloc (10);

              strcpy(val,"not found");

       }

 

       return(val);

}

 

 

void print_glossary()

{

       fprintf(fptr3,"

Definitions\n

Hit: A request for any object that is on your site. Each element of a

requested page (a graphic, a sound file, or the page itself) is counted

as a hit. A page on your site that contains five graphics generates six

hits - the five images and the original request made for the page.\n

Page Hit: The total number of times any of your pages are visited. The

same user counts as a page hit each time he loads one of your pages.\n

Client: A unique IP number that has accessed your site (i.e. person).\n

Individual Clients: If a client accesses more than once on different

days within your report period, this counts only once, whereas the Clients

and All Clients fields count the client again.\n

Kilobytes transmitted: The number of Kilobytes of data transmitted from

your site.

       ");

 

}


Welcome
About Michael
Baseball
Commercial Uses
Image Galleries
Info Retrieval
Mind and Sense
Music
Own The Art
Poetry
Prime Numbers
Pyriodic Table
Software
Video and Links
Contact