[dvdvol] new web forms and new disc images

Aaron Cannon cannona at fireantproductions.com
Fri Oct 15 23:39:20 PDT 2004


I fixed about a dozen bugs and made about the same number of improvements 
in the request processor and have added it to the cd project page.  If 
anyone wants to take another look, you can visit 
http://www.gutenberg.org/cdproject/ .  Then click on either the 
international or United States form.  It's still ok to test, just make sure 
you say so in the comments field.

Most of the improvements were in data validation.  However, I also changed 
the program so that now, in addition to E-mailing the request to 
cd at pglaf.org, it writes the data to a flat file.  This is a good thing, 
because it will allow us to eventually make things more distributed.  What 
I envision is a system where volunteers can log on to a web page at 
Gutenberg.org and checkout requests, somewhat like the Distributed Proofing 
site.  It shouldn't be too hard to implement.

I've also been working on getting access to the PG database, so that we can 
start compiling a new CD and DVD image.  If anyone has experience with 
parsing RDF/XML in Perl, I could use your help.

To give you all an idea of the current size of the archive, here's a 
breakdown of file types and their total size in bytes.


  fk_filetypes | fk_compressions | files |    bytes
--------------+-----------------+-------+-------------
  ?            | none            |     1 |        6148
  ?            | zip             |    56 |  1158610659
  avi          | zip             |     1 |     9671667
  css          | none            |    73 |      129309
  doc          | none            |     5 |    16335360
  doc          | zip             |     5 |     3530238
  dvi          | gz              |     1 |      145672
  eps          | none            |     5 |      667758
  eps          | zip             |     1 |       50481
  gif          | none            |  3038 |    61544329
  gif          | zip             |     3 |      820419
  html         | none            |  3826 |  1289404590
  html         | zip             |  3578 |  3726062915
  index        | none            |   339 |      882064
  iso          | none            |   277 |  4852684800
  iso          | zip             |     1 |   388439680
  jpg          | none            | 26280 |  1947495197
  jpg          | zip             |     3 |     1928065
  license      | none            |    23 |      253207
  lit          | none            |    57 |     5728949
  lit          | zip             |    55 |     4477759
  ly           | none            |     9 |       59063
  ly           | zip             |     1 |        2423
  md5          | none            |     4 |       15020
  mid          | none            |    46 |     3195924
  mid          | zip             |     7 |      574353
  mp3          | none            | 12751 | 93537630338
  mp3          | zip             |    54 |  1717503076
  mpg          | none            |     4 |    16441408
  mpg          | zip             |     7 |    30644113
  mus          | none            |     9 |     1853407
  mus          | zip             |     8 |     4154954
  nfo          | none            |     1 |     4222976
  nfo          | zip             |     1 |     3063405
  pageimages   | zip             |     1 |    12875805
  pdf          | none            |    93 |   117202451
  pdf          | zip             |    37 |    47006708
  png          | none            | 10897 |   779262971
  prc          | none            |    54 |     7266524
  prc          | zip             |    55 |     8775500
  ps           | none            |     5 |    17086684
  ps           | zip             |     1 |     4210628
  qt           | none            |     1 |     1399639
  qt           | zip             |     1 |     7758161
  readme       | none            |   573 |     9154766
  readme       | zip             |     3 |     6718762
  rtf          | none            |    41 |    47205486
  rtf          | zip             |    53 |    28563481
  sib          | none            |    38 |     1799503
  sib          | zip             |     5 |      904124
  svg          | none            |     2 |       24120
  tex          | none            |    24 |    10556350
  tex          | zip             |    29 |     6468205
  tiff         | none            |    34 |     9129666
  tr           | zip             |     1 |     2591514
  txt          | none            | 17253 | 14194087555
  txt          | zip             | 17200 |  4711928510
  wav          | none            |     2 |    29144452
  xml          | none            |    47 |   111411597
  xml          | zip             |    42 |    25677549
  xsl          | none            |    14 |      345476
               | none            |   115 |    25679187
               | rar             |    23 |   338860727
               | zip             |    49 |   583288192
(64 rows)


As you can see, we're going to have to leave a lot off of any DVD we 
create.  The big question is what gets kept and what gets excluded.  Even 
if we only include zipped HTML and zipped text, we're still over 
budget.  Of course, there's always the possibility that we could create two 
images, a volume 1 and a volume 2, or something similar.

Thoughts?

Sincerely
Aaron Cannon


--
E-mail: cannona at fireantproductions.com
Skype: cannona
MSN Messenger: cannona at hotmail.com (Do not send E-mail to the hotmail address.) 





More information about the dvdvol mailing list