Major Update of Project Gutenberg Newsletter Statistics

For sometime I’ve been working through the Newsletter Archives doing a manual count on all the ‘posted’ listings, then entering the results into a spreadsheet. I wanted to confirm that all the statistics added up correctly. There was some inconsistencies but I believe that now it is a much more accurate week-by-week, year-by-year account.

As a result there will be some differences in the numbers previously recorded in the newsletters. Therefore I have put together this article to show these new yearly statistics for each year since 2001.

Notes of interest include;

This is neither a fully accurate nor definitive count, but it certainly brings us another step closer to that goal.

PG Reserved / Pending List

The following is a listing of all the entries on the Project Gutenberg Reserved/Pending list. These numbers have been reserved for various reason and at present there is no date as to when they will get posted.

Reserved / Pending
gbnewby                                              20000
Reserved            "The Road Leads On, by Knut Hamsun"                   7536
Reserved            "The Arabian Nights Entertainments Volume 2, Anon."   5613
Reserved / Pending  Unknown                                               3018
Reserved / Pending  Unknown                                               2879
Reserved            "Added Upon a Story, by Nephi Anderson"               2877
Reserved / Pending  Unknown                                               2738
Reserved            "The Home Book of Verse, by Burton E. Stevenson V8"   2626
Reserved            "The Home Book of Verse, by Burton E. Stevenson V7"   2625
Reserved            "The Home Book of Verse, by Burton E. Stevenson V6"   2624
Reserved            "The Home Book of Verse, by Burton E. Stevenson V5"   2623
Reserved            "Human Genome Project, About the Human Genome Files"  2200
Reserved            "The Original Writings of Samuel Adams, Volume 1"     2091
Reserved            "2001, by Arthur C. Clarke"                           2001
Reserved            "1984, by George Orwell (Did it come true?)"          1984
Reserved            Pietro di Miceli (former PG Webmaster )               1964
Reserved            Reserved for WWI                                      1914
Reserved            "Twelfth-Night, by William Shakespeare"           WL  1789
Reserved            Reserved for Shakespeare                          WL  1767
Reserved            Reserved for Shakespeare                          WL  1766
Reserved            "I Have A Dream, Martin Luther King, Jr."             1691
Pending             Unfilled                                              1648
Pending             Unfilled                                              1647
Pending             Unfilled                                              1255
Reserved            The Project Gutenberg Encyclopedia                     199
Reserved            The Project Gutenberg Encyclopedia                     198
Reserved            The Project Gutenberg Encyclopedia                     197
Reserved            The Project Gutenberg Encyclopedia                     196
Reserved            The Project Gutenberg Encyclopedia                     195
Reserved            The Project Gutenberg Encyclopedia                     194
Reserved            The Project Gutenberg Encyclopedia                     193
Reserved            The Project Gutenberg Encyclopedia                     192
Reserved            The Project Gutenberg Encyclopedia                     191
Reserved            The Project Gutenberg Encyclopedia                     190
Reserved            The Project Gutenberg Encyclopedia                     189
Reserved            The Project Gutenberg Encyclopedia                     188
Reserved            The Project Gutenberg Encyclopedia                     187
Reserved            The Project Gutenberg Encyclopedia                     186
Reserved            The Project Gutenberg Encyclopedia                     185
Reserved            The Project Gutenberg Encyclopedia                     184
Reserved            The Project Gutenberg Encyclopedia                     183
Reserved            The Project Gutenberg Encyclopedia                     182
Reserved            The Project Gutenberg Encyclopedia                     181

Unwrapping Paragraphs in a PG eBook – A HOW TO

If you’re interested in converting the Project Gutenberg Plain-Vanilla ASCII texts into PDF, HTML or other format that will allow you to create a versatile display format, then you may find it useful to remove the mid-paragraph hard linebreaks that exist in these files.

A ‘How To’ was recently posted on the Project Gutenberg gutvol-d mailing list, which is a great guide on how to do this procedure.

Digital Text Masters: A Future for Public Domain eBooks?

The following article was posted by Jon Noring on the TeleRead blog in February 2007. This is an excellent discussion on why Digital Text Master files should be created along with ideas on how to implement it. — Ed

‘Digital Text Masters’ (Digitizing the classic public domain books)

by Jon Noring

Rembrandt Self Portrait

The recent TeleBlog articles about the Project Gutenberg (PG) text Tarzan of the Apes (see 1, 2), suggest that not all is well in the existing corpus of public domain digital texts.

My personal experience the last twelve years in digitizing several public domain books has helped me to see a number of problems which I’ve mentioned in various forums, including the PG forums, and The eBook Community. For the sake of not turning this already long article into a whole book, I won’t cover here the complete list of problems I found, plus those found by others.

To summarize what I believe should be done to resolve most of the known problems, when it comes to creating a digital text of any work in the public domain, we should first produce and make available what we call a “digital text master,“ which meets a quite high degree of textual accuracy to an acceptable and known print source. From the “master,” various display formats, and derivative types of texts (e.g., modernized, corrected, composite, bowdlerized, parodied, etc.) can then be produced to meet a variety of user needs.

(Btw, what better example to illustrate the concept of a “digital text master” than to show the self-portrait of the great 17th century Dutch master painter, Rembrandt van Rijn, whose attention to detail and exactness is renowned.)