For sometime I’ve been working through the Newsletter Archives doing a manual count on all the ‘posted’ listings, then entering the results into a spreadsheet. I wanted to confirm that all the statistics added up correctly. There was some inconsistencies but I believe that now it is a much more accurate week-by-week, year-by-year account.
As a result there will be some differences in the numbers previously recorded in the newsletters. Therefore I have put together this article to show these new yearly statistics for each year since 2001.
Notes of interest include;
3,042 is the starting figure for 2001 (previously thought to be 3,100)
Almost all of the DP-EU books have been posted to the PG site. These duplicates are now subtracted from the totals. Both overall and unique totals will be shown.
PG U.S. total includes the PG-EU ’non-unique’ figures as these are still a part of that count.
REposted total had to be adjusted at the end of 2006 as it was found that a number of reposts were not documented in any newsletter listings.
This is neither a fully accurate nor definitive count, but it certainly brings us another step closer to that goal.
The following is a listing of all the entries on the Project Gutenberg Reserved/Pending list. These numbers have been reserved for various reason and at present there is no date as to when they will get posted.
Reserved / Pending
gbnewby 20000
Reserved "The Road Leads On, by Knut Hamsun" 7536
Reserved "The Arabian Nights Entertainments Volume 2, Anon." 5613
Reserved / Pending Unknown 3018
Reserved / Pending Unknown 2879
Reserved "Added Upon a Story, by Nephi Anderson" 2877
Reserved / Pending Unknown 2738
Reserved "The Home Book of Verse, by Burton E. Stevenson V8" 2626
Reserved "The Home Book of Verse, by Burton E. Stevenson V7" 2625
Reserved "The Home Book of Verse, by Burton E. Stevenson V6" 2624
Reserved "The Home Book of Verse, by Burton E. Stevenson V5" 2623
Reserved "Human Genome Project, About the Human Genome Files" 2200
Reserved "The Original Writings of Samuel Adams, Volume 1" 2091
Reserved "2001, by Arthur C. Clarke" 2001
Reserved "1984, by George Orwell (Did it come true?)" 1984
Reserved Pietro di Miceli (former PG Webmaster ) 1964
Reserved Reserved for WWI 1914
Reserved "Twelfth-Night, by William Shakespeare" WL 1789
Reserved Reserved for Shakespeare WL 1767
Reserved Reserved for Shakespeare WL 1766
Reserved "I Have A Dream, Martin Luther King, Jr." 1691
Pending Unfilled 1648
Pending Unfilled 1647
Pending Unfilled 1255
Reserved The Project Gutenberg Encyclopedia 199
Reserved The Project Gutenberg Encyclopedia 198
Reserved The Project Gutenberg Encyclopedia 197
Reserved The Project Gutenberg Encyclopedia 196
Reserved The Project Gutenberg Encyclopedia 195
Reserved The Project Gutenberg Encyclopedia 194
Reserved The Project Gutenberg Encyclopedia 193
Reserved The Project Gutenberg Encyclopedia 192
Reserved The Project Gutenberg Encyclopedia 191
Reserved The Project Gutenberg Encyclopedia 190
Reserved The Project Gutenberg Encyclopedia 189
Reserved The Project Gutenberg Encyclopedia 188
Reserved The Project Gutenberg Encyclopedia 187
Reserved The Project Gutenberg Encyclopedia 186
Reserved The Project Gutenberg Encyclopedia 185
Reserved The Project Gutenberg Encyclopedia 184
Reserved The Project Gutenberg Encyclopedia 183
Reserved The Project Gutenberg Encyclopedia 182
Reserved The Project Gutenberg Encyclopedia 181
If you’re interested in converting the Project Gutenberg Plain-Vanilla ASCII texts into PDF, HTML or other format that will allow you to create a versatile display format, then you may find it useful to remove the mid-paragraph hard linebreaks that exist in these files.
A ‘How To’ was recently posted on the Project Gutenberg gutvol-d mailing list, which is a great guide on how to do this procedure.
The following article was posted by Jon Noring on the TeleRead blog in February 2007. This is an excellent discussion on why Digital Text Master files should be created along with ideas on how to implement it. — Ed
‘Digital Text Masters’ (Digitizing the classic public domain books)
by Jon Noring
The recent TeleBlog articles about the Project Gutenberg (PG) text Tarzan of the Apes (see 1, 2), suggest that not all is well in the existing corpus of public domain digital texts.
My personal experience the last twelve years in digitizing several public domain books has helped me to see a number of problems which I’ve mentioned in various forums, including the PG forums, and The eBook Community. For the sake of not turning this already long article into a whole book, I won’t cover here the complete list of problems I found, plus those found by others.
To summarize what I believe should be done to resolve most of the known problems, when it comes to creating a digital text of any work in the public domain, we should first produce and make available what we call a “digital text master,“ which meets a quite high degree of textual accuracy to an acceptable and known print source. From the “master,” various display formats, and derivative types of texts (e.g., modernized, corrected, composite, bowdlerized, parodied, etc.) can then be produced to meet a variety of user needs.
(Btw, what better example to illustrate the concept of a “digital text master” than to show the self-portrait of the great 17th century Dutch master painter, Rembrandt van Rijn, whose attention to detail and exactness is renowned.)
This is an extract from Jonathan Lethem's article on copyright in Harper's. It's a long article but well worth the read.
Literature has been in a plundered, fragmentary state for a long time. When I was thirteen I purchased an anthology of Beat writing. Immediately, and to my very great excitement, I discovered one William S. Burroughs, author of something called Naked Lunch, excerpted there in all its coruscating brilliance. Burroughs was then as radical a literary man as the world had to offer. Nothing, in all my experience of literature since, has ever had as strong an effect on my sense of the sheer possibilities of writing. Later, attempting to understand this impact, I discovered that Burroughs had incorporated snippets of other writers' texts into his work, an action I knew my teachers would have called plagiarism. Some of these borrowings had been lifted from American science fiction of the Forties and Fifties, adding a secondary shock of recognition for me. By then I knew that this “cut-up method,” as Burroughs called it, was central to whatever he thought he was doing, and that he quite literally believed it to be akin to magic. When he wrote about his process, the hairs on my neck stood up, so palpable was the excitement. Burroughs was interrogating the universe with scissors and a paste pot, and the least imitative of authors was no plagiarist at all.
Nelson W. Polsby, who marshaled intellectual rigor, lucid writing and a knack for drawing striking lessons from real-life observation in his enduring studies of Congress and the presidency, died on Tuesday at his home in Berkeley, Calif. He was 72.
The cause was complications of congestive heart failure, his daughter Emily Polsby said.
Mr. Polsby, a political scientist, wrote or edited at least 15 books and scores of articles and edited The American Political Science Review, the most prestigious political science journal. He was especially known for his studies of Congress, the presidency, political parties, policy making and the media.
The Library of Congress has received a $2-million grant from the Alfred P. Sloan Foundation to digitize public domain works. The grant emphasizes digitizing "at-risk" titles—or books that are falling apart—and volumes about American history. Dubbed "Digitizing American Imprints at the Library of Congress," the project will also allow the LoC to invest in such technology as, according to a statement from the organization, "suitable page-turner display" along with a program dedicated to quickly indexing and capturing chapters and other sections of a work.Publishers Weekly Daily, 5 February 2007
Teachers have steered the Shakespeare curriculum for younger pupils in England away from Othello and Henry IV Part I in favour of lighter texts. After a poll, plays set for 13 and 14-year-olds in England could include Romeo and Juliet and As You Like It. Othello did not make the list because more than half of those questioned said the themes of sexual jealousy and racism were not suitable for that age. Teachers say the exam system impedes the enjoyment of Shakespeare anyway.
Molly Ivins, the liberal newspaper columnist who delighted in skewering politicians and interpreting, and mocking, her Texas culture, died yesterday in Austin. She was 62.
Ms. Ivins waged a public battle against breast cancer after her diagnosis in 1999. Betsy Moon, her personal assistant, confirmed her death last night. Ms. Ivins died at her home surrounded by family and friends.
In her syndicated column, which appeared in about 350 newspapers, Ms. Ivins cultivated the voice of a folksy populist who derided those who she thought acted too big for their britches. She was rowdy and profane, but she could filet her opponents with droll precision.