The Project Gutenberg Weekly Newsletter January 28, 2004
eBooks Readable By Both Humans and Computers For Since 1971
Part 1
In this week's Project Gutenberg Weekly Newsletter:
1) Editorial
2) News and Comment
3) Notes and Queries, Reviews and Features
4) Mailing list information
Editorial
Hello,
World domination is still on our agenda this week, along with a new
look at the world of PG through our new A to Z.
Happy reading,
Alice
Send feedback and suggestions to the newsletter editor at: news@pglaf.org
Founding editor: Michael Hart hart@pobox.com
Newsletter editor: Alice Wood news@pglaf.org
Project Gutenberg CEO: Greg Newby gbnewby@pglaf.org
Project Gutenberg website: http://gutenberg.net
Project Gutenberg Newsletter website: http://gutenberg.net/newsletter
Radio Gutenberg: http://gutenberg.net/audio
Hosted by iBiblio, The Public's Library at http://ibiblio.org
Distributed Proofreaders: http://www.pgdp.net
Newsletter and mailing list subscriptions:http://gutenberg.net/subs.shtml
============= [ SUBMIT A NEW EBOOK FOR COPYRIGHT CLEARANCE ]==============
If you have a book you would like to confirm is in the public domain in
the US, and therefore suitable for Project Gutenberg, please do the
following:
1. Check whether we have the eBook already. Look in
http://gutenberg.net/GUTINDEX.ALL
which is updated weekly. (The searchable catalog at
http://www.gutenberg.net lags behind by several months)
2. Check the "in progress" list to see whether someone is already
working on the eBook. Sometimes, books are listed as in progress for
years - if so, email David Price (his address is on the list) to ask
for contact information for the person working on the book. The "in
progress" list:
http://www.dprice48.freeserve.co.uk/GutIP.html
3. If the book seems to be a good candidate (pre-1923 publication
date, or 1923-1988 published in the US without a copyright notice),
submit scans of the title page and verso page (even if the verso is
blank) to:
http://beryl.ils.unc.edu/copy.html
You'll hear back within a few days.
2) News and Comment
Major news this week
Distributed Proofreaders Europe goes live for full public testing
http://dp.rastko.net/default.php
The official public testing of Distributed Proofreaders Europe began
yesterday, with over one hundred proofers joining during their recent
testing period, a little life is buzzing into DP-EU. If you haven't
made it over there yet, we really recommend it.
A 'To-Do' List
If you're looking for classics to work on for PG, you
can check out the newly updated list at
<http://www.steveharris.net/PGList.htm.>
A lot of books were written before the 1923 copyright deadline. Not too
many are well known today and some are more important than others.
While I want to encourage folks to contribute to Project Gutenberg any
way they can, I thought it might be useful to have a place for those who
are looking for ideas on which books to add.
The list includes almost 2000 entries compiled from the catalogues of
Everyman's Library, Franklin Press' "100 Greatest Books", the Modern
Library, Oxford Classics, and Penguin, as well as
- Bloom's Canon: Harold Bloom's The Western Canon, is one
(knowledgeable and opinionated) man's view of the important literary
books of western culture (from which I have extracted those (about 680)
written through 1900 (more or less); an approximate cut-off date which
is 'safe' for copyright purposes).
- PWBS: This list includes the 294 books from Publisher's Weekly's list
of US Best Sellers from 1900-1923.
Of the almost 2000 total entries, almost 1100 are already in Project
Gutenberg, PG Australia or are "In Progress".
About 260 still seem to be under US copyright (listed as "UC" (sometimes
with date)).
About 310 do not appear to have any etext version and are separately
listed in the "Not Found" list.
About 325 are available from etext sources other than Gutenberg.
This represents an increase of over 150 new or in progress Project
Gutenberg etexts and a reduction of more than 140 in the "Not Found"
List since July, 2003.
Thanks to Steve Harris
Other news this week
Food for thought
CONGRESSIONAL COMMITTEE APPROVES DATABASE BILL
The House Judiciary committee yesterday approved a
controversial database protection bill. Supported by
database companies such as Reed Elsevier and opposed by
companies such as Amazon.com, Google, and Yahoo!, the bill
would allow a database owner to sue in civil court "any
person who makes available in commerce to others a
quantitatively substantial part of the information in a
database". Bill at
<http://thomas.loc.gov/cgi-bin/query/z?c108:H.R.3261:>
Coverage at
http://news.com.com/2100-1028_3-5145040.html
------------
Help Beta Test A New Website At projectgutenberg.info
This is up and running now, but will change during the month.
Please email hart@pobox.com with your suggestions and comments
------------
Latest Statistics
11128 Total 01/28/04 Week #3 (22/337)
86 New This Week
43 New Last Week
73.67 Weekly Average
221 New This Month
221 New This Year
10.05 Average per day this year
2880 Projected Total for this year
60 New this week last year (01/22/03)
287 New this month last year (Jan)
161 New this year last year (2003)
$ 0.90 Trillion dollar cost/book
$ 1.45 Trillion dollar cost/book last year
6904 Etexts This Week Last Year
3 Production Weeks this Year 49 to go.
22 Production Days this Year 337 to go.
1 Production Months this Year
2257 eBooks in last 6 months (07/30/03 - 01/28/04) 26 weeks (30 - 3)
12.47 Daily Average for the last 6 months (181 production days)
1967 eBooks in the prior 6 months (01/22/03 - 07/23/03) 26 weeks (3 - 29)
10.81 Daily Average for the prior 6 months (182 production days)
5489 eBooks in the last 18 months (07/31/02 - 01/28/04) 78 weeks (30 - 3)
GETTING A COPY OF THE PROJECT GUTENBERG DVD OR CD
We would prefer you download the CD or DVD image, as described below.
But if you would like a copy, we will send you a copy in the mail free
of charge (you can make a donation to offset our costs: visit
http://gutenberg.net/donate.shtml). When possible, we will send you
TWO copies, so you can give one away.
You can buy a copy of the CD from this site (a portion of all proceeds
go to Project Gutenberg):
http://supporttech.home.comcast.net/projgut.htm
You could also use a CD or DVD burner to copy the disc we send.
To receive a CD or DVD:
email your name and postal address
specify whether you want a CD or DVD
send to: "cd@pglaf.org"
We'll respond to let you know we got your message, and will send a CD
or DVD as soon as we can. Since the CDs and DVDs are produced by
volunteers (using their home computers), we cannot guarantee fast
delivery. Discs are sent via USPS or other inexpensive method.
Generally, the discs are hand-labeled and will arrive in a simple wrapper.
We will send them to you at any address, worldwide. We do not
have a program in place to send discs to other people on your behalf,
but are working on such a program for the future.
DOWNLOADING IMAGES FOR THE PROJECT GUTENBERG DVD AND CD
You can download these CD and DVD ISO images freely, and you are
encouraged to give away copies. See the details on the CD/DVD project
page for limitations on commercial use.
Start here for description, links to the ISO images, and checksums.
http://gutenberg.net/cdproject
The ISO format is a single large file. CD/DVD burning software can
write the file to a CD or DVD, which can then be read in any computer
with a CD or DVD reader. You need a drive that can write and a blank
disc to write the ISO images. These files are large, so not suitable
for download over a modem or other slow connection.
*** The DVD: About 9400 of our first 10,000 eBooks (none of the PG of
Australia eBooks are included due to copyright, and the audio eBooks
and human genome are left out due to size):
You can download the Project Gutenberg DVD image directly from these
sites:
ftp://ftp.ibiblio.org/pub/docs/books/gutenberg/cdimages/pgdvd.iso
(location: North Carolina. Very fast network connection)
ftp://underdog.arsc.alaska.edu/images/pgdvd.iso
or "rsync -rlHtSv ftp@underdog.arsc.alaska.edu::images ."
(location: Alaska. Fast network connection)
ftp://ftp.archive.org/pub/etext/cdimages/pgdvd.iso
(Location: San Francisco. Fast but saturated network connection)
The DVD file size is 4139646976.
MD5 sum is 59d8a193874349181122ff52e2e3e114
*** The CD: The August 2003 "Best of Gutenberg" CD contains over 600 eBooks.
The CD image is available as .ISO and .zip (the .zip contains the ISO):
ftp://ftp.ibiblio.org/pub/docs/books/gutenberg/cdimages/PG2003-08.ISO
ftp://ftp.ibiblio.org/pub/docs/books/gutenberg/cdimages/PG2003-08.zip
PG2003-08.ISO MD5 sum: f309b43fddea1ad444ef802e0e5fa92c size 713037824 bytes
(an earlier version has since had minor errors fixed:
MD5 sum: e448aaec6010fa03373d0f74dde5f36e size 711589888 bytes)
PG2003-08.zip MD5 sum: 2bf96ee51d593169ee5b08202b41179d size 387828452 bytes
You can also get it in smaller chunks that can be reassembled
using a RAR program such as WinRAR:
ftp://ftp.ibiblio.org/pub/docs/books/gutenberg/cdimages/PG2003-08-parts
We are working to make our images available via BitTorrent, but they
are not there yet. The CD is available via the eDonkey peer to peer
network at this address:
ed2k://|file|The.Project.Gutenberg.CD.August.2003.Edition.-.PG2003-08.zip|387828452|0485242D72E3B440D7D9FD61F0ED44DD|/
Email "cd@pglaf.org" if you need help with the CD or DVD image. If
you can BURN CDs or DVDs for us to give away, please let us know!
For more free eBooks, information about our mailing lists and newsletters,
and how to get involved creating eBooks, visit Project Gutenberg on the
Web at www.gutenberg.net
Radio Gutenberg Update
www.gutenberg.net/audio
channel 1 - Sherlock Holmes "The Sign of Four"
channel 2 - Robert Sheckley's "Bad Medicine"
Both are high quality live readings from the collection.
Testing of Radio Gutenberg audio books on demand is currently taking
place.
QUICK WAYS TO MAKE A DONATION TO PROJECT GUTENBERG
A. Send a check or money order to:
Project Gutenberg Literary Archive Foundation
809 North 1500 West
Salt Lake City, UT 84116
B. Donate by credit card online
NetworkForGood:
http://www.guidestar.org/partners/networkforgood/donate.jsp?ein=64-6221541
or
PayPal to "donate@gutenberg.net":
https://www.paypal.com
/xclick/business=donate%40gutenberg.net&item_name=Donate+to+Gutenberg
Project Gutenberg's success is due to the hard work of thousands of
volunteers over more than 30 years. Your donations make it possible
to support these volunteers, and pay our few employees to continue the
creation of free electronic texts. We accept credit cards, checks and
money transfers from any country, in any currency.
Donations are made to the Project Gutenberg Literary Archive Foundation
(PGLAF). PGLAF is approved as a charitable 501(c)(3) organization by
the US Internal Revenue Service, and has the Federal Employer Identification
Number (EIN) 64-6221541.
For more information, including several other ways to donate, go to
http://www.gutenberg.net or email gbnewby@ils.unc.edu
3) Notes and Queries, Reviews and Features
PG A to Z
Yes, we start another new column this week, that will encompass
several of our others as we go past. We realise that perhaps, we might
not catch everything PG related as we go past any particular letter so
feel free to contribute suggestions and we may well go back and cover
that letter again. Now where to start, A of course, A is for
Alice. Hmm, very convienient. Well, we are not typical here at the
newsletter (you may have noticed), so we dare to be different.
'Z'
Z is for Zoran Stefanovic, co-ordinator of DP-EU and Project Rastko. Z
in the PG author list turns up several Chinese volumes by authors such
as Zhang Chao, Zhang Ni and Zhu Xi. There is also a volume on
aeroplanes by James Slough Zerbe. The introduction states that the
book isn't about the exploits of aviators nor a history of flying, but
is in fact 'a book of instructions intended to point out the theories
of flying, as given by the pioneers, the practical application of
power to the various flying structures; how they are built, the
different methods of controlling them; the advantages and
disadvantages of the types now in use; and suggestions as to the
directions in which improvements are required.' Possibly, if you desire
to build your own flying machine, this is the book with which to
begin. Z also includes Emile Zola, more on whom below.
Z for book titles includes Zanoni and Zicci both Edward Bulwer Lytton,
Zone Policeman 88; a close range study of the Panama canal and its
workers by Harry Alverson Franck.
Finally, Zen And The Art Of The Internet by Brendan P Kehoe,
published in 1992. Two particular pieces caught my eye.
'One warning is perhaps in order---this territory we are entering can
become a fantastic time-sink. Hours can slip by, people can come and
go, and you'll be locked into Cyberspace. Remember to do your work!'
'You have at your fingertips the ability to talk in ``real-time'' with
someone in Japan, send a 2,000-word short story to a group of people
who will critique it for the sheer pleasure of doing so, see if a
Macintosh sitting in a lab in Canada is turned on, and find out if
someone happens to be sitting in front of their computer (logged on)
in Australia, all inside of thirty minutes. No airline (or tardis,
for that matter) could ever match that travel itinerary.'
The ability to see into the future and see Distributed Proofreaders,
now that's impressive!
Emile Zola, French novelist and critic, the founder of naturalist
movement in literature. Emile Zola was born in Paris. His father was an
Italian engineer, who had French citizenship in 1862. Zola spent
his childhood in Aix-en-Provence, southeast France. When he was seven,
his father died, leaving the family with money problems - his mother
was largely dependent on a tiny pension. In 1858 Zola moved with his
mother to Paris. During his formative years Zola wrote several short
stories and essays, 4 plays and 3 novels. Among his early books was
CONTES Á NINON, which was published in 1864. When his sordid
autobiographical novel LA CONFESSION DE CLAUDE (1865) was published
and attracted the attention of the police, Zola was fired from his job.
After his first major novel, THÉRÈSE RAQUIN (1867), Zola started the
long series called Les Rougon Macquart, the natural and social history
of a family under the Second Empire. "I want to portray, at the outset
of a century of liberty and truth, a family that cannot restrain
itself in its rush to possess all the good things that progress is
making available and is derailed by its own momentum, the fatal
convulsions that accompany the birth of a new world." The family had
two branches - the Rougons were small shopkeepers and petty bourgeois,
and the Marquarts were poachers and smugglers and they had problems
with alcohol. Some members of the family would rise during the story
to the highest levels of the society, some would fall as victims of
social evils and heredity. Zola presented the idea to his publisher in
1868. At first the plan was limited to 10 books, but ultimately the series
comprised 20 volumes, ranging in subject from the world of peasants
and workers to the imperial court. Zola prepared his novels
carefully, he interviewed experts, wrote thick dossiers based on his
research, and outlined the action of each chapter.
In 1885 Zola published one of his finest works, GERMINAL. It was first
major work on a strike, based on his research notes on labor
conditions in the coal mines. The book was attacked by right-wing
political groups as a call to revolution. NANA (1880), another famous
work of the author, took the reader to the world of sexual
exploitation. Zola's tetralogy, LES QUATRE EVANGILES, which started
from FÉCONDITÉ (1899), was left unfinished.
Zola died on September 28, in 1902, under mysterious
circumstances, overcame by carbon monoxide fumes in his
sleep. According to some speculations, Zola's enemies blocked the
chimney of his apartment, causing poisonous fumes to build up and kill
him. At Zola's funeral Anatole France declared. 'He was a moment of
the human conscience.' In 1908 Zola's remains were transported to the
Panthéon. Naturalism as a literary movement fell out of favor after
Zola's death, but his integrity influenced deeply such writers as
Theodore Dreiser, August Strindberg and Emilia Pardo-Bazan.
United Press International: Project Gutenberg 's anabasis
By Sam Vaknin UPI Business Correspondent
Published 1/7/2004 12:37 PM
Last October, Project Gutenberg -- the Web's first and largest online
library of free electronic books -- released a long-awaited DVD
containing close to 10,000 of its titles. Since then, another 1000
texts were added to its burgeoning archives. The Project has also
spawned numerous other Web sites. Some of them, such as Blackmask,
offer free downloads and sell their own DVD with mostly Project
Gutenberg eBooks in multiple formats. Others provide free browsers and
library applications specific to PG's content.
The man behind the Project -- and, to many, the man who was the inventor
of the proto eBook in 1971 -- is Michael Hart.
Always amenable to preaching the gospel of free content and its
benefits, he spoke with United Press International about Project
Gutenberg's recent progress. Hart was joined by Greg Newby, chief
executive officer of the Project Gutenberg Literary Archive
Foundation.
Q. In October 2003, you set a new target for Project Gutenberg of one
million free eBook by the year 2015. Are there so many books in the
public domain? And what then?
Michael: Archimedes said, "give me a lever long enough, and I will move the
world." Project Gutenberg (gutenberg.net) is just such a lever,
enabling a single person to create something of immense value that is
made available to millions of people. If we have reached a mere 1.5
percent of the world's population, we have already given away a
trillion eBooks.
Project Gutenberg is a grass roots operation, never having had real
funding or grants. For 30 years people said that we won't be around
next year. When we started to get close to 10,000 eBooks, they finally
stopped. There are lots of pretend eBook operations, but none of them
produce all of their eBooks themselves, or have 10,000 of their own
eBooks that can be read by virtually any text reader and word processor.
The next big step, after we have reached a million eBooks, will be to
translate each of them into as many as 100 languages, thus making them
available to an even larger audience. Regarding the number of titles
in the public domain, during the 20th Century, there were many years
in which over 50,000 books were published and the rate has been
increasing throughout. Certainly there were a million titles published
before 1923 that we can get our hands on, not to mention non-book
items such as newspapers, magazines, brochures and advertisements,
court records and other government documents, unpublished manuscripts
and diaries, music, film, photographs, audio, and other art forms.
Greg: My calculation, based on the U.S. Library of Congress' copyright
renewal records, is that there are about 1 million books published
from 1923 - 1964 that are demonstrably in the public domain. We are
seeking to "discover" these items. The copyrights of only 10 percent
of all published items are ever renewed.
Q. Libraries on CD-ROMs are at least a decade old. Why did Project
Gutenberg wait until now to issue its own DVD?
Michael: Because there was always someone out there willing to do it
for us. Because CD burners and DVD burners finally got so cost
effective that we could afford to give away this kind of
media. Because today you can't buy a computer off the shelf without a
DVD drive. Until now, physical media could not compete on a cost
effective basis with Internet downloads.
Greg: We have some volunteers willing to create CD and DVD images and
we now distribute them. But we hope to find many other channels to
distribute our content for free or for a small fee.
Q. Why don't simple scans or raw OCR (optical character recognition)
output qualify as eBooks? What is the technological future of eBooks
-- is it Machine Translation and, if yes, why?
Michael: Book scanning is outsourced half way across the world and the
results are shoddy and often cannot be used as input for OCR programs,
to create a text file, for instance. In contrast, once a true eBook is
created, it has more value than a paper copy, because it can be copied
ad infinitum, sent all over the world, even to a billion readers, and
can be the basis for hundreds of new paper and eBook editions, all at
virtually no cost. Moreover, people are not interested in scans. Some
Project Gutenberg sites each hand out 10 million eBooks per year -
impossible with scanned images or full text eBooks due to their
bandwidth-consuming oversize. The "scanners" want to be the only
source for "their" books, even when those books are in the public
domain - and are willing to claim copyright on the public domain works
of Project Gutenberg in the process. They deny themselves true access
to the public.
Our Unlimited Distribution Model calls for everyone to
have a library of 10,000 eBooks, stored on a single DVD that costs
only $1. People find this appealing. There are perhaps 10,000
volunteers to create our kind of ebooks - against only a few hundred
people, all paid, working to create libraries of scans. Additionally,
the huge scan files hold just a single book, are not searchable,
cannot be copied, indexed, or cited by off the shelf applications,
typos can't be corrected, and are not truly portable due to their size.
Project Gutenberg eBooks can be read in any manner the reader chooses
-- favorite fonts, margination, number of lines per page can all be --
-- modified. The reader becomes his or her own publisher. People with --
-- disabilities can use a speech engine to read the texts aloud. The --
-- visually challenged can change the font size. This is impossible to --
-- do with scans. With CD burners available for under $15, and DVD --
-- burners for $100, with blank media so cheap -- the cost of
individual books becomes literally "too cheap to meter." And that is
the whole point of the Project Gutenberg eBook library.
Greg: eBooks are editable and suitable for creating derivative
works. They are not intended to be a depiction of a printed artifact,
but a direct means of experiencing the author's writing. Today's best
OCR still makes (on average) several errors per page of text, and
requires human intervention to handle things like page headings and
footnotes. We plan to make PG's eBooks easily transformable among
different digital formats - XML, HTML, PDF, Braille, audiobooks, TeX,
RTF and others. Features such as fonts, or background colors will be
selectable. Machine translation (MT) will be another of these
"formats", but it is currently technologically premature and
immature. In cooperation with partner organizations in Europe and
elsewhere, we hope to help to develop better MT software. We are
supporting a project in Europe to augment MT with human translation,
much as today's OCR must be helped by human proofreaders to achieve a
low error rate.
With thanks to Sam Vaknin
Mailing list information
For more information about the Project Gutenberg's mailing lists
please visit the following webpage:
http://gutenberg.net/subs.shtml
Trouble?
If you are having trouble subscribing, unsubscribing or with
anything else related to the mailing lists, please email
"owner-gutnberg@listserv.unc.edu" to contact the lists'
(human) administrator. Please note the email address spelling.
If you would just like a little more information about Lyris
features, you can find their help information at http://www.lyris.com/help
Please note that the newsletter staff do not have access to the
mailing list email address list, so they are unable to subscribe
/ unsubscribe you themselves. They can however, give advice if
you have trouble following the procedures on the webpage.
Current Subscription Numbers as at end December 2003
gweekly - 2812
gmonthly - 3490
Credits
Thanks this week to Brett and George for the numbers and the
booklists. Thierry, Joel, Greg, Michael and Larry Wall. Entertainment for
the workers provided as usual by BBC 6Music and Andrew Collins. Happy
anniversary to me. Typically working in a virtual office full of
men, no-one but me noticed that I've been around for a year doing
newsletters. No chocolates, no flowers - not even a picture of a
flower, they didn't even offer to make me a coffee, they probably all
think it's not important. Well, now I'm off to proof-read a book at DP
where I started, oh, and have a frosty drink. Happy anniversary readers.
pgweekly_2004_01_28_part_1.txt