I'm focusing primarily on vertebrates at the
moment, which have a total of (I think) about 60,000-70,000 rows
for all taxons (species, families, etc.). My goal is to create a
customized database that does a really good job of handling
vertebrates first, manually adding a few key invertebrates and
plants as needed.
I couldn't possibly repeat the process with invertebrates
or plants, which are simply overwhelming. So, if I ever figure
out the Catalogue of Life's database, then I'm simply going to
modify its tables so they work with my system. My vertebrates
database will override their vertebrate rows (except for any
extra information they have to offer).
As for "hand-entry," I do almost all my work in
spreadsheets. I spent a day or two copying scientific names
from the Catalogue of Life into my spreadsheet. Common names
and slugs (common names in a URL format) is a project that
will probably take years. I might type a scientific name or
common name into Google and see where it leads me. If a
certain scientific name is associated with the common name
"yellow birch," then its slug becomes yellow-birch. If two or
more species are called yellow birch, then I enter
yellow-birch in a different table ("Floaters"), which leads to
a disambiguation page.
For organisms with two or more popular common names - well,
I haven't really figured that out yet. I'll probably have to
make an extra table for additional names. Catalogue of Life
has common names in its database, but they all have upper case
first letters - like American Beaver. That works fine for a
page title but in regular text I need to make beaver lowercase
without changing American. So I'm just starting from square
one and recreating all the common names from scratch.