NZGDB Newsletter #9, March 2008
In this newsletter:
-
Making
your data even more secure.
Ten
good reasons for putting your database into NZGDB
Search
results now ordered by the number of scrapbook documents
Shipping: Passengers who embarked or disembarked at
stopovers
Become
an ambassador for NZGDB and earn subscription credits
Material for future newsletters
– request for submissions
NZGDB has always
offered more security than a home PC.
From the time that we installed our own server, data has been stored on
mirrored disks, meaning that every bit of information is written simultaneously
to two separate disks so that in the event of the failure of one disk the data
can be recovered from the other. The
system software is continually checking the health of the system and would
detect the first disk failure, so the chances of any data being lost or corrupted
before we could replace the faulty disk was already very low. However we wanted
to plug even this extremely low risk, so NZGDB was taken off-air last week
while we implemented an even higher level of insurance, with an automated
backup to yet another, entirely separate disk. This now makes NZGDB a very safe place to store your data. While not quite up to the level of the
banking system where everything is continuously mirrored at another site in
another city so that business can continue without interruption even in the
event of a terrorist attack, it’s at the standard of “mission critical” systems
running most of our companies. About the only event that would cause us a
serious problem would be a fire in the server room Even then there wouldn’t be a total loss. We would have to fall back to the off-site
backup, and there would also be a delay while we obtained another server. But even then, eventually we’d be back on
air with most of the data intact.
If our week offline
inconvenienced anybody, please accept my apologies. For those with a current subscription (including credits earned
by supplying data), your subscription has been extended by two weeks.
You may have noticed
that all the “tonycairns” databases now have owner “datamanager”. This makes explicit what we have said in
previous newsletters: this data is not
“owned” by Tony in the same way as he owns his “Cairns Doole 2007.ged” database
(as user TonyC) and I own the Barnes.ged records. The datamanager databases have been loaded from data that are
publicly available, or that have been provided to us by email for loading. We’d like nothing better than to find a
real owner for these databases, somebody who can respond to queries about the
data, and take responsibility for their accuracy. If one or more of these databases is yours, please get in touch
and we’ll transfer them to your ownership, or, better still, replace them with
your more up-to-date information.
There are now 3.67 million
records in the GDB database, making it the largest collection of “New Zealand”
family trees available on the web by far, much larger than the NZ sections of
the large international sites.
Admittedly the New Zealand connection in some trees is not strong, in
some cases only a few families, but we all come from somewhere else if you go
back far enough, and now that air travel has made it easy to move around the
world we’ll see more and more trees that weave in and out of New Zealand so it
wouldn’t make sense to be too strict.
If your tree has any “New Zealand connection”, then it’s welcome here.
We expect further
loading of “datamanager” data to slow down now. While we will continue to load GED databases as we find more family
trees with New Zealand content, we’d like to focus more on quality than
quantity, working to improve the information that we can find about each person
in the GDB, filling in gaps while reducing the duplication.
One of the most
important ways of improving quality is to get more databases with real
owners. A database with owner
“robertb” or “tonyc” is more valuable than a “datamanager” database, as there
is an owner who understands the data and can work towards making it more
complete, and more accurate. Have you
loaded your database yet? Here are
several reasons for doing so: -
1.
By
publishing your database in NZGDB others can easily find your public
records. They may be able to fill in
some gaps for you, telling you more about your ancestors, or correcting facts
that you have got wrong. Contacting you
is easy, unless you said that you don’t want to be contacted.
2.
In
NZGDB you publish your database without losing control. Compare this with the usual approach, where
you email your GED to another genealogist or place it on a website for others
to download. If there are problems –
errors, omissions, or privacy breaches – you cannot fix them, as your GED is
now being propagated around the world without any chance of recall or
correction. In contrast, within NZGDB
others can see your records, and they can they link them to their own to create
an extended tree, but they remain your records. At any time you can change them, even remove them. Such changes instantly change the records of
anybody who has linked to your records because they don’t have their own copy,
rather they are linking to your record
3.
The
records you publish need not be merely the basic names, dates, and family
relationships that are exported in GED files.
You can add all your scrapbook data – documents, photos, even audio,
making each record a rich description of the subject’s life.
4.
** When a database is published the
system looks for duplicate records with other databases. Through these duplicate links you can find
others who have records about your people.
Perhaps it’s merely another copy of the same information, but sometimes
there are facts and notes that are completely new to you.
5.
Through
the Compare and Synchronize tools you can easily compare your records with the
duplicates, importing new facts and corrections, and scrapbook links, into your
records. When you import new facts in
this way, NZGDB automatically creates a source record documenting where you
obtained this information.
6.
By
linking your records to another’s tree you gain new lines of ancestry and
descent without creating a copy to become obsolete as the original is updated.
7.
You
become part of the NZGDB community, providing value to others just as they provide
value to you.
8.
You
create a secure copy of your precious family research, guarding against the
possibility that your research will be lost if your own computer or paper
records are damaged or destroyed. As
was noted above, the risk of NZGDB being lost is much lower than that your own
records would be lost. They will not be
lost when your children sort out your things after your death.
9.
Because
NZGDB can determine whether a record should be public or private, you do not
have to be careful to omit records of living people from your database. Eventually all your records will become
public as they come within the scope of the 100 year rule, but in the meantime
records of living people are private, hidden from everybody except you and
those to whom you have given permission.
NZGDB is a good way of sharing your story with your family, while
keeping private information private.
10.
By
providing records to the GDB you earn a subscription credit, calculated in
proportion to the number of records that you add (2000 people gives you a
year’s subscription).
** The
automatic detection duplicates is temporarily suspended, as it was taking too
long. When it has been made faster then
it will be reinstated, and the databases loaded while this facility was turned
off will be processed for duplicates.
Behind the scenes I’ve been making a number of
changes that should make the site a little faster. However there is still a lot to be done. I’ve also made a number of functional
changes. The most significant of these
are the changes that have been made to the GDB Search function
When you use the GDB search function you’ll
notice that there is now a button, [Advanced Search]. If you’re observant, you may also notice that the option list: Normal/Replaced/All has disappeared.
If you click the [Advanced Search] button then
it changes to say [Standard Search], and the search page changes to expose some
more options: -
A couple of minor differences:
1. You
can now give alternative spellings for the family name. This will be particularly useful when there
are alternative spellings, McDuff and MacDuff, ONeil and O’Neil, for example.
2. The
option list Normal/Replaced/All reappears.
However the really interesting bit is the group
of controls below the [Return] button.
These allow you to save a search, and then later re-run it, or find all
the records that have changed or are new since you ran it earlier. Here’s how it works.
·
First,
you define a search in the normal way.
You will probably have clicked [Search] (either button, they’re
identical) to see the results that are returned, but you don’t have to.
·
You
then put a name in the textbox following the [Save Search] button, and you
click the button. For example: -
By saving this as my “Pym” search, I can later
come back and rerun it. I do this by
selecting it with the drop-down box, [?Select Search], and then clicking one of
the buttons: -
·
[Search]
reruns the search, as if I had reentered the criteria manually. The results may of course include records
added or changed since the original search.
·
[Find
Changes] will display only records that were in the original (saved) search
that have since been changed.
·
[Find
New Records] displays records that are now in the database, but were not
returned by the original search. This
will include “excess” records: if the
original search returned the limit of 1000 records, but without this limit it
would have returned 1200 records, then [Find New Records] will return these 200
records along with any other new records.
·
[Previous
Records] returns the records that were previously returned, but will ignore any
new records.
Feedback questions.
1. Would
it be better to combine [Find Changes] and [Find New Records], with the search
returning all changed and new records in one go?
2. Would it be better to move the Father and
Mother criteria to the advanced page?
Few people use these criteria, so this would simplify the standard
search as most people use it, while still leaving parent-criteria available to
those who want them.
Search results are now ordered so that, within
a group of duplicate records with the same name-date, records are returned in
order of the number of scrapbook entries.
Thus if you look up Hannah OLD(1860-1939), the first record is mine as
this has a lot of pictures and documents attached to the record. Within other OLD records those belonging to
Mirk562 are usually first, for the same reason. Counting scrapbook documents is an attempt to present “the best”
record first while keeping the criteria for “best” simple enough so that the
query does not slow down. I found that
counting the scrapbook links made almost no difference to the speed of the search.
As I said in the last newsletter, simply
counting these entries is only a crude approximation to “which record is best”
but it is a simple and efficient test, and good enough for the purpose of
deciding which record to display first.
Also, it has the virtue of ensuring that the top record is one that is
owned by somebody who cares about it, because scrapbook entries require
explicit action by the owner. Trees that are simply uploads of GED files, like
all the datamanager databases, have no scrapbook entries.
Let me know what you think. Is this good enough?
The system has always logged access and update
to GDB records, but this log record has been hidden away where only I could see
it. This has now been changed. When you look at one of your records you’ll
see a [Log] button. Click this and
you’ll see a log showing who has accessed this record, and any updates that
there have been to the record. This only
records access and change since the change was made: if you need to know this for
earlier access, contact me. The data is
available, although not so easily linked to the record.
Suppose you had an ancestor who immigrated to
Wellington on a voyage from London to Nelson. Previously a search would have found this passenger among the
arrivals at Nelson, with only a comment to say that he (or she) actually
disembarked at the Wellington stopover.
The passenger would not have been found in a search for Wellington
arrivals.
Now, when the voyage is defined with one or
more stopovers then the downloaded spreadsheet includes columns allowing you to
indicate passengers who got on or off at these stopovers. You will normally leave these columns blank
as most passengers embark at the voyage’s departure point and disembark at its
arrival, but in cases like our hypothetical ancestor we can now simply record
“Wellington” in the disembark column.
The passenger will now be found by a search for Wellington arrivals, and
will not be found in a search for Nelson arrivals.
New-user registration now includes a section
“How did you hear about us”. If a user
hears about NZGDB from you, and then subscribes to NZGDB, then we’ll credit
your logon with a subscription calculated as 20% of their payment. So tell your friends about NZGDB: the more
people that subscribe, the better we can make it for everybody.
Speed. In the next development period I will continue trying to make the site faster. The changes that we’ve made this period have had a positive effect, but there is still more that can be done.
The feedback on the proposed “Post-it Note” facility was highly positive, and there were no negative responses. I will therefore implement this idea.
Post-it notes are intended to be “unofficial information”, not yet approved by the record owner. They will appear in the links section, and I will make them look like a post-it note by giving them a light yellow background. There will be no confusion: one can easily distinguish these from the owner’s own information. The record owner can accept or delete a post-it note: when accepted it will be changed into a normal note.
The feedback from last month’s newsletter indicates that this idea has struck a chord. How often have we found records that are incorrect, or lack some vital information that we supply, but we get no response when we attempt to contact the record owner? Post-it notes are a way of solving this problem, without compromising the idea that the record owner has complete control of his/her record.
At first NZGDB limited searches so that only the first 100 records were returned. With the new server I was able to increase this to 400, and then to 1000.
Many sites return the first so-many records (say 100), with [Next] and [Previous] buttons to move forward or backward through the total list. This works well for a small-scale site, but something that I found out as part of last month’s investigation into speeding up the site is that the complete set of records is returned even if you only see 100 of them. Thus if we used this logic within NZGDB and you searched for “Smith”, NZGDB would find over 22,000 records, return all 22,000 to your computer, and then display the first 1000. This would cause a huge amount of network traffic, and you’d be left waiting for a long time for your results. In fact it would probably cause a timeout.
However, the facility to save searches, implemented this month, gives me a way of solving this problem. It may still be necessary to put a limit on the total search size, but even if so this can be quite large. When you do a search returning a large number of records, they will now be returned to you 1000 at a time, but only the current set of records will be send over the Internet. Thus if your search returns 5000 records, 1000 will be sent to you, 4000 will be held back on the server awaiting your [Next] click.
What else would you like to see in NZGDB? I’d particularly like to hear from more users like teecee who have data that they’d like to see integrated into NZGDB. Somebody posting to the Rootsweb NZ list wrote: -
> To those people who have an interest in Dorset England
there is a site that
> is run by Individuals at their own parishes called online parish
clerks
> where they transcribe the information free to researchers.
> It would be great to have it working here.
We’d love to make it work here! If you have access to copyright-free parish data, cemetery lists, or anything else of interest to your fellow genealogists, then we’d be happy to work with you to develop a section within NZGDB making this data available, searchable, and linkable to the GDB. Have a look at the facilities for shipping to get an idea of the sort of thing that can be provided. By the way, we have available a lot more shipping data that we can make available once we have checked that we are not violating copyright.
A number of BDM (Birth, Death, and Marriage) spreadsheets are already available within the documents section of the site, and we have access to a lot more. However as spreadsheet documents they are not easily searchable, so adding a proper BDM facility within the resources section could be one of our next developments.
We’d like to invite you to submit material for future newsletters. Newsletters so far have been just from us to you, telling you about NZGDB developments, but as the NZGDB community grows they could become an opportunity for you to communicate with other family history enthusiasts: -
Information wanted} Do you have a question? Perhaps another NZGDB user can answer it.
Information offered} Perhaps you have something to offer other users.
Book reviews}
Web site review} If you’ve come across a book, or website, or new program of interest to other family historians, why don’t you write a brief review on it?
Software for genealogy}
Articles of interest. If you have an interesting story from your historical research, whether about a family, a place, a community, or a building, then we’d love to publish it for you.
These newsletters have been produced about monthly: this is likely to continue as long as the system keeps rapidly evolving. Not only are the newsletters sent to the growing list of NZGDB users around the world, but newsletters are all archived on line and will remain available indefinitely. When there are a large number of archived newsletters then cumulative search and index facilities will be developed. At present the index that appears when you click on the Newsletters link is all that is needed.
Regards,
Robert
Barnes,
NZGDB
Developer