For about a week
we’ve been running on our new server.
The new server is not much to look at – just a flat slab, about the thickness
of a kitchen bench top, it’s the second from the top in this rack. It’s already proving its worth. You may have noticed that searching for
people is significantly faster, and those with “Smith” ancestors can now look
them up: previously all they saw was a timeout message. Something that is very important to me but
you may not have noticed is that I can run large database queries (like looking
for duplicate records) without noticeably affecting other users.
Having our own server will allow me to provide a number of new features
that I could only dream about before.
First however I’ve got to learn how to administer it properly so that I
prevent, rather than cause, unexpected outages. My apologies to anybody who was
inconvenienced by a server outage.
If you are using a bookmark to access NZGDB, check that this points to www.nzgdb.co.nz and not the old URL, http://robertb.equote.co.nz/nzgdb. This old URL was replaced in May, but until
the server change it still pointed to the same site as www.nzgdb.co.nz, but it now points to the
old site. If you’re not sure, look at
the first (= logon) page: if it has a
red warning message about data being lost, then you’re looking at the old site.
In the site conditions you agreed to accept this newsletter. Let me know if you want to be taken off the
list.
The system now attempts to detect duplicate records, linking them with a
Duplicate(score) soft link. For
example, opening a record you may see something like this in the soft links section:
-
The logic was designed to locate as many duplicates as economically
possible without linking any non-duplicates.
The process is: -
1.
Records with the same Family Name and Given Names are compared, and a
“Raw Score” of 1 to 4 calculated by comparing Family Name, Given Names, Year of
Birth, and Year of Death. The Raw
Score adds 1 if the field is the same, subtracts 1 if the field is different,
and adds 0 if the field is absent.
2.
Raw scores are also calculated for parents and grandparents (use zero if
there is no corresponding record), and added to the raw score for the starting
record. This yields a number with a
maximum of 28.
3.
If the combined scores are greater than 10 (of 28), then a duplicate
link is created. The link’s score is
reported as deciles, i.e., like a %age but 0-10 not 0-100.
This logic will miss duplicates at the extremities of trees, where there
are insufficient parents and grandparents to raise the score above 10/28, and
it misses duplicates where there are minor spelling differences (for example,
Hannah Francis BARNES/Hannah Frances BARNES). However every reported duplicate that I have
checked has been genuine, which is more important than the fact that we’ve
missed some. The missing duplicates
can easily be linked through [Synchronise]: -
This feature has been available to testers for about a month, and now
that many duplicates are located automatically it has been generally released.
I’m still struggling to make
this easy while preserving essential principles such as record ownership, so
anybody using this facility is told to read the Help first, (http://www.nzgdb.co.nz/help/gdb5_help.htm),
and any feedback would be very welcome.
To use this
facility, click the button [Synchronise, Link, and Merge Trees]. This button is on the Compare page, which
you reach from a Duplicate soft link (see above).
When you
synchronise records the system will automatically create duplicate links for
corresponding ancestors, handling the “extremity issue” noted above. Ancestors are linked even if they have
different names, allowing situations such as John OLD/John OWLD to be
recognized as duplicates. It also
provides tools to allow you to record duplicates when names are different.
Synchronise also
makes it very easy to import facts from the duplicate record into your own
record. The system automatically
creates a source record noting the owner/ged that fact has been copied from, as
well as copying any source records that were in the source GED
Many GED’s have been submitted with names like “Cairns Doole 2007.GED”,
implying that these are a snapshot.
Perhaps you intend to submit updated records later as “Cairns Doole
2008.GED”. But this will have some
unintended consequences.
The system will keep the two submissions quite separate. Thus if “Cairns Doole 2008.ged” contains records
that are also present in “Cairns Doole 2007.ged”, then both records will be
stored in NZGDB. Many of these
duplicates will be detected and linked, but not all. At the least this will increase confusion as the search returns
more records and users have to search through each of these trying to work out
which record to believe.
You could ask us to remove the earlier GED, but this will lose any links
that have been established to/from it.
If others have identified duplication and synchronized their records, or
linked their tree to yours, they will have to do this all over again.
What you SHOULD do is to keep using the same name unless you actually
want both GEDs to be stored. Also,
when uploading the GED you should use the “Update” option, not “Replace”. “Update” will preserve the original record
keys, retaining soft links and unchanged records. Use “Replace” only for very exceptional situations such as when
you completely rebuild your PC database, for example converting it from Legacy
to Family Tree Maker.
It is better to use a neutral name, like “Cairns Doole.GED”, than a name
that implies a particular date. Contact me or Tony if you’d like your data
source renamed.
Also, as discussed in Newsletter #3
updating can be very efficient, as you only need to upload the changes.
Another new feature is the Tree View.
Click the [Tree View] button on the individual page, or check “Open in
Tree View” on the search page, and you’ll see a display like this: -
Click any arrow and the tree view is shifted to the selected
person. Click any link and the relevant
individual page is opened. Colour
coding is used to indicate “foreign” records (from another GED) that have been
linked into your tree.
Regards,
Robert Barnes,
NZGDB Developer