Using Software to
Analyse Surname Distribution
Andrew Young
Why analyse the
distribution of surnames? In fact, why analyse
the distribution of anything? Well, as the author
of the BIRDIE software says, when you're
researching the Smith family you need a way to
sort through that haystack and find the needle
hidden somewhere in the middle.
Analysing surname
distribution allows the family historian and in
particular those conducting one-name studies to
plot the occurrences of events and thereby get
a clearer picture of a family's location at
different periods in the past. This can show
migration patterns for example, which might never
have been noticed in a set of written data.
Further, with the latest onslaught of
genealogical CD-ROMs and web-based search
engines, it is becoming even more important to
avoid the perils of that over-used phrase 'information
over-load'.
So how can we use a
computer to bring our information alive? One such
program, which I will examine here, is BIRDIE 2.0
(British Isles Regional Display of IGI Extracts),
which will plot data onto maps of the British
Isles and then allow the family historian to
examine counties on a parish-by-parish level.
Although originally written to handle data from
the IGI, it now accepts data in several different
formats. The product is satisfyingly compact,
being distributed on three floppy disks and
installing quickly and easily. Computer system
requirements are Windows 95 or later, 16 Mbytes
of memory, 5 Mbytes of hard disk space and an 800x600
display capable Of 256 colours.
On first using the
program, it is not instantly obvious how to pass
data into it and how to produce some maps from it.
However, a quick tour through the help system
soon reveals the correct instructions.
Information can be entered from either a GEDCOM (Genealogical
Data Communications) file or a CSV (Comma
Separated Value) file. Unfortunately, I wanted to
import some records from the LDS 1881 census
disks, but found no way of generating these
types of file from the census viewing utility. An
awkward way around this problem is to copy the
census records into a spreadsheet program (such
as Microsoft Excel), strip out all the
unnecessary data and then save the file in the
CSV format, ready for importing into BIRDIE. The
import involves telling the program which
fields are the Name, Parish, County etc.
BIRDIE can now generate
a British Isles map with each county shaded a
different colour depending on the number of occurrences
in that particular county. For example, in Figure
1, the map shows the distribution of the 7,906
records imported. The counties shaded the darkest
(Berkshire and Wiltshire) are mentioned most in
the data and the counties shaded lightest are
mentioned the least. To be able to zoom in to
county level and examine each individual parish
in a similar way requires the desired counties to
be obtained from the BIRDIE producer. This then
allows maps such as Figure 2 to be produced.

Figure 1

Figure 2
We've now seen why we
might need to analyse the distribution of our
ancestors and a brief explanation of how to use
one software tool to help us with this. We can
now put this to use in analysing some surnames in
the 1880s Berkshire. As mentioned before, we will
use data from the LDS 1881 census CD-ROMs and
adapt the data format in Excel before importing
it into BIRDIE.
We shall start with a
relatively common name - Goddard.
Extracting this 1881
census data and plotting it with BIRDIE produced
Figure 3. The parishes with the largest number of
occurrences are Reading St. Mary and Reading St.
Giles. Additionally, the south-east of the county
has a greater number of Goddards than the west
and north, so this could be a useful indicator on
where to focus research efforts.

Figure 3
However, the true
benefit of programs such as BIRDIE become
apparent when dealing with the Smith surname.
Here we see approximately 2,200 records in
Berkshire and a distribution map as in Figure 4.
From a map such as this, it will be very
difficult to draw any conclusions, other than
that Smith is a very popular surname. Using the
facilities of BIRDIE to reduce the thresholds for

Figure 4
each shading we are able
to generate Figure 5, in which many more records
are required for a parish to be shaded in each
particular colour. Consequently, we can now see
that the Smith surname predominates in just a
handful of parishes.

Figure 5
Our next example maps
the Lawrence surname and variants. Figure 6 shows
evidence of Lawrence's heading towards the urban
areas of Abingdon, Wantage, Newbury, Reading and
Windsor, with other occurrences filling in the
intervening parishes. This is also the third map
we have seen with most of the entries in the
south of the county and few entries in the north
and more rural area. This might seem obvious, but
we sometimes need a visual aid to explain the
obvious. Would you have been able to conclude
this from a set of written records?

Figure 6
Lastly, I was interested
to discover whether people with a name such as
Reading, Newbury or Wantage really do tend to
originate from those particular towns. Plotting
data for the surname Windsor revealed that, as
shown in Figure 7, the largest concentration in
Berkshire is indeed in the Windsor area. This is
perhaps a superficial example, but helps to
illustrate the uses and conclusions which can be
drawn from such a simple exercise as drawing a
map of historical events.

Figure 7
Products such as
BIRDIE can be an aid to many genealogists,
helping them to focus their research or to
discover the migration of their ancestors
families over the centuries. However, as I found
with the examples I used in this article, it can
take some time to try and understand the
resulting maps in order to draw some conclusions
from them. Analysing surname distribution is by
no means the ultimate solution in finding elusive
records but should go some way towards helping,
along with providing some useful and interesting maps along the
way.
BIRDIE is produced
by Drake Software Associates. See www.drake-software.co.uk/
Andrew
Young is currently studying for a degree in
Computer Science at the University of
Southampton. Along with maintaining web sites
for three organisations and running a growing
mailing list, Andrew also writes for the
Computer Section of Family Tree Magazine. He
has been involved in the research of his
family history for over ten years and more
recently in his mother's OneName Study. E-mail:
awy@tylehurst.demon.co.uk