BerksFHS Logo  

BerksFHS Berkshire Family Historian
March 2000

upBerks FHS indexContents

Berkshire Family Historian:
Main Page
, March 2000 Contents

Using Software to Analyse Surname Distribution

Andrew Young

Why analyse the distribution of surnames? In fact, why analyse the distribution of anything? Well, as the author of the BIRDIE software says, when you're researching the Smith family you need a way to sort through that haystack and find the needle hidden somewhere in the middle.

Analysing surname distribution allows the family historian and in particular those conducting one-name studies to plot the occur­rences of events and thereby get a clearer picture of a family's location at different periods in the past. This can show migration patterns for example, which might never have been noticed in a set of written data. Further, with the latest onslaught of genealogical CD-ROMs and web-based search engines, it is becoming even more important to avoid the perils of that over-used phrase ­'information over-load'.

So how can we use a computer to bring our information alive? One such program, which I will examine here, is BIRDIE 2.0 (British Isles Regional Display of IGI Extracts), which will plot data onto maps of the British Isles and then allow the family his­torian to examine counties on a parish-by-parish level. Although originally written to handle data from the IGI, it now accepts data in several different formats. The product is satisfyingly compact, being distributed on three floppy disks and installing quickly and easily. Computer system requirements are Windows 95 or later, 16 Mbytes of memory, 5 Mbytes of hard disk space and an 800x600 display capable Of 256 colours.

On first using the program, it is not instantly obvious how to pass data into it and how to produce some maps from it. However, a quick tour through the help system soon reveals the correct instructions. Information can be entered from either a GEDCOM (Genealogical Data Communications) file or a CSV (Comma Separated Value) file. Unfortunately, I wanted to import some records from the LDS 1881 census disks, but found no way of gen­erating these types of file from the census viewing utility. An awk­ward way around this problem is to copy the census records into a spreadsheet program (such as Microsoft Excel), strip out all the unnecessary data and then save the file in the CSV format, ready for importing into BIRDIE. The import involves telling the pro­gram which fields are the Name, Parish, County etc.

BIRDIE can now generate a British Isles map with each county shaded a different colour depending on the number of occur­rences in that particular county. For example, in Figure 1, the map shows the distribution of the 7,906 records imported. The counties shaded the darkest (Berkshire and Wiltshire) are mentioned most in the data and the counties shaded lightest are mentioned the least. To be able to zoom in to county level and examine each individual parish in a similar way requires the desired counties to be obtained from the BIRDIE producer. This then allows maps such as Figure 2 to be produced.

Figure 1

Figure 2

We've now seen why we might need to analyse the distribution of our ancestors and a brief explanation of how to use one software tool to help us with this. We can now put this to use in analysing some surnames in the 1880s Berkshire. As mentioned before, we will use data from the LDS 1881 census CD-ROMs and adapt the data format in Excel before importing it into BIRDIE.

We shall start with a relatively common name - Goddard.

Extracting this 1881 census data and plotting it with BIRDIE produced Figure 3. The parishes with the largest number of occur­rences are Reading St. Mary and Reading St. Giles. Additionally, the south-east of the county has a greater number of Goddards than the west and north, so this could be a useful indicator on where to focus research efforts.

Figure 3

However, the true benefit of programs such as BIRDIE become apparent when dealing with the Smith surname. Here we see approximately 2,200 records in Berkshire and a distribution map as in Figure 4. From a map such as this, it will be very difficult to draw any conclusions, other than that Smith is a very popular sur­name. Using the facilities of BIRDIE to reduce the thresholds for

Figure 4

each shading we are able to generate Figure 5, in which many more records are required for a parish to be shaded in each partic­ular colour. Consequently, we can now see that the Smith sur­name predominates in just a handful of parishes.

Figure 5

Our next example maps the Lawrence surname and variants. Figure 6 shows evidence of Lawrence's heading towards the urban areas of Abingdon, Wantage, Newbury, Reading and Windsor, with other occurrences filling in the intervening parishes. This is also the third map we have seen with most of the entries in the south of the county and few entries in the north and more rural area. This might seem obvious, but we sometimes need a visual aid to explain the obvious. Would you have been able to conclude this from a set of written records?

Figure 6

Lastly, I was interested to discover whether people with a name such as Reading, Newbury or Wantage really do tend to originate from those particular towns. Plotting data for the surname Windsor revealed that, as shown in Figure 7, the largest concen­tration in Berkshire is indeed in the Windsor area. This is perhaps a superficial example, but helps to illustrate the uses and conclusions which can be drawn from such a simple exercise as drawing a map of historical events.

Figure 7

Products such as BIRDIE can be an aid to many genealogists, helping them to focus their research or to discover the migration of their ancestors families over the centuries. However, as I found with the examples I used in this article, it can take some time to try and understand the resulting maps in order to draw some con­clusions from them. Analysing surname distribution is by no means the ultimate solution in finding elusive records but should go some way towards helping, along with providing some useful and interesting maps along the way.

BIRDIE is produced by Drake Software Associates. See

Andrew Young is currently studying for a degree in Computer Science at the University of Southampton. Along with maintaining web sites for three organisations and running a growing mailing list, Andrew also writes for the Computer Section of Family Tree Magazine. He has been involved in the research of his family his­tory for over ten years and more recently in his mother's One­Name Study. E-mail:

Web-page produced by DandyLion Services
Please contact thewith any queries
© Berkshire Family History Society 2001

updated 20th August 2001