Technical Project Brief
gene-alogy.net is a custom-built genealogical research platform, built from the ground up, it is not a template, CMS, or hosted service. I created this project as a means of tracking my genealogical research, and incorporating vizualisation tools for my dna research which unfortunately are not offered elsewhere. Genetic genealogy deals with massive datasets and having an efficient and dynamic range of tools to evaluate match data and combine matches from numerous different platforms is the functional focus of the project. I also had family members ask to view my research ocasionaly, but no other popular platforms create a shareable view, let alone combine traditional genealogy research with the genetic data. This page describes the main aspect of the project's architecture, database design, and the tools it affords me which have become essential to my research workflow.
Numbers above are live counts queried from the database on each page load.
Technology Stack
Dynamic Page Architecture
No page on this site is individually hand-coded (apart from this one, ironically). Every page, from the ancestor profiles to a DNA chromosome visualization is instead rendered dynamically by querying the database and assembling output upon page load. A shared include chain handles authentication, theming, and database connection before any page-specific logic runs. Adding a new ancestor, uploading a source document, or importing a DNA match automatically propagates anywhere it's called on the site with no manual page updates.
auth_system.php
db_connect.php
DB queries → HTML
Theming & Responsive Design
The site supports multiple switchable visual themes which I enjoy changing from time to time; at the moment I simply have a light and dark version of the current view. Visitor's preference persists across sessions by way of cookies. Both themes are fully responsive. A sticky mini-header appears on scroll, rebuilt dynamically from the main navigation without duplicating markup. An AJAX-powered search autocomplete in the sidebar queries the database as you type, returning matching ancestors with birth/death years.
Profile Pages
A single profile.php handles every person in the database. It dynamically
assembles its content from whichever tables contain data for the queried individual: birth/death
records, parents, spouses, and children from the genealogy tables; military service,
religious affiliation, colonial records, enslavement data, and census references from
the life event tables; DNA cluster assignments from the segment tables; source documents
from the filesystem; photos from a naming-convention-based directory scan. If a table
has no row for that person, that section simply doesn't appear — no conditionals
scattered across hundreds of hand-coded pages.
Each profile also renders a life-event timeline and a Leaflet map plotting birth,
death, and residence locations as geo-coordinates, drawn from a separate
timeline_data.php endpoint via fetch. I propogate geocoordinates myself due to the frequency of innacurate location data I have encountered in the past, especially when dealing with historical places which do not always map onto modern maps. This way I ensure the accuracy of the location data used in the timeline and map. Individuals in the database are sorted and easily located not only via the live search, but through a variety of useful sorting pages listed on the sidebar.
warfare table, photos are matched by filename convention, and the
timeline is built from life-event records and rendered with Leaflet.
Primary Source Documents
Source files (scans, PDFs, transcriptions) are stored in a sources/
directory and associated with individuals by a filename convention. Profile pages
scan the directory at page load and display matching documents automatically without requiring
any manual linking of new sources. For significant, difficult to read, or non-English records,
transcriptions are input directly into the ancestor's JSON notes file and rendered
inline on the profile.
Photo Restoration & Colorization
Many profile photos are digitally restored and colorized. Restoration is done manually in Photoshop; AI enhancement is applied selectively after sufficient manual cleanup. Colorization uses knowledge of the subject's ethnic background, social context, and any available descriptive records to estimate accurate skin, hair, and clothing tones. A modal disclaimer on every page with restored photos explains the methodology and links to the original scans in the sources section.
Database Design
The database is organized into five functional groups. Genealogical relationships, DNA evidence, historical context, and life events are linked by shared identifier keys but stored independently — so the research dataset can grow in any direction without restructuring existing tables.
matches
segment_matches
chromosome_map
chromosomes
clusters
snps
snp_data
snp_annotations
snp_notes
relatives
associates
ethnicities
census_nonfederal
census_national_
benchmarks
census_vocab
residence
places
warfare
colonists
jobs
faiths
enslaved
enslavers
The separation between ancestors, relatives, and
associates reflects a research distinction: direct-line ancestors,
collateral relatives, and associated individuals (neighbors, witnesses, enslaved persons)
who appear in documents but whose exact relationship may not yet be established.
All three are queryable together through UNION queries wherever a full
person lookup is needed — for example, fetching a parent or spouse by ancnum
regardless of which table they're in.
DNA Research Tools
As previously stated, the primary purpose of this project is its incorporation of multiple custom-built DNA analysis tools, developed to support research workflows that commercial testing platforms and third-party tools don't offer together in a single integrated environment, if at all. The visualizations and integrations are rendered in real time from the database using the HTML5 Canvas API rather than no pre-generated images. This is helpful in that it allows for dynamic updates when newer, more precise data becomes available with a simple database insert.
--refresh flag is passed.
snp_data and snps, associated with the tested
individual in the individuals table.matches and segment_matches with chromosome
positions, centimorgan values, SNP counts, and maternal/paternal side
assignments.ethnicities table and written to chromosome_map.
Each ethnicity record stores an associated GeoJSON polygon used to paint
the ancestry map.ancnum
linking directly to a named individual in the genealogy tables.Security & Privacy
Because the site contains DNA data belonging to other people — matches who have not chosen to make their information public — a tiered access system controls what any given visitor can see. The obfuscation below is live, not illustrative: this is what a DNA match name actually looks like to a non-logged-in visitor.
Implementation
All database queries use prepared statements (via PDO and MySQLi) to prevent SQL injection. All output rendered to the browser passes through htmlspecialchars() to prevent cross-site scripting. DNA match names and email addresses are run through a server-side obfuscation function before being serialized to the JavaScript data payload for unauthenticated sessions — the data never leaves the server in readable form, not merely hidden client-side. Individuals born after a threshold year with no recorded death date are automatically treated as potentially living, and their complete profile data is withheld from unauthenticated requests regardless of how the URL is constructed.
Research Purpose
The site exists primarily as a research tool. The goal is to connect every DNA match to a named ancestor, verify those connections across multiple tested individuals in the same family, and build a fully-cited, cross-referenced record of family history spanning fourteen countries and several centuries. The public-facing side makes that research available to family members and others with shared ancestry, while keeping sensitive data — matches, source documents, and living relatives — accessible only where appropriate.
Every structural decision in the schema — the separation of ancestors
from relatives, the MRCA foreign key on segment matches, the per-individual
chromosome maps, the GeoJSON stored on ethnicity records — exists because the research
required it. The architecture follows the research, not the other way around.