This is a tool to automatically create pedigree trees based on segment matches from a set of autosomal files. This tool will also let you know how a segment match of an unrelated genetic match is related and through whom.
Project Page:Autosomal Pedigree CreatorGetting Started
Folder StructureDownload the Autosomal Pedigree Creator.zip file from the website which is usually less than 1 Mb and extracting it gives you the following files and folders.
- bin – contains bare minimum Graphviz binaries required to convert a .gv dot to PNG image file.
- data – intermediate folder
- ibd – intermediate folder
- tmp – intermediate folder
- Autosomal Pedigree Creator.exe – Executable
- README.txt – Readme file giving a quick overview of the software just in case you haven’t looked at the website.
Kit Preparation
In order to use this tool, some basic preparation must be done. It is just renaming the files with humanly readable filenames. Please don’t change the file extensions. Please use only alphabets.
E.g.,
- 264652-autosomal-o37-results.csv.gz can be renamed to Felix.gz
- 264652-autosomal-o37-results.csv can be renamed to Felix
- genome_v3_Full_20131006120000.zip can be renamed to Felix.zip
Once renamed, place all the renamed kits into a folder. This folder will be selected from the interface.
User Interface
Usage of this tool is self-explanatory and below are the brief steps.
- Click Browse and select folder where you had placed all the prepared kits.
- Dump All –This option is only required when you have kits totally unrelated to each other and you want to dump every possible segment connection.
- Click start and the process begins. The process can go on for a few minutes to several hours depending on the number of autosomal DNA files.
Execution
The process executes sometimes for several hours. The progress seems to get stuck at 15% and then at 75%. This is not really stuck but it tries to extract as much information as possible in order to construct the tree and it does not know how far it has to go. Also, each comparison is done in parallel equal to the number of processors in your computer to accelerate the process.
Pedigree Output
When the process finishes, a PNG file called pedigree.png will automatically open which contains the tree. For some reason if the PNG file didn’t open, then you can always find the file in the root folder of Autosomal Pedigree Creator.
The tool uses Graphviz to generate the PNG file output from a .gv dot file. The .gv file can be found inside the tmp folder as tree.gv.
Tracing the Connection
For some reason, if you want to check a connection between two common ancestors or two autosomal files, you can do so by following the below procedure.
In the pedigree output, each line is a match, the terminals are autosomal files and the 4 letter ovals are common ancestors. The mapping between these 4 letters and what it means can be found inside tmp folder in the file common_ancestors.csv which can be opened in excel.
As mentioned each arrow is a connection or a matching segment or a group of segments from a common ancestor.
XML Representation
The complete list of common ancestors and how each are related is present in the XML file atree.xml.
This file contains the common ancestor CA tag and the list of segments that match. Please note that all the sub nodes match all the segments at the parent level. Even though the root element is ADAM-EVE, its sub nodes are not automatically connected to the root. The root element is just for the sake of having a root element in XML and is not reproduced in the pedigree tree.
The XML is generated from a text file ‘atree.txt’. The XML file is simply a hierarchical representation of the text file.
Matching Segments
All matching segments can be found inside the ‘ibd’ folder. Please note ‘ibd’ is just a folder name and does not automatically mean they haven’t had recombination or Identity By Descent. However, all matching segments inside ‘ibd’ folder are compound segments.
Opening a file say, Arulraj-Chandrakumar-Esther-SathiaGnanaraj means, the segment is common across Arulraj, Chandrakumar, Esther and SathiaGnanaraj autosomal files and it represents the common ancestor for the kits involved.
Output Interpretation
You might wonder why there are some common ancestors represented as 4 chars in ovals but has only one descendant common ancestor also represented as 4 chars in ovals. The reason is because, these intermediate common ancestors do have population data or segments matching the individuals but does not match the parents. If you want to include all such matching segments from population data, you can enable ‘Dump All’ option. However, be warned that ‘Dump All’ can create a clumsy pedigree because every individual may match every common ancestor depending on how close they are related.
The above output which is closely correct, but still requires some manual intervention and adjustments to get accurate pedigree.
For the above pedigree, below are the true relations.
- Felix (self)
- Chandrakumar (Father)
- Selvarani (Mother)
- Sathia Gnanaraj (Paternal grandfather)
- Esther (Wife)
- Arulraj (Father in law)
There is no common ancestors between Felix and Chandrakumar (because Chandrakumar is my father). So, VLXQ name represented as a common ancestor between myself and my father is none other than my father himself. Similarly for all parent/child relations. It is not possible to automate this using computer algorithms because, a computer can only say if a relation is parent/child but it cannot find who the parent is unless it has all the required data surrounding it which is not possible or feasible all the time. Changing the parent/child relations will lead to the below modified pedigree.
As you can see, I can infer the following from the autosomal pedigree tree.
- My wife’s tree is separate line.
- There are three individual common ancestors giving three lines.
- My parents are distant cousins.
Let me know if you find this tool useful and know what you found.