Genotypes of Global Populations

    About the SiteBrowse Project DataALFREDLinksTutorial

Project Data Download

 

Descriptions of Files

References for Data  (under construction)                                   

Download Data Files by Populations   (under construction)

Bulk Download of all data

 

Descriptions of Files

 

The following description applies to each of the separate files. The population name is the file name.

 

Description: The file is tab-delimited, and thus the columns will not be aligned as well as the sample shown below if the file is opened in a regular text editor such as MS NotePad. However, if the file is opened in MS Excel and the Text to Columns function is applied, it should look as shown below.

 

 

rs#

chrom

posNCBI36.1

GM02822

GM03190

GM03382

GM03725

GM05052

GM05817

GM06052

 

 

 

 

 

 

 

 

 

 

rs3782735

12

6755336

AG

AA

AA

AA

AA

AA

AA

rs10774447

12

6764709

CT

CT

TT

TT

CT

CT

CT

rs2855534

12

6768781

NN

NN

NN

NN

NN

NN

NN

rs2857234

12

6775441

CT

CC

CC

CT

TT

CT

CT

rs2707209

12

6775524

CT

TT

CT

CT

CC

CT

CT

rs4646985

12

6777644

NN

NN

NN

NN

NN

NN

NN

rs2255301

12

6779702

CC

TT

CT

CC

CC

CT

CC

rs7299900

12

6788092

AG

GG

GG

GG

NN

GG

AG

rs3213427

12

6799007

CT

CT

CT

TT

CT

CT

TT

rs2071081

12

6805891

NN

NN

NN

NN

NN

NN

NN

rs9606186

22

18300358

GG

CC

GC

CC

CC

CC

CC

 

 

Row:

a) Row 1: header line, defining contents for each column,

c) Row 2 - end: data lines, containing site information and genotypes for individuals.

 

Column:

a) Column 1: dbSNP number, a standard reference for the molecular definition.

b) Column 2: chromosome number.

c) Column 3: physical location on chromosome measured in base pair from the tip of the short arm, per NCBI build 36.1

d) Column 4 - end: the genotyping information for each individual indicated by the ID symbol in row 1 for that column.

                                   Genotypes are given using nucleotide symbols (A, T, C, G) where known.  “NN” stands for missing data.

  

References for Data (under construction)

 

Additional information on each polymorphism, including references can be found in ALFRED using the rs# to search for the site and following appropriate links.  Links to the population and sample descriptions and references are associated with the data in ALFRED.

 

Information of human population samples can be found here. Genotype data for some populations cannot be released publicly.

 

 

Download Data Files by Populations (under construction)

 

Bulk Download of All Data

 

To download the data as a compressed file, which when expanded will contain 40 individual population files,

                                                                           each containing data on 437 markers for that population, click here.

 

To download the data as a compressed file, which when expanded will contain 2 files, chr.1-9 and chr.10-22,

                                 in an inverted format (individual samples by row; individual markers by column, with extra columns denoting population source), click here.

 

 

       Top