Pageviews

Monday, June 27, 2011

hooray - populationsInFasta


after a very painful afternoon of not understanding why seemingly flawless code was misbehaving, i have been able to add a "populationsInFasta" collection to db.loci. i added code to textDataIndexMDB.py that basically scans all documents for a match to an array of the individuals belonging to a population and if found, adds the population to the populationsInFasta array. i'm going to push the code to GitHub and hopefully do some robustness testing today/tomorrow with other datasets. then with the rest of the week i begin format conversions. gosh - i just LOVE when troublesome code is resolved!

the db.loci collections look like this:

{ "SNPs" : 7,
"_id" : ObjectId("4e08edef10d1b18af7000017"),
"indInFasta" : [
"J11",
"J02",
"J05",
"J09",
"J08",
"J07",
"J18",
"J19",
"J17",
"J14",
"J15",
"J12",
"J13",
"J10",
"J06"],
"individuals" : { 
"J01" : 2,
"J02" : 8,
"J03" : 3,
"J04" : 2,
"J05" : 6,
"J06" : 5,
"J07" : 3,
"J08" : 9,
"J09" : 7,
"J10" : 10,
"J11" : 9,
"J12" : 5,
"J13" : 7,
"J14" : 5,
"J15" : 11,
"J16" : 1,
"J17" : 10,
"J18" : 8,
"J19" : 14,
"J20" : 2 },
"length" : 296,
"locusFasta" : "JUNCOmatic_121_aln.fasta",
"locusNumber" : "121",
"path" : "/Users/shird/Documents/Dropbox/lociNGS1/juncoLoci/JUNCOmatic_121_aln.fasta",
"populationsInFasta" : [ 
"POP3",
"POP2",
"POP1",
"POP4" ] }

so much useful information! i can now query for loci that contain at least one individual from a given set of populations.

No comments:

Post a Comment