Wednesday 15 September 2010

How to compare substrings on python? (DNA suquencing) -



How to compare substrings on python? (DNA suquencing) -

i have assignment in python have check pair of deoxyribonucleic acid sequence. (in case 3 pairs)

pair 1 (gaaggtcgaa, cctcggga) pair 2( atgatggac, gtgataaggaccc) pair 3 (aaattt, gggccc)

check each pair see if have mutual sequence.

longest mutual sequences

pair 1: tcg

pair 2: tgat ggac

pair 3: no mutual sequence found

i able substrings, i'm having difficulties comparing them , printing when find mutual sequences.

thanks in advance.

my code far txt file has pairs mentioned

import string def main():

# open file reading in_file = open ("./dna.txt", "r") # read number of pairs num_pairs = in_file.readline() num_pairs = num_pairs.strip() num_pairs = int(num_pairs) # read pairs of deoxyribonucleic acid strands in range (num_pairs): st1 = in_file.readline() st2 = in_file.readline() st1 = st1.strip() st2 = st2.strip() print(st2, st1) # order strands size if ( len(st1) > len(st2) ): dna1 = st1 dna2 = st2 else: dna1 = st2 dna2 = st1

main()

an imperfect answer:

pair1 = ('gaaggtcgaa', 'cctcggga') pair2 = ('atgatggac', 'gtgataaggaccc') pair3 = ('aaattt', 'gggccc') def findsequences(pair): seq1, seq2 = pair seqfragments = [seq1[i:i+3] in xrange(len(seq1)-2)] homecoming [seqfragment seqfragment in seqfragments if seqfragment in seq2] >>> findsequences(pair1) ['tcg'] >>> findsequences(pair2) ['tga', 'gat', 'gga', 'gac'] >>> findsequences(pair3) []

the flaw it's looking 3-in-a-row sequence.

python

No comments:

Post a Comment