How to compare substrings on python? (DNA suquencing) -
i have assignment in python have check pair of deoxyribonucleic acid sequence. (in case 3 pairs)
pair 1 (gaaggtcgaa, cctcggga) pair 2( atgatggac, gtgataaggaccc) pair 3 (aaattt, gggccc)
check each pair see if have mutual sequence.
longest mutual sequences
pair 1: tcg
pair 2: tgat ggac
pair 3: no mutual sequence found
i able substrings, i'm having difficulties comparing them , printing when find mutual sequences.
thanks in advance.
my code far txt file has pairs mentioned
import string def main():
# open file reading in_file = open ("./dna.txt", "r") # read number of pairs num_pairs = in_file.readline() num_pairs = num_pairs.strip() num_pairs = int(num_pairs) # read pairs of deoxyribonucleic acid strands in range (num_pairs): st1 = in_file.readline() st2 = in_file.readline() st1 = st1.strip() st2 = st2.strip() print(st2, st1) # order strands size if ( len(st1) > len(st2) ): dna1 = st1 dna2 = st2 else: dna1 = st2 dna2 = st1
main()
an imperfect answer:
pair1 = ('gaaggtcgaa', 'cctcggga') pair2 = ('atgatggac', 'gtgataaggaccc') pair3 = ('aaattt', 'gggccc') def findsequences(pair): seq1, seq2 = pair seqfragments = [seq1[i:i+3] in xrange(len(seq1)-2)] homecoming [seqfragment seqfragment in seqfragments if seqfragment in seq2] >>> findsequences(pair1) ['tcg'] >>> findsequences(pair2) ['tga', 'gat', 'gga', 'gac'] >>> findsequences(pair3) []
the flaw it's looking 3-in-a-row sequence.
python
No comments:
Post a Comment