Description
Finding ORFs
In this lab, you will write `python` functions to find potential prokaryotic ORFs of a given DNA. Write all your code into `answers.py`. Run the script with `python3 answers.py` to make sure that your code works.
Task 1
========
Write a function `findStartCodons` that returns position of start codons in DNA sequence as a list. Please note that the first position should be `0` and the second position should be `1` and so forth.
Hint 1 : Investigate `find` function.
`’ATTTA’.find(‘A’)` returns `0`
`’TTTTT’.find(‘A’)` returns `-1`
`’ATTTA’.find(‘A’,3)` returns `4`
The second parameter (`3` in the example) is the start position for the string.
Hint 2 : `while` loop is useful.
Task 2
========
Write a function `findNextStop` that takes two parameters `DNA` and `startCodonPosition`. The function should return the next stop codon of a given DNA sequence and start codon position.
Hint: Remember that start and stop codon should be in frame. The length from start to stop codon should be multiple of 3. Investigate the third parameter of `range` function. What does `range(0,30,3)` does?
The output should be a `list` as in the example below.
Task 3
=======
Use the functions you created in `Task 1` and `Task 2`: `findStartCodons` and `findNextStop` in a new function `getAllORFs` to find all possible ORFs of a given DNA sequence.
Hint: You basically need to run the first function. Then write a loop, within the loop call the second function. Output should be a `list`.
Task 4
=======
Use your `getReverseComplementaryDNA` function from `Lab2` (copy and paste your working functions) and find the all ORFs in both given DNA and its reverse complementary sequence.
Hint: You just run `getAllORFs` function twice for the actual given sequence and reverse complement of the sequence. Then return all the combination of two results in a single `list`.
Here are the two functions from previous lab:
“`
def getComplementaryDNA(inputDNA):
complement = {‘A’: ‘T’, ‘C’: ‘G’, ‘G’: ‘C’, ‘T’: ‘A’}
return ”.join([complement[base] for base in inputDNA])
def getReverseComplementaryDNA(inputDNA):
complementaryDNA = getComplementaryDNA(inputDNA)
return complementaryDNA[::-1]
“`
Task 5
=======
Write a function returning the longest possible ORF. Use the previous function `getAllORFsFromBothStrands`. Please find out the length of the longest potential ORF predicted from the following DNA:
“`
GGTAGTTTTTTTTTGAAAAAAATGCCTAAAAAGCTTGCAATGACTAAATGATTCTGTTATTATATTGTGGTGCTGTAAAAATACAGCTTTAGCAATGATACAAGAGGTTGCGACACGCTCGGTTGCATTGCCACGCAACAGGTGTCGGTTTTCTTGAGGAGCTAGCCTATTATCGTAAATAGACGAGAGGAGAAAAGATGGCAAACAAAAAAATCCGTATCCGTTTGAAAGCGTACGAACACCGTACACTTGATACAGCGGCAGAAAAAATCGTTGAAACTGCAACACGTACAGGTGCTACAGTTGCTGGACCAGTTCCACTTCCAACTGAACGCAGTCTTTACACAATTATTCGTGCGACTCACAAATACAAAGATTCTCGCGAACAATTTGAAATGCGTACACACAAACGTTTGGTAGACATCATCAATCCAACACAAAAAACTGTTGATGCTTTGATGAAACTTGATCTTCCAAGTGGTGTCAACGTAGAAATCAAACTTTAATCGGTGAGATTTTGCAAGTACAGTTAGTGTTTGATGGAACTTGAACACGAGCTAAACTCTACATGAAAAAGATAAATCTTCCTCGAAACAGAAGCTTTTGTGTTAGATTTTCTATTTTTATTTTGAGTTAG
“`
Hint: Here is a way of sorting (short to long) the array of text items: `listOfText.sort(key = lambda s: len(s))`