最近又在开IBW2017的会,想起来我当年唯一参加过的一次是IBW2011,那年我在暨大拿了个小基金,资助了自己去西安玩一趟。当年只记录了这个课程的作业,分享一下。
Course Projects:
Project 1: Implementation of a simple gene finder
GOAL
Build a simple codon-usage based gene finder for finding genes in
E.coli.
Procedure
Collect 100 gene sequences from the bacterium E. coli in the genbank
(http://www.ncbi.nlm.nihh.gov). Compute the codon usage table based on
these genes (and the translated protein sequences from them); Build a
probabilistic model based on the codon usages; Implement a random
sequence model in which the nucleotide frequency is computed from the
100 E. coli genes. For a given DNA sequence (and one selected reading
frame), compare your model with a random sequence model; Results that
you should submit:
Two FASTA files for the collected 100 genes and 100 translated protein
sequences; The printed codon usage table; A program named ECgnfinder,
running with the syntax as ECgnfinder –i inputfile
Inputfile stands for the name of input file, which should contain one
DNA sequence in FASTA file format; the program should be able to
report an error message if the input file is in the wrong format.
The output should be printed to the standard output as (xxx stands for
the likelihood)
ORF1: xxx ORF2: xxx
代码点击阅读原文看吧,代码长一点,就懒得调了,怎么调都调不好,微信太渣渣。
赞赏