专栏名称: YuLabSMU
专注于生物信息学、R语言和可视化,只有原创、拒绝爆款!
目录
相关文章推荐
51好读  ›  专栏  ›  YuLabSMU

a simple gene finder

YuLabSMU  · 公众号  ·  · 2017-08-06 12:03

正文

请到「今天看啥」查看全文




最近又在开IBW2017的会,想起来我当年唯一参加过的一次是IBW2011,那年我在暨大拿了个小基金,资助了自己去西安玩一趟。当年只记录了这个课程的作业,分享一下。






Course Projects:

Project 1: Implementation of a simple gene finder


GOAL

Build a simple codon-usage based gene finder for finding genes in E.coli.


Procedure

Collect 100 gene sequences from the bacterium E. coli in the genbank (http://www.ncbi.nlm.nihh.gov). Compute the codon usage table based on these genes (and the translated protein sequences from them); Build a probabilistic model based on the codon usages; Implement a random sequence model in which the nucleotide frequency is computed from the 100 E. coli genes. For a given DNA sequence (and one selected reading frame), compare your model with a random sequence model; Results that you should submit:


Two FASTA files for the collected 100 genes and 100 translated protein sequences; The printed codon usage table; A program named ECgnfinder, running with the syntax as ECgnfinder –i inputfile


Inputfile stands for the name of input file, which should contain one DNA sequence in FASTA file format; the program should be able to report an error message if the input file is in the wrong format.


The output should be printed to the standard output as (xxx stands for the likelihood)


ORF1: xxx ORF2: xxx



代码点击阅读原文看吧,代码长一点,就懒得调了,怎么调都调不好,微信太渣渣。

赞赏







请到「今天看啥」查看全文