Tiezheng Yuan Ph.D.: Perl: Equivalent codes instead of BioPerl (1)

Abstract: Here are the two equivalent codes for extract sequences from a FASTA file. One implement Bio::SeqIO, and the other implement only standard Perl codes.

The style of Perl coding using Bio-perl:

#bioperl style

sub chr_info{

my ($infile)=@_;

#generate genome_info_csv file

my $in_obj = Bio::SeqIO->new(-file => $infile, -format => 'fasta');

while (my $seq_obj = $in_obj->next_seq() ) {#3

my $chr=$seq_obj->display_id;

my $chr_seq=$seq_obj->seq;

my $chr_len=$seq_obj->length;

my $GC_num=map {/G|C/gi} $chr_seq;

my $GC_cont=int($GC_num*100/$chr_len+0.5);

my $enzyme_num=map {/'GAATTC'/gi} $chr_seq;

print join(',', $chr, $chr_len, $GC_cont, $enzyme_num, ), "\n";

}

#only standard Perl codes

sub my_chr_info{

my ($infile)=@_;

#generate genome_info_csv file

open my($IN), "<", $infile or die ;

my($chr,$seq);

while( defined ($chr=<$IN>) && defined ($seq=<$IN>) ){#3

chomp($chr, $seq);

$chr=~s/^>//;

my $seq_len=length($seq);

my $GC_num=map {/G|C/gi} $seq;

my $GC_cont=int($GC_num*100/$seq_len+0.5);

my $enzyme_num=map {/'GAATTC'/gi} $seq;

print join(',', $chr, $seq_len, $GC_cont, $enzyme_num, ), "\n";

close($IN);

}

Tiezheng Yuan Ph.D.

Tuesday, December 22, 2015

Perl: Equivalent codes instead of BioPerl (1)

No comments:

Post a Comment