Abstract:
Multi-threading programming can speed up analytic work.
A thread is the
smallest sequence of programmed instructions managed by the operating
system. Multi-threading is the running of multiple simultaneous
running threads at a program without the need to have multiple copies
of the program. A thread is usually a component of a process, and
multi-threads for multi-threading in Perl can exist within the same
process but share resource such as memory or CPU. All threads share
the same executable perl code, but the values of input variables
might differ. CPU switches among different threads, and threads can
be executed concurrently in multi-CPU or multi-core systems. There
are some issues related to the implementation of multi-threading in
Perl.
The first one is how
many threads we can use? Please remember number of CPUs is only one
of essential factor in practice. For example, there are 4 CPUs (4
cores per CPU) server computer. The theoretical maximum of
multi-threads would be 16 (4x4). Another limit point is memory.
Memory usage should not be beyond 90% when multi-threads computation.
The I/O ability of hard driver is also a neck-limit. One standard
known as linear speedup can be used for balancing the best number of
threads. Linear speedup is the execution time under one thread
divided by the time under multithreads. Higher speedup would be
better. As the graph showed, the relationship between the number of
threads and the computation time would be close to saturation status. Obviously, always increasing threads would not always speed up at
ratio.
The second one is if
the work can be parallel-processed known as task parallelism? The
supports for multi-threads in Perl is not perfect like other true
parallel processing programming. Memory communication and
synchronization between threads are not in manual control in Perl
programming. The whole task might as well divided into many
sub-tasks. Each task would be executed by a thread, and there are no
communications between threads.
Here is the example
for multi-threading programming in Perl.
#! /usr/bin/perl -w
use strict;
use warnings;
use threads;
sub read_fasta{
my ($in_file)=@_;
my $out_file='short_'.$in_file;
open my($IN), "<", $in_file or die;
open my($OUT), ">", $out_file or die;
while((my $L1=<$IN>) & (my $L2=<$IN>)){
chomp($L1, $L2);
print $OUT "$L1\n", "$L2\n" if length($L2)<50;
}
close($IN);
close($OUT);
}
#main program
my @sample_files=('/home/yuan/data_2/test/test1.fa',
'/home/yuan/data_2/test/test1.fa',
'/home/yuan/data_2/test/test1.fa');
while(1){
if(threads->list() < $variables{threads_num} and
@sample_files>0 ){
my $file=shift @sample_files;
threads->create(\&read_fasta, $file);
}
#recover all threads
foreach my $sub_thread( threads->list() ){#2
$sub_thread->join() if $sub_thread->is_joinable();
}#2
last if threads->list()==0 and @sample_files==0;
sleep 10;
}
Writing data:
2013.09.12
No comments:
Post a Comment