在特定示例中了解perl中的异步
我必须编写一个脚本来并行获取一些URL并做一些工作.在过去,我一直使用Parallel :: ForkManager来做这些事情,但现在我想学习一些新东西并尝试使用AnyEvent(和AnyEvent :: HTTP或AnyEvent :: Curl :: Multi)进行异步编程……但是我我有问题理解AnyEvent并编写一个脚本应该:
>打开一个文件(每一行都是一个单独的URL) 我已阅读了许多手册和教程,但我仍然很难理解阻塞和非阻塞代码之间的差异.我在http://perlmaven.com/fetching-several-web-pages-in-parallel-using-anyevent找到了类似的脚本,Szabo先生解释了基础知识,但我仍然无法理解如何实现以下内容: ... open my $fh,"<",$file; while ( my $line = <$fh> ) { # http request,read response,update MySQL } close $fh ... …并在这种情况下添加并发限制. 我非常感谢你的帮助;) UPDATE 按照Ikegami的建议,我试试Net :: Curl :: Multi.我对结果非常满意.经过多年使用Parallel :: ForkManager只是为了同时抓取数千个URL,Net :: Curl :: Multi似乎很棒. #!/usr/bin/perl use Net::Curl::Easy qw( :constants ); use Net::Curl::Multi qw( ); sub make_request { my ( $url ) = @_; my $easy = Net::Curl::Easy->new(); $easy->{url} = $url; $easy->setopt( CURLOPT_URL,$url ); $easy->setopt( CURLOPT_HEADERDATA,$easy->{head} ); $easy->setopt( CURLOPT_FILE,$easy->{body} ); return $easy; } my $maxWorkers = 10; my $multi = Net::Curl::Multi->new(); my $workers = 0; my $i = 1; open my $fh,"urls.txt"; LINE: while ( my $url = <$fh> ) { chomp( $url ); $url .= "?$i"; print "($i) $urln"; my $easy = make_request( $url ); $multi->add_handle( $easy ); $workers++; my $running = 0; do { my ($r,$w,$e) = $multi->fdset(); my $timeout = $multi->timeout(); select $r,$e,$timeout / 1000 if $timeout > 0; $running = $multi->perform(); RESPONSE: while ( my ( $msg,$easy,$result ) = $multi->info_read() ) { $multi->remove_handle( $easy ); $workers--; printf( "%s getting %sn",$easy->getinfo( CURLINFO_RESPONSE_CODE ),$easy->{url} ); } # dont max CPU while waiting select( undef,undef,0.01 ); } while ( $workers == $maxWorkers || ( eof && $running ) ); $i++; } close $fh; 解决方法
Net :: Curl是一个非常好的库,非常快.此外,它也可以处理并行请求!我建议使用它而不是AnyEvent.
use Net::Curl::Easy qw( :constants ); use Net::Curl::Multi qw( ); sub make_request { my ( $url ) = @_; my $easy = Net::Curl::Easy->new(); $easy->{url} = $url; $easy->setopt( CURLOPT_URL,$easy->{body} ); return $easy; } my $max_running = 10; my @urls = ( 'http://www.google.com/' ); my $multi = Net::Curl::Multi->new(); my $running = 0; while (1) { while ( @urls && $running < $max_running ) { my $easy = make_request( shift( @urls ) ); $multi->add_handle( $easy ); ++$running; } last if !$running; my ( $r,$e ) = $multi->fdset(); my $timeout = $multi->timeout(); select( $r,$timeout / 1000 ) if $timeout > 0; $running = $multi->perform(); while ( my ( $msg,$result ) = $multi->info_read() ) { $multi->remove_handle( $easy ); printf( "%s getting %sn",$easy->{url} ); } } (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |