您好,登錄后才能下訂單哦!
小編給大家分享一下perl如何自動獲取網頁上的信息,希望大家閱讀完這篇文章之后都有所收獲,下面讓我們一起去探討吧!
perl獲取網頁上的信息
perl自動上網,然后獲取網頁上的信息:
#!/usr/bin/perl -w # Perl pragma to restrict unsafe constructs use strict; # use LWP::UserAgent model use LWP::UserAgent; # main function sub main { # get params # @_ # Within a subroutine the array @_ contains the parameters passed to that subroutine. # Inside a subroutine, @_ is the default array for the array operators push, pop, shift, and unshift. my $url = 'http://www.taobao.com'; die "no url param!\n" unless $url; # create LWP::UserAgent object my $ua = LWP::UserAgent->new; # set connect timeout $ua->timeout(20); # set User-Agent header $ua->agent("Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; .NET CLR 2.0.50727)"); # send url use get mothed, and store response at var $resp my $resp = $ua->get($url); # check response if ($resp->is_success) { # get response content(html source code) my $content = $resp->decoded_content; # use Regex get page title from $content if ( $content =~ m{<title>(.*)</title>}si ) { # <title>(.+?)</title> (.+?) match title string, use () to store this str at a special variable $1 (this is a perl variable ), # The bracketing construct ( ... ) creates capture groups (also referred to as capture buffers). To refer to the current contents of a group later on, within the same pattern, use $1 for the first,$2 for the second, and so on. my $head = $1; print "find page title : $head\n"; } else { print "no page title for url : $url\n"; } } else { #display status information and exit die $resp->status_line; } } # pass params to main function, # @ARGV # The array @ARGV contains the command-line arguments intended for the script. main(@ARGV);
看完了這篇文章,相信你對“perl如何自動獲取網頁上的信息”有了一定的了解,如果想了解更多相關知識,歡迎關注億速云行業資訊頻道,感謝各位的閱讀!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。