一個(gè)用php獲取遠(yuǎn)程網(wǎng)址header頭信息的方法,這在采集時(shí)很有用,他可以讓你判斷出來,遠(yuǎn)程文件或網(wǎng)頁是否正常,是否是404頁
有二種方法,
1.用php的函數(shù)get_headers
get_headers -- Fetches all the headers sent by the server in response to a HTTP request
Description
array get_headers ( string url [, int format] )
get_headers() returns an array with the headers sent by the server in response to a HTTP request. Returns FALSE on failure and an error of level E_WARNING will be issued.
If the optional format parameter is set to 1, get_headers() parses the response and sets the array's keys.
例子 1. get_headers() example
<?php
$url = 'http://www.example.com';
print_r(get_headers($url));
print_r(get_headers($url, 1));
?>
上例的輸出類似于:
Array
(
[0] => HTTP/1.1 200 OK
[1] => Date: Sat, 29 May 2004 12:28:13 GMT
[2] => Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)
[3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
[4] => ETag: "3f80f-1b6-3e1cb03b"
[5] => Accept-Ranges: bytes
[6] => Content-Length: 438
[7] => Connection: close
[8] => Content-Type: text/html
)
Array
(
[0] => HTTP/1.1 200 OK
[Date] => Sat, 29 May 2004 12:28:14 GMT
[Server] => Apache/1.3.27 (Unix) (Red-Hat/Linux)
[Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT
[ETag] => "3f80f-1b6-3e1cb03b"
[Accept-Ranges] => bytes
[Content-Length] => 438
[Connection] => close
[Content-Type] => text/html
)
get_headers 是用來取得遠(yuǎn)程服務(wù)器的響應(yīng)頭信息的.用返回的第一個(gè)數(shù)組再加上正則就可以判斷遠(yuǎn)程地址是否為200正常網(wǎng)頁
2,用curl CURLOPT_NOBODY參數(shù)只抓取header頭信息
curl函數(shù)真是個(gè)好東西,curl參數(shù)里有一項(xiàng)可以配置只抓取遠(yuǎn)程網(wǎng)頁的header頭信息
如下代碼,加紅的地方是關(guān)健,他指定了curl抓的內(nèi)容中包含header頭,并且不要body內(nèi)容.
function get_header($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY,true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch, CURLOPT_AUTOREFERER,true);
curl_setopt($ch, CURLOPT_TIMEOUT,30);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: */*',
'User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)',
'Connection: Keep-Alive'));
$header = curl_exec($ch);
return $header;
}
如對本文有疑問,請?zhí)峤坏浇涣髡搲?,廣大熱心網(wǎng)友會為你解答?。?點(diǎn)擊進(jìn)入論壇