PHP:将curl_exec输出转换为UTF8
发布时间:2020-12-13 16:39:09 所属栏目:PHP教程 来源:网络整理
导读:我只想使用UTF8.问题是我不知道每个网页的字符集.如何检测并转换为UTF8? ?php$url = "http://vkontakte.ru";$ch = curl_init($url);$options = array( CURLOPT_RETURNTRANSFER = true,);curl_setopt_array($ch,$options);$data = curl_exec($ch);// $data =
|
我只想使用UTF8.问题是我不知道每个网页的字符集.如何检测并转换为UTF8?
<?php
$url = "http://vkontakte.ru";
$ch = curl_init($url);
$options = array(
CURLOPT_RETURNTRANSFER => true,);
curl_setopt_array($ch,$options);
$data = curl_exec($ch);
// $data = magic($data);
print $data;
见:http://paulisageek.com/tmp/curl-utf8 什么是魔术()?
通过Gumbo和Pekka的建议,我写了curl_exec_utf8
/** The same as curl_exec except tries its best to convert the output to utf8 **/
function curl_exec_utf8($ch) {
$data = curl_exec($ch);
if (!is_string($data)) return $data;
unset($charset);
$content_type = curl_getinfo($ch,CURLINFO_CONTENT_TYPE);
/* 1: HTTP Content-Type: header */
preg_match( '@([w/+]+)(;s*charset=(S+))?@i',$content_type,$matches );
if ( isset( $matches[3] ) )
$charset = $matches[3];
/* 2: <meta> element in the page */
if (!isset($charset)) {
preg_match( '@<metas+http-equiv="Content-Type"s+content="([w/]+)(;s*charset=([^s"]+))?@i',$data,$matches );
if ( isset( $matches[3] ) )
$charset = $matches[3];
}
/* 3: <xml> element in the page */
if (!isset($charset)) {
preg_match( '@<?xml.+encoding="([^s"]+)@si',$matches );
if ( isset( $matches[1] ) )
$charset = $matches[1];
}
/* 4: PHP's heuristic detection */
if (!isset($charset)) {
$encoding = mb_detect_encoding($data);
if ($encoding)
$charset = $encoding;
}
/* 5: Default for HTML */
if (!isset($charset)) {
if (strstr($content_type,"text/html") === 0)
$charset = "ISO 8859-1";
}
/* Convert it if it is anything but UTF-8 */
/* You can change "UTF-8" to "UTF-8//IGNORE" to
ignore conversion errors and still output something reasonable */
if (isset($charset) && strtoupper($charset) != "UTF-8")
$data = iconv($charset,'UTF-8',$data);
return $data;
}
正则表达式大部分来自http://nadeausoftware.com/articles/2007/06/php_tip_how_get_web_page_content_type (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
