java 网络编程-爬虫+模拟浏览器
发布时间:2020-12-15 07:58:39 所属栏目:Java 来源:网络整理
导读:网络爬虫+模拟浏览器(获取有权限网站资源): 获取URL 下载资源 分析 处理 public class http {public static void main(String[]args) throws Exception{ //http+s更安全 //URL.openStream()打开于URL的连接,并返回一个InputStream用于从连接中读取数据 //
网络爬虫+模拟浏览器(获取有权限网站资源):
获取URL 下载资源 分析 处理 public class http { public static void main(String[]args) throws Exception { //http+s更安全 //URL.openStream()打开于URL的连接,并返回一个InputStream用于从连接中读取数据 //获取URL URL url=new URL("https://www.jd.com"); //下载资源 InputStream is = url.openStream(); BufferedReader br=new BufferedReader(new InputStreamReader(is,"UTF-8"));; String msg=null; while((msg=br.readLine())!=null) { System.out.println(msg); } br.close(); } } 获取有权限网络资源: public class http { public static void main(String[]args) throws Exception { //.openConnectio,,返回一个URLConnection实例表示由所引用的远程对象的连接URL //URLConnection的子类有HttpURLConnection和JarURLConnection URL url=new URL("https://www.jd.com"); //下载资源 HttpURLConnection conn=(HttpURLConnection)url.openConnection(); conn.setRequestMethod("GET");//模拟浏览器得get请求 conn.setRequestProperty( "User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763"); BufferedReader br=new BufferedReader(new InputStreamReader(conn.getInputStream(),"UTF-8")); String msg=null; while((msg=br.readLine())!=null) { System.out.println(msg); } br.close(); } } (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |