1.获取百度所有链接的例子(通过ID):
public class Activity01(改成你自己的Activity) extends Activity { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); TextView tv = new TextView(this); String myString = null; StringBuffer sff = new StringBuffer();//一定要new一个,我刚开始搞忘了,出不来。 try { Document doc = Jsoup.connect("http://www.baidu.com").get(); Elements links = doc.select("a[href]"); //注意这里是Elements不是Element。同理getElementById返回Element,getElementsByClass返回时Elements for(Element link : links){ //这里没有什么好说的。 sff.append(link.attr("abs:href")).append(" ").append(link.text()).append(" "); } myString = sff.toString(); } catch (Exception e) { myString = e.getMessage(); e.printStackTrace(); } /**//* 将信息设置到TextView */ tv.setText(myString); /**//* 将TextView显示到屏幕上 */ this.setContentView(tv); } }
?2.获取news.cqu.edu.cn中class为topnews 的新闻标题。
package huxiaoan.cqu.praseHtml; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import android.app.Activity; import android.os.Bundle; import android.widget.TextView; public class HtmlActivity extends Activity { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); TextView tv = (TextView) findViewById(R.id.out); String myString = new String(); try { Document doc = Jsoup.connect("http://news.cqu.edu.cn").get(); //Elements Elements topnews = doc.getElementsByClass("topnews"); //Elements Elements links = topnews.select("a[href]"); for (Element link : links) { myString+=link.text(); myString+="\n"; } } catch (Exception e) { myString = e.getMessage(); e.printStackTrace(); } /* 将信息设置到TextView */ tv.setText(myString); } }
?3.利用session连续获取多个页面。即保持会话。
package huxiaoan.cqu.praseHtml; import java.util.Iterator; import java.util.Map; import java.util.Map.Entry; import org.jsoup.Connection; import org.jsoup.Connection.Response; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import android.app.Activity; import android.os.Bundle; import android.widget.TextView; public class HtmlActivity extends Activity { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); TextView tv = (TextView) findViewById(R.id.out); String myString = new String(); String sessionid = new String(); try { // 登录获取sessionid Connection con = Jsoup.connect("http://www.jwc.cqu.edu.cn/login.asp") .data("username", "000") .data("password", "000"); con.post(); sessionid = con.response().cookie("ASPSESSIONIDCCSTRTQS"); // 查询课表(利用读取到的session值,可以实现保持会话,连续请求了。) Connection con_query = Jsoup .connect("http://www.jwc.cqu.edu.cn/PlanAndCurriculum/cour_tab_sel_stud.ASP") .cookie("ASPSESSIONIDCCSTRTQS", sessionid); // 读取内容 Document doc = con_query.get(); Elements fonts = doc.getElementsByTag("b"); for (Element font : fonts) { myString += font.text(); } } catch (Exception e) { myString = e.getMessage(); e.printStackTrace(); } /* 将信息设置到TextView */ tv.setText(myString); } }
?这个例子经过我无数次的测试,经常出现读不到session值的情况。耽误了我很长一段时间。
找了各种英文网站,找到了一种解决办法,我不知道以后还会不会出现问题 。解决方法是,把所有cookie的值都读出来。
Connection.Response res = Jsoup.connect("http://www.jwc.cqu.edu.cn/login.asp") .data("username", "000","password", "000") .method(Method.POST) .execute(); Map<String, String> cookies = res.cookies(); //如果需要 Document doc1 = res.parse(); Connection connection = Jsoup.connect("http://www.jwc.cqu.edu.cn/PlanAndCurriculum/cour_tab_sel_stud.ASP"); for (Entry<String, String> cookie : cookies.entrySet()) { connection.cookie(cookie.getKey(), cookie.getValue()); } Document doc = connection.get();
?