ximalaya

这里是openkava 的blog，关注程序开发的一切技术。 ZZ 表示转载的文章，如涉及版权，请和我联系删除。在这里你可以看到关于以下技术的文章：移动开发技术，ANDROID ，IOS，WINDOWS PHONE平台开发，企业ERP开发，动态脚本PYTHON ,OPENGL ES 3D技术，游戏开发技术，HTML5 ,JAVASCRIPT ，MYSQL,AMAZON EC2 ,GOOGLE GAE ,GOOGLE CLOUD SQL 等。本站发展历程： 2010年，正式把所有的blog移到这里，租用godaddy的空间，记录生活和工作上的一些心得。下面是关于我的个人介绍，写在这里权当凑字数啦。职业：软件开发，开发经验6年，管理经验3年；工作上使用的技术：C#, SQL SERVER 个人使用的技术：PYTHON，PHP, CSS, JAVA ,ANDROID ，object-c 等等联系我请发邮件：<a href="http://blog.openkava.com/openkava@gmail.png"><img class="alignnone size-full wp-image-96" title="邮箱" src="http://blog.openkava.com/openkava@gmail.png" alt="" width="174" height="24" /></a>

ruby使用nokogiri抓取网页

Nov 10, 201232

AI-generated summary

This article discusses the use of the Nokogiri gem in Ruby for web scraping. It explains how Nokogiri can be used in conjunction with Spidr to easily scrape web pages or images. The article also provides a code example using Nokogiri to parse a web page and print the content of specific elements. It concludes by mentioning a tutorial for further learning.

nokogiri 这个 gem 实在好用，配合 spidr ，可以很方便的抓取网页或图片。

spidr 使用 nokogiri

所以要灵活的话还是要用 nokogiri 。

require 'net/http'
require "open-uri"
require 'nokogiri'

weburl='http://slide.eladies.sina.com.cn/fa/slide_3_22147_9430.html#p=17'
doc = Nokogiri::HTML.parse (open (weburl), nil, 'gb2312') #不这样写会有乱码
doc.css('dl dd').each do |link|

puts link.content

rescue
puts 'error'
end
学习教程：

http://ruby.bastardsbook.com/chapters/html-parsing/

Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.

Blockchain ID
#51350-337
Owner
0xbed7743f2e79f87783362278bbf3288351ae0cc4
Transaction Hash
Creation 0xe8e8e4d1...2624266b45 Last Update 0xe8e8e4d1...2624266b45
IPFS Address
ipfs://bafkreiesgxxdqyoi33zxsyr3jcjmbjv6pc4vf6t5nnwc2xvn75lccp63su