当前位置：首页技术文章正文

关于 ruby??：为什么我会从 Nokogiri 收到”错误的状态行”错误？ | 珊瑚贝

01-05 技术文章 222

Why do I get “wrong status line” errors from Nokogiri?

我的 Ruby/Nokogiri 脚本是：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

require ‘rubygems’
require ‘nokogiri’
require ‘open-uri’

f = File.new(“enterret” +“.txt”, ‘w’)

1.upto(100) do |page|
urltext =“http://xxxxxxx.com/” +“page/”
urltext << page.to_s +“/”
doc = Nokogiri::HTML(open(urltext))
doc.css(“.photoPost”).each do |post|
quote = post.css(“h1 + p”).text
author = post.css(“h1 + p + p”).text
f.puts“#{quote}” +“#{author}”
f.puts“——————————————————–“
end
end

运行此脚本时出现以下错误：

1	http.rb:2030:in `read_status_line’: wrong status line:”<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”” (Net::HTTPBadResponse)

但是我的脚本正确写入文件，只是这个错误不断出现。错误是什么意思？

看看这个讨论：stackoverflow.com/questions/1910816/…
而不是您用于创建 urltext 的 rigamarole，请尝试：urltext =”http://xxxxxxx.com/page/#{ page }/”。用 Ruby 编写时，它更惯用。

如果不知道您访问的是哪个站点，很难确定，但我怀疑问题不在于 Nokogiri。

http.rb 正在报告错误，这很可能是在抱怨返回的 HTTPd 标头。 http.rb 关心与 HTTPd 服务器的握手，并且会抱怨缺少/格式错误的标头，但它不会关心有效负载。

另一方面，Nokogiri 会关注有效负载，即 HTML。 DOCTYPE 应该是 HTML 有效负载的一部分，所以我怀疑他们的服务器正在发送 HTML DOCTYPE 而不是 MIME doctype，它应该是 “text/html”.

在 Ruby 1.8.7 http.rb 文件中，您将在代码中的 2030 处看到以下行：

1
2
3
4
5

def response_class(code)
CODE_TO_OBJ[code] or
CODE_CLASS_TO_OBJ[code[0,1]] or
HTTPUnknownResponse
end

这似乎是生成您所看到的那种消息的可能位置。

感谢您提供的信息，但是我离开了我的电脑一段时间并尝试重新运行脚本，它完美地工作，没有我改变脚本的任何一行。有趣的？
这只是意味着服务器停止发送错误的响应代码。即他们修复了它。
我同意，他们可能修复了服务器。发送 HTML doctype 而不是 mime doctype 是一个严重的错误，它可能会真正混淆任何信任 mime-type 的代码。浏览器通常不信任这些信息，因为开发人员很久以前就学会了不信任它；他们可以扫描输出并查看其是否为文本，并查找标签并查看其是否为 HTML。但是大多数与服务器通信的应用程序不会这样做，并且会假设 mime 类型的响应是正确的。这些年来写了很多蜘蛛，我也学得很辛苦，所以对于蜘蛛，我检查一下我得到了什么。

来源：https://www.codenong.com/8269904/

微信公众号

手机浏览(小程序)

Warning: get_headers(): SSL operation failed with code 1. OpenSSL Error messages: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

Warning: get_headers(): Failed to enable crypto in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

Warning: get_headers(https://static.shanhubei.com/qrcode/qrcode_viewid_8921.jpg): failed to open stream: operation failed in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

0

分享到：