Discussion:
urllib2 httplib.BadStatusLine exception while opening a page on an Oracle HTTP Server
ak
2009-01-19 21:00:44 UTC
Permalink
Hi everyone,

I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server

http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=phone&null

which gets 302 redirected to https://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.

However, with urllib2 and the following code snippet, it doesn't work


--------
import cookiejar
import urllib2

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req = urllib2.Request(url, None)
s=opener.open(req)
--------

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib2.py", line 387, in open
response = meth(req, response)
File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.5/urllib2.py", line 419, in error
result = self._call_chain(*args)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
return self.parent.open(new)
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine

Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying with urllib2 gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://
www.orange.sk).

Am I doing anything wrong or is this a bug in urllib2 ?

-- ak
ak
2009-01-19 21:48:25 UTC
Permalink
Post by ak
Hi everyone,
I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server
http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=p...
which gets 302 redirected tohttps://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.
However, with urllib2 and the following code snippet, it doesn't work
--------
import cookiejar
import urllib2
cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req = urllib2.Request(url, None)
s=opener.open(req)
--------
? File "<stdin>", line 1, in <module>
? File "/usr/lib/python2.5/urllib2.py", line 387, in open
? ? response = meth(req, response)
? File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
? ? 'http', request, response, code, msg, hdrs)
? File "/usr/lib/python2.5/urllib2.py", line 419, in error
? ? result = self._call_chain(*args)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
? ? return self.parent.open(new)
? File "/usr/lib/python2.5/urllib2.py", line 381, in open
? ? response = self._open(req, data)
? File "/usr/lib/python2.5/urllib2.py", line 399, in _open
? ? '_open', req)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
? ? return self.do_open(httplib.HTTPSConnection, req)
? File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
? ? r = h.getresponse()
? File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
? ? response.begin()
? File "/usr/lib/python2.5/httplib.py", line 385, in begin
? ? version, status, reason = self._read_status()
? File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
? ? raise BadStatusLine(line)
httplib.BadStatusLine
Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying with urllib2 gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://www.orange.sk).
Am I doing anything wrong or is this a bug in urllib2 ?
-- ak
Actually, I was wrong on the last point, this does *not* work on
https://www.orange.sk (but does on http://www.orange.sk). IMHO, this
means either urllib2 or the server misimplemented HTTPS.
Post by ak
opener.open(urllib2.Request('http://www.orange.sk/', None, headers))
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Mon, 19 Jan 2009 21:44:03 GMT
header: Server: Oracle-Application-Server-10g/10.1.3.1.0 Oracle-HTTP-
Server
header: Set-Cookie:
JSESSIONID=0a19055a30d630c427bda71d4e26a37ca604b9f590dc.e3eNaNiRah4Pe3aSch8Sc3yOc40;
path=/web
header: Expires: Mon, 19 Jan 2009 21:44:13 GMT
header: Surrogate-Control: max-age="10"
header: Content-Type: text/html; charset=ISO-8859-2
header: X-Cache: MISS from www.orange.sk
header: Connection: close
header: Transfer-Encoding: chunked
<addinfourl at 137417292 whose fp = <socket._fileobject object at
0x831348c>>
Post by ak
opener.open(urllib2.Request('https://www.orange.sk/', None, headers))
reply: ''
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine

As you can see the reply from the server seems empty (which results in
the BadStatusLine exception)

Any help greatly appreciated.

-- ak
O Peng
2009-02-19 12:57:46 UTC
Permalink
I'm running into a similar problem with the BadStatusLine.
The source code for httplib.py in the problem is as follows:

class HTTPResponse:
...
def _read_status(self):
line = self.fp.readline()
...
if not line:
# Presumably, the server closed the connection before
# sending a valid response.
raise BadStatusLine(line)

However, I found that right before the 'raise BadStatusLine(line)'
when I ran the following:

restOfResponse = self.fp.read()
print restOfResponse

restOfResponse is NOT empty. In fact, when I run self.fp.read() at
the beginning of the begin() function, it is not empty at all.
This leads me to believe there is a bug with the self.fp.readline()
(socket._fileobject.readline()) function. For me it only fails
sometimes.

This behavior is only observed on Windows, Python 2.5. Running it on
Mac OS X, Python 2.5 yielded no problems.
Post by ak
Hi everyone,
I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server
http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=p...
which gets 302 redirected tohttps://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.
However, with urllib2 and the following code snippet, it doesn't work
--------
import cookiejar
import urllib2
cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req = urllib2.Request(url, None)
s=opener.open(req)
--------
? File "<stdin>", line 1, in <module>
? File "/usr/lib/python2.5/urllib2.py", line 387, in open
? ? response = meth(req, response)
? File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
? ? 'http', request, response, code, msg, hdrs)
? File "/usr/lib/python2.5/urllib2.py", line 419, in error
? ? result = self._call_chain(*args)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
? ? return self.parent.open(new)
? File "/usr/lib/python2.5/urllib2.py", line 381, in open
? ? response = self._open(req, data)
? File "/usr/lib/python2.5/urllib2.py", line 399, in _open
? ? '_open', req)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
? ? return self.do_open(httplib.HTTPSConnection, req)
? File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
? ? r = h.getresponse()
? File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
? ? response.begin()
? File "/usr/lib/python2.5/httplib.py", line 385, in begin
? ? version, status, reason = self._read_status()
? File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
? ? raise BadStatusLine(line)
httplib.BadStatusLine
Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying with urllib2 gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://www.orange.sk).
Am I doing anything wrong or is this a bug in urllib2 ?
-- ak
Actually, I was wrong on the last point, this does *not* work onhttps://www.orange.sk(but does onhttp://www.orange.sk). IMHO, this
means either urllib2 or the server misimplemented HTTPS.
Post by ak
opener.open(urllib2.Request('http://www.orange.sk/', None, headers))
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Mon, 19 Jan 2009 21:44:03 GMT
header: Server: Oracle-Application-Server-10g/10.1.3.1.0 Oracle-HTTP-
Server
JSESSIONID=0a19055a30d630c427bda71d4e26a37ca604b9f590dc.e3eNaNiRah4Pe3aSch8 Sc3yOc40;
path=/web
header: Expires: Mon, 19 Jan 2009 21:44:13 GMT
header: Surrogate-Control: max-age="10"
header: Content-Type: text/html; charset=ISO-8859-2
header: X-Cache: MISS fromwww.orange.sk
header: Connection: close
header: Transfer-Encoding: chunked
<addinfourl at 137417292 whose fp = <socket._fileobject object at
0x831348c>>
Post by ak
opener.open(urllib2.Request('https://www.orange.sk/', None, headers))
reply: ''
? File "<stdin>", line 1, in <module>
? File "/usr/lib/python2.5/urllib2.py", line 381, in open
? ? response = self._open(req, data)
? File "/usr/lib/python2.5/urllib2.py", line 399, in _open
? ? '_open', req)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
? ? return self.do_open(httplib.HTTPSConnection, req)
? File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
? ? r = h.getresponse()
? File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
? ? response.begin()
? File "/usr/lib/python2.5/httplib.py", line 385, in begin
? ? version, status, reason = self._read_status()
? File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
? ? raise BadStatusLine(line)
httplib.BadStatusLine
As you can see the reply from the server seems empty (which results in
the BadStatusLine exception)
Any help greatly appreciated.
-- ak
ak
2009-03-01 21:22:18 UTC
Permalink
which website have you tested it on ?
My tests were basically on https://www.orange.sk and http://www.orange.sk
(the first fails, and not the second one, which led me to think
there's a bug in python's SSL implementation for this particular web
server) (Oracle) with python 2.5
Post by O Peng
I'm running into a similar problem with the BadStatusLine.
? ? ...
? ? ? ? line = self.fp.readline()
? ? ? ? ...
? ? ? ? ? ? # Presumably, the server closed the connection before
? ? ? ? ? ? # sending a valid response.
? ? ? ? ? ? raise BadStatusLine(line)
However, I found that right before the 'raise BadStatusLine(line)'
restOfResponse = self.fp.read()
print restOfResponse
restOfResponse is NOT empty. ?In fact, when I run self.fp.read() at
the beginning of the begin() function, it is not empty at all.
This leads me to believe there is a bug with the self.fp.readline()
(socket._fileobject.readline()) function. ?For me it only fails
sometimes.
This behavior is only observed on Windows, Python 2.5. ?Running it on
Mac OS X, Python 2.5 yielded no problems.
Post by ak
Hi everyone,
I have a problem withurllib2on this particular url, hosted on an
Oracle HTTP Server
http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=p...
which gets 302 redirected tohttps://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.
However, withurllib2and the following code snippet, it doesn't work
--------
import cookiejar
importurllib2
cookiejar = cookielib.LWPCookieJar()
opener =urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req =urllib2.Request(url, None)
s=opener.open(req)
--------
? File "<stdin>", line 1, in <module>
? File "/usr/lib/python2.5/urllib2.py", line 387, in open
? ? response = meth(req, response)
? File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
? ? 'http', request, response, code, msg, hdrs)
? File "/usr/lib/python2.5/urllib2.py", line 419, in error
? ? result = self._call_chain(*args)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
? ? return self.parent.open(new)
? File "/usr/lib/python2.5/urllib2.py", line 381, in open
? ? response = self._open(req, data)
? File "/usr/lib/python2.5/urllib2.py", line 399, in _open
? ? '_open', req)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
? ? return self.do_open(httplib.HTTPSConnection, req)
? File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
? ? r = h.getresponse()
? File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
? ? response.begin()
? File "/usr/lib/python2.5/httplib.py", line 385, in begin
? ? version, status, reason = self._read_status()
? File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
? ? raise BadStatusLine(line)
httplib.BadStatusLine
Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying withurllib2gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://www.orange.sk).
Am I doing anything wrong or is this a bug inurllib2?
-- ak
Actually, I was wrong on the last point, this does *not* work onhttps://www.orange.sk(butdoes onhttp://www.orange.sk). IMHO, this
means eitherurllib2or the server misimplemented HTTPS.
Post by ak
opener.open(urllib2.Request('http://www.orange.sk/', None, headers))
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Mon, 19 Jan 2009 21:44:03 GMT
header: Server: Oracle-Application-Server-10g/10.1.3.1.0 Oracle-HTTP-
Server
JSESSIONID=0a19055a30d630c427bda71d4e26a37ca604b9f590dc.e3eNaNiRah4Pe3aSch8 Sc3yOc40;
path=/web
header: Expires: Mon, 19 Jan 2009 21:44:13 GMT
header: Surrogate-Control: max-age="10"
header: Content-Type: text/html; charset=ISO-8859-2
header: X-Cache: MISS fromwww.orange.sk
header: Connection: close
header: Transfer-Encoding: chunked
<addinfourl at 137417292 whose fp = <socket._fileobject object at
0x831348c>>
Post by ak
opener.open(urllib2.Request('https://www.orange.sk/', None, headers))
reply: ''
? File "<stdin>", line 1, in <module>
? File "/usr/lib/python2.5/urllib2.py", line 381, in open
? ? response = self._open(req, data)
? File "/usr/lib/python2.5/urllib2.py", line 399, in _open
? ? '_open', req)
? File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
? ? result = func(*args)
? File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
? ? return self.do_open(httplib.HTTPSConnection, req)
? File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
? ? r = h.getresponse()
? File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
? ? response.begin()
? File "/usr/lib/python2.5/httplib.py", line 385, in begin
? ? version, status, reason = self._read_status()
? File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
? ? raise BadStatusLine(line)
httplib.BadStatusLine
As you can see the reply from the server seems empty (which results in
the BadStatusLine exception)
Any help greatly appreciated.
-- ak
Steven D'Aprano
2009-01-20 00:14:28 UTC
Permalink
Post by ak
Hi everyone,
I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server
http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null
Post by ak
which gets 302 redirected to
https://www.orange.sk/eshop/sk/catalog/post/phones.html, after setting a
cookie through the Set-Cookie header field in the 302 reply. This works
fin with firefox.
However, with urllib2 and the following code snippet, it doesn't work
Looking at the BadStatusLine exception raised, the server response line
is empty. Looking at the source for httpllib suggests to me that the
server closed the connection early. Perhaps it doesn't like connections
from urllib2?

I ran a test pretending to be IE using this code:

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?' \
'type=post&subtype=phone&null'
agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; " \
"NeosBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
headers = {'User-Agent': agent}
req = urllib2.Request(url, data=None, headers=headers)
try:
s=opener.open(req)
except httplib.BadStatusLine, e:
print e, e.line
else:
print "Success"



but it failed. So the problem is not as simple as changing the user-agent
string.

Other than that, I'm stumped.
--
Steven
ak
2009-01-21 19:43:27 UTC
Permalink
On Jan 20, 1:14?am, Steven D'Aprano
Post by ak
Post by ak
Hi everyone,
I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server
http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null
Post by ak
which gets 302 redirected to
https://www.orange.sk/eshop/sk/catalog/post/phones.html, after setting a
cookie through the Set-Cookie header field in the 302 reply. This works
fin with firefox.
However, with urllib2 and the following code snippet, it doesn't work
Looking at the BadStatusLine exception raised, the server response line
is empty. Looking at the source for httpllib suggests to me that the
server closed the connection early. Perhaps it doesn't like connections
from urllib2?
cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?'\
? ? 'type=post&subtype=phone&null'
agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; " \
? ? "NeosBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
headers = {'User-Agent': agent}
req = urllib2.Request(url, data=None, headers=headers)
? ? s=opener.open(req)
? ? print e, e.line
? ? print "Success"
but it failed. So the problem is not as simple as changing the user-agent
string.
Other than that, I'm stumped.
--
Steven
Thanks a lot for confirming this. I also tried with different headers,
including putting *exactly* the same headers as firefox (including
Connection:keep-alive by modifying httplib), it still doesn't work.
The only possible explanation for me is that python's httplib doesn't
handle SSL/TLS 'properly' (not necessarly in the sense of the TLS
spec, but in the sense that every other browser can connect properly
to this website and httplib can't)

If anyone knows an Oracle HTTPS server to confirm this on another
server, it would be nice...
Ahmed, Shakir
2009-01-21 20:13:26 UTC
Permalink
I am grabbing few fields from a table and one of the columns is in date
format. The output which I am getting is "Wed Feb 09 00:00:00 2005" but
the data in that column is "02/09/2005" and I need the same format
output to insert those recodes into another table.

print my_service_DATE
Wed Feb 09 00:00:00 2005

Any help is highly appreciated.

sk
Tim Chase
2009-01-21 20:42:28 UTC
Permalink
Post by Ahmed, Shakir
I am grabbing few fields from a table and one of the columns is in date
format. The output which I am getting is "Wed Feb 09 00:00:00 2005" but
the data in that column is "02/09/2005" and I need the same format
output to insert those recodes into another table.
print my_service_DATE
Wed Feb 09 00:00:00 2005
if you are getting actual date/datetime objects, just use the
strftime() method to format as you so desire.

If you're getting back a *string*, then you should use
time.strptime() to parse the string into a time-object, and then
use the constituent parts to reformat as you see fit.

-tkc
Loading...