Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary gibberish followed by weird HTML headers in index.html, lectures.html #128

Open
danmbox opened this issue Mar 20, 2014 · 2 comments

Comments

@danmbox
Copy link

danmbox commented Mar 20, 2014

less ./randomness-001/lectures.html

<binary gibberish>
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Type: text/html
Date: Thu, 20 Mar 2014 15:06:59 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Location: https://class.coursera.org/randomness-001/lecture
Pragma: no-cache
Server: nginx
X-Frame-Options: SAMEORIGIN
X-Powered-By: PHP/5.3.10-1ubuntu3.10
Content-Length: 130
Connection: keep-alive

Redirecting to <a href="https://class.coursera.org/randomness-001/lecture">https://class.coursera.org/randomness-001/l
@danmbox
Copy link
Author

danmbox commented Apr 5, 2014

Additional notes: This bug has persisted, so it's probably not due to the server. I have no problem saving a .../lecture/ page from the browser to an HTML file, so I don't understand why the script downloads gibberish. index.html gets similarly corrupted (about as often)

@dgorissen
Copy link
Owner

Thanks Dan. Unfortunately I do not have the time to really follow up on
this, any help welcome. The way a browser interacts with a page is quite
different that requests does. coursera-dl used to be based around the
mechanize lib and that seemed more stable in this respect. You can try the
mechanize branch on github (It should still work I believe) and see if that
helps.

On Sat, Apr 5, 2014 at 6:07 PM, Dan Muresan [email protected]:

Additional notes: This bug has persisted, so it's probably not due to the
server. I have no problem saving a .../lecture/ page from the browser to an
HTML file, so I don't understand why the script downloads gibberish.
index.html gets similarly corrupted (about as often)

Reply to this email directly or view it on GitHubhttps://github.com//issues/128#issuecomment-39644344
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants