8000 "Extract full content" feature does not work · Issue #6 · angristan/feedbin-docker · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

"Extract full content" feature does not work #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ldexterldesign opened this issue Mar 28, 2020 · 11 comments · Fixed by #7
Closed

"Extract full content" feature does not work #6

ldexterldesign opened this issue Mar 28, 2020 · 11 comments · Fixed by #7

Comments

@ldexterldesign
Copy link
ldexterldesign commented Mar 28, 2020

Hi,

Hope you're safe

Disclaimer: I don't know what this means in the readme so I invite you to enhance it, specifically the "update the domains" part:

Copy caddy/example.Caddyfile to caddy/Caddyfile and update the domains

I left the caddy file default but had problems with letsencrypt

This could be due to [r]ate limiting because I must have been hammering something during my copious setup attempts. If you'd be kind enough to clarify the "update the domains" part of the readme then happy to retry in a week or so (I read rate limiting can last days/weeks 😕)..?

Here's what worked for me:

127.0.0.1:3000 {
  gzip
  proxy / http://feedbin-web:3000 {
    transparent
  }
}

127.0.0.1:8081 {
  gzip
  proxy / http://camo:8081 {
    transparent
  }
}

127.0.0.1:9000 {
  gzip
  proxy / http://minio:9000 {
    transparent
  }
}
# Rails
[...]
FEEDBIN_HOST=127.0.0.1
FORCE_SSL=false

Correct me if I'm wrong but localhost, instead of 127.0.0.1, won't work because [t]his

Aside, the "(extract/sticky) full content" feature doesn't seem to load full content (ever) for me - I presume this is because I'm not using SSL/HTTPS?:

  • If yes then how can I fix this
  • If no then any idea what the reason is

Screenshot 2020-03-28 at 05 50 30

Screenshot 2020-03-28 at 05 46 29

If you have any issues (e.g. questions/queries) then happy to help

Hope to hear back

Sincerely

r: https://letsencrypt.org/docs/rate-limits
t: https://letsencrypt.org/docs/certificates-for-localhost

@angristan angristan changed the title Fix/research "extract full content" feature / caddy/SSL/HTTPS #Enhancement #Help #Question #Suggestion Fix/research "extract full content" feature / caddy/SSL/HTTPS Mar 28, 2020
@angristan
Copy link
Owner

You can see if you have been rate limited in Caddy's logs :)

I think "extract full content" is an option for Feedbin's scrapper. Like if a website only provides an excerpt in their feed, Feedbin would try to get the whole content. If that's not working, I can't see how it's related to HTTPS, probably a Feedbin bug.

@ldexterldesign
Copy link
Author
ldexterldesign commented Mar 29, 2020

You can see if you have been rate limited in Caddy's logs :)

I was monitoring the docker logs (assume they're the same as caddy logs but will explore further next time I attempt an install with default configs)

I think "extract full content" is an option for Feedbin's scrapper. Like if a website only provides an excerpt in their feed, Feedbin would try to get the whole content. If that's not working, I can't see how it's related to HTTPS, probably a Feedbin bug.

I just came from the feedbin free trial where it worked fine so I don't think it's a bug

I've experienced this issue before with [f]eed 8000 er:

Mixed Content: The page at 'https://feeder.co/reader' was loaded over HTTPS, but requested an insecure resource 'http://www.bbc.co.uk/earth/story/20150727-mystical-hair-ice-riddle-solved'. This request has been blocked; the content must be served over HTTPS.

I believe I fixed this by turning HTTPS off in feeder but I can't be sure without testing again (they insist their extension must be installed so I know that played/plays a part) so I'm pretty sure this issue is HTTP/S related

I notice you touch on mixed content in the readme..?

Is this feature working in your build?:

  • If yes then it's probably a config issue for me - what HTTP/S setup do you have
  • If no then it's something that needs (ideally) fixing in the repo'

Cheers

f: https://feeder.co

@angristan
Copy link
Owner

Oops, I'm indeed getting a 500 when clicking on extract full content:

feedbin-web_1            | I, [2020-03-29T16:40:46.824871 #1]  INFO -- : method=GET path=/extracts/7914/entry format=js controller=ExtractsController action=entry status=500 error='NoMethodError: undefined method `check_for_image' for #<ExtractsController:0x00007f1e283a7e18>' duration=207.92 view=0.00 db=19.65
feedbin-web_1            | F, [2020-03-29T16:40:46.854732 #1] FATAL -- :
feedbin-web_1            | NoMethodError (undefined method `check_for_image' for #<ExtractsController:0x00007f1e283a7e18>):
feedbin-web_1            |
feedbin-web_1            | app/controllers/extracts_controller.rb:18:in `rescue in entry'
feedbin-web_1            | app/controllers/extracts_controller.rb:9:in `entry'
feedbin-web_1            | lib/tld_length.rb:13:in `call'
feedbin-web_1            | lib/basic_authentication.rb:10:in `call'

I'm not sure why you think this is a mixed content issue ^^

@ldexterldesign
Copy link
Author
ldexterldesign commented Mar 30, 2020

Screenshot 2020-03-30 at 06 28 58

OK, so I think the HTTP 500 error is because:

Request URL: http://127.0.0.1:3000/extracts/104/entry?utf8=%E2%9C%93&extract=true

According to the following [s]ource my request is an "unsecured HTTP request" and won't send:

A Referer header is not sent by browsers if:

The referring resource is a local "file" or "data" URI.
An unsecured HTTP request is used and the referring page was received with a secure protocol (HTTPS).

As you see from my response, the content is nothing:

Content-Length: 0

If correct then the fix would be to ensure I'm using HTTPS host domains (e.g. https://feedbin.domain.tld ?) not a local host domain (i.e. 127.0.0.1)

According to this [so]urce it is possible to setup HTTPS with 127.0.0.1 but seems wiser to use a domain instead of an IP

You have some *feedbin.domain.tld stuff setup in .env - are these placeholder variables or something letsencrypt can/will work with?:

  • If yes then back to my rate limiting issue
  • If no then what hosts/domains are you using

Sincerely

s: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer
so: https://letsencrypt.org/docs/certificates-for-localhost/

@ldexterldesign
Copy link
Author
ldexterldesign commented Mar 30, 2020

Presuming this is related...

.env:

[...]
# Rails
RACK_ENV=production
RAILS_ENV=production
PORT=3000
SECRET_KEY_BASE=password
DEFAULT_URL_OPTIONS_HOST=feedbin.domain.tld
PUSH_URL=https://feedbin.domain.tld
FEEDBIN_URL=https://feedbin
8000
.domain.tld
FEEDBIN_HOST=127.0.0.1
FORCE_SSL=false
[...]

If I use .env:

[...]
FORCE_SSL=
// or
FORCE_SSL=true
[...]

... then feedbin_feedbin-web logs the following (error):

=> Booting Puma
=> Rails 6.0.2.2 application starting in production 
=> Run `rails server --help` for more startup options
2020-03-30 04:44:02 +0000: HTTP parse error, malformed request (): #<Puma::HttpParserError: Invalid HTTP format, parsing fails.>
---
2020-03-30 04:44:03 +0000: HTTP parse error, malformed request (): #<Puma::HttpParserError: Invalid HTTP format, parsing fails.>

If I use .env:

[...]
FORCE_SSL=false
[...]

... then logs yield no errors:

=> Booting Puma
=> Rails 6.0.2.2 application starting in production
=> Run `rails server --help` for more startup options

Regards

@ldexterldesign
Copy link
Author
ldexterldesign commented Mar 30, 2020

Regarding your log (I see exactly the same), correct me if I'm wrong but, since we're dealing with a [X]MLHttpRequest, I think you'd be better off looking at browser console than docker logs

Here's where the event exception occurs:

Screenshot 2020-03-30 at 07 08 44

// Do send the request
// This may raise an exception which is actually handled in jQuery.ajax (so no try/catch here)
xhr.send( ( options.hasContent && options.data ) || null );

Regards

x: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/send

@ldexterldesign
Copy link
Author

@angristan
Copy link
Owner

Sorry but... no 😄

500 is a backend issue, you're looking in the wrong places. This is not related to HTTPS, CORS, XHR, Referrer... The front-end is doing good. If we get a error 500, the problem is on the server side.

As pointed out above, this is the issue:

feedbin-web_1            | I, [2020-03-29T16:40:46.824871 #1]  INFO -- : method=GET path=/extracts/7914/entry format=js controller=ExtractsController action=entry status=500 error='NoMethodError: undefined method `check_for_image' for #<ExtractsController:0x00007f1e283a7e18>' duration=207.92 view=0.00 db=19.65
feedbin-web_1            | F, [2020-03-29T16:40:46.854732 #1] FATAL -- :
feedbin-web_1            | NoMethodError (undefined method `check_for_image' for #<ExtractsController:0x00007f1e283a7e18>):
feedbin-web_1            |
feedbin-web_1            | app/controllers/extracts_controller.rb:18:in `rescue in entry'
feedbin-web_1            | app/controllers/extracts_controller.rb:9:in `entry'
feedbin-web_1            | lib/tld_length.rb:13:in `call'
feedbin-web_1            | lib/basic_authentication.rb:10:in `call'

As you can see on the first line method=GET path=/extracts/7914/entry, this is what the XHR request is calling.

@ldexterldesign
Copy link
Author
ldexterldesign commented Mar 30, 2020

Mmm, currently looking into caddy/camo as being the culprit - do you think that's more likely?

Cheers

@angristan angristan changed the title Fix/research "extract full content" feature / caddy/SSL/HTTPS "Extract full content" feature does not work Mar 30, 2020
@angristan
Copy link
Owner

I doubt they would be the culprit... I opened an issue upstream, hopefully we'll get some insights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0