8000 Proxy http requests to external proxy for internet access · Issue #1838 · linkerd/linkerd · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Proxy http requests to external proxy for internet access #1838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
KZachariassen opened this issue Feb 28, 2018 · 24 comments
Closed
1 of 2 tasks

Proxy http requests to external proxy for internet access #1838

KZachariassen opened this issue Feb 28, 2018 · 24 comments

Comments

@KZachariassen
Copy link

Issue Type:

  • Bug report
  • Feature request

We are running a Linkerd-to-Linkerd setup where we use http_proxy to route HTTP traffic to Linkerd, but in our infrastructure internet access is only provided using a proxy server. We need to be able to tell Linkerd to proxy internet traffic to another proxy server.

Here is the Linkerd config we use today:

admin:
  ip: 0.0.0.0
  port: 9990

namers:
- kind: io.l5d.consul
  includeTag: true
  failFast: true
  useHealthCheck: false
  host: infrastructure-consul
  port: 8500

routers:
  - label: http1-out
    protocol: http
    servers:
      - port: 4140
        ip: 0.0.0.0
    dtab: |
      /svc        =>  /#/io.l5d.consul/.local/external;
      /host       => /svc;
      /http/*/*   => /host;
    identifier:
     kind: io.l5d.header.token

  - label: http1-in
    protocol: http
    servers:
      - port: 4141
        ip: 0.0.0.0
    dtab: |
      /svc        =>  /#/io.l5d.consul/.local/internal;
      /host       => /svc;
      /http/*/*   => /host;
    identifier:
      kind: io.l5d.header.token

telemetry:
  - kind: io.zipkin.http
    host: zipkin:9411
    initialSampleRate: 1.00

usage:
  orgId: linkerd-examples-consul

We have tried to use the io.buoyant.rinet but that only gives us the feature to make Linkerd make http requests to external resources, but not to a proxy. We could route internet traffic directly to the Internet proxy, but then we loos all the features of Linkerd and we need to implement that in all our services.

We are only using http(s) so we do only need to talk to an http proxy(in our case a squid).

Our end gold would be a flow looking like this:
[service] -> [Linkerd] -> [InternetProxy] -> (Internet)

Or if multiple connected services are in play:
[service] -> [Linkerd] -> [Linkerd] -> [service] -> [Linkerd] -> [InternetProxy] -> (Internet)

Environment:

  • Linkerd version 1.3.5
@KZachariassen KZachariassen changed the title Proxy http requests to external proxy for internat access Proxy http requests to external proxy for internet access Feb 28, 2018
@wmorgan
Copy link
Member
wmorgan commented Feb 28, 2018

Thanks for filing this, @krjensen!

@KZachariassen
Copy link
Author

Let me know if you need more information from me.

@seanb4t
Copy link
seanb4t commented Apr 20, 2018

We have this same exact use case.

Finagle supports this ( see here ), it just needs to be 'exposed'/'lifted' to linkerd http level configs. Perhaps as static client config options to allow per router config.

I can offer some time to look at this, however, my scala is rather rusty - any pointers/starting locations would be very much appreciated.

@wmorgan
Copy link
Member
wmorgan commented Apr 20, 2018

Maybe @adleong or @dadjeibaah can share some pointers.

@adleong
Copy link
Member
adleong commented Apr 24, 2018

If my understanding is correct, there are two different scenarios being discussed in this issue. Let me discuss them individually, and please correct me if I have misunderstood.

@krjensen is asking about sending all egress traffic through an HTTP proxy (squid). This should be easy to do by adding a fallback rule to the dtab which sends requests directly to the HTTP proxy. Something like:

/svc/* => /$/inet/<http_proxy_ip>/<http_proxy_port>

Since later dtabs rules have higher priority, this dtab rule should be added to the start of the dtab. This will effectively act as a fallback. If a request is made to Linkerd and Linkerd cannot resolve it through service discovery, it will fallback to sending it to the HTTP proxy.
@krjensen does this answer your question?

@sean-brandt on the other hand seems to be asking about sending traffic through a SOCKS proxy. This is more complex because Linkerd needs to be made aware that it is talking to a SOCKS proxy and use the SOCKS protocol. As you say, Finagle does have support for this. The way I would imagine this working is by adding the ability to specify a SOCKS proxy on a Linkerd client configuration. eg something like:

routers:
- protocol: http
  client: 
    socksProxy:
      ip: 1.2.3.4
      port: 4321

If this is something that you're interested in, @sean-brandt, would you mind opening a new issue specifically around SOCKS support?

@seanb4t
Copy link
seanb4t commented Apr 24, 2018

No, I'm looking to configure linkerd to, on a per 'service' basis use an http 'forward proxy' to reach external services. This may be either over http, or https, so the protocol requires supporting http proxy specific communication.

Generally I'm looking to support this:

[client]  --> [linkerd] --> [http proxy (squid forward proxy, non-transparent) ] --> [ ultimate destination service ]

No need, nor desire, for SOCKS at all on my end.

@adleong
Copy link
Member
adleong commented Apr 24, 2018

@sean-brandt in that case, you should be able to take care of the routing entirely in your dtab:

/svc/foo => /$/inet/<http_proxy_host>/<http_proxy_port>

Will route requests for foo to the proxy.

@seanb4t
Copy link
seanb4t commented Apr 24, 2018

@adleong Ok. How, then, is linkerd supposed to know where to tell the proxy to go to? The 'proxy' in this case is a forward proxy ( squid, or similar ) and it is expecting requests to it to follow the http proxy protocol.

Basically requests would require that GET/HEAD/.. use the fully qualified host/url and the Host header be set to the destination.
For https it's basically the same, but there's only one method ( and no URL ) - CONNECT. Host header is still required.

Where would I configure that /svc/foo requests should go to foobar.nowhere.com/bar VIA <http_proxy_host>:<http_proxy_port> ?

@seanb4t
Copy link
seanb4t commented Apr 24, 2018

Additionally - as I noted above, Finagle supports this. It really just needs to be exposed somewhere/how in the config.

import com.twitter.finagle.{Service, Http}
import com.twitter.finagle.http.{Request, Response}
import com.twitter.finagle.client.Transporter
import java.net.SocketAddress

val twitter: Service[Request, Response] = Http.client
  .withTransport.httpProxyTo(
    host = "twitter.com:443",
    credentials = Transporter.Credentials("user", "password")
  )
  .newService("inet!my-proxy-server.com:3128") // using local DNS to resolve proxy

@adleong
Copy link
Member
adleong commented Apr 24, 2018

Linkerd forwards the Host header unchanged which means that it needs to be set to the desired final destination by the application that is sending the request to Linkerd. Something like this:

App sends request to Linkerd:

GET /bar
Host: foobar.nowhere.com

Linkerd routes /svc/foobar.nowhere.com => /$/inet/<proxy_host>/<proxy_port and sends the request to the proxy.

Proxy sends the request to foobar.nowhere.com.

Does that make sense?

@seanb4t
Copy link
seanb4t commented Apr 24, 2018

That makes sense, at least in the plaintext case - I think. :) I'll put together a test, since I'm not certain that it will work with the TLS case ( where the ultimate target service is https ) and a proxy 'tunnel' is required using the CONNECT method.

@rasmus
Copy link
rasmus commented Jun 11, 2018

Any progress on this?

@adleong
Copy link
Member
adleong commented Jun 11, 2018

Good question. Putting aside SOCKS proxying and CONNECT tunneling for the moment, I think this issue is specifically about configuring Linkerd's routing rules to use an HTTP proxy. @sean-brandt and @krjensen: do my explanations here make sense? Can we close this issue?

8000

@seanb4t
Copy link
seanb4t commented Jun 11, 2018

TLS doesn't work via the routes above, or at least I wasn't able to get it to do so. It's the switch to the CONNECT method and related work that's the issue.

There is code in Finagle to support this - unfortunately, however I lack the time and familiarity to be able to help out much more than that.

@seanb4t
Copy link
seanb4t commented Jun 11, 2018

Hmm, additionally - a 'test' should be easy enough to replicate self contained, presuming a containerized environment.

  • Container A : Attempts to connect to https://someapi.somewhere.com via container B by connecting to it on port 8080: HTTP GET /someapi
  • Container B: Linkerd listens on 8080, routes requests for /someapi/* to https://someapi.somewhere.com via squid running on container C listening on 3128
  • Container C: squid listening on 3128 operating in a traditional caching proxy role

@adleong
Copy link
Member
adleong commented Jun 11, 2018

Ah, yes, this is expected because Linkerd does not support creating tunnels with the CONNECT method. I've filed #1982 to track that separately.

@seanb4t
Copy link
seanb4t commented Jun 11, 2018

Absent CONNECT support, I've got no opinion one way or the other on this particular issue. #1982 covers my use-case/concern, I believe.

@adleong adleong closed this as completed Jun 12, 2018
@rasmus
Copy link
rasmus commented Jun 13, 2018

Any reason why this was closed?

@KZachariassen
Copy link
Author

Why did you close this issue. I still think it's 100% relevant and a feature we really need.

@adleong
Copy link
Member
adleong commented Jun 13, 2018

For HTTPS proxying, CONNECT tunneling is required and that is tracked in issue #1982.

For HTTP, Linkerd's routing rules can be configured to send the traffic to the external proxy (see discussion above). @rasmus @krjensen does this cover your use-case or is there something else that I'm missing?

@ccbeloy
Copy link
ccbeloy commented Dec 4, 2018

@adleong - even for the HTTP case, the suggested dtab doesn't work. We're also using Squid proxy server as Sean describe. Basically I needs to follow http proxy protocol for Squid to successfully process the request.
@sean-brandt - if you don't mind sharing, what was your workaround for running Linkerd behind corporate proxy? Thanks.

@DudnykOleksandr
Copy link
DudnykOleksandr commented Aug 29, 2019

@sean-brandt in that case, you should be able to take care of the routing entirely in your dtab:

/svc/foo => /$/inet/<http_proxy_host>/<http_proxy_port>

Will route requests for foo to the proxy.

Hi, I have a similar issue. I used Squid proxy and Docker containers. Squid itself works as expected. But if requests are routed via Linkerd with this rule /svc/* => /$/inet/<http_proxy_ip>/<http_proxy_port> squid errors

The following error was encountered while trying to retrieve the URL: /

Invalid URL

Some aspect of the requested URL is incorrect.

Some possible problems are:

Missing or incorrect access protocol (should be http:// or similar)

Missing hostname

Illegal double-escape in the URL-Path

Illegal character in hostname; underscores are not allowed.

@adleong
Copy link
Member
adleong commented Aug 29, 2019

Hi @DudnykOleksandr!

I think the problem has to do with the way that proxies interact. Linkerd acts like a transparent proxy which means that when clients modify a request to make a proxy request by making the URI absolute, Linkerd undoes that change by dropping the scheme and making the request back into a normal request. That behavior was added to Linkerd in this PR. However, squid expects to be treated as a proxy and expects proxy requests with absolute URIs.

I haven't used squid, personally, but it looks like this behavior can be changed by changing the squid to intercept mode:

I believe that in intercept mode, squid no longer requires requests to have absolute URIs.

@DudnykOleksandr
Copy link

Hi @DudnykOleksandr!

I think the problem has to do with the way that proxies interact. Linkerd acts like a transparent proxy which means that when clients modify a request to make a proxy request by making the URI absolute, Linkerd undoes that change by dropping the scheme and making the request back into a normal request. That behavior was added to Linkerd in this PR. However, squid expects to be treated as a proxy and expects proxy requests with absolute URIs.

I haven't used squid, personally, but it looks like this behavior can be changed by changing the squid to intercept mode:

I believe that in intercept mode, squid no longer requires requests to have absolute URIs.

@adleong thank you for clarifying this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
0