8000 Pip download prefers newer package version even when local package exists · Issue #5500 · pypa/pip · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Pip download prefers newer package version even when local package exists #5500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for Git 8000 Hub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bendikro opened this issue Jun 13, 2018 · 48 comments
Open
Labels
C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion

Comments

@bendikro
Copy link

Environment

  • pip version: 10.0.1/master
  • Python version: Tested on python 3.6
  • OS: Linux

Description
pip download does not prefer package found locally even if it satisfies the requirements when there is a newer available at the remote package index

Expected behavior
Prefer the already existing package as long as long as it satisfies the dependency requirements

How to Reproduce

  1. Create directory pkg_cache
  2. Run pip3 download --dest pkg_cache/ --find-links pkg_cache/ setuptools==39.0.1 && pip3 download --dest pkg_cache/ --find-links pkg_cache/ setuptools

Output

pip3 download --dest pkg_cache/  --find-links pkg_cache/ setuptools==39.0.1 && pip3 download --dest pkg_cache/  --find-links pkg_cache/ setuptools
Looking in links: pkg_cache/
Collecting setuptools==39.0.1
  Using cached https://files.pythonhosted.org/packages/20/d7/04a0b689d3035143e2ff288f4b9ee4bf6ed80585cc121c90bfd85a1a8c2e/setuptools-39.0.1-py2.py3-none-any.whl
  Saved ./pkg_cache/setuptools-39.0.1-py2.py3-none-any.whl
Successfully downloaded setuptools
Looking in links: pkg_cache/
Collecting setuptools
  Using cached https://files.pythonhosted.org/packages/7f/e1/820d941153923aac1d49d7fc37e17b6e73bfbd2904959fffbad77900cf92/setuptools-39.2.0-py2.py3-none-any.whl
  Saved ./pkg_cache/setuptools-39.2.0-py2.py3-none-any.whl
Successfully downloaded setuptools
@pfmoore
Copy link
Member
pfmoore commented Jun 13, 2018

That behaviour is by design. Pip will always prefer the latest available version, it takes no account of where a package comes from.

bendikro added a commit to bendikro/pip that referenced this issue Jun 13, 2018
If a package already exists in a directory specified with --find-links,
PackageFinder still prefers a newer version of the package found
at the remote package index.

Fix by preferring local package file when found.
@bendikro
Copy link
Author

@pfmoore
I see. We have multiple requirement files, and since pip does not handle double requirements it is necessary to do multiple calls to pip download, one for each requirements file. With the current behavior of pip, where one file has setuptools and another has setuptools==39.0.1, both 39.0.1 and 39.2.0 will be downloaded.

@pfmoore
Copy link
Member
pfmoore commented Jun 13, 2018

So? That's the point of pip download. I don't know if I'm missing something here but I can't see what the problem is. What exactly do you use the files downloaded via pip download for? As per the docs the intention is that you use pip download to populate a directory from which you can later use pip install --find-links to do an install while offline. The pip install command is perfectly capable of handling a --find-links directory with multiple versions of the same package in it, so why are you bothered that this is happening?

@pradyunsg pradyunsg added C: download About fetching data from PyPI and other sources S: awaiting response Waiting for a response/more information labels Jun 14, 2018
@bendikro
Copy link
Author

The point is that consistency is useful. Things that behave differently all the time is less useful than things that do the same thing every time. Had pip supported handling multiple requirement files and dealt properly with the dependencies, this wouldn't be a problem though.

With two requirement files, as explained earlier, you never actually know exactly what package versions will be downloaded.

The pip install command is perfectly capable of handling a --find-links directory with multiple versions of the same package in it, so why are you bothered that this is happening?
Depending on the order of the requirement files you provide, different package versions are installed. Consistency is key.

Second reason is speed. By looking locally and finding a package that satisfies the dependencies, there is no need to check remotely. Therefore, a call to pip download would be blazing fast if the packages are already downloaded. Currently it's very slow.

@pfmoore
Copy link
Member
pfmoore commented Aug 21, 2018

I'm not sure I follow. Pip's current behaviour is perfectly consistent - I described it above:

Pip will always prefer the latest available version, it takes no account of where a package comes from.

In fact, if we preferred local files, we'd be harming consistency, because you'd get something different installed depending on what was present locally.

I don't see anything actionable here. Pip's current behaviour is by design, if you want to propose a change, you'll need to provide details of what you propose, and you'll probably need more persuasive arguments than you've currently offered.

@bendikro
Copy link
Author

I'm not sure I follow. Pip's current behaviour is perfectly consistent - I described it above:
True. It's consistent in that you never know which version it will download in the scenario I describe.

In fact, if we preferred local files, we'd be harming consistency, because you'd get something different installed depending on what was present locally.
The whole point is to know exactly what will be installed based on the local files. But having pip download the same package versions each time is not possible with multiple requirement files as I describe.

I agree that the current default behaviour shouldn't be changed, but an option to be able to prefer local packages over checking remotely would still be useful.

What I propose is to have an option that makes pip check locally if a package that satisfies the given dependency already exists locally, and if so, do not check remotely.

@pfmoore
Copy link
Member
pfmoore commented Aug 21, 2018

OK, so what you're suggesting is an option to pip download that says "for each requirement, if it can already be satisfied from the destination directory, skip it, otherwise download the requirement as normal and store the downloaded file in the destination directory.

I can see the logic in that. If you wanted to create a PR implementing it, I'm not going to object. I can't say that I find your justification for the behaviour compelling, but that's something that can be debated later, when there's a PR to review.

@pfmoore pfmoore added the resolution: deferred till PR Further discussion will happen when a PR is made label Aug 21, 2018
@mboisson
Copy link

This would also be very useful for HPC clusters on which the staff may build python wheels that are optimized for their CPU architecture. The current behavior requires HPC staff to always be recompiling new versions as soon as they are out, or risk users using dramatically slower python packages in some situations. Being able to tell pip to favor a local wheelhouse over some minor version increase found online would be very useful to us.

@ccoulombe
Copy link

@bendikro Any news/updates on this? This would be very useful for us.

@rishihahs
Copy link

An example of this causing an issue in practice:

Let's say I'm using python 2.7. Matplotlib 3 supports only python 3.5+. If I install a package that has matplotlib>=2.0 as a requirement (e.g. scikit-image) then even though I have matplotlib 2.x installed locally, pip will try to download and install matplotlib 3.x which of course will fail.

bendikro added a commit to bendikro/pip that referenced this issue Nov 20, 2018
Add option --prefer-local-compatible to download and install commands

With this option enabled, local directories specified with --find-links
are searched first. If package requirements are satisfied by packages
found locally, the local package will be preferred, and no remote URLs
will be checked.
@bendikro
Copy link
Author

I've created a PR suggesting a new option --prefer-local-compatible: #6023

@pfmoore
Copy link
Member
pfmoore commented Nov 20, 2018

I remain unconvinced that this is a good idea, but as I said above, if someone else feels it's worth taking this forward, I won't object.

For the record, though, my objection to this isn't so much that it's difficult to implement or explain the basics of the proposed behaviour, it's more about the maintenance burden:

  1. I fully expect to get requests to extend this behaviour to pip install, which is something I remain strongly against, as I noted above. Having to repeatedly argue against such requests is going to be a drain on developer resources.
  2. There are potentially odd edge cases where the behaviour will be unexpected at best, wrong at worst (I don't want to invent a series of increasingly-unlikely theoretical cases here, but it's definitely true that I can think of situations involving complex dependency trees where it makes my head hurt even thinking about what might happen). Should something like that come up in real life, working out a suitable fix could be a significant issue.

@mboisson
Copy link

Wait... this is for pip install, that's the whole point of it. Isn't it ?

@pfmoore
Copy link
Member
pfmoore commented Nov 20, 2018

Absolutely not. See all of my previous comments about why this should not be added to pip install.

However, I've just noticed that the PR adds the option to pip install. I'll register my objection to that on the PR, as well.

@mboisson
Copy link
mboisson commented Nov 20, 2018

I disagree. As manager of an HPC cluster and a very comprehensive wheel house, we strongly want that for install too.

Wheels downloaded from online repositories break or under-perform way too often.

@mboisson
Copy link

I'm not sure I follow. Pip's current behaviour is perfectly consistent - I described it above:

Pip will always prefer the latest available version, it takes no account of where a package comes from.

In fact, if we preferred local files, we'd be harming consistency, because you'd get something different installed depending on what was present locally.

I don't see anything actionable here. Pip's current behaviour is by design, if you want to propose a change, you'll need to provide details of what you propose, and you'll probably need more persuasive arguments than you've currently offered.

"Always prefer the latest available version" is the complete opposite of consistency. It means that any two successive installation will yield different results, even when performed on the exact same host.

@mboisson
Copy link
mboisson commented Nov 20, 2018

For the sake of argument, lets define consistency.

My definition of something consistent : Something is consistent when it yields the same result when executed

  1. at two different times
  2. in two different places

Getting both 1) and 2) is very hard. It requires basically having the whole software stack/operating system managed by the same system. This is never going to be achieved by pip alone, and is - as far as I know - only managed by NixOS (https://github.com/NixOS/nixpkgs)

Getting 2) cannot possibly be achieved without having 1) unless you are executing things at the exact same time or unless you pin down the version of every package you install.

What is left is 1). Current pip behaviour does not give 1) at all. If install packages today, I will get widely different versions than what I installed 6 months ago.

Item 1) Can however be achieved assuming there is a local set of packages that are fixed/supported. This is the case on our HPC clusters. We also achieve 2) as long as users remain on our infrastructures (multiple clusters).

However, both 1) and 2) are jeopardized by the current pip behaviour and the lack of ability to tell pip that packages available in our wheel house are preferred to those more recent version that can be downloaded.

@pfmoore
Copy link
Member
pfmoore commented Nov 20, 2018

Sigh. I guess we're simply going to have to disagree. Pip has mechanisms (version pinning, hash checking) to satisfy your requirement (1). Just because you choose not to use them, or because they don't work easily in your particular situation, doesn't mean pip can't do that. Nor does it mean that pip needs another method of doing the same thing.

I remain -1 on this whole proposal, and you've pretty much convinced me that accepting the option for pip download will set a precedent that will make it impossible to resist demands that we add it to pip install. So I'm no longer willing to make an exception for pip download.

@pradyunsg pradyunsg removed the S: awaiting response Waiting for a response/more information label Nov 20, 2018
@mboisson
Copy link

It's not that I choose not to use them, it's because nobody (i.e. package developers) ever does.

@mboisson
Copy link

I guess that an option --try-no-index which would try to install/download/update without considering indexes first and go to the index only if it did not work would get the same -1 from you too @pfmoore ?

@pradyunsg pradyunsg removed the resolution: deferred till PR Further discussion will happen when a PR is made label Nov 20, 2018
@RonnyPfannschmidt
Copy link
Contributor

i would like to note that at work we always use version-pinning and/or constraint files, its simply insane not to have that in place in production environments where consistency is a must

also i wonder if the "strategy" option for pip install -U would make sense for download, downloading only as needed to fulfill the requirement set

@mboisson
Copy link

Yeah, advanced users of HPC will use version-pinning. But this is not just any lambda user. When managing a HPC cluster, you are dealing with thousands of users that know very little about good practices. Any small step you can take to reduce the amount of rope with which they can hang themselves is tickets and problems that are avoided.

@bendikro
Copy link
Author

Absolutely not. See all of my previous comments about why this should not be added to pip install.

However, I've just noticed that the PR adds the option to pip install. I'll register my objection to that on the PR, as well.

@pfmoore I must admit I did not understand you from the previous discussion to be so strongly against having this option for pip install as well. I understood the earlier discussion to be related to changing the default behavior of pip, which I agree is a very bad idea.

Due to the additional interest given to this ticket, I wanted to put together a PR with an implementation prototype with the new option. Including the option for pip install was not due to actively ignoring your comments, but simply because it can be useful for the install command as well as commented by @mboisson.

@pfmoore
Copy link
Member
pfmoore commented Nov 21, 2018

OK, fair enough. I still don't see sufficient benefit in this change to justify the cost, though.

Just as a question, why don't you use something like a local devpi instance that serves your "local" files, but if there are no local files for a package falls back to PyPI? I'm pretty sure devpi can do things like this (and if it's not the default behaviour there is a plugin system that lets you customise the behaviour). Or just simply write a small webapp that serves an index that behaves as you want it to?

@mboisson
Copy link
mboisson commented Nov 21, 2018

I was unaware of devpi, but running a web server is not an option. We can't run a web server on a HPC cluster, and compute nodes on which jobs run and pip may be called don't necessarily have access to the web. We want the packages we serve to be available without needing web access. It currently works nicely by just having a directory containing the wheels which is accessible on our filesystems, and configuring find-links to point to that directory in the PIP_CONFIG_FILE.

The only caveat, as is being discussed, is that it will not limit itself to whatever it found first in the directory pointed to by find-links, even if it matches the requirements. We even had to globally tell pip that our system is not manylinux1 compatible because these were taking precedence over our locally compiled wheels even if we had the most up to date version (it considers manylinux1 to be "more recent" than linux).

@bendikro
Copy link
Author
bendikro commented Nov 21, 2018

Wait... this is for pip install, that's the whole point of it. Isn't it ?

The initial reason I wanted to change the behavior is to make pip download faster by avoiding lookups to remote indexes when not needed.

There are two cases where not needed can be applied.

  1. When a local package satisfies the requirement, e.g. requests>=2.18.3 where version 2.18.3 exists locally but is not the latest
  2. When a local package satisfies a pinned requirement, e.g. requests==2.18.3 where version 2.18.3 exists locally.

Point 1 conflicts with the current pip behavior, where --prefer-local-compatible causes the package version found locally to be installed even when newer versions are available on the remote index.
Point 2 will not differ from the current pip behavior, where the result, i.e. the installed packages, are the same.

Currently, even with pinned versions for all packages, where all packages exists locally, pip download still retrieves the available package versions from the remote indexes.
With a requirements file with 90 packages, --prefer-local-compatible reduces the pip download time from ~26 to ~7 seconds.

@bendikro
9E88 Copy link
Author

OK, fair enough. I still don't see sufficient benefit in this change to justify the cost, though.

Just as a question, why don't you use something like a local devpi instance that serves your "local" files, but if there are no local files for a package falls back to PyPI? I'm pretty sure devpi can do things like this (and if it's not the default behaviour there is a plugin system that lets you customise the behaviour). Or just simply write a small webapp that serves an index that behaves as you want it to?

I'll try to explain our use case.

We have a multitude of projects that rely on different virtual environments for different tasks, e.g. system tests, unit tests, running various python scripts, etc.

We used to have multiple requirement files containing only the strictly necessary requirement specifications for each virtual environment. However, due to the issue mentioned above, we ended up generating one requirement file with pinned package versions for each virtual environment instead.

Whenever the requirement file changes, or the virtual environment is removed (make clean), the required package versions are first downloaded to a local cache directory with pip download, and then the virtual environment is created from these packages.
pip download is run quite frequently to ensure all the required packages are available, which currently takes more time than strictly necessary, when most or all of the packages are already available in the cache directory.

There is not a set of specific package versions we use for all the projects, but each project, and each virtual environment has a set of requirements with pinned versions. Therefore, running a devpi instance is not very convenient.

@bendikro
Copy link
Author

I guess that an option --try-no-index which would try to install/download/update without considering indexes first and go to the index only if it did not work would get the same -1 from you too @pfmoore ?

@mboisson
How does that differ from the proposed --prefer-local-compatible option in 6023?

@mboisson
Copy link

@bendikro, it does not need to define a new "local" concept and check for file:, so it reuses more mechanisms that are already in place.

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

This thread is getting very confused. I suspect we're hitting a case of the XY Problem.

The original statement of the problem here was that "pip download does not prefer package found locally even if it satisfies the requirements when there is a newer available at the remote package index". That's not a problem, because that's not how pip download is defined to work. So taking a very naive viewpoint, this issue can be closed as "not a problem - user had misunderstood how pip works". But that's not very helpful.

It's possible that in attempting to solve an issue in their local environments, @bendikro and/or @mboisson have identified that if pip preferred "local" files over "remote" ones, then they could use that to solve their problem. That's fine, but as noted, it's not how pip works.

Rather than proposing that pip gets changed to work the way you wish it would in order to implement the solution you'd thought of, can I suggest that we go back to the underlying problems? If you raise one or more new issues describing what your underlying problem is, maybe we can either find a solution using pip as it currently works, or we can identify a change to pip that doesn't have the difficulties that "prefer local files" does but still helps address the problem.

(Disclaimer: My personal feeling is that there's likely an acceptable solution using pip as it stands, maybe with some local environment config changes, or with a process change in how you're working. What I've understood of the underlying problems so far doesn't seem like it's something that needs a pip change. But I may be wrong.)

@ccoulombe
Copy link
ccoulombe commented Nov 22, 2018

@pfmoore How would you solve, with current pip, that users need to install the latest version in a specific wheelhouse even if there's a more recent version on PyPI? Ideally, the user only need to pip install <name>.

For example, in the local wheelhouse, there's matplotlib v3.0.1 but on PyPI there's v3.0.2. The v3.0.1 is the preferred candidate to be installed. As mentionned by @mboisson, --find-links and --no-index are already known and used.

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

@ccoulombe Pin the version. "Ideally, the user only need to pip install name" isn't a requirement, just a preference. Or if you don't want to specify the version, --no-index.

@mboisson
Copy link
mboisson commented Nov 22, 2018

@pfmoore ok, let me roll back to our problems.

Problem 1)

  • Most binary packages provided by pipy don't work or are sub-optimal on HPC systems. That is because either they assume specific system packages to be present when they might not be, and because they are not compiled to target specific hardware architectures that are available on the cluster.

Problem 2)

  • Most HPC users are computer illiterate. They will pick whatever information is available online that will tell them to just run pip install X. They will not pin versions. Especially when they don't know that we have built specific versions for them. When they don't and it fails, they will contact our support and create an undue workload on our staff.

Problem 3)

  • Even if - through a local repository of wheels - we provide a binary version which we know work on our system, and which we have compiled to perform well on the hardware we have, pip will try to download a version from online unless we specify --no-index as soon as a newer version is released, even if that newer version is not needed by the requirements.

Problem 4)

  • If we specify --no-index in the PIP_CONFIG_FILE, pip won't even attempt to download packages, even those that are pure python and which would work just fine. This means that we would have to host a complete repository of all possibly python packages and is just unwieldy.

Please suggest a solution that solves all 4 problems that does not equate to "prefer locally built packages".

@mboisson
Copy link

Can we somehow tell pip to not ever download binary (i.e. compiled) packages from online repositories ? Pure python packages are usually alright.

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

@mboisson Thanks for the clarification. It'll take me a while to digest that but I appreciate the explanation.

Can we somehow tell pip to not ever download binary (i.e. compiled) packages from online repositories

Yes - --no-binary :all:.

@mboisson
Copy link
mboisson commented Nov 22, 2018

@mboisson Thanks for the clarification. It'll take me a while to digest that but I appreciate the explanation.

Can we somehow tell pip to not ever download binary (i.e. compiled) packages from online repositories

Yes - --no-binary :all:.

--no-binary :all: will also block binary packages that are hosted in our wheelhouse (i.e. found through --find-links)

I also realize that my question was not precise enough.

Can we somehow tell pip not to ever download binary packages, nor their source equivalent (i.e. only ever download pure python packages) ?

Not downloading the binary version of numpy for example is no better because it will download the source version and try and fail to compile it optimally.

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

@mboisson If I follow that set of requirements, the only control you have over what options pip sees when run by your users is the global configuration file?

Also, you stated earlier that running a local index wasn't an option, but I don't see anything in your problem statement that precludes it. And my immediate thought when seeing your requirements is that PyPI is not a good fit for your requirements, and running a local index (that passes through to PyPI when appropriate) is exactly the solution that other environments I've hear of with similar constraints tend to use...

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

Can we somehow tell pip not to ever download binary packages, nor their source equivalent (i.e. only ever download pure python packages) ?

I think we're confusing each other here. What do you mean by "download"? From PyPI? If that, then only by using --no-index and hosting a local index (which is why I think that's the most appropriate solution for you).

@mboisson
Copy link
mboisson commented Nov 22, 2018

@mboisson If I follow that set of requirements, the only control you have over what options pip sees when run by your users is the global configuration file?

Also, you stated earlier that running a local index wasn't an option, but I don't see anything in your problem statement that precludes it. And my immediate thought when seeing your requirements is that PyPI is not a good fit for your requirements, and running a local index (that passes through to PyPI when appropriate) is exactly the solution that other environments I've hear of with similar constraints tend to use...

Correct, the only control we have over what options pip sees is the global configuration file.

Running an index which requires running a server is not an option. Having some sort of script that pip would query locally and not require a web server to figure out whether it's redirected to pipy or the local repository could work.

@mboisson
Copy link

Can we somehow tell pip not to ever download binary packages, nor their source equivalent (i.e. only ever download pure python packages) ?

I think we're confusing each other here. What do you mean by "download"? From PyPI? If that, then only by using --no-index and hosting a local index (which is why I think that's the most appropriate solution for you).

I mean that unless it's a pure python package act as if you were using --no-index (i.e. just look in our local repository).

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

Running an index which requires running a web server is not an option.

Hmm, I'd like to say "why not?" but I'll accept that as a fact for now. In which case, you could (note, this is untested!) write a script that grabs https://pypi.org/simple and modifies it so that links to packages you have locally point to your local copies, and put that somewhere you can reference via a file: URL (which you can then treat as a repository index via --index-url - PEP 503 doesn't mandate that an index is available via HTTP).

You'll need to refresh your local index regularly, but that's a cost of not being able to support a web server (which could do the refresh on the fly).

In spite of agreeing to accept "not able to use a web server" as a constraint, I'd also like to point out that you could put an index on an external site like heroku - after all, your users can access the internet, so it's not like they couldn't access that as an index...

I mean that unless it's a pure python package act as if you were using --no-index (i.e. just look in our local repository).

How do you detect that it's pure Python? That's not possible without building the package (as some packages have optional C extensions).

@mboisson
Copy link

Running an index which requires running a web server is not an option.

Hmm, I'd like to say "why not?" but I'll accept that as a fact for now. In which case, you could (note, this is untested!) write a script that grabs https://pypi.org/simple and modifies it so that links to packages you have locally point to your local copies, and put that somewhere you can reference via a file: URL (which you can then treat as a repository index via --index-url - PEP 503 doesn't mandate that an index is available via HTTP).

You'll need to refresh your local index regularly, but that's a cost of not being able to support a web server (which could do the refresh on the fly).

That's an interesting idea. We'll think about it.

In spite of agreeing to accept "not able to use a web server" as a constraint, I'd also like to point out that you could put an index on an external site like heroku - after all, your users can access the internet, so it's not like they couldn't access that as an index...

In some cases, our users do have Internet access, but in others they don't (i.e. when they are running on compute nodes on the cluster). So running a web server would be an option for some cases, but it would require to keep two distinct solutions, with the risk that they would eventually diverge in the list of packages they provide.

I mean that unless it's a pure python package act as if you were using --no-index (i.e. just look in our local repository).

How do you detect that it's pure Python? That's not possible without building the package (as some packages have optional C extensions).

I'ld say that anything for which the wheel is cp27-cp27mu-linux_x86_64 (or similar for other versions of python) is compiled ? Although I guess that's hard to test without trying to install it ?

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

In some cases, our users do have Internet access, but in others they don't

Well, if they don't, they can't access PyPI so problem solved 😄

I'd say that anything for which the wheel is cp27-cp27mu-linux_x86_64 (or similar for other versions of python) is compiled ?

But you said you didn't want to try to compile sdists that aren't "pure Python" either? There's no way of telling whether a sdist is "pure Python".

@mboisson
Copy link

In some cases, our users do have Internet access, but in others they don't

Well, if they don't, they can't access PyPI so problem solved 😄

Well, yes, if they can still access our local index (and don't need packages that aren't in there), which they can't if it's a web server.

I'd say that anything for which the wheel is cp27-cp27mu-linux_x86_64 (or similar for other versions of python) is compiled ?

But you said you didn't want to try to compile sdists that aren't "pure Python" either? There's no way of telling whether a sdist is "pure Python".

"pure python" packages will usually end-up as a {py2,py3,py2.py3}-none-any.whl, no ?

@pfmoore
Copy link
Member
pfmoore commented Nov 22, 2018

Well, yes, if they can still access our local index (and don't need packages that aren't in there), which they can't if it's a web server.

... and again we hit something that I don't follow. You say they don't have "internet access". What precisely do you mean by that? No access to any sort of IP connection other than the local machine? Or no access outside of the local network? How do they currently access your local index? As a shared filesystem? There's no reason that wouldn't still be possible (all of my suggestions have only been about hosting an index on a web server - the actual distribution files themselves would remain on the local filesystem).

You could have the global config set up as

# For people with no web access at all
find-links = /the/local/shared/filesystem

# For people who can access PyPI (and hence the Internet)
# This service points to the local shared filesystem for projects you want to serve locally,
# and PyPI for all other projects. It can be a web service as shown here, or probably just
# a file: URL pointing to a PEP 503 format simple index, if you're willing to handle
# regularly refreshing the HTML pages.
index-url = https://our.local/index

That's as far as I can reasonably go designing this for you - you'll need to do some work yourself to fill in the blanks, but hopefully it's enough to give you the idea.

"pure python" packages will usually end-up as a {py2,py3,py2.py3}-none-any.whl, no?

End up as, yes. But at the point we're trying to make a decision, they are just .tar.gz sdists.

@mboisson
Copy link

Well, yes, if they can still access our local index (and don't need packages that aren't in there), which they can't if it's a web server.

... and again we hit something that I don't follow. You say they don't have "internet access". What precisely do you mean by that? No access to any sort of IP connection other than the local machine?

Yes

Or no access outside of the local network? How do they currently access your local index? As a shared filesystem?

Yes, shared filesystem.

There's no reason that wouldn't still be possible (all of my suggestions have only been about hosting an index on a web server - the actual distribution files themselves would remain on the local filesystem).

Yes, a local index file sounds like it could work.

@mboisson
Copy link

Mmm, I believe that our issue seems to have been fixed and merged (a.k.a. the --prefer-binary option) from this issue #3785

@mboisson
Copy link

Also, a colleague mentioned the option to have a constraint file, which can be set globally in a PIP_CONFIG_FILE, and in which one could exclude any version of a package which is more recent than the locally available version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion
Projects
None yet
Development

No branches or pull requests

7 participants
0