[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 570 – Python Positional-Only Parameters

Author:
Larry Hastings <larry at hastings.org>, Pablo Galindo <pablogsal at python.org>, Mario Corchero <mariocj89 at gmail.com>, Eric N. Vander Weele <ericvw at gmail.com>
BDFL-Delegate:
Guido van Rossum <guido at python.org>
Discussions-To:
Discourse thread
Status:
Final
Type:
Standards Track
Created:
20-Jan-2018
Python-Version:
3.8

Table of Contents

Abstract

This PEP proposes to introduce a new syntax, /, for specifying positional-only parameters in Python function definitions.

Positional-only parameters have no externally-usable name. When a function accepting positional-only parameters is called, positional arguments are mapped to these parameters based solely on their order.

When designing APIs (application programming interfaces), library authors try to ensure correct and intended usage of an API. Without the ability to specify which parameters are positional-only, library authors must be careful when choosing appropriate parameter names. This care must be taken even for required parameters or when the parameters have no external semantic meaning for callers of the API.

In this PEP, we discuss:

  • Python’s history and current semantics for positional-only parameters
  • the problems encountered by not having them
  • how these problems are handled without language-intrinsic support for positional-only parameters
  • the benefits of having positional-only parameters

Within context of the motivation, we then:

  • discuss why positional-only parameters should be a feature intrinsic to the language
  • propose the syntax for marking positional-only parameters
  • present how to teach this new feature
  • note rejected ideas in further detail

Motivation

History of Positional-Only Parameter Semantics in Python

Python originally supported positional-only parameters. Early versions of the language lacked the ability to call functions with arguments bound to parameters by name. Around Python 1.0, parameter semantics changed to be positional-or-keyword. Since then, users have been able to provide arguments to a function either positionally or by the keyword name specified in the function’s definition.

In current versions of Python, many CPython “builtin” and standard library functions only accept positional-only parameters. The resulting semantics can be easily observed by calling one of these functions using keyword arguments:

>>> help(pow)
...
pow(x, y, z=None, /)
...

>>> pow(x=5, y=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: pow() takes no keyword arguments

pow() expresses that its parameters are positional-only via the / marker. However, this is only a documentation convention; Python developers cannot use this syntax in code.

There are functions with other interesting semantics:

  • range(), an overloaded function, accepts an optional parameter to the left of its required parameter. [4]
  • dict(), whose mapping/iterator parameter is optional and semantically must be positional-only. Any externally visible name for this parameter would occlude that name going into the **kwarg keyword variadic parameter dict. [3]

One can emulate these semantics in Python code by accepting (*args, **kwargs) and parsing the arguments manually. However, this results in a disconnect between the function definition and what the function contractually accepts. The function definition does not match the logic of the argument handling.

Additionally, the / syntax is used beyond CPython for specifying similar semantics (i.e., [1] [2]); thus, indicating that these scenarios are not exclusive to CPython and the standard library.

Problems Without Positional-Only Parameters

Without positional-only parameters, there are challenges for library authors and users of APIs. The following subsections outline the problems encountered by each entity.

Challenges for Library Authors

With positional-or-keyword parameters, the mix of calling conventions is not always desirable. Authors may want to restrict usage of an API by disallowing calling the API with keyword arguments, which exposes the name of the parameter when part of the public API. This approach is especially useful for required function parameters that already have semantic meaning (e.g, namedtuple(typenames, field_names, …) or when the parameter name has no true external meaning (e.g., arg1, arg2, …, etc for min()). If a caller of an API starts using a keyword argument, the library author cannot rename the parameter because it would be a breaking change.

Positional-only parameters can be emulated by extracting arguments from *args one by one. However, this approach is error-prone and is not synonymous with the function definition, as previously mentioned. The usage of the function is ambiguous and forces users to look at help(), the associated auto-generated documentation, or source code to understand what parameters the function contractually accepts.

Challenges for Users of an API

Users may be surprised when first encountering positional-only notation. This is expected given that it has only recently been documented [13] and it is not possible to use in Python code. For these reasons, this notation is currently an outlier that appears only in CPython APIs developed in C. Documenting the notation and making it possible to use it in Python code would eliminate this disconnect.

Furthermore, the current documentation for positional-only parameters is inconsistent:

  • Some functions denote optional groups of positional-only parameters by enclosing them in nested square brackets. [5]
  • Some functions denote optional groups of positional-only parameters by presenting multiple prototypes with varying numbers of parameters. [6]
  • Some functions use both of the above approaches. [4] [7]

Another point the current documentation does not distinguish is whether a function takes positional-only parameters. open() accepts keyword arguments; however, ord() does not — there is no way of telling just by reading the existing documentation.

Benefits of Positional-Only Parameters

Positional-only parameters give more control to library authors to better express the intended usage of an API and allows the API to evolve in a safe, backward-compatible way. Additionally, it makes the Python language more consistent with existing documentation and the behavior of various “builtin” and standard library functions.

Empowering Library Authors

Library authors would have the flexibility to change the name of positional-only parameters without breaking callers. This flexibility reduces the cognitive burden for choosing an appropriate public-facing name for required parameters or parameters that have no true external semantic meaning.

Positional-only parameters are useful in several situations such as:

  • when a function accepts any keyword argument but also can accept a positional one
  • when a parameter has no external semantic meaning
  • when an API’s parameters are required and unambiguous

A key scenario is when a function accepts any keyword argument but can also accepts a positional one. Prominent examples are Formatter.format and dict.update. For instance, dict.update accepts a dictionary (positionally), an iterable of key/value pairs (positionally), or multiple keyword arguments. In this scenario, if the dictionary parameter were not positional-only, the user could not use the name that the function definition uses for the parameter or, conversely, the function could not distinguish easily if the argument received is the dictionary/iterable or a keyword argument for updating the key/value pair.

Another scenario where positional-only parameters are useful is when the parameter name has no true external semantic meaning. For example, let’s say we want to create a function that converts from one type to another:

def as_my_type(x):
    ...

The name of the parameter provides no intrinsic value and forces the API author to maintain its name forever since callers might pass x as a keyword argument.

Additionally, positional-only parameters are useful when an API’s parameters are required and is unambiguous with respect to function. For example:

def add_to_queue(item: QueueItem):
    ...

The name of the function makes clear the argument expected. A keyword argument provides minimal benefit and also limits the future evolution of the API. Say at a later time we want this function to be able to take multiple items, while preserving backwards compatibility:

def add_to_queue(items: Union[QueueItem, List[QueueItem]]):
    ...

or to take them by using argument lists:

def add_to_queue(*items: QueueItem):
    ...

the author would be forced to always keep the original parameter name to avoid potentially breaking callers.

By being able to specify positional-only parameters, an author can change the name of the parameters freely or even change them to *args, as seen in the previous example. There are multiple function definitions in the standard library which fall into this category. For example, the required parameter to collections.defaultdict (called default_factory in its documentation) can only be passed positionally. One special case of this situation is the self parameter for class methods: it is undesirable that a caller can bind by keyword to the name self when calling the method from the class:

io.FileIO.write(self=f, b=b"data")

Indeed, function definitions from the standard library implemented in C usually take self as a positional-only parameter:

>>> help(io.FileIO.write)
Help on method_descriptor:

write(self, b, /)
    Write buffer b to file, return number of bytes written.

Improving Language Consistency

The Python language would be more consistent with positional-only parameters. If the concept is a normal feature of Python rather than a feature exclusive to extension modules, it would reduce confusion for users encountering functions with positional-only parameters. Some major third-party packages are already using the / notation in their function definitions [1] [2].

Bridging the gap found between “builtin” functions which specify positional-only parameters and pure Python implementations that lack the positional syntax would improve consistency. The / syntax is already exposed in the existing documentation such as when builtins and interfaces are generated by the argument clinic.

Another essential aspect to consider is PEP 399, which mandates that pure Python versions of modules in the standard library must have the same interface and semantics that the accelerator modules implemented in C. For example, if collections.defaultdict were to have a pure Python implementation it would need to make use of positional-only parameters to match the interface of its C counterpart.

Rationale

We propose to introduce positional-only parameters as a new syntax to the Python language.

The new syntax will enable library authors to further control how their API can be called. It will allow designating which parameters must be called as positional-only, while preventing them from being called as keyword arguments.

Previously, (informational) PEP 457 defined the syntax, but with a much more vague scope. This PEP takes the original proposal a step further by justifying the syntax and providing an implementation for the / syntax in function definitions.

Performance

In addition to the aforementioned benefits, the parsing and handling of positional-only arguments is faster. This performance benefit can be demonstrated in this thread about converting keyword arguments to positional: [11]. Due to this speedup, there has been a recent trend towards moving builtins away from keyword arguments: recently, backwards-incompatible changes were made to disallow keyword arguments to bool, float, list, int, tuple.

Maintainability

Providing a way to specify positional-only parameters in Python will make it easier to maintain pure Python implementations of C modules. Additionally, library authors defining functions will have the choice for choosing positional-only parameters if they determine that passing a keyword argument provides no additional clarity.

This is a well discussed, recurring topic on the Python mailing lists:

Logical ordering

Positional-only parameters also have the (minor) benefit of enforcing some logical order when calling interfaces that make use of them. For example, the range function takes all its parameters positionally and disallows forms like:

range(stop=5, start=0, step=2)
range(stop=5, step=2, start=0)
range(step=2, start=0, stop=5)
range(step=2, stop=5, start=0)

at the price of disallowing the use of keyword arguments for the (unique) intended order:

range(start=0, stop=5, step=2)

Compatibility for Pure Python and C Modules

Another critical motivation for positional-only parameters is PEP 399: Pure Python/C Accelerator Module Compatibility Requirements. This PEP states that:

This PEP requires that in these instances that the C code must pass the test suite used for the pure Python code to act as much as a drop-in replacement as reasonably possible

If the C code is implemented using the existing capabilities to implement positional-only parameters using the argument clinic, and related machinery, it is not possible for the pure Python counterpart to match the provided interface and requirements. This creates a disparity between the interfaces of some functions and classes in the CPython standard library and other Python implementations. For example:

$ python3 # CPython 3.7.2
>>> import binascii; binascii.crc32(data=b'data')
TypeError: crc32() takes no keyword arguments

$ pypy3 # PyPy 6.0.0
>>>> import binascii; binascii.crc32(data=b'data')
2918445923

Other Python implementations can reproduce the CPython APIs manually, but this goes against the spirit of PEP 399 to avoid duplication of effort by mandating that all modules added to Python’s standard library must have a pure Python implementation with the same interface and semantics.

Consistency in Subclasses

Another scenario where positional-only parameters provide benefit occurs when a subclass overrides a method of the base class and changes the name of parameters that are intended to be positional:

class Base:
    def meth(self, arg: int) -> str:
        ...

class Sub(Base):
    def meth(self, other_arg: int) -> str:
        ...

def func(x: Base):
    x.meth(arg=12)

func(Sub())  # Runtime error

This situation could be considered a Liskov violation — the subclass cannot be used in a context when an instance of the base class is expected. Renaming arguments when overloading methods can happen when the subclass has reasons to use a different choice for the parameter name that is more appropriate for the specific domain of the subclass (e.g., when subclassing Mapping to implement a DNS lookup cache, the derived class may not want to use the generic argument names ‘key’ and ‘value’ but rather ‘host’ and ‘address’). Having this function definition with positional-only parameters can avoid this problem because users will not be able to call the interface using keyword arguments. In general, designing for subclassing usually involves anticipating code that hasn’t been written yet and over which the author has no control. Having measures that can facilitate the evolution of interfaces in a backwards-compatible would be useful for library authors.

Optimizations

A final argument in favor of positional-only parameters is that they allow some new optimizations like the ones already present in the argument clinic due to the fact that parameters are expected to be passed in strict order. For example, CPython’s internal METH_FASTCALL calling convention has been recently specialized for functions with positional-only parameters to eliminate the cost for handling empty keywords. Similar performance improvements can be applied when creating the evaluation frame of Python functions thanks to positional-only parameters.

Specification

Syntax and Semantics

From the “ten-thousand foot view”, eliding *args and **kwargs for illustration, the grammar for a function definition would look like:

def name(positional_or_keyword_parameters, *, keyword_only_parameters):

Building on that example, the new syntax for function definitions would look like:

def name(positional_only_parameters, /, positional_or_keyword_parameters,
         *, keyword_only_parameters):

The following would apply:

  • All parameters left of the / are treated as positional-only.
  • If / is not specified in the function definition, that function does not accept any positional-only arguments.
  • The logic around optional values for positional-only parameters remains the same as for positional-or-keyword parameters.
  • Once a positional-only parameter is specified with a default, the following positional-only and positional-or-keyword parameters need to have defaults as well.
  • Positional-only parameters which do not have default values are required positional-only parameters.

Therefore, the following would be valid function definitions:

def name(p1, p2, /, p_or_kw, *, kw):
def name(p1, p2=None, /, p_or_kw=None, *, kw):
def name(p1, p2=None, /, *, kw):
def name(p1, p2=None, /):
def name(p1, p2, /, p_or_kw):
def name(p1, p2, /):

Just like today, the following would be valid function definitions:

def name(p_or_kw, *, kw):
def name(*, kw):

While the following would be invalid:

def name(p1, p2=None, /, p_or_kw, *, kw):
def name(p1=None, p2, /, p_or_kw=None, *, kw):
def name(p1=None, p2, /):

Full Grammar Specification

A simplified view of the proposed grammar specification is:

typedargslist:
  tfpdef ['=' test] (',' tfpdef ['=' test])* ',' '/' [','  # and so on

varargslist:
  vfpdef ['=' test] (',' vfpdef ['=' test])* ',' '/' [','  # and so on

Based on the reference implementation in this PEP, the new rule for typedarglist would be:

typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* ',' '/' [',' [tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [
        '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
      | '**' tfpdef [',']]]
  | '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
  | '**' tfpdef [',']] ] )| (
   tfpdef ['=' test] (',' tfpdef ['=' test])* [',' [
        '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
      | '**' tfpdef [',']]]
 | '*' [tfpdef] (',' tfpdef ['=' test])* [',' ['**' tfpdef [',']]]
 | '**' tfpdef [','])

and for varargslist would be:

varargslist: vfpdef ['=' test ](',' vfpdef ['=' test])* ',' '/' [',' [ (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
        '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
      | '**' vfpdef [',']]]
  | '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
  | '**' vfpdef [',']) ]] | (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
        '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
      | '**' vfpdef [',']]]
  | '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
  | '**' vfpdef [',']
)

Semantic Corner Case

The following is an interesting corollary of the specification. Consider this function definition:

def foo(name, **kwds):
    return 'name' in kwds

There is no possible call that will make it return True. For example:

>>> foo(1, **{'name': 2})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() got multiple values for argument 'name'
>>>

But using / we can support this:

def foo(name, /, **kwds):
    return 'name' in kwds

Now the above call will return True.

In other words, the names of positional-only parameters can be used in **kwds without ambiguity. (As another example, this benefits the signatures of dict() and dict.update().)

Origin of “/” as a Separator

Using / as a separator was initially proposed by Guido van Rossum in 2012 [8] :

Alternative proposal: how about using ‘/’ ? It’s kind of the opposite of ‘*’ which means “keyword argument”, and ‘/’ is not a new character.

How To Teach This

Introducing a dedicated syntax to mark positional-only parameters is closely analogous to existing keyword-only arguments. Teaching these concepts together may simplify how to teach the possible function definitions a user may encounter or design.

This PEP recommends adding a new subsection to the Python documentation, in the section “More on Defining Functions”, where the rest of the argument types are discussed. The following paragraphs serve as a draft for these additions. They will introduce the notation for both positional-only and keyword-only parameters. It is not intended to be exhaustive, nor should it be considered the final version to be incorporated into the documentation.


By default, arguments may be passed to a Python function either by position or explicitly by keyword. For readability and performance, it makes sense to restrict the way arguments can be passed so that a developer need only look at the function definition to determine if items are passed by position, by position or keyword, or by keyword.

A function definition may look like:

def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):
      -----------    ----------     ----------
        |             |                  |
        |        Positional or keyword   |
        |                                - Keyword only
         -- Positional only

where / and * are optional. If used, these symbols indicate the kind of parameter by how the arguments may be passed to the function: positional-only, positional-or-keyword, and keyword-only. Keyword parameters are also referred to as named parameters.

Positional-or-Keyword Arguments

If / and * are not present in the function definition, arguments may be passed to a function by position or by keyword.

Positional-Only Parameters

Looking at this in a bit more detail, it is possible to mark certain parameters as positional-only. If positional-only, the parameters’ order matters, and the parameters cannot be passed by keyword. Positional-only parameters would be placed before a / (forward-slash). The / is used to logically separate the positional-only parameters from the rest of the parameters. If there is no / in the function definition, there are no positional-only parameters.

Parameters following the / may be positional-or-keyword or keyword-only.

Keyword-Only Arguments

To mark parameters as keyword-only, indicating the parameters must be passed by keyword argument, place an * in the arguments list just before the first keyword-only parameter.

Function Examples

Consider the following example function definitions paying close attention to the markers / and *:

>>> def standard_arg(arg):
...     print(arg)
...
>>> def pos_only_arg(arg, /):
...     print(arg)
...
>>> def kwd_only_arg(*, arg):
...     print(arg)
...
>>> def combined_example(pos_only, /, standard, *, kwd_only):
...     print(pos_only, standard, kwd_only)

The first function definition standard_arg, the most familiar form, places no restrictions on the calling convention and arguments may be passed by position or keyword:

>>> standard_arg(2)
2

>>> standard_arg(arg=2)
2

The second function pos_only_arg is restricted to only use positional parameters as there is a / in the function definition:

>>> pos_only_arg(1)
1

>>> pos_only_arg(arg=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: pos_only_arg() got an unexpected keyword argument 'arg'

The third function kwd_only_args only allows keyword arguments as indicated by a * in the function definition:

>>> kwd_only_arg(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: kwd_only_arg() takes 0 positional arguments but 1 was given

>>> kwd_only_arg(arg=3)
3

And the last uses all three calling conventions in the same function definition:

>>> combined_example(1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: combined_example() takes 2 positional arguments but 3 were given

>>> combined_example(1, 2, kwd_only=3)
1 2 3

>>> combined_example(1, standard=2, kwd_only=3)
1 2 3

>>> combined_example(pos_only=1, standard=2, kwd_only=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: combined_example() got an unexpected keyword argument 'pos_only'

Recap

The use case will determine which parameters to use in the function definition:

def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):

As guidance:

  • Use positional-only if names do not matter or have no meaning, and there are only a few arguments which will always be passed in the same order.
  • Use keyword-only when names have meaning and the function definition is more understandable by being explicit with names.

Reference Implementation

An initial implementation that passes the CPython test suite is available for evaluation [10].

The benefits of this implementations are speed of handling positional-only parameters, consistency with the implementation of keyword-only parameters (PEP 3102), and a simpler implementation of all the tools and modules that would be impacted by this change.

Rejected Ideas

Do Nothing

Always an option — the status quo. While this was considered, the aforementioned benefits are worth the addition to the language.

Decorators

It has been suggested on python-ideas [9] to provide a decorator written in Python for this feature.

This approach has the benefit of not polluting function definition with additional syntax. However, we have decided to reject this idea because:

  • It introduces an asymmetry with how parameter behavior is declared.
  • It makes it difficult for static analyzers and type checkers to safely identify positional-only parameters. They would need to query the AST for the list of decorators and identify the correct one by name or with extra heuristics, while keyword-only parameters are exposed directly in the AST. In order for tools to correctly identify positional-only parameters, they would need to execute the module to access any metadata the decorator is setting.
  • Any error with the declaration will be reported only at runtime.
  • It may be more difficult to identify positional-only parameters in long function definitions, as it forces the user to count them to know which is the last one that is impacted by the decorator.
  • The / syntax has already been introduced for C functions. This inconsistency will make it more challenging to implement any tools and modules that deal with this syntax — including but not limited to, the argument clinic, the inspect module and the ast module.
  • The decorator implementation would likely impose a runtime performance cost, particularly when compared to adding support directly to the interpreter.

Per-Argument Marker

A per-argument marker is another language-intrinsic option. The approach adds a token to each of the parameters to indicate they are positional-only and requires those parameters to be placed together. Example:

def (.arg1, .arg2, arg3):

Note the dot (i.e., .) on .arg1 and .arg2. While this approach may be easier to read, it has been rejected because / as an explicit marker is congruent with * for keyword-only arguments and is less error-prone.

It should be noted that some libraries already use leading underscore [12] to conventionally indicate parameters as positional-only.

Using “__” as a Per-Argument Marker

Some libraries and applications (like mypy or jinja) use names prepended with a double underscore (i.e., __) as a convention to indicate positional-only parameters. We have rejected the idea of introducing __ as a new syntax because:

  • It is a backwards-incompatible change.
  • It is not symmetric with how the keyword-only parameters are currently declared.
  • Querying the AST for positional-only parameters would require checking the normal arguments and inspecting their names, whereas keyword-only parameters have a property associated with them (FunctionDef.args.kwonlyargs).
  • Every parameter would need to be inspected to know when positional-only arguments end.
  • The marker is more verbose, forcing marking every positional-only parameter.
  • It clashes with other uses of the double underscore prefix like invoking name mangling in classes.

Group Positional-Only Parameters With Parentheses

Tuple parameter unpacking is a Python 2 feature which allows the use of a tuple as a parameter in a function definition. It allows a sequence argument to be unpacked automatically. An example is:

def fxn(a, (b, c), d):
    pass

Tuple argument unpacking was removed in Python 3 (PEP 3113). There has been a proposition to reuse this syntax to implement positional-only parameters. We have rejected this syntax for indicating positional only parameters for several reasons:

  • The syntax is asymmetric with respect to how keyword-only parameters are declared.
  • Python 2 uses this syntax which could raise confusion regarding the behavior of this syntax. This would be surprising to users porting Python 2 codebases that were using this feature.
  • This syntax is very similar to tuple literals. This can raise additional confusion because it can be confused with a tuple declaration.

After Separator Proposal

Marking positional-parameters after the / was another idea considered. However, we were unable to find an approach which would modify the arguments after the marker. Otherwise, would force the parameters before the marker to be positional-only as well. For example:

def (x, y, /, z):

If we define that / marks z as positional-only, it would not be possible to specify x and y as keyword arguments. Finding a way to work around this limitation would add confusion given that at the moment keyword arguments cannot be followed by positional arguments. Therefore, / would make both the preceding and following parameters positional-only.

Thanks

Credit for some of the content of this PEP is contained in Larry Hastings’s PEP 457.

Credit for the use of / as the separator between positional-only and positional-or-keyword parameters go to Guido van Rossum, in a proposal from 2012. [8]

Credit for discussion about the simplification of the grammar goes to Braulio Valdivieso.


Source: https://github.com/python/peps/blob/main/peps/pep-0570.rst

Last modified: 2023-09-09 17:39:29 GMT