8000 feat: allow Executors as dataclass by JoanFM · Pull Request #4918 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: allow Executors as dataclass #4918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 32 additions & 6 deletions docs/fundamentals/executor/executor-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Executor uses `docarray.DocumentArray` as input and output data structure. Pleas
{class}`~jina.Executor` is a self-contained component and performs a group of tasks on a `DocumentArray`.
It encapsulates functions that process `DocumentArray`s. Inside the Executor, these functions are decorated with `@requests`. To create an Executor, you only need to follow three principles:

1. An `Executor` should subclass directly from the `jina.Executor` class.
1. An `Executor` should subclass directly from the `jina.Executor` class. `Executor` can also be a `dataclass`
2. An `Executor` class is a bag of functions with shared state or configuration (via `self`); it can contain an arbitrary number of
functions with arbitrary names.
3. Functions decorated by `@requests` will be invoked according to their `on=` endpoint. These functions can be coroutines (`async def`) or regular functions.
Expand All @@ -26,11 +26,11 @@ You can name your executor class freely.

### `__init__`

No need to implement `__init__` if your `Executor` does not contain initial states.
No need to implement `__init__` if your `Executor` does not contain initial states or if it is a [dataclass](https://docs.python.org/3/library/dataclasses.html)

If your executor has `__init__`, it needs to carry `**kwargs` in the signature and call `super().__init__(**kwargs)`
If your executor has `__init__`, it needs to carry `**kwargs` in the signature and call `super().__init__(**kwargs)`
in the body:

````{tab} Executor
```python
from jina import Executor

Expand All @@ -41,11 +41,24 @@ class MyExecutor(Executor):
self.bar = bar
self.foo = foo
```
````

````{tab} Executor as dataclass
```python
from dataclasses import dataclass
from jina import Executor


@dataclass
class MyExecutor(Executor):
bar: int
foo: str
```
````

````{admonition} What is inside kwargs?
:class: hint
Here, `kwargs` are reserved for Jina to inject `metas` and `requests` (representing the request-to-function mapping) values when the Executor is used inside a Flow.
Here, `kwargs` are reserved for Jina to inject `metas` and `requests` (representing the request-to-function mapping) values when the Executor is used inside a Flow. Also when `Executor` is a `dataclass` these parameters are injected by Jina as in the regular case when calling `super().__init__`

You can access the values of these arguments in the `__init__` body via `self.metas`/`self.requests`/`self.runtime_args`,
or modify their values before passing them to `super().__init__()`.
Expand Down Expand Up @@ -95,6 +108,20 @@ Some of these `arguments` can be used when developing the internal logic of the

These `special` arguments are `workspace`, `requests`, `metas`, `runtime_args`.

Another alternative, is to declare your `Executor` as a [dataclass](https://docs.python.org/3/library/dataclasses.html). In this case, user does not provide an specific constructor.
Then, Jina will inject all these `special` arguments without the need of the user to call any specific method.

```python
from dataclasses import dataclass
from jina import Executor


@dataclass
class MyExecutor(Executor):
bar: int
foo: str
```

(executor-workspace)=
### `workspace`

Expand Down Expand Up @@ -161,7 +188,6 @@ The list of the `runtime_args` is:

These can **not** be provided by the user through any API. They are generated by the Flow orchestration.


## See further

- {ref}`Executor in Flow <executor-in-flow>`
Expand Down
25 changes: 19 additions & 6 deletions jina/jaml/parsers/executor/legacy.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import dataclasses
import inspect
from functools import reduce
from typing import Any, Dict, Optional, Set, Type
Expand Down Expand Up @@ -72,12 +73,24 @@ def parse(

cls._init_from_yaml = True
# tmp_p = {kk: expand_env_var(vv) for kk, vv in data.get('with', {}).items()}
obj = cls(
**data.get('with', {}),
metas=data.get('metas', {}),
requests=data.get('requests', {}),
runtime_args=runtime_args,
)
if dataclasses.is_dataclass(cls):
obj = cls(
**data.get('with', {}),
)
cls.__bases__[0].__init__(
obj,
**data.get('with', {}),
metas=data.get('metas', {}),
requests=data.get('requests', {}),
runtime_args=runtime_args,
)
else:
obj = cls(
**data.get('with', {}),
metas=data.get('metas', {}),
requests=data.get('requests', {}),
runtime_args=runtime_args,
)
cls._init_from_yaml = False

# check if the yaml file used to instanciate 'cls' has arguments that are not in 'cls'
Expand Down
29 changes: 29 additions & 0 deletions tests/integration/dataclass_executor/test_dataclass_executor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import dataclasses

from docarray import DocumentArray
from jina import Executor, Flow, requests


def test_executor_dataclass():
@dataclasses.dataclass
class MyDataClassExecutor(Executor):
my_field: str

@requests(on=['/search'])
def baz(self, docs, **kwargs):
for doc in docs:
doc.tags['meta 8000 s_name'] = self.metas.name
doc.tags['my_field'] = self.my_field

f = Flow().add(
uses=MyDataClassExecutor,
uses_with={'my_field': 'this is my field'},
uses_metas={'name': 'test-name-updated'},
uses_requests={'/foo': 'baz'},
)
with f:
res = f.post(on='/foo', inputs=DocumentArray.empty(2))
assert len(res) == 2
for r in res:
assert r.tags['metas_name'] == 'test-name-updated'
assert r.tags['my_field'] == 'this is my field'
Empty file.
Empty file.
30 changes: 29 additions & 1 deletion tests/unit/jaml/test_type_parse.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import dataclasses
import os

import pytest
Expand All @@ -13,6 +14,15 @@ class MyExecutor(BaseExecutor):
pass


@dataclasses.dataclass
class MyDataClassExecutor(BaseExecutor):
my_field: str = ''

@requests
def baz(self, **kwargs):
pass


def test_non_empty_reg_tags():
assert JAML.registered_tags()
assert __default_executor__ in JAML.registered_tags()
Expand Down Expand Up @@ -175,7 +185,6 @@ def test_parsing_brackets_in_envvar():
'VAR1': '{"1": "2"}',
}
):

b = JAML.load(flow_yaml, substitute=True)
assert b['executors'][0]['env']['var1'] == '{"1": "2"}'
assert b['executors'][0]['env']['var2'] == 'a'
Expand Down Expand Up @@ -213,3 +222,22 @@ def test_jtype(tmpdir):

assert type(BaseExecutor.load_config(exec_path)) == BaseExecutor
assert type(Flow.load_config(flow_path)) == Flow


def test_load_dataclass_executor():
executor_yaml = '''
jtype: MyDataClassExecutor
with:
my_field: this is my field
metas:
name: test-name-updated
workspace: test-work-space-updated
requests:
/foo: baz
'''

exec = BaseExecutor.load_config(executor_yaml)
assert exec.my_field == 'this is my field'
assert exec.requests['/foo'] == MyDataClassExecutor.baz
assert exec.metas.name == 'test-name-updated'
assert exec.metas.workspace == 'test-work-space-updated'
0