docker build -t metadata-baseline -f Dockerfile.baseline .
docker build -t metadata-validation -f Dockerfile.validation .
docker build -t metadata-scoring -f Dockerfile.scoring .
Here we describe how to apply the baseline method to automatically annotate a dataset (see Data Description).
- Create the folders
input
,data
andoutput
in your current directory. - Place the input dataset in
input
, e.g.input/APOLLO-2-leaderboard.tsv
- Run the following command
docker run \
-v $(pwd)/input:/input:ro \
-v $(pwd)/data:/data:ro \
-v $(pwd)/output:/output \
metadata-baseline APOLLO-2-leaderboard
where APOLLO-2
is the name of the dataset in the folder input
(without the extension .tsv
). Here $(pwd)
is automatically replaced to the absolute path of the current directory.
The file /output/APOLLO-2-leaderboard-Submission.json
is created upon successful completion of the above command.
The following command checks that the format of the submission file generated is valid.
$ docker run \
-v $(pwd)/output/APOLLO-2-leaderboard-Submission.json:/input.json:ro \
metadata-validation \
validate-submission --json_filepath /input.json
Your JSON file is valid!
where $(pwd)/output/APOLLO-2-leaderboard-Submission.json
points to the location of the submission file generated in the previous section.
Alternatively, the scoring script can be run directly using Python.
$ python3 -m venv venv
$ pip install click jsonschema
Here is the generic command to validate the format of a submission file.
$ python schema/validate.py validate-submission \
--json_filepath yourjson.json \
--schema_filepath schema/output-schema.json
To validate the submission file generated in the previous section, the command becomes:
$ python schema/validate.py validate-submission \
--json_filepath output/APOLLO-2-leaderboard-Submission.json \
--schema_filepath schema/output-schema.json
Your JSON file is valid!
Here we evaluate the performance of the submission by comparing the content of the submission file to a gold standard (e.g. manual annotations).
$ docker run \
-v $(pwd)/output/Apollo2-Submission.json:/submission.json:ro \
-v $(pwd)/data/Annotated-Apollo2.json:/goldstandard.json:ro \
metadata-scoring score-submission /submission.json /goldstandard.json
0.9692308