metadata-automation-challenge

Building docker images

docker build -t metadata-baseline -f Dockerfile.baseline .
docker build -t metadata-validation -f Dockerfile.validation .
docker build -t metadata-scoring -f Dockerfile.scoring .

Running the baseline method

Here we describe how to apply the baseline method to automatically annotate a dataset (see Data Description).

Create the folders input, data and output in your current directory.
Place the input dataset in input, e.g. input/APOLLO-2-leaderboard.tsv
Run the following command

docker run \
  -v $(pwd)/input:/input:ro \
  -v $(pwd)/data:/data:ro \
  -v $(pwd)/output:/output \
  metadata-baseline APOLLO-2-leaderboard

where APOLLO-2 is the name of the dataset in the folder input (without the extension .tsv). Here $(pwd) is automatically replaced to the absolute path of the current directory.

The file /output/APOLLO-2-leaderboard-Submission.json is created upon successful completion of the above command.

Validating the submission file

The following command checks that the format of the submission file generated is valid.

$ docker run \
  -v $(pwd)/output/APOLLO-2-leaderboard-Submission.json:/input.json:ro \
  metadata-validation \
  validate-submission --json_filepath /input.json
Your JSON file is valid!

where $(pwd)/output/APOLLO-2-leaderboard-Submission.json points to the location of the submission file generated in the previous section.

Alternatively, the scoring script can be run directly using Python.

$ python3 -m venv venv
$ pip install click jsonschema

Here is the generic command to validate the format of a submission file.

$ python schema/validate.py validate-submission \
  --json_filepath yourjson.json \
  --schema_filepath schema/output-schema.json

To validate the submission file generated in the previous section, the command becomes:

$ python schema/validate.py validate-submission \
  --json_filepath output/APOLLO-2-leaderboard-Submission.json \
  --schema_filepath schema/output-schema.json
Your JSON file is valid!

Scoring the submission

Here we evaluate the performance of the submission by comparing the content of the submission file to a gold standard (e.g. manual annotations).

$ docker run \
  -v $(pwd)/output/Apollo2-Submission.json:/submission.json:ro \
  -v $(pwd)/data/Annotated-Apollo2.json:/goldstandard.json:ro \
  metadata-scoring score-submission /submission.json /goldstandard.json
0.9692308

Name		Name	Last commit message	Last commit date
Latest commit History 273 Commits
R		R
baseline_demo		baseline_demo
bin		bin
schema		schema
scoring_demo		scoring_demo
workflow		workflow
.gitignore		.gitignore
Dockerfile.baseline		Dockerfile.baseline
Dockerfile.scoring		Dockerfile.scoring
Dockerfile.validation		Dockerfile.validation
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

metadata-automation-challenge

Building docker images

Running the baseline method

Validating the submission file

Scoring the submission

About

Releases

Packages

Languages

sailfish009/metadata-automation-challenge

Folders and files

Latest commit

History

Repository files navigation

metadata-automation-challenge

Building docker images

Running the baseline method

Validating the submission file

Scoring the submission

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages