CovidQA Replication Issues

I tried replicating CovidQA experiments on Colab (Tesla K80), Compute Canada (Tesla V100) and locally (GTX1650) on #134 with the index from https://www.dropbox.com/s/z8s0urul6l4zig2/lucene-index-cord19-paragraph-2020-05-12.tar.gz?dl=1

Re-Ranking with Random (I got the same results)

python -um pygaggle.run.evaluate_kaggle_highlighter --method random --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.0
recall@3	0.0199546485260771
recall@50	0.3247165532879819
recall@1000	1.0
mrr	0.03999734528458418
mrr@10	0.020888672929489253

python -um pygaggle.run.evaluate_kaggle_highlighter --method random --split kq --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.0
recall@3	0.0199546485260771
recall@50	0.3247165532879819
recall@1000	1.0
mrr	0.03999734528458418
mrr@10	0.020888672929489253

Re-Ranking with BM25

I got the following error on the three machines:

File "/pygaggle/pygaggle/model/evaluate.py", line 161, in evaluate
    scores = [x.score for x in self.reranker.rerank(example.query,
File "/pygaggle/pygaggle/rerank/bm25.py", line 46, in rerank
    idfs = {w:
File "/pygaggle/pygaggle/rerank/bm25.py", line 48, in <dictcomp>
    text.metadata['docid'], w) for w in tf}
KeyError: 'docid'

I replaced text.metadata['docid'] by text.title['docid'] in /pygaggle/pygaggle/rerank/bm25.py and got the same results for the 2 commands:

python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.15384615384615385
recall@3	0.21865889212827985
recall@50	0.7208778749595076
recall@1000	0.7582928409459021
mrr	0.25329970378011524
mrr@10	0.23344131303314977

python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 --split kq --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.15384615384615385
recall@3	0.21865889212827985
recall@50	0.7208778749595076
recall@1000	0.7582928409459021
mrr	0.25441237140238665
mrr@10	0.23493413238311195

Re-Ranking with monoT5

I tried with Python 3.6.9, 3.7.3 and 3.8 with the corresponding requirements and got the following error in all cases:
(Besides changing torch version, I have not tried looking into this error)

 File /pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 193, in <module>
    main()
  File "/pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 182, in main
    reranker = construct_map[options.method](options)
  File "/pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 81, in construct_t5
    model = loader.load().to(device).eval()
  File "/pygaggle/pygaggle/model/serialize.py", line 76, in load
    return self._fix_t5_model(T5ForConditionalGeneration.from_pretrained(
  File "/pygaggle/pygaggle/model/serialize.py", line 34, in _fix_t5_model
    model.decoder.block[0].layer[1].EncDecAttention.\
  File ".../.../torch/nn/modules/module.py", line 778, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'T5Attention' object has no attribute 'relative_attention_bias'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions