8000 CovidQA Replication Issues · Issue #135 · castorini/pygaggle · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

CovidQA Replication Issues #135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dahlia-Chehata opened this issue Jan 1, 2021 · 4 comments
Closed

CovidQA Replication Issues #135

Dahlia-Chehata opened this issue Jan 1, 2021 · 4 comments

Comments

@Dahlia-Chehata
Copy link
Contributor

I tried replicating CovidQA experiments on Colab (Tesla K80), Compute Canada (Tesla V100) and locally (GTX1650) on #134 with the index from https://www.dropbox.com/s/z8s0urul6l4zig2/lucene-index-cord19-paragraph-2020-05-12.tar.gz?dl=1

Re-Ranking with Random (I got the same results)

python -um pygaggle.run.evaluate_kaggle_highlighter --method random --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.0
recall@3	0.0199546485260771
recall@50	0.3247165532879819
recall@1000	1.0
mrr	0.03999734528458418
mrr@10	0.020888672929489253

python -um pygaggle.run.evaluate_kaggle_highlighter --method random --split kq --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.0
recall@3	0.0199546485260771
recall@50	0.3247165532879819
recall@1000	1.0
mrr	0.03999734528458418
mrr@10	0.020888672929489253

Re-Ranking with BM25

I got the following error on the three machines:

File "/pygaggle/pygaggle/model/evaluate.py", line 161, in evaluate
    scores = [x.score for x in self.reranker.rerank(example.query,
File "/pygaggle/pygaggle/rerank/bm25.py", line 46, in rerank
    idfs = {w:
File "/pygaggle/pygaggle/rerank/bm25.py", line 48, in <dictcomp>
    text.metadata['docid'], w) for w in tf}
KeyError: 'docid'

I replaced text.metadata['docid'] by text.title['docid'] in /pygaggle/pygaggle/rerank/bm25.py and got the same results for the 2 commands:

python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.15384615384615385
recall@3	0.21865889212827985
recall@50	0.7208778749595076
recall@1000	0.7582928409459021
mrr	0.25329970378011524
mrr@10	0.23344131303314977

python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 --split kq --dataset data/kaggle-lit-review-0.2.json --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12

precision@1	0.15384615384615385
recall@3	0.21865889212827985
recall@50	0.7208778749595076
recall@1000	0.7582928409459021
mrr	0.25441237140238665
mrr@10	0.23493413238311195

Re-Ranking with monoT5

I tried with Python 3.6.9, 3.7.3 and 3.8 with the corresponding requirements and got the following error in all cases:
(Besides changing torch version, I have not tried looking into this error)

 File /pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 193, in <module>
    main()
  File "/pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 182, in main
    reranker = construct_map[options.method](options)
  File "/pygaggle/pygaggle/run/evaluate_kaggle_highlighter.py", line 81, in construct_t5
    model = loader.load().to(device).eval()
  File "/pygaggle/pygaggle/model/serialize.py", line 76, in load
    return self._fix_t5_model(T5ForConditionalGeneration.from_pretrained(
  File "/pygaggle/pygaggle/model/serialize.py", line 34, in _fix_t5_model
    model.decoder.block[0].layer[1].EncDecAttention.\
  File ".../.../torch/nn/modules/module.py", line 778, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'T5Attention' object has no attribute 'relative_attention_bias'
@ronakice
Copy link
Member
ronakice commented Jan 4, 2021

Hey @Dahlia-Chehata , can you send me an editable Colab you were trying to replicate this on, so I can try to fix it?

@Dahlia-Chehata
Copy link
Contributor Author

@ronakice here is the link

@ronakice
Copy link
Member
ronakice commented Jan 5, 2021

Should be fixed in new PR, checkout this. Thanks for pointing this out :)

@ronakice
Copy link
Member
ronakice commented Jan 5, 2021

Okay, this is merged, close if everything is fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0