-
Notifications
You must be signed in to change notification settings - Fork 100
MSMARCO support: monoBERT #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@ronakice, thanks a lot for adding this! Just curious, what is inference speed (query-doc per sec) you got on GPU vs TPU? |
@rodrigonogueira4 right now pygaggle only supports single GPU I believe. Yes, we should add TPU and multi-GPU support soon and benchmark it! I'm prioritizing monoBERT/duoBERT/T5 for both MARCO and TREC first which the URAs could easily run (since they only have single GPU access). |
pygaggle/settings.py
Outdated
|
||
|
||
class MsMarcoSettings(Settings): | ||
msmarco_index_path: str = '/content/data/index-msmarco-passage-20191117-0ed488' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we convert all these as arguments to our main functions? It is hard to keep track of which exact command we used to run an experiment when they are in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I just followed similar settings as that which was included for CovidQA. Maybe things like the index_path which rarely every change can remain here, and we can move model directory to main?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM!
setup.py
Outdated
'tqdm==4.45.0', | ||
'transformers==2.7.0' | ||
'transformers>=2.7.0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we force 2.8.0 already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My part of the code can, I don't know if it will cause issues in the rest (I don't think so..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. lets try to move to 2.8.0 as it contains some important new functions for T5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for implementing this! Just a couple of minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all good to me and thanks again for implementing this!
Feel free to merge when you finish converting settings.py as arguments.
Tested on a subset of 50 questions (1000 passages each) on colab (P100): https://colab.research.google.com/drive/1C-0U40wCUbDUObWReBYkeIBLgqJeKCuO
It takes about 16-17 minutes. So can potentially scale it up to 1000ish examples when having URAs replicate it. I will also entirely replicate the results entirely across multiple GPUs/TPUs as and when I add support for it!
m̶r̶r̶@̶1̶0̶ ̶0̶.̶3̶6̶4̶7̶6̶9̶8̶4̶1̶2̶6̶9̶8̶4̶1̶3̶ ̶v̶s̶ ̶@̶r̶o̶d̶r̶i̶g̶o̶n̶o̶g̶u̶e̶i̶r̶a̶4̶'̶s̶ ̶r̶u̶n̶.̶m̶o̶n̶o̶b̶e̶r̶t̶.̶d̶e̶v̶.̶s̶m̶a̶l̶l̶.̶t̶s̶v̶ ̶s̶c̶o̶r̶i̶n̶g̶ ̶m̶r̶r̶@̶1̶0̶:̶ ̶0̶.̶3̶5̶7̶6̶9̶0̶4̶7̶6̶1̶9̶0̶4̶7̶6̶3̶ ̶o̶n̶ ̶t̶h̶e̶s̶e̶ ̶5̶0̶ ̶q̶u̶e̶s̶t̶i̶o̶n̶s̶.̶ ̶(there was something wrong with the sample, reinvestigating now) This is potentially due to the tokenization difference (here we just use encode_plus that follows ’longest_first’ truncation strategy vs fixed length tokenization of 64 tokens for the query and 448 for the passage). It could also be due to a bunch of other things as we saw similar differences in numbert.
Some other important general changes I made: