Different Vocab Size Between Tokenizer and Model's Word Embedding Layer · Issue #33 · IndoNLP/indonlu · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected Behavior
The length of tokenizer vocab size and the BERT's word embedding layer dimension should be the same
Actual Behavior
The length of tokenizer vocab size and the BERT's word embedding layer dimension is not the same
Steps to Reproduce the Problem
model = AutoModel.from_pretrained('indobenchmark/indobert-base-p1')
print(model)
tokenizer = AutoTokenizer.from_pretrained('indobenchmark/indobert-base-p1')
print(len(tokenizer))
The text was updated successfully, but these errors were encountered: