使用Hugging Face快速上手Tokenizer方法step1方法step1进入huggingface网站在搜索栏中搜索chinese【根据自己的需求来,如果数据集是中文这的搜索】打开第一个bert-base-chinese复制下面这段话到vscode里from transformers import AutoTokenizer, AutoModelForMaskedLMtokenizer = AutoTokenizer.from_pretrained("bert-base-ch. Add the given special tokens to the Tokenizer. If these tokens are already part of the vocabulary, it just let the Tokenizer know about them. If they don’t exist, the Tokenizer creates them, giving them a new id. These special tokens will never be processed by the model (ie won’t be split into multiple tokens), and they can be removed from the output when decoding.