torchtext.get_tokenizer
文章目录
- 1. 说明
- 2. pytorch代码
1. 说明
假设我们有一个句子如下:You can now install TorchText using pip!
分词后可得:['you', 'can', 'now', 'install', 'torchtext', 'using', 'pip', '!']
2. pytorch代码
import torchtext
from torchtext.data import get_tokenizer
tokenizer = get_tokenizer("basic_english")
tokens = tokenizer("You can now install TorchText using pip!")
print(tokens)
- 结果:
['you', 'can', 'now', 'install', 'torchtext', 'using', 'pip', '!']