Ads
Ads
Ads
In the context of NLP (Natural Language Processing), a token is a sequence of characters that represents a single semantic unit in the text. Tokens are typically created by splitting the text into individual words or subwords, which are then used as the basic units of processing. For example, in the sentence "I like to play soccer," the individual words "I," "like," "to," "play," and "soccer" would each be considered a separate token.
def count_tokens(text):
tokens = text.split()
return len(tokens)
text = "This is an example sentence."
print("Number of tokens:", count_tokens(text))