LLM Playground
{{ model.id }}
Send
{{ message.role }}:
{{ message.content }}
Advanced Options
Max Tokens:
?
The maximum number of tokens to generate in the chat completion.
Stream:
?
If set, partial message deltas will be sent, like in ChatGPT.
Temperature:
?
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Top P:
?
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
Top K:
?
The number of highest probability vocabulary tokens to keep for top-k-filtering.