LLM Playground

{{ message.role }}: {{ message.content }}

Advanced Options

Max Tokens: ? The maximum number of tokens to generate in the chat completion.

Stream: ? If set, partial message deltas will be sent, like in ChatGPT.

Temperature: ? What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Top P: ? An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Top K: ? The number of highest probability vocabulary tokens to keep for top-k-filtering.