

OpenAI
56.0k 8.8kWhat is OpenAI ?
OpenAI is a cutting-edge research organization dedicated to developing and promoting Artificial Intelligence. Founded in 2015, OpenAI has quickly established itself as a leading player in the AI community.
OpenAI API
The OpenAI API can be applied to virtually any task. It offer a range of models with different capabilities and price points, as well as the ability to fine-tune custom models.
Text generation models
OpenAI’s text generation models (often referred to as generative pre-trained transformers or “GPT” models for short), like GPT-4 and GPT-3.5, have been trained to understand natural and formal language. Models like GPT-4 allows text outputs in response to their inputs. The inputs to these models are also referred to as “prompts”. Designing a prompt is essentially how you “program” a model like GPT-4, usually by providing instructions or some examples of how to successfully complete a task. Models like GPT-4 can be used across a great variety of tasks including content or code generation, summarization, conversation, creative writing, and more. Read more in the introductory text generation guide and in the prompt engineering guide.
Assistants
Assistants refer to entities, which in the case of the OpenAI API are powered by large language models like GPT-4, that are capable of performing tasks for users. These assistants operate based on the instructions embedded within the context window of the model. They also usually have access to tools which allows the assistants to perform more complex tasks like running code or retrieving information from a file. Read more about assistants in the Assistants API Overview.
Embeddings
An embedding is a vector representation of a piece of data (e.g. some text) that is meant to preserve aspects of its content and/or its meaning. Chunks of data that are similar in some way will tend to have embeddings that are closer together than unrelated data. OpenAI offers text embedding models that take as input a text string and produce as output an embedding vector. Embeddings are useful for search, clustering, recommendations, anomaly detection, classification, and more. Read more about embeddings in the embeddings guide.
Tokens
Text generation and embeddings models process text in chunks called tokens. Tokens represent commonly occurring sequences of characters. For example, the string ” tokenization” is decomposed as ” token” and “ization”, while a short and common word like ” the” is represented as a single token. Note that in a sentence, the first token of each word typically starts with a space character. Check out the tokenizer tool to test specific strings and see how they are translated into tokens. As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text.
One limitation to keep in mind is that for a text generation model the prompt and the generated output combined must be no more than the model’s maximum context length. For embeddings models (which do not output tokens), the input must be shorter than the model’s maximum context length. The maximum context lengths for each text generation and embeddings model can be found in the model index.
OpenAI Models
MODEL | DESCRIPTION |
---|---|
GPT-4 Turbo and GPT-4 | A set of models that improve on GPT-3.5 and can understand as well as generate natural language or code |
GPT-3.5 Turbo | A set of models that improve on GPT-3.5 and can understand as well as generate natural language or code |
DALL·E | A model that can generate and edit images given a natural language prompt |
TTS | A set of models that can convert text into natural sounding spoken audio |
Whisper | A model that can convert audio into text |
Embeddings | A set of models that can convert text into a numerical form |
Moderation | A fine-tuned model that can detect whether text may be sensitive or unsafe |
GPT base | A set of models without instruction following that can understand as well as generate natural language or code |
Deprecated | A full list of models that have been deprecated along with the suggested replacement |