OpenAI: let's discover the evolution of onboarding models and the latest API news

 OR penAI is ready to launch a new generation of integration models, the new GPT-4 Turbo and moderation models, new API usage management tools and soon a price drop related to GPT-3.5 Turbo. Additionally, by default, the data sent to the OpenAI API they will not be used to train or improve OpenAI models.


New integration models at lower prices

OpenAI introduces two new embedding models: a smaller and extremely efficient 3-small text embedding model, and a larger and more powerful 3-large text embedding model. An embedding is a sequence of numbers that represents concepts within content such as natural language or code. Integrations make it easier for machine learning models and other algorithms to understand relationships between content, allowing them to perform tasks like clustering or retrieval. They power applications such as knowledge retrieval in ChatGPT and the Assistants API, as well as many retrieval-enhanced generation (RAG) development tools.

A new small text integration model

Text Embedding-3-small is the new highly efficient embedding template and is a notable improvement over its predecessor, the text-embedding-ada-002 template released in December 2022. Comparing text-embedding-ada-002 with text-embedding-3-small, the average score on a benchmark commonly used for multilingual retrieval (MIRACL) increased from 31.4% to 44% while the average score on a benchmark commonly used for English tasks ( MTEB) increased from 61% to 62.3%.

The text-embedding-3-small is also significantly more efficient than the previous generation text-embedding-ada-002 model. So the price of text-embedding-3-small was reduced 5 times compared to text-embedding-ada-002, going from a price per $1000 token of $0.0001 to $0.00002 . We are not deprecating text-embedding-ada-002. So even though we recommend the newest model, customers are free to continue using the previous generation model.

A new large text embedding model: text-embedding-3-large

Text Embedding-3-Large is the largest new embedding model of the next generation and creates embeddings with up to 3,072 dimensions. Comparing text-embedding-ada-002 with text-embedding-3-large: on MIRACL, the average score increased from 31.4% to 54.9%, while on MTEB, the average score increased from 61% at 64.6%. Text-embedding-3-large will be priced at $0.00013/1000 tokens.

Native support for shortening integrations

Using larger embeddings, such as storing them in a vector store for retrieval, generally costs more and consumes more compute, memory, and storage than using smaller embeddings. Both new integration models were trained with a technique that allows developers to balance the performance and cost of using integrations. In particular, the developers can shorten embeddings (i.e. remove certain numbers from the end of the sequence) without the embedding losing its conceptual representation properties by passing in the API size parameter . For example, on MTEB, a text-embedding-3-large embedding can be shortened to a size of 256 while still outperforming a full text-embedding-ada-002 embedding with a size of 1536. This allows for very flexible usage.

For example, when using a vector data store that only supports embeddings up to 1024 dimensions, developers can still use our best embedding pattern text-embedding-3-large and specify a value of 1024 for the API dimensions parameter, which will reduce the embedding to 3072 dimensions, trading some precision in exchange for a smaller vector dimension.

More new models and lower prices

OpenAI will introduce a new GPT-3.5 Turbo model , gpt-3.5-turbo-0125, in February, and for the third time in the last year, is reducing the prices of GPT-3.5 Turbo to help customers scale. Entry prices for the new model are reduced by 50% to $0.0005/1,000 tokens and exit prices are reduced by 25% to $0.0015/1,000 tokens. This model will also benefit from various improvements, including greater precision in responses in required formats and a fix for a bug that caused a text encoding issue for function calls in languages ​​other than English. Customers using the gpt-3.5-turbo model alias will be automatically upgraded from gpt-3.5-turbo-0613 to gpt-3.5-turbo-0125 two weeks after the launch of this model.

Updated Turbo GPT-4 Preview

More than 70% of GPT-4 API customer requests have been moved to GPT-4 Turbo since its release, as developers take advantage of its updated knowledge limit, larger 128,000 popups and of its lower prices. OpenAI has released an updated preview model of GPT-4 Turbo , gpt-4-0125-preview. This model accomplishes tasks such as code generation more comprehensively than the previous preview model and aims to reduce cases of "laziness" where the model has not completed a task.

The new model also includes the fix for the bug that affected non-English UTF-8 generations. For those who want to be automatically updated to new GPT-4 Turbo preview builds, OpenAI is also introducing a new model name aka gpt-4-turbo-preview, which will always point to the latest GPT-4 Turbo preview model. GPT-4. Turbo. In the coming months, OpenAI plans to release GPT-4 Turbo with vision to general availability

Updated moderation model

The free moderation API allows developers to identify potentially dangerous text. As part of ongoing security work, OpenAI is releasing text-moderation-007, the most robust moderation model to date. The text-moderation-latest and text-moderation-stable aliases have been updated to point there.

New ways to understand API usage and manage API keys

OpenAI is rolling out two platform enhancements to give developers both greater visibility into their usage and greater control over API keys. First, developers can now assign permissions to API keys from the API Keys page. For example, a key can be assigned read-only to power an internal monitoring dashboard, or restricted to access only certain endpoints.

Second, the Usage Dashboard and Usage Export functionality now expose metrics at the API key level once you enable tracking. This makes it easier to view usage at the per feature, team, product, or project level, simply by having separate API keys for each.

Comments