Generation pre training Transformer 3(GPT-3) is an autoregressive language model for text generation in OpenAI. GPT-3 shows the amazing potential of a truly intelligent language model to generate text, and has the ability to complete amazing tasks such as question answering, abstract, semantic search, chat robot and writing poetry or thesis. Among them, we have conducted Q & A experiments in GPT-3, advertising generation, sentence interpretation, intention classification and so on. Now, let's do some experiments for semantic search tasks using the GPT-3 API endpoint provided by OpenAI.
OpenAI's API for search allows you to perform semantic search in a group of documents. Based on semantically related query text, it provides scores for each document and gives their grades.
Because it is API based access, it is easy to use. We only need to provide the text in the form of document, and then query the text. The API will return multiple results that match the query sorted according to the correlation score.
The following are the steps of semantic search using OpenAI API. Here we use Python to call the API, but you can also issue cURL requests.
# Install the necessary modules pip install openai
To perform a semantic search, you first need to upload the document in JSONL file format. The following is an example of JSONL file format.
{"text": "Hello OpenAI", "metadata": "sample data"}
Next, we will create a JSONL file for semantic search and name it sample_ Search. JSONL and copy the following code into it:
{"text": "The rebuilding of economies after the COVID-19 crisis offers a unique opportunity to transform the global food system and make it resilient to future shocks, ensuring environmentally sustainable and healthy nutrition for all. To make this happen, United Nations agencies like the Food and Agriculture Organization, the United Nations Environment Program, the Intergovernmental Panel on Climate Change, the International Fund for Agricultural Development, and the World Food Program, collectively, suggest four broad shifts in the food system.", "metadata": "Economic reset"} {"text": "In the past few weeks healthcare professionals have been fully focussed caring for enormous numbers of people infected with COVID-19. They did an amazing job. Not in the least because healthcare professionals and leaders have been using continues improvement as part of their accreditation program for many years. It has become part of their DNA. This has enabled them to change many processes as needed during COVID-19, using a cross-functional problem solving approach in (very) rapid improvement cycles.", "metadata": "Supporting adaptive healthcare"}
Now it's time to upload the JSONL file using the API key by setting the purpose to search semantic search. Create a file called upload_ File. Py, then copy the following code and provide the OpenAI API key.
import openai openai.api_key = "YOUR-API-KEY" response = openai.File.create(file=open("sample_doc.jsonl"), purpose="search") print(response)
When you run upload_ File.py file, you will get the following response:
Copy the id from the response in the above step.
Now let's test it. To test the ability of GPT-3 semantic search, provide the query in the query text parameter.
import openai openai.api_key = "YOUR-API-KEY" search_response = openai.Engine("davinci").search( search_model="davinci", query="healthcare", max_rerank=5, file="file-8ejPA5eM13J4J0dWy3bBbvTf", return_metadata=True ) print(search_response)
The response is shown in the figure below:
Using GPT-3 to perform a semantic search on a given query is very simple. In the JSON response, we get the document text matching the query, and the score shows the correlation of the results. In our test, we provided only one document. If we provide multiple documents, we will get multiple results with different scores. As we can see,