OpenAI API, ChatGPT, and privacy - How safe is your customer and business data?
24/10/2023The large language models developed by OpenAI (such as GPT-4) are based on three sources of information[1]:
Information publicly available on the internet.
Information acquired from third parties.
Information entered by users and OpenAI employees.
It's not surprising that data on the internet is largely freely available for use. However, it is important for companies to understand what happens to the information that users of OpenAI's models enter themselves or the information to which OpenAI has access through its API. The answer to this is not as straightforward as some content platforms may suggest.
It is crucial to distinguish between the way data is interacted with and the type of account you or your organization obtains from OpenAI.
Information Privacy for Employees When Using ChatGPT with a Free or Plus Account
Chances are high that one or more colleagues are now using ChatGPT on a daily basis in their work. In other words, the free or Plus version available through chat.openai.com. Perhaps you even encourage your team to use it. It makes sense because it's a significant productivity booster. But how safe is the data you put into it?
The answer is quite clear: not safe. All the information you provide to ChatGPT as a free or Plus user can be used to improve the underlying models. How long? OpenAI does not provide a definitive answer to that. It will be as long as they deem necessary or are legally obligated to do so[2].
(!) Tip: Exclude your data from being used for training purposes. What many companies may not be aware of is that OpenAI provides its users with the option to exclude their accounts and data from being used for training purposes. Currently, there are two ways to do this. The first is to disable Chat History under Data Controls in Settings. However, this means that your closed chats cannot be retrieved or viewed afterward, which can be inconvenient.
If you don't want your data to be used for training purposes but still want to be able to access and use your chat history, fill out this form on the OpenAI website. Both options work for both free and Plus accounts! Unfortunately, this process cannot be applied retroactively, so it only applies to all your future chats[3].
Company data used by the OpenAI API
In addition to the standard chat function, OpenAI offers companies the opportunity to use their API. With this, you can unleash GPT-3.5 Turbo and GPT-4 on systems and data of your choice in an automated manner. If you want to learn more about APIs and what you can do with them, read this article on Wikipedia.
But what happens to the data that is accessible to OpenAI through their API? The answer to this is twofold[4]:
Data that comes in through the API platform (after March 1, 2023) is not used to train OpenAI's models.
However, OpenAI reserves the right to securely retain all API inputs and outputs to identify and prevent potential misuse. For specific, highly sensitive use cases, OpenAI can implement a "zero data retention" policy. To do this, you should contact the Sales team.
Data privacy for other OpenAI API endpoints such as fine-tuning, moderation, and embedding
If you have no idea what these terms mean, you can read about them on OpenAI's website, for example, what model fine-tuning, moderation, or embedding is! In addition to the standard GPT API for interacting with the well-known language models GPT-3.5 Turbo and GPT-4, there are a whole bunch of AI models accessible through the API. At the time of writing, the following data policy is active for each API endpoint at OpenAI.
As you can see, the data policy is not the same for all endpoints. For some endpoints, it is also impossible to qualify for Zero Data Retention. So, pay close attention to which endpoint you are using and how it aligns with your own or a client's data policy. The positive aspect is that data flowing to any endpoint is not used to train the models. So, your API input should never appear as output on someone else's screen 😉
Information security and data privacy for companies with an OpenAI Enterprise account
The holy grail within OpenAI for businesses: the Enterprise account. The costs are variable and depend on the type of business, the number of employees, and the complexity of the organization. It offers significant advantages over the Plus and free versions. In addition to unlimited access to GPT-4 32k via the API, Enterprise provides more control over the use of ChatGPT by offering shareable chat templates and an administrative environment. But, back to data privacy!
Companies with an Enterprise account clearly have the most control over the use of their corporate data. OpenAI promises not to use data via ChatGPT for the training of their models (a difference from the free and Plus accounts). Furthermore, data from chats that are deleted is removed from OpenAI's systems no later than 30 days. With Enterprise, you have the assurance as an organization that information provided by employees to ChatGPT will never be used for the training of its models! Unfortunately, Enterprise is not available to everyone, and you are at the mercy of OpenAI to use it by making a request to the Sales team.
What about OpenAI's ChatGPT and GDPR for businesses?
It's clear that ChatGPT is a GDPR nightmare. Why? You can read about it in this article from The Verge. Several EU countries have initiated investigations into OpenAI's data policy regarding ChatGPT. But what about the business use of OpenAI's services?
It has recently been determined that even if you only make the GPT-4 model available through your app or website without having access to the user data entered, you still need to comply with GDPR regulations[5]. How to meet GDPR requirements when using OpenAI's APIs? Legal Nodes has written a comprehensive article on this topic that we couldn't do better!
Final verdict on data privacy and information security for companies using OpenAI services (such as ChatGPT)
Despite entrepreneurs' concerns about their business and customer data when using OpenAI's services, we are generally positive about the efforts made by OpenAI. Did you know that they even have a dedicated Security Portal? Here, you can see which regulations and security measures OpenAI complies with. You can also request your own account to access and download documents or subscribe with your email address to stay informed about all developments related to data privacy and security. published
In addition to this cool addition by OpenAI (powered by the startup Safebase), it is clear that they understand the importance of data privacy and security for their services. Since March 2023, no data from API and Enterprise interactions is used for model training. Offering Zero Data Retention is an important step towards a privacy-first approach. Also, providing an opt-out option, albeit still manual, for free and Plus ChatGPT accounts is a great step towards more data control for everyone.
It seems clear that OpenAI does not yet fully comply with all extensive GDPR regulations in Europe, but European countries are giving them time to align their systems with these regulations. However, for companies using OpenAI's API, it is largely possible to comply with GDPR, depending on how you use the services. If you process customer data through the OpenAI API, we recommend making sure you include OpenAI in your data processing register and sign a data processing agreement, which you can do on the OpenAI website.
Schedule your 30-minute meet & greet and AI consultation
With our extensive knowledge, personalized approach, and unmatched technical expertise, we're confident that our AI solutions will propel your organization to new heights. Discover the future of your business today. Contact us today to learn more about our AI consulting and integration services, and let us help you unlock the full potential of artificial intelligence for your organization.