A data privacy ‘GUT Check’ for synthetic media like ChatGPT
The rise of synthetic media like OpenAI’s ChatGPT is changing the way many types of content are produced and consumed — from academics to entertainment. As with any innovation, synthetic media raises concerns, such as data privacy and security, ethical issues, and the potential to spread misinformation. Ultimately, it’s up to you to determine whether the risks outweigh the benefits.
Concerned about a data security incident? Contact the campus IT Help Desk at 801-581-4000, the hospital ITS Service Desk at 801-587-6000, or the Information Service Office's Security Operations Center at SOC@utah.edu for immediate assistance.
Want to learn more? Reach out to the offices below!
- Office of General Counsel: Contact Ogcfirstname.lastname@example.org if you are evaluating a service for your organization and are provided with a contract for goods or services.
- Privacy Office: Contact email@example.com if a third-party vendor will be accessing, viewing, storing, or using university protected health information (PHI). If the terms of service or contract suggest data collection, a business associate agreement (BAA) may be legally necessary. Contact firstname.lastname@example.org with general inquiries about information privacy and your rights and responsibilities.
- IT Governance, Risk, & Compliance: Contact ISO-GRC@utah.edu if you are assessing software or an information system for your organization. The U’s Information Security Office must evaluate the security of new software or hardware.
- PIVOT: Contact PIVOT Center – Partners for Innovation, Ventures, Outreach & Technology (utah.edu) if you have anidea for innovating systems using apps or software.
Synthetic media is loosely defined as any form of media (visual, textual, audio) generated by or in collaboration with artificial intelligence (AI), such as large language models (LLM). An LLM is “a deep learning algorithm that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive data sets,” according to NVIDIA. Companies, scholars, and organizations, including government agencies, mine data for compilation into massive data sets using the copyright principle of “fair use,” which permits the limited use of copyrighted material without permission.
Currently, the most noteworthy synthetic media platform is ChatGPT, which boasts over 100 million active users. Using a dialogue format, ChatGPT can “answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests,” according to OpenAI. Other platforms include Google’s Bard, Meta’s LLaMA (Large Language Model Meta AI), and Synthesis.
As with every innovation, there are immediate fans, stans, and naysayers. Companies, such as BuzzFeed and ESPN, have voiced support for the technology, indicating that synthetic media will be used to create public-facing content, such as BuzzFeed’s quizzes and ESPN’s commentary. Architects and visual designers also are experimenting with visual synthetic media in their drafting to save time and money.
Some privacy professionals express concern that innovations that use synthetic media will cause more harm than good because it increases avenues for malicious attacks. Criminals have already created websites that impersonate ChatGPT and other OpenAI platforms to phish personal and financial information or prompt users to download files containing malware. Other concerns include the technology’s capability to produce believable fake comments, videos, or other media, leading to a spread of misinformation. Educational institutions are grappling with ChatGPT-written research papers, with ChatGPT appearing as a co-author of at least four published papers and preprints, despite unsettled legal questions of who (or what) owns content generated by large learning models.
One thing is certain: Synthetic media is here to stay.
That’s why it’s important to increase our technological literacy and implement healthy, curious, and cautious data privacy habits to protect against malicious threat actors, accidental disclosure of sensitive or personal information, and legal liability. To accomplish this, implement a “GUT Check” when using new technology to keep your data protected.
The “GUT Check” for data privacy:
G — Generated: How was the content generated?
U — Understand: Do you understand the terms and conditions of use?
T — Time: Take time before you disclose your data. Are you OK with how it will be used?
Generated: The first step to producing and consuming synthetic media safely is understanding how it was generated. Developers are working to create tools to evaluate whether content was generated with AI, but until these applications are integrated into our smartphones, computers, and other devices, be cautious when engaging with all media. Investigate the origin of content, research the name of the creator(s), and evaluate whether the content seems suspicious. Evaluate whether the content has an unsettling or unnatural appearance, known as the “uncanny valley” effect. Finally, seek to understand the technology by researching how large language models work.
Data use and ownership: Who owns the data you input? Who owns the data you produce? How will corporations and organizations use the data you input?
Data retention: How long will the corporation or organization keep your data? Can you review the data that has been collected, amend it, or delete it?
Consent and opt-out: Can you opt-out, or consent (opt-in), to your data being stored, used, or sold by the company?
Time: The final step is to take time to evaluate whether you want to use the technology once you understand how the company or organization plans to store and use your data. If the technology is free to use, it’s likely that the company is benefiting from the user in some way — either by collecting data to sell to other companies or to train the large learning model to improve the technology. If you choose to proceed, pause for a moment to reflect on the information you plan to share before you upload it. Know that everything you do online has a digital footprint and be protective of your personal information, knowing that once you upload it, you can lose control over it.
Our monthly newsletter includes news from UIT and other campus/ University of Utah Health IT organizations, features about UIT employees, IT governance news, and various announcements and updates.