Footprints

Give users the ability to trace the relationships between sources and results

Overview:

Working with generative AI can feel like wandering through a maze in the dark. Even if you start to find your way, you often don’t know how you got there in the first place.

Footprints as a term represents an array of emerging patterns to trace the relationship of sources and results through different requests and results.

  • AI-generated Google results are shown beside the sources that were scraped
  • Tools like Midjourney include the dominant tokens in the metadata of the images they produce
  • Chats can be seeded with requests to provide annotations and footnotes that add context to the result.

That said, we don’t know what we DON’T see: the traces left behind on the data that we don’t have the technology or patterns to view yet.

Footprints relate closely to Filters/Parameters, in that both represent ways of looking at the data at the meta level. Where parameters can precipitate trends in data that weren’t clearly visible, footprints are the clues left behind that parameters can sniff for. They are like the scent of data, but users still need to know to dig.

  • In a private or secure dataset, footprints help us find relationships between data to control for variables and find important relationships
  • In large models, or less secure databases, they can help protect against the inadvertent exposure of personal data, intellectual property, or bias
  • In products likely to be scrapped for their data, proactively considering the footprints of our information make for more safety and more trustworthy experiences

Legal, ethical, and commercial implications

We are likely to see more attention paid to these as legal challenges around LLMs mount, and users become more savvy.

Consider for a moment that you were using a chat interface to request information about reproductive rights in your state. While your personal information, IP address, and so on might be secure, you leave other fingerprints about you behind. For example, let's say you search for information about a healthcare provider near your home, or you asked for doctors within your healthcare network. It's not hard to assume someone who wanted to connect your behavior to you would have more clues to work with than you realize.

Designers can proactively stave off risk by looking for ways to isolate metadata and connect results to their sources. Any tools that allow for the scrapping of personal data of any kind should be up front about this, and avoid the dark matter patterns of nesting these settings deep within user option panels.

When designing interfaces to allow people to interact with Artificial Intelligence, look for ways for people to browse intelligently. Consider going beyond a basic caveat, and provide warnings when people are including personal information in their prompts.

Consider allowing users to see the map they are leaving behind and delete personal data.

People can only change behavior they are aware of. More intelligent, ethical, and trustworthy results can help to reinforce intelligent use of AI, thus expanding commercial, academic, and other use cases and making it more likely that the technology can be used for good.

Benefits:

Anti-patterns:

All or nothing
When an AI returns sources, this can have a cognitive bias effect and cause someone to think they are seeing the full picture, when it fact some other information may have been intentionally or unintentionally withheld by the model. This matters in situations where the user may be using the response to form an opinion or understanding. This can go too far in the other direction as well(see Gemini's false equivalence of Elon Musk and Hitler). The trick may be teaching users how to maintain skepticism and ask follow ups to a bot.

Midjourney makes it easy for users to navigate back in time to locate source images for regeration
Images saved from Midjourney include the first several tokens, helping users recall how the image was formed
Julius allows users to view the Python code it generated to fetch a result
Reword includes animation of the sources it is fetching on its loading screen
The training sources that Reword references are available from the main canvas
Mobile app Kin (testflight) offers an example of how to manage fingerprints of personal data. Users can see their mind map from the app's home screen and choose what information to keep or delete
Google displays the sources behind its results as hyperlinks that a user can navigate to
When reading a PDF, Adobe uses anchorlinks to show the reader exactly where in the document it referenced to draw some inference
Perplexity combines many of these patterns, including anchorlinks to references, sources, any the ability to use Copilot to write a more accurate summary
Notion and other tools allow the user to navigate between versions of a result
The organization Content Credentials is creating tools that would allow users to watermark images by source, or verify whether they are computer generated
AI writer separates text written by the writer themselves from text imported from the computer's response.
Tools like Writer.ai's AI DETECT help alert others to the use of auto-generated text
No items found.

What examples have you seen?

Share links or reach out with your thoughts?

Thanks for reaching out!
There was an error submitting your form. Please check that all required fields are complete.