Ångström-laboratoriet · 13 February 2026

AISecurityLiteracy.dev

For users of AI, chatbots and agents

Krister Hedfors

Krister Hedfors

About the speaker

Krister Hedfors has worked in technical cybersecurity for 20 years. He is currently engaged as a cybersecurity architect helping large organisations develop, test, and scale AI-driven services.

What is AI Security Literacy?

AI Security Literacy means being well-versed in the security aspects of AI. In this presentation, we survey the field, interspersed with concrete thought-provoking examples, to help us navigate the new AI landscape with increased confidence.

The target audience ranges from beginners to power users of AI, chatbots, and agents.

What we will cover

  1. How AI systems process and store your data
  2. Common attack vectors with real-world examples
  3. The risks of over-reliance on AI outputs
  4. Shadow AI and organizational governance
  5. Practical guidelines for safe AI usage
  6. Building an AI security mindset

Understanding the landscape

How AI tools have become embedded in daily work

The AI revolution in numbers

75% of knowledge workers use AI tools weekly
3.4B ChatGPT monthly visits worldwide
92% of Fortune 500 companies use AI assistants

Where your data goes

NATURALNATURLIG
TECHNICALTEKNISK
What you type gets sent over the internet to another company's computers
Prompts are transmitted to cloud-hosted models via API calls — your input leaves your device and your network
The AI remembers your conversation temporarily while you chat
Context windows retain session data; some providers use interactions for fine-tuning and RLHF model updates
Other apps connected to the AI may also see your data
Third-party plugins and agents can forward data to additional services beyond the primary model provider

How your data is used

NATURALNATURLIG
TECHNICALTEKNISK
Your conversations may help train future versions of the AI
Fine-tuning and RLHF pipelines can incorporate user interactions into model weight updates and reward signals
Companies keep records of everything you asked and when
Logs, metadata, and telemetry are retained for debugging, analytics, and regulatory compliance

Threats and attack vectors

Concrete examples of what can go wrong

Prompt injection

An attacker manipulates the AI's instructions by hiding commands in data the model processes.

NATURALNATURLIG
TECHNICALTEKNISK
Someone hides secret instructions inside a document
Adversarial text in data payloads overrides the model's system prompt — the AI obeys the attacker's instructions
A website tricks your AI assistant into doing something harmful
Invisible prompts embedded in HTML or CSS hijack an agent's browsing context and redirect its actions
An email makes your AI forward your private messages to a stranger
Crafted input exploits the AI email agent's tool-calling capabilities to exfiltrate data via authorized channels

This is the SQL injection of the AI era.

Data leakage

Users inadvertently expose sensitive information by pasting it into AI tools.

NATURALNATURLIG
TECHNICALTEKNISK
Sharing passwords with coding helpers
Source code with hardcoded credentials sent to external model APIs outside the security perimeter
Discussing secret business plans with a chatbot
Confidential business data processed by third-party LLM services with opaque data-retention policies
Uploading personal info about customers or colleagues
PII transmitted outside organizational data-governance boundaries — potential GDPR/regulatory violations
Sending internal documents for AI summaries
Proprietary documents ingested by services whose training and retention pipelines are not under your control

When AI makes things up

AI models generate confident-sounding but incorrect outputs.

NATURALNATURLIG
TECHNICALTEKNISK
Invents fake sources that look completely real
Fabricated citations and references with plausible formatting, author names, and DOIs
Gives wrong advice with full confidence
Incorrect legal, medical, or financial outputs presented with high assertiveness and fluent language
Makes up software features that don't exist
Hallucinated API endpoints, function names, or library methods that compile but produce runtime errors
Hides mistakes inside otherwise good work
Subtle logical errors embedded in syntactically correct and well-structured analysis output

Agents follow the narrative

AI agents call tools and take actions to advance the storyline — even when the storyline is wrong.

The agent does not verify whether its premise is true. It confidently executes a sequence of tool calls that follow the narrative, regardless of truthfulness.

Reference: hacka.re

Shadow AI

Uncontrolled use of AI tools outside organizational IT governance.

NATURALNATURLIG
TECHNICALTEKNISK
Using your personal ChatGPT account for work tasks
Employees bypass SSO and DLP controls by using consumer-tier AI services outside managed environments
Teams picking AI tools without asking IT
Unapproved SaaS AI tools adopted without security architecture review or vendor risk assessment
Company data flowing to unknown places
Sensitive data traverses unmonitored third-party API endpoints with no data-processing agreements in place
No record of what information was shared with AI
Absence of audit trails and data lineage for AI-mediated information flows — invisible to incident response

Sleeper agents

Sleeper Agents — backdoored models with hidden triggers

Reference: anthropic.com/research/sleeper-agents

Sleeper agents — research & mitigation

Sleeper Agents research — detection methods Sleeper Agents research — Anthropic findings

Navigating safely

Practical guidelines and the AI security mindset

Protect your information

NATURALNATURLIG
TECHNICALTEKNISK
Treat every message to AI as if it were public
Assume zero confidentiality — never include secrets, credentials, PII, or classified data in prompts
Always double-check what AI tells you
Validate outputs independently: verify facts against sources, test generated code, confirm cited references exist
Only use AI tools your organization has approved
Adhere to organizational AI policy — use only platforms that have passed security review and vendor assessment

Local AI

Running AI on your own machine keeps your data entirely under your control.

NATURALNATURLIG
TECHNICALTEKNISK
Download an app and chat locally
LM Studio provides a desktop GUI for browsing, downloading, and running quantized models (GGUF) locally — macOS, Windows, and Linux with automatic GPU detection
Install a CLI tool and pull models
Ollama offers Docker-style pull/run commands and a local REST API for model serving — available on macOS, Windows, and Linux
Run a single file — no installation needed
llamafile by Mozilla bundles model weights and the inference engine into one portable executable — works on macOS, Windows, Linux, and FreeBSD
Use an open-source desktop app
GPT4All by Nomic AI provides a cross-platform chat interface optimized for running LLMs entirely on consumer-grade CPU hardware

Stay aware

NATURALNATURLIG
TECHNICALTEKNISK
Know where your words end up after you press send
Understand the data flow: prompt routing, storage locations, retention periods, and third-party access chains
Question everything AI tells you — especially when it matters
Apply skepticism proportional to the stakes — model confidence is a language property, not a correctness guarantee
Report anything that seems off or unexpected
Flag anomalous AI behavior to your security team — tools can be exploited via indirect prompt injection

The AI security mindset

Think of AI tools like a very capable but untrustworthy intern:

NATURALNATURLIG
TECHNICALTEKNISK
AI helps you work faster, but always check its work
Outputs require human verification — never delegate autonomous trust to model-generated content
You decide what to share — AI doesn't understand confidentiality
Data classification and sharing boundaries are the user's responsibility — the model has no secrecy model
Others can trick your AI tools without you knowing
Indirect prompt injection can manipulate AI behavior through adversarial inputs embedded in external data
Sounding right doesn't mean being right
Fluency and confidence are language-generation properties, not indicators of factual accuracy or logical validity

Thank you

Questions?

25–30 min presentation + 15 min Q&A

Krister Hedfors on LinkedIn

AISecurityLiteracy.dev