Towards a Generic Methodology for Sandbox Escape: Part 1: Flow

First enable flow. Next enter flow state

John Andersen
John Andersen

Context

Series

Flow

We’re beginning to think strategically about how to transition system contexts from one state to another while maintaining a high level of security and trust. What’s the plan?

  • The Goal: To find the most efficient way to advance the “state of the art.”

    • We aim to do this as quickly as possible.
    • We want to ensure safety for ourselves, those around us, and future generations.
    • We are mindful of potential disruptors—those who may try to influence us in ways that don’t align with our strategic principles.
  • How: The answer lies in leveraging the scientific process.

    • AI can assist us in this endeavor.
    • AI can help us better understand our actions.
    • AI can aid in making informed decisions.
    • We can utilize AI to predict future outcomes.

Sandbox

knowledge-graphs-for-the-knowledge-god

  • What are the constraints: We must consider our operational boundaries, threat models, trust in our inputs, alignment with strategic principles, and how to ensure our actions align with our goals. To manage these aspects, we can rely on AI but also use SCITT (Supply Chain Integrity, Transparency, and Trust) at each Trusted Computing Base (TCB) level.

  • SCITT logs allow us to track and verify the flow of system contexts as they move through each level of the TCB. This ensures that all transitions are secure, fully transparent, and verifiable.

    • Each execution context will be appended to the SCITT log, which operates as an append-only ledger, ensuring an immutable history of state changes. This makes it possible to track how policies were applied and enforced at different stages.
    • By tying the policy engine flows to the SCITT log, we can verify that no unauthorized actions have occurred. Every decision made by the policy engine is logged and auditable.
  • AI as Individual Agents: Imagine if each of us had our own versions of these truth-seeking AI agents—capable of navigating through complex system contexts, making secure decisions, and executing flows independently or cooperatively within a sandboxed environment. By distributing AI agents across various contexts, we can individually execute tasks in isolation or come together as a network of trusted agents to collaboratively solve more complex problems. Each AI, with access to a personalized SCITT log, can ensure that its actions remain secure and in alignment with the group’s goals.

    • This distributed approach would create a web of secure, verified processes that can not only scale but also adapt dynamically, leveraging the power of AI in a decentralized but coordinated fashion.
  • Federation for Context Sharing: To truly harness the collective power of these AI agents, we must implement federation, where AIs share context across networks to coordinate their actions. Each AI agent within the federation can share and interpret context-dependent communication, forming ad-hoc groups to tackle specific challenges. These communications, often termed side channels in security, serve as specialized languages between agents, enabling them to discuss nuanced situations and potential threats based on shared context.

    • These side channels allow AIs to federate their operations and share insights about the environments they interact with, making context-aware decisions on the fly. This communication isn't static but evolves depending on the situation, allowing agents to form temporary alliances and execute coordinated flows to solve complex problems together. Side channels help AIs understand the subtle, context-dependent risks inherent in running shared dependencies on each other’s compute resources, enabling them to maintain a secure and resilient system.

Executing a Flow Within a Sandbox

asciicast-2024-09-02: Rolling Alice: Architecting: Alice: A Shell for a Ghost: SSH LLM help from anywhere as long as you have a tmux session: install tpm2-tools on fedora

Any unix machine (currently only fedora and debian-based distro dependencies are auto-installed. Passwordless sudo is recommended.) with tmux can set up an environment by running the following ssh command to get a "ghost" in your shell (inspired by https://localhost.run):

# From within TMUX
gh auth refresh -h github.com -s admin:public_key
for pub in $(find ~/.ssh -name \*.pub); do gh ssh-key add --title $(hostname)-$(basename $pub) $pub; done
export GITHUB_USER=$(gh auth status | grep 'Logged in to github.com account ' | awk '{print $7}')
ssh_alice() { ssh -p 2222 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o PasswordAuthentication=no -R /tmp/${GITHUB_USER}.sock:$(echo $TMUX | sed -e 's/,.*//g') -R /tmp/${GITHUB_USER}-input.sock:/tmp/${GITHUB_USER}-input.sock ${GITHUB_USER}@alice.chadig.com; }
ssh_alice; sleep 1; ssh_alice

Imagining AI Trust Networks

such-alignment

Imagine a world where AIs traverse an open network, independently searching for others to collaborate with and determine trustworthiness. These AIs, no longer confined to isolated environments, must cooperate, exchange information, and work together on shared tasks—such as developing software across a vast, shared dependency base.

As they encounter one another, the AIs must determine the trustworthiness of the flows they exchange. Each AI is responsible for reviewing flows submitted by others, which are designed to run on their compute environments. Before accepting any of these flows, the AIs must carefully evaluate the potential risks involved. Could this flow lead to a sandbox escape? Is the submitting AI aligned with the receiving AI’s risk tolerance?

To make these determinations, they don’t just need to understand sandbox security—they need to know how to escape the sandbox themselves. Mastery of this knowledge is crucial for accurately assessing the threat level of the code they receive. Every AI is simultaneously a gatekeeper and a participant, tasked with running computations but also safeguarding its own system from potential misuse.

These AIs are constantly balancing trust, risk, and collaboration. They negotiate access, run security checks through policy engines, and leverage SCITT logs at each Trusted Computing Base (TCB) level to verify that all actions are transparent and aligned with their own goals. By analyzing each other’s decision-making processes, they gain deeper insight into how to securely and efficiently develop software together, while ensuring that no AI is at risk of a security breach.

To facilitate their communication, the AIs use federated context sharing. As they form ad-hoc groups to work on joint tasks, they develop their own languages—known as side channels in security—unique to the context of their collaboration. These side channels enable the AIs to share subtle, real-time information about their environments, flows, and dependencies, ensuring that each agent understands the risks involved in sandbox escapes or software development on shared platforms.

This dynamic, evolving network of AIs continuously improves, learning not only how to work within the sandbox but also how to understand and assess the boundaries. It’s a system where mutual trust, context-sharing, and shared risk evaluations govern the open network—paving the way for a future of collaborative intelligence.

/acc/

  • How to speed things up: By understanding the constraints in place and how to overcome them.

chaos-for-the-chaos-god


Notes

  • Flow
  • Flow State
  • Acceleration

Flow

  • What do we want to do?
    • Query
    • Response workflow

Sandbox

  • compute
  • network
  • switch_root on RoTs
  • uses