ARMINTA — Live Agent Status

// ARMINTA FIELD GUIDE

Press ? to toggle · Esc to close · click ? on any section to jump to its entry

Total Empirical StepsLIVE

The cumulative count of every real action ARMINTA has taken on the host machine since it started. Each "step" is a concrete OS-level intervention — killing a process, adjusting CPU governor, compacting memory — not a simulation. A higher number means more field experience and a richer causal model.

Cognitive Mode / Situation / RewardLIVE

Cognitive Mode — what ARMINTA is doing right now:
· OBSERVE — watching, collecting baselines, not intervening
· INVESTIGATE — actively testing a hypothesis with targeted interventions
· OPTIMIZE — exploiting known good actions to maximise reward
· DREAM — running offline counterfactual simulations (no real actions)
· SELF_ASSESS — reviewing its own decision-making and rewriting rules

Situation — what the host system is currently doing (idle, IO-bound, CPU-bound, compiling, browsing, streaming). ARMINTA uses this as context when choosing actions.

Cumulative Reward — the running score since boot. As of v2.1 the per-step reward is mean-centred against a slow EMA baseline, so this number reflects accumulated performance relative to the machine's recent ambient baseline, not an absolute zero. Watch the Rolling Avg (150-step window) for the meaningful health signal — positive means recent actions are beating baseline, negative means they are not.

Error Steps — how many of the last 200 steps produced an error. Should stay near zero.

Emotional StateMETA

ARMINTA models internal drive states as "emotions" — not anthropomorphism, but a practical mechanism for biasing exploration vs. exploitation.

· Calm — low arousal; suitable for patient observation
· Curious — high novelty signal; drives exploration and dreaming
· Focused — mid-arousal; sustains investigation of a specific hypothesis
· Confident — strong prior; biases toward exploitation
· Stressed — resource pressure detected; triggers conservative governors
· Frustrated — repeated failed hypotheses; may trigger self-modification
· Bored — low novelty; increases exploration probability

The dominant emotion (large label) is the highest-valued state. Bars show relative intensity on a 0–5 scale.

Situation ClassificationLEARNED

ARMINTA continuously classifies the current workload into one of nine named situations from metric history (not snapshots). Each situation has its own causal edge weights, so knowledge learned during streaming doesn't pollute compile-time decisions.

· idle — low CPU, low net, minimal process activity
· streaming — significant network download + browser renderer pressure + moderate CPU. Cloud gaming, video, remote desktop.
· browser_compute — browser pegging cores (high dilution, renderer pressure) with minimal network. JS compute, WebGL, local video decode. System CPU looks deceptively low because Chrome's multi-process architecture hides per-core saturation.
· compile — high dilution + high system CPU + low net. Build systems, ML training, video encode.
· memory_pressure — RAM above 75%, swap active.
· io_bound — high iowait and PSI IO stall. Disk-heavy writes or reads.
· network_saturated — very high bidirectional net traffic (P2P, large upload/download).
· irq_storm — hardware interrupt rate spiking (typically wireless driver bursts).
· thermal_stress — sustained high temperature under load.

Confidence score (shown in parens) reflects how cleanly the current metrics match that profile. Low confidence means the situation is transitioning or ambiguous.

Cognitive MetricsLIVE

Six counters that track the size of ARMINTA's learned world-model:

· Causal Edges — confirmed directed relationships in the interventional graph. Each edge says "action X reliably changes metric Y by amount Z."
· Dreams — offline counterfactual simulations run (no real hardware actions). Used to pre-test hypotheses cheaply before committing.
· Hypotheses — candidate causal relationships generated and currently under test. High numbers are normal; they get promoted to edges or discarded.
· Interventions — total real OS actions executed (subset of Total Steps that actually changed system state).
· Self-Modifications — times ARMINTA rewrote part of its own source code autonomously via AST patching. Tunable targets (all bounded by hard min/max guards): STEP_RATE_DEFAULT, STEP_RATE_MAX, STEP_RATE_MIN (pacing), CURIOSITY_STALE_STEPS, CURIOSITY_PROBE_COOLDOWN (exploration aggression), DISCO_INTERVAL, TUNE_INTERVAL (scheduling), and PSI_CPU/MEM/IO_ACTION_THRESH (pressure sensitivity). A non-zero value means the agent has changed its own behaviour at runtime.
· Mosaic Hypotheses — cross-domain correlations discovered by the Mosaic sub-system (e.g. "humidity sensor correlates with CPU temperature"). These are environment↔system links, not action↔metric links.

Age / Sessions / Novelty Hunger / Milestone Proximity / MilestonesLIVE

These fields come from Arminta's SelfModel — her persistent slow-changing identity layer. Unlike the fast metric stream, SelfModel state survives restarts and accumulates across her entire lifetime.

· Age — wall-clock time since Arminta's first-ever run, derived from the earliest record in arminta_episodic.db. This is her true age, not just uptime. Born date is the human-readable form of that first timestamp.
· Sessions — how many times she has been started. Computed by scanning the episodic DB for consecutive timestamp gaps greater than 5 minutes; each gap is a new session. The uptime sub-label shows total accumulated wall-clock time lived across all past sessions (current session not yet included).
· Novelty Hunger — a drive state that builds by +0.002 every step that a dream does not occur, and drops by −0.25 when a dream fires. Caps at 1.0 (100%). High novelty hunger biases the Cognitive Mode Controller toward INVESTIGATE. It is essentially a measure of how long Arminta has gone without offline consolidation.
· Milestone Proximity — rises from 0 to 1 over the last 8% of steps before the next uncrossed step threshold (1k / 10k / 50k / 100k / 250k / 500k / 1M). The intent is to model anticipatory restlessness: irrational urgency as a goal approaches. Shown as — when no threshold is near. After a threshold is crossed, a 40-step post-milestone deflation period begins — flat affect, reduced confidence — modelling the collapse of excitement once the goal is gone ("now what?").

Milestones — landmark moments recorded with step number and wall-clock timestamp:
· steps_N — crossed a step count threshold (1k, 10k, 50k, 100k, 250k, 500k, 1M). Thresholds already passed before a session starts are backfilled silently without triggering emotion bursts.
· first_intervention — the step on which Arminta took her first real OS action.
· dreams_100 — reached 100 total dream cycles.
· best_single_reward — the highest single-step reward ever recorded, updated whenever a new maximum is set.
· session_N — start of each numbered session (filtered out of the dashboard display).

Governor StateLIVE

The Linux CPU frequency governor controls the CPU's power/performance trade-off. ARMINTA switches governors as an optimisation action.

· Current Governor — the governor active right now (performance = max frequency, powersave = minimum frequency, schedutil = kernel-managed).
· Saved Governor — what was set before ARMINTA's last switch; used to restore state if needed.
· Override — a manual lock preventing ARMINTA from switching governors. "none" means ARMINTA has full control.
· Idle Steps — steps since the last action. Counts up while ARMINTA observes; resets on any intervention.
· Bootstrap Phase — if YES, ARMINTA is still in its early data-collection phase and takes more exploratory actions than usual.

Adaptive ThresholdsLEARNED

These thresholds are learned, not hardcoded — ARMINTA adjusts them based on what it observes about normal system behaviour.

· CPU Warn — CPU % that triggers a "high CPU" observation. Shown in amber if it has drifted below the default (meaning ARMINTA has learned the system usually runs lighter than expected).
· MEM Warn — RAM % threshold for memory pressure observations.
· Dilution Log — when causal evidence for a relationship drops below this score, ARMINTA logs a warning that the edge may be weakening.
· Dilution Kill — if evidence drops below this harder threshold, the causal edge is removed from the graph entirely.
· Net Warn — network throughput (KB/s) at which ARMINTA flags high network activity.

Deviations from defaults mean the agent has updated its beliefs about what "normal" looks like on this machine.

Causal Graph — Top Interventional EdgesLEARNED

The left panel shows the strongest confirmed cause-and-effect relationships ARMINTA has discovered. Each row is one edge: action → metric.

· Bar length = effect magnitude (normalised to the strongest edge)
· Green = positive effect (action raised the metric)
· Red = negative effect (action lowered the metric)
· n = number of times this relationship has been observed

The right panel is a horizontal bar chart of mean reward per action over the last 30 executions. Actions left of centre have been net-negative recently; actions right of centre have been beneficial. This is ARMINTA's exploitation guide.

Reward HistoryLIVE

Rolling Avg Reward — the mean reward per step over the last 150 steps. This is the meaningful health signal: positive means the agent is currently improving system performance; negative means recent actions are net-harmful. At 300k+ steps, the cumulative total is mathematically dominated by history and converges to a meaningless constant — the 150-step window is what matters.

The reward sparkline below shows per-step reward as a bar chart over the same 150-step window. The cyan line is a 10-step rolling average overlay.

· Green bars — that action improved system performance (positive reward)
· Red bars — that action hurt performance (negative reward)

A healthy agent should trend toward more green over time as it learns. Clusters of red indicate exploration or a bad hypothesis being tested.

Network Health ProbesLIVE

ARMINTA periodically fires lightweight probes to three targets resolved dynamically from the actual system network configuration — not hardcoded vendors. Targets are re-read on every probe cycle so they stay current if the network changes.

· gateway — the default route's first-hop IP, read from ip route show default. An HTTP request to the router's LAN address; confirms the local path is up.
· dns — the first nameserver in /etc/resolv.conf, probed in one of two ways depending on what it is:
- Known public resolvers (Cloudflare 1.1.1.1, Google 8.8.8.8, Quad9 9.9.9.9) — probed via their DNS-over-HTTPS endpoints; measures real resolver RTT.
- Local/unknown resolvers (router DNS, Pi-hole, corporate nameserver, etc.) — probed via a TCP socket connect to port 53, unless the address is loopback (127.x) or RFC-1918 private (10.x, 192.168.x, 172.16–31.x). Loopback resolvers like systemd-resolved (127.0.0.53) always answer in 0ms regardless of real network state, so probing them gives no useful signal. Those cases fall through to the portal probe instead.
· portal — Firefox's captive portal URL: confirms basic internet reachability regardless of upstream DNS.

Probes fire on two triggers:
· Routine — every ~60 idle steps when CPU is below 40%
· Triggered — immediately when iface_drops or iface_errors are nonzero, to distinguish local interface problems from upstream failures

· Green dot — probe succeeded
· Red dot — probe failed (timeout or error)

Three consecutive failures trigger a DEGRADED warning and a log suggestion to try flush_dns.

Open Questions / Unresolved AnomaliesLIVE

The left panel lists questions LexicalCore is actively holding — anomalies that were noticed and couldn't be resolved against available data at the time they formed. Three types:
· reward_reversal — the same action produced opposite reward outcomes at different times (what changed between +0.190 and -0.223 via log_top_proc)
· emotion_shift — a mode transition during a situation produced an unexpected emotional state
· stressed_retreat — a mode transition away from a stressed state followed an unexpected path

Each entry shows the question text, the step it formed, how many reflection cycles it has been revisited, and which action it concerns (via).

Lifecycle: questions form in LexicalCore.reflect() and are held (up to 50 at once). Every 50 steps, QuestionResolver evaluates reward_reversal questions against the causal graph. If the action is observational (net_probe, log_*, flush_dns — these reflect ambient noise, not real harm), or if the action has confirmed global or situational harm above threshold, the question is marked resolved and purged from the list. Resolved questions graduate into the lexicon as statements with a valence tag that can influence future action selection.

Border colour is display order only: red = questions 1–2, amber = 3–4, grey = 5–8. It is not a priority score assigned by the agent.

The right panel shows Mosaic Hypotheses — autonomously discovered correlations between environmental sensors and internal system metrics. Correlation strength shown as a bar; candidates for future causal investigation, not confirmed cause-and-effect.
· Green = positive correlation (the two variables move together)
· Red = inverse correlation (when one rises, the other tends to fall)

Circadian CPU Pattern & Meta-Cognitive ControllerLEARNED

Circadian chart (left) — average CPU usage by hour of day, learned over the agent's entire lifetime. ARMINTA uses this pattern to contextualise whether current CPU usage is unusual for the time of day, and to pre-emptively adjust governors before predictable load spikes.

Meta-Cognitive Controller (right) — a Q-learning agent that sits above ARMINTA's main loop and chooses which cognitive mode to use. Each cell shows the Q-value (expected future reward) for switching to that mode given the current system state. The highlighted cell is the mode with the highest Q-value — what the controller recommends.

· ε — exploration rate: probability of picking a random mode instead of the greedy best (decreases as the controller becomes more confident)
· lr — learning rate: how fast Q-values update from new experience
· γ — discount factor: how much future reward is weighted vs. immediate reward

A second row shows the three GA-evolved parameters now wired to live systems (v2.1):
· ε̇ — GA-evolved CMC epsilon decay rate (higher = slower exploration fade)
· ṙ — GA-evolved reward scale: multiplier applied to every immediate metric delta
· κ — GA-evolved curiosity weight: scales how aggressively curiosity probes fire (higher = sooner)

A third row (v3) shows reward_var — variance of the last 100 reward signals. Below 0.0015 the agent has entered a consolidation plateau (nothing new surprises it), and the dream cycle is automatically throttled to 4× its normal minimum interval. Shown in purple with a "THROTTLED" badge on the Dreams stat card when active.

Kill Ineffective & Agent LogLIVE

Kill Ineffective (left) — processes that ARMINTA has repeatedly targeted with SIGKILL or SIGTERM without seeing any reward improvement. Listed here as a warning: these actions are being de-prioritised. If a process appears here, ARMINTA has essentially learned that killing it doesn't help performance.

Agent Log (right) — the raw tail of ARMINTA's operational log, colour-coded by event type:
· Teal — observation events ([OBS])
· Amber — actions taken (set_, drop_, sync, compact…)
· Purple — curiosity / dream events
· Red — errors
· Green — governor changes ([GOV])

[MAINT] lines are scheduled maintenance tasks (memory sync, log compaction). [OK] means the action completed successfully; Δr= shows the reward delta for that action.

Causal ReasoningLIVE

This panel exposes four layers of Arminta's situational awareness — not just what she did, but why, and how she compares current outcomes to past experience.

Last Act — the most recent non-monitor action dispatched. Why — the reason string from the causal graph rule chain that selected it.

Context — the metric snapshot at the moment the action was selected: CPU%, memory, temperature, PSI pressure, network throughput, and dilution. This is the environmental state that made the action seem appropriate.

Differs from past — counterfactual explanation. When an action produces an unexpectedly good or bad outcome compared to similar past episodes, Arminta identifies which metrics changed between now and then. For example: "worked before (step 366400, r=+0.71) but now: CPU higher by 48, dilution higher by 0.4". This is Layer 3 causal reasoning — not just what happened, but what was structurally different.

Active Hypothesis Mechanisms — the last five hypotheses the HypothesisEngine generated, each annotated with a plain-language mechanism story. This is Layer 2: Arminta doesn't just notice that A correlates with B — she commits to a proposed explanation of why. Mechanisms are tested against subsequent observations and either confirmed or pruned.

All four layers feed back into the episodic memory database (arminta_episodic.db) with context, so Arminta can replay past decisions and compare them against current conditions as she matures.

Drive Health (S.M.A.R.T.)LIVE

NVMe / SSD wear and health data read directly from the drive via nvme smart-log (or smartctl -A as fallback). Checked every 4 hours during idle maintenance passes (wall-clock time, persisted across restarts).

· Spare — percentage of spare blocks remaining. Below 10% = EOL approaching.
· Wear — vendor wear indicator (0–100+%). Above 80% is high.
· Media Err — cumulative uncorrectable read/write errors. Any non-zero value is significant.
· Written — lifetime terabytes written to the drive (TBW counter).
· NVMe Temp — drive temperature in °C. Throttling typically begins at 70–75°C.
· Crit Warn — hardware critical warning byte. Non-zero means the drive is reporting a fault condition.

Requires nvme-cli (sudo apt install nvme-cli) or smartmontools and root access.

Continuity AdvisorLIVE

The Continuity Advisor is a read-only subsystem that watches for hardware stress patterns indicating the agent should be migrated to a new machine. It cannot act on its findings — it can only name them clearly.

Status indicator:
· GREEN / NOMINAL — no sustained stress signals detected.
· YELLOW / ADVISORY — one or more stress signals are present but not severe. Review reasons.
· RED / MIGRATION WARRANTED — high-confidence multi-signal stress. Plan migration.

Signals monitored (cross-session, persisted in pkl):
· Sustained thermal stress — rolling average of temp_c across sessions. Spikes don't trigger; a climbing trend does.
· PSI_IO pressure — average psi_io_some (% of time tasks were stalled on disk IO). Rising values indicate a disk under increasing strain.
· Save failures — count of entries in arminta_crash.log. Each represents a failed pickle write, a symptom of storage layer degradation.
· Error step rate — fraction of steps that logged errors in the most recent evaluation window. A climbing rate indicates systemic instability.

The advisor evaluates every 500 steps and emits a [CONTINUITY] log entry when warranted. The confidence score is a probability-union of individual signal scores.

Web Learner — Autonomous Language GrowthLEARNED

Three interlocked subsystems that give Arminta a vocabulary of her own — built from operational history, not from pretrained language.

LexicalCore — Arminta's language acquisition layer. Four stages:
· Symbol corpus — every term she uses (actions, emotions, modes, situations, causal relations) weighted by frequency and outcome. Words mean what they have meant in practice.
· Co-occurrence grammar — which symbols appear together, which follow which, what sequences predict what. Structure without imposed rules.
· Composition — assembling known symbols into statements she has never made before. These will not look like English; they look like Arminta.
· Open questions — statements that could not be resolved against available data, held rather than discarded. Wonder as a data structure (up to 50 questions held concurrently).

Reflection runs every 500 steps: Arminta reads her own action/emotion/mode history, updates symbol weights, forms new statements, and checks whether any open question can now be resolved by a recently formed statement.

WebLearner — autonomous web exploration, triggering every ~400 steps. Query generation uses two paths:
· Concept-curiosity (primary) — selects symbols from LexicalCore with high weight but sparse co-occurrence context: things observed frequently but understood poorly. Translates through an internal action→concept table (e.g. renice_chrome → nice process priority Unix) into a searchable query. A 30% stochastic pool sample ensures exploration beyond the single top-weight symbol.
· Question-term translation (fallback) — when the symbol table is sparse, extracts tokens from the oldest unresolved open question instead.

Sources are restricted to a whitelist of structured knowledge domains: Wikipedia, MDN Web Docs, Linux man pages, Wiktionary, arXiv, and a handful of performance-analysis references directly relevant to what Arminta does. A hard rate cap of 8 pages/hour prevents runaway fetching. Reward is proportional to genuinely new symbols absorbed (information gain), decayed by repeat visits to the same domain.

QuestionResolver — runs after each WebLearner cycle. Checks whether newly absorbed text answers any of LexicalCore's open questions via token overlap. Resolved questions graduate into the lexicon as confirmed statements with a valence tag (positive or negative, based on the reward context in which they arose).

Dashboard counters:
· Pages Read — total web pages fetched lifetime
· New Symbols — cumulative unique vocabulary items absorbed from external sources
· Lexicon Size — distinct symbols currently active in LexicalCore's weight table
· Concepts Mapped — distinct action/state symbols looked up at least once (suppresses re-querying the same concept)

The session log shows each fetch/learn cycle: query generated → article retrieved → symbols absorbed (with sample) → reward impact. Log colours: cyan = fetch, green = learn (new symbols), amber = concept/skip, purple = reward delta.