Articles

from g0v.social

最後實現的組合技:
自製 rm 的 bash wrapper,一律拿掉 f,根據有 r/R 就加上 I,無就加上 i

wrapper 本身最後執行指令時,會去呼叫 safe-rm。

然而 safe-rm 的系統安裝方式是放到 /user/local/bin/rm ,來取代日常的 rm,雙重防呆手段~~

當要用原始 rm 時還是可以 /bin/rm。

Anthropic 之前有發了一篇可以把問題潛伏在大模型中,並給予觸發醒來的條件,例如 2026 年才給予 SQL Injection 的程式碼,簡稱 Sleeping Agent。

昨天他們發表研究,有辦法針對 LLM 進行測試,以偵測到這類潛伏的威脅:
Simple probes can catch sleeper agents
https://www.anthropic.com/research/probes-catch-sleeper-agents

該實驗根據一個假設出發,如果 LLM 有欺騙的習慣(?),那這個 LLM 可能很難不去思考如何欺騙。

我覺得無論是假設,還是這個實驗,都非常有趣,根本是測謊器概念的延伸吧。

媽呀又在搖!!

from Feedly

Simple probes can catch sleeper agents

This “Alignment Note” presents some early-stage research from the Anthropic Alignment Science team following up on our recent “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety…

OCR plugin not found on NPM registry · Issue #577 · nut-tree/nut.js

Short summary This page documents a plugin that does not exist on NPM.js - what is going on? Was the plugin pulled from npmjs? Desired execution environment / tested on Virtual machine Docker conta…

Why China is defeating Tesla

Tesla’s stock price has been taking an absolute beating lately. Since July of last year the company has lost about $400 billion in market capitalization — a decline of over 40%. It’s now less than…
jimmy