Artificial Intelligence

China’s GLM-5.2 matches Claude Opus in selected cybersecurity tests

Open-weight model GLM-5.2 reached results comparable to selected Claude configurations in two security evaluations. Researchers caution that the finding applies only to specific tasks.

Author admin
3 min read

Chinese company Z.ai has released the open-weight GLM-5.2 model, which produced results comparable to some Anthropic Claude configurations in two separate studies of vulnerability discovery and incident investigation. The findings come from evaluations by Graphistry and Semgrep. They apply to specific task sets and do not establish overall parity between the models.

What the Graphistry test found

On June 23, Graphistry reported that OpenCode with GLM-5.2 solved 28 of 59 tasks in the private CyBT-CTF benchmark. Claude Opus 4.7 and 4.8 achieved the same result in comparable configurations. The researchers estimated that Claude completed the work 19% faster but cost more than 2.2 times as much for the same number of solved tasks.

However, the strongest Louie and Opus configuration solved 35 of 59 tasks, compared with 28 for OpenCode and GLM-5.2. The authors stressed that the agent harness, tools and prompting setup may have a larger effect on the outcome than the choice between these two models. They present the result as a starting point, not a universal ranking.

A separate Semgrep evaluation

Semgrep separately evaluated how well models could find insecure direct object reference, or IDOR, vulnerabilities caused by insufficient access-control checks. GLM-5.2, running with a minimal harness, reached a 39% F1 score and ranked above the Claude Code configurations included in the test.

Related:  Cloud technologies for business: In simple words, why it is profitable

Semgrep’s purpose-built Multimodal system performed better, reaching 61% with GPT-5.5 and 53% with Opus 4.8. The researchers explicitly limited their conclusion: this was one task, one dataset and one run. They said the order could change when models are tested against another vulnerability class.

What is known about GLM-5.2

According to Z.ai’s official announcement, GLM-5.2 was introduced on June 16, 2026 for long-horizon tasks, coding and work with a context window of up to one million tokens. Its Hugging Face model card lists 753 billion parameters and an MIT license.

Publishing the weights allows users to download the model, run it on their own infrastructure and adapt it for particular tasks. Open weights do not mean that the developer has disclosed the training data and the full development pipeline. Cifrum.kz previously published a guide to running models locally with Ollama, which explains the practical side of this approach.

Why overall parity with Mythos has not been established

The original Perplexity overview describes the result as comparable to Anthropic’s Mythos. However, the numerical comparisons published by Graphistry and Semgrep primarily involve Claude Opus 4.7 and 4.8. Mythos does not appear as a separate, directly comparable model in the reported tables.

The narrower and supportable conclusion is that GLM-5.2 reached Claude Opus-level performance in several specific cybersecurity scenarios. These tests do not measure general reasoning across all domains, reliability in every kind of coding task or the ability to handle any security problem.

Why the result is drawing attention

Axios notes that stronger open-weight models can lower the cost of tools for defenders and enable local use without provider oversight. That creates new options for security teams while also raising concerns about potential misuse.

Related:  IT Quest Launches in Kazakhstan: Technology and Artificial Intelligence Challenges Await Participants

The news comes amid a broader debate over access to frontier systems. Cifrum.kz previously reported on the Anthropic CEO’s call for G7 countries to coordinate AI policy. The GLM-5.2 findings show that model evaluations increasingly depend not only on where a system was developed or how it is licensed, but also on the task, methodology and agent environment.

Sources: Z.ai, Hugging Face, Graphistry, Semgrep, Axios.

The image was generated with artificial intelligence for Cifrum.kz and is illustrative. It does not show a real interface or test results.

Comments on this article

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top