Customer Cases
Pricing

AI-Driven Testing: Evolution from Assistance to Autonomy — Youzan, Ctrip & China Unicom Case Study

Explore how Youzan, Ctrip, and China Unicom are revolutionizing QA. Learn about the shift from AI-assisted to autonomous testing, featuring AITest, AI Tester, and intelligent computing network platforms. Discover key efficiency metrics and implementation strategies.

I. Introduction: The New Era of Intelligent Testing

With the breakthroughs of large model technology in natural language understanding and multi-modal processing, software testing is bidding farewell to the traditional model of "human-led + code-assisted" and entering a new stage of "AI-driven intelligent testing."

Whether in general software testing, exclusive intelligent computing networks, or enterprise-level automated case generation, AI has become the core force in solving industry pain points such as low testing efficiency, insufficient coverage, and difficulty in adapting to complex scenarios. This article integrates the practices of Youzan AITestChina Unicom AI Intelligent Computing Network, and Ctrip "AI Tester" to reveal the internal logic of leaping from "auxiliary tools" to "autonomous driving."

II. Core Value: Reconstruction of Efficiency, Quality, and Cost

The essence of AI implementation in testing is to reconstruct the entire process to achieve three-dimensional optimization.

1. Youzan AITest: From "People+Code" to "AI-Intelligence"

Youzan AITest focuses on building a "low threshold, high coverage, and strong adaptability" system.

  • Process Reconstruction: Traditional manual models (writing, debugging, execution, and reporting) are replaced by: Requirement Input  AI Analysis & Marking  AI Case Generation  AI Assembly of Execution Sets  Automatic Report Output.

  • Key Advantage: It only retains manual intervention nodes in complex scenarios, greatly reducing repetitive work. It also improves stability through "AI coverage of abnormal scenarios."

2. Ctrip "AI Tester": High-Speed Generation from PRD

To solve the pain points of "manpower intensity, insufficient coverage, and poor consistency," Ctrip created a one-click generation system from PRD to test cases.

  • Implementation Results: Within 2 weeks of launch, 30+ requirements and 1500+ use cases were generated.

  • Performance Metrics:

    • Adoption Rate: 89.3%

    • Coverage Rate: 80.6%

    • Efficiency: Case design time for small/medium requirements reduced by 70%, and for large requirements by 50%.

  • Impact: Frees testers from basic work to focus on high-value exploratory testing.

3. China Unicom: Domestic 400G Intelligent Computing Network Platform

Facing clusters with huge scale and the difficulty of decoupling network and computing, China Unicom and Xinertai built a specialized platform.

  • Core Value: Overcoming testing bottlenecks in ultra-large-scale clusters.

  • Technical Loop: Through distributed architecture and multi-vendor compatibility, it realizes a full "Management-Execution-Analysis" closed loop, solving the management problems of complex networking in intelligent computing.

III. Core Challenges and Breakthrough Ideas

Implementing AI testing is not "achieved overnight." The three cases highlight three major barriers:

1. Technical Limitations of Models

  • The "Illusion" Problem: Large Language Models (LLMs) are probabilistic and may output "reasonable-looking but erroneous" information.

  • Performance Bottlenecks: Insufficient response delays and throughput in multi-modal scenarios.

  • Engineering Misunderstandings: Companies often fall into the trap of "equating AI with chatbots" or "directly connecting large models," resulting in "demo-level" products.

2. The Trust and Standards Gap

  • Confidence Crisis: A small mistake by AI can destroy the trust of senior testers who require high certainty.

  • Lack of Standards: Uncertainty regarding when AI should intervene and how to correct its output leads to low human-machine collaboration efficiency.

3. Breakthrough Strategy: Technical Adaptation to Scenarios

  • Injection of Domain Knowledge:

    • Ctrip: Integrated "boundary value analysis" and "equivalence class division" into the model.

    • Youzan: Precipitated a test case knowledge base.

    • China Unicom: Developed generation logic based on intelligent computing network characteristics.

  • Prompt Engineering: Ctrip uses a four-step process: Structured Parsing PRD  Requirement Extraction Scenario Generation  Structured Output.

  • Data Closed Loop: Evaluating effects via "accuracy" and "adoption rates," then feeding manual corrections back into training data for continuous optimization.

  • Architectural Innovation: China Unicom adopted a distributed architecture with nodes deployed on multiple devices to solve "large-scale, multi-vendor" hardware testing problems.

IV. Future Evolution: The Three-Stage Leap

According to the planning of the three cases, AI testing will follow this path:

  • Stage 1: Assisted (Human-led): AI solves single-point problems (e.g., Ctrip’s case generation). Humans are responsible for key decisions and quality control. Focus: Efficiency and trust-building.

  • Stage 2: Supervised (AI-led): AI takes over most tasks. Relying on breakthroughs in knowledge graphs and Multi-Agent collaboration, AI independently understands requirements and performs verification. Humans only intervene in complex cross-system logic. (Youzan AITest's current goal).

  • Stage 3: Autonomous (Full Process Loop): Humans only intervene in extreme scenarios. AI autonomously identifies demand changes, generates plans, performs verification, and repairs simple defects. (Ctrip’s "Multi-Agent architecture" plan).

V. Conclusion: Amplifying Human Capabilities

The core value of AI testing is not to replace people, but to amplify them. By using AI to handle repetitive and exploratory work, testers can focus on high-value tasks like complex logic design and risk assessment. The key to success lies in "Engineering Thinking + Human-Machine Collaboration + Scenario-based Adaptation."


Latest Posts
1How to Test AI Products: A Complete Guide to Evaluating LLMs, Agents, RAG, and Computer Vision Models A comprehensive guide to AI product testing covering binary classification, object detection, LLM evaluation, RAG systems, AI agents, and document parsing. Includes metrics, code examples, and testing methodologies for real-world AI applications.
2How to Utilize CrashSight's Symbol Table Tool for Efficient Debugging Learn how to use CrashSight's Symbol Table Tool to extract and upload symbol table files, enabling efficient debugging and crash analysis for your apps.
3How to Enhance Your Performance Testing with PerfDog Custom Data Extension Discover how to integrate PerfDog Custom Data Extension into your project for more accurate and convenient performance testing and analysis.
4Mobile Game Performance Testing in 2026: Complete Guide with PerfDog Insights from Tencent’s Founding Developer Master mobile game optimization with insights from PerfDog’s founding developer. Learn to analyze 200+ metrics including Jank, Smooth Index, and FPower. The definitive 2026 guide for Unity & Unreal Engine developers to achieve 120FPS and reduce battery drain.
5Hybrid Remote Device Management: UDT Automated Testing Implementation at Tencent Learn how Tencent’s UDT platform scales hybrid remote device management. This case study details a 73% increase in device utilization and WebRTC-based automated testing workflows for global teams.