AI Testing

1. What is AI Testing

AskTable offers an AI testing system designed to systematically validate the performance of large models in data analysis scenarios. This system ensures end-to-end reliability from natural language input to data insights.

Within this framework, the ATS (AskTable Test Set) serves as the core testing dataset, specifically designed to assess AI analysis capabilities. Through a collection of predefined test cases, ATS not only evaluates the accuracy of SQL generation but also verifies whether the entire process—from query to chart rendering and data summarization—meets business expectations.

ATS establishes a standardized testing mechanism, helping enterprises harness the power of large models while effectively controlling the risks of AI applications, ensuring that analytical results align with real-world business needs.

2. Feature Overview

The AI testing system provides the following core features:

Test Set Management: Create, edit, and delete test sets, systematically managing test cases.
Test Case Management: Add, modify, and delete test cases with flexible configuration of test content.
Batch Testing: Support running multiple test cases at once to efficiently validate model performance.
Accuracy Evaluation: By comparing the results of generated SQL against the expected SQL, the system evaluates the model’s understanding of user questions and its SQL generation ability, ensuring that the generated SQL correctly answers the business queries.

Through ATS, users can quickly verify the SQL generation capabilities after system upgrades or model updates, ensuring the accuracy and stability of business queries.

3. Relationship Between ATS and Datasource

For each datasource, multiple ATS instances can be created. This one-to-many relationship allows users to build different sets of test cases for various scenarios or business domains based on the same datasource, enabling a comprehensive validation of the system's analysis capabilities on that datasource.

4. Testing Process

Within each ATS, users can create multiple test cases (up to 50 test cases per ATS).

In each test case, users must fill in a natural language question along with its corresponding expected SQL.

When running a testing task, the system generates SQL based on the test cases, executes both the generated SQL and the expected SQL, compares their query results, and finally compiles and outputs the test pass rate.

5. Frequently Asked Questions

1. How are execution results compared in testing tasks?

The system executes both the generated SQL and the expected SQL, then compares their query results. The comparison process includes: matching the number of rows, number of columns, and the content of each cell. Note that the system does not compare column names—only the equivalence of data values is considered. A test case is considered passed when all comparison criteria are met.

2. How are test cases executed during a testing task?

The system adopts a parallel processing mechanism, executing two test cases simultaneously to improve testing efficiency.

6. Conclusion

As a core quality assurance framework within the AskTable system, ATS provides a systematic solution to ensure the accuracy of the entire data analysis process.

By establishing standardized testing procedures, users

1. What is AI Testing​

2. Feature Overview​

3. Relationship Between ATS and Datasource​

4. Testing Process​

5. Frequently Asked Questions​

6. Conclusion​