logo
|
Blog
    Request Diagnosis
    Guide

    A Step-by-Step Guide to Data Quality Diagnostics

    Is your data quality good enough? Find out in just a few clicks. DataClinic: The easiest way to inspect your AI training data.
    Pebbly's avatar
    Pebbly
    May 15, 2026
    A Step-by-Step Guide to Data Quality Diagnostics
    Contents
    What Does Data Quality Diagnostics Provide?Requesting a Data Quality Diagnostic1️⃣ Click [Request Diagnostic] When you land on the DataClinic diagnostic page, the most prominent button you will see is [Request Diagnostic].2️⃣ Review the Process and Click [Continue]3️⃣ Check Your Available Credits and Click [Proceed]4️⃣ Name Your Dataset Now it's time for your data.5️⃣ Organize Your Dataset Folder Structure6️⃣ Compress and Upload Once your folders are organized, you’re at the final step!7️⃣ Confirm Estimated Credit Usage8️⃣ Final Review and Click [Continue]If you've made it this far, your diagnostic request is complete! 🎉

    When AI performance falls short of expectations, the bottleneck is rarely the model—it's the data. If your dataset is imbalanced, redundant, or mislabeled, even the most advanced model will inevitably produce unstable results.

    Data Clinic is a data quality diagnostic service that proactively detects these hidden vulnerabilities and provides a quantitative, objective evaluation.

    What Does Data Quality Diagnostics Provide?

    A DataClinic diagnostic is more than just a simple statistical report.

    It meticulously evaluates whether your training data is truly AI-Ready across the following dimensions:

    • Is the data volume sufficient?

    • Is the class distribution balanced?

    • Are there redundant images or corrupted files?

    • Is the label structure optimized for training?

    • Is the train/test split configured correctly?

    Our automated diagnostic engine analyzes all of these factors and delivers actionable, human-readable results.

    Ready to check the health of your data? Here is a step-by-step guide on how to diagnose your data using DataClinic!

    Currently, diagnostics are available exclusively for image data. We plan to expand support to multimodal datasets, including performance charts, video, and sensor data, in the future.

    Requesting a Data Quality Diagnostic

    1️⃣ Click [Request Diagnostic] When you land on the DataClinic diagnostic page, the most prominent button you will see is [Request Diagnostic].

    Data Quality Diagnosis Page
    Data Quality Diagnosis Page

    “Is it safe to just click it?” Yes! At this stage, no credits are deducted, and you are simply confirming configurations. Feel free to click it without any pressure. (Button: Request Data Diagnostic)

    2️⃣ Review the Process and Click [Continue]

    Data quality diagnosis process
    Data quality diagnosis process

    The next screen provides an at-a-glance overview of the Data Clinic diagnostic workflow:

    • Level I · II · III Diagnostics

    • Comprehensive Evaluation & Improvement Suggestions

    Take a quick look to understand the process, and then click the [Continue] button.

    3️⃣ Check Your Available Credits and Click [Proceed]

    Data Quality Diagnosis Credit
    Data Quality Diagnosis Credit

    Here is a feature many users appreciate!

    A pop-up will clearly display:

    ✔ Your currently available diagnostic credits

    ✔ How many credits will be consumed for this specific diagnostic

    • "Enough credits?" 👉 Click [Proceed]

    • "Not enough?" 👉 Recharge your credits and then proceed

    Click [Proceed] to move to the next step.

    4️⃣ Name Your Dataset Now it's time for your data.

    Write the diagnostic dataset name
    Write the diagnostic dataset name

    In this step, simply assign a name to the dataset you wish to diagnose.

    ✔ Quick naming tips:

    • Alphanumeric combinations recommended: e.g., AnimalFaceDataset_1

    • Underscores (_) and hyphens (-) are allowed.

    • Other special characters are not permitted.

    • Names can be edited later in [My Page].

    📌 Since your final diagnostic report will be saved under this name, we recommend choosing something easily identifiable.

    5️⃣ Organize Your Dataset Folder Structure

    Upload data for quality diagnosis
    Upload data for quality diagnosis

    For this step, you just need to match our baseline format. Please refer to the folder structure guidelines below:

    Dataset form for diagnosis application
    Dataset form for diagnosis application

    ✔ Supported image extensions: .jpg, .png, .jpeg

    ✔ Both 'train' and 'test' folders are required.

    ✔ Class (label) names must be identical across both folders.

    To ensure DataClinic can accurately analyze your data, it is crucial to strictly adhere to this basic structure!

    • 1. 'train' folder (Mandatory diagnostic data) This folder contains your training images. Inside the 'train' folder, create subfolders for each class (label), and place the corresponding image files (.jpg, .png, .jpeg) inside them.

    • 2. 'test' folder (Reference for quality improvement) While not directly used for the primary diagnostic, this folder is analyzed to provide holistic data quality improvement strategies. Create class-specific subfolders and organize the images exactly as you did in the 'train' folder.

    6️⃣ Compress and Upload Once your folders are organized, you’re at the final step!

    Quality diagnosis data upload complete
    Quality diagnosis data upload complete

    • File Compression: Compress both the 'train' and 'test' folders into a single .zip file.

    • Upload: Drag and drop the .zip file into the upload window, or click the [Upload File] button! We support massive uploads of up to 1TB, so there's no need to worry about file size constraints.

    • Continue: Once everything is ready, click the [Continue] button.

    💡 Pro Tip! Confused about the data format? Click 'Download Sample Data' on the screen to review the template beforehand. It will make the process much clearer.

    7️⃣ Confirm Estimated Credit Usage

    Information on Available Diagnostic Credits
    Information on Available Diagnostic Credits

    This step outlines exactly how many images will be diagnosed and how many credits will be consumed. If everything looks good 👉 Click [Proceed].

    8️⃣ Final Review and Click [Continue]

    Final application for data quality diagnosis
    Final application for data quality diagnosis

    This is the final step. From here, DataClinic takes over and runs the diagnostics automatically!

    If you've made it this far, your diagnostic request is complete! 🎉

    Data quality diagnosis application completed
    Data quality diagnosis application completed

    To recap, requesting a Data Clinic diagnostic essentially boils down to 3 main phases. Much simpler than it sounds, right?

    1️⃣ Name your dataset

    2️⃣ Organize your folder structure

    3️⃣ Compress and upload

    If your AI performance is unstable, inspect your data first. Models don't lie. They only perform as well as the data you feed them.

    Diagnose the health of your data right now with Data Clinic. It is the very first step you must take before any AI training begins.

    Share article
    Contents
    What Does Data Quality Diagnostics Provide?Requesting a Data Quality Diagnostic1️⃣ Click [Request Diagnostic] When you land on the DataClinic diagnostic page, the most prominent button you will see is [Request Diagnostic].2️⃣ Review the Process and Click [Continue]3️⃣ Check Your Available Credits and Click [Proceed]4️⃣ Name Your Dataset Now it's time for your data.5️⃣ Organize Your Dataset Folder Structure6️⃣ Compress and Upload Once your folders are organized, you’re at the final step!7️⃣ Confirm Estimated Credit Usage8️⃣ Final Review and Click [Continue]If you've made it this far, your diagnostic request is complete! 🎉

    Pebblous

    RSS·Powered by Inblog