Check SSD and HDD Health in Linux with smartctl

Posted on 20 January 2026

Hard drive failures are one of the most common causes of unexpected data loss, downtime, and costly interruptions. According to real-world operational statistics, annual failure rates can exceed 5%* for certain drive capacities and models — especially under heavy or continuous load.

*The AFR value depends on the model, manufacturing year, and workload. The figure is based on Backblaze’s public statistics and is not universal for all HDD/SSD devices.

*Backblaze: Annualized Failure Rates by Drive Size (2021–2023)*

⚠️ Note: For SSDs, the AFR can vary significantly.

To avoid sudden disk failures, the powerful built-in S.M.A.R.T. diagnostics (Self-Monitoring, Analysis, and Reporting Technology) for monitoring drive health is available through the smartmontools package in Linux. In this guide, you’ll learn how to use smartctl and smartd to analyze both HDD and SSD reliability, catch early signs of failure, and automate routine checks

PRODUCTS THAT MIGHT INTEREST YOU:

Benefit from the best server plans and related services, competitive prices, coupled with personalized attention to each client. Supported by top-notch technical assistance that remains consistently accessible to address all your inquiries.

GPU Servers

from $238.00 /month

go to full server list
Amsterdam 10G servers

from $680.00 /month

go to full server list
USA Storage Servers

from $398.00 /month

go to full server list

What Is S.M.A.R.T. and How smartmontools Helps

S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology) is a monitoring system embedded in modern HDDs and SSDs. It tracks internal attributes such as temperature, wear level, or error rates to predict drive failure.

smartmontools provides two core tools:

smartctl: A command-line tool for querying S.M.A.R.T. data and running self-tests.
smartd: A daemon that schedules automatic checks and alerts.

These tools support SATA, NVMe, and USB-connected drives. Note that for some USB-SATA adapters, you may need to manually specify the device type using the -d sat option, as S.M.A.R.T. data is not always passed through correctly.

Installing smartmontools: Quick Start on Any Linux Distro

⚠️ Note: Availability depends on the hypervisor and the configuration of the host machine.

Debian/Ubuntu:

sudo apt update

sudo apt install smartmontools

RHEL/CentOS:

sudo yum install smartmontools

Arch Linux:

sudo pacman -S smartmontools

Verify installation:

smartctl --version

⚠️ Note: On virtual machines (VPS), S.M.A.R.T. may be inaccessible due to abstracted hardware. In KVM, SMART passthrough is possible if the hosting provider allows it, but in OpenVZ/LXC, SMART is always

Essential smartctl Commands for Drive Health Checks

Command	Description
sudo smartctl -i /dev/sda	Basic drive info (model, serial number, SMART support)
sudo smartctl -H /dev/sda	Quick health check summary (“PASSED” or “FAILED”). Note: This check is often insufficient; use -A for a full analysis.
sudo smartctl -A /dev/sda	Full list of SMART attributes
sudo smartctl -t short /dev/sda	Initiate short self-test (2–5 minutes)
sudo smartctl -t long /dev/sda	Start extended self-test (10–60+ minutes)
sudo smartctl -l selftest /dev/sda	View past test results and history

S.M.A.R.T. Attributes Explained: Key Metrics for HDD and SSD

S.M.A.R.T. attributes vary by device, but key metrics are consistent. Here’s a breakdown of important ones for SSDs and HDDs:

Common S.M.A.R.T. Attributes (HDD Focused)

Attribute	Meaning	Normal Range	Action if abnormal
Reallocated_Sector_Ct	Remapped bad sectors	0	Replace disk if growing
Current_Pending_Sector	Sectors awaiting re-check	0	Backup immediately
Offline_Uncorrectable	Unfixable errors	0	Data may be lost
Temperature_Celsius	Drive temperature	30–50°C	Ensure cooling if >55°C

Critical S.M.A.R.T. Metrics for SSD Longevity

Attribute	Description	Normal Behavior
Percentage Used	Indicates wear level (100 = fresh)	Approaching 100% signals EOL
Available Spare	Remaining spare blocks	Should be >5%
Available Spare Threshold	Warning limit for spares	If met, drive is near failure
Wear Leveling Count	Cycles of NAND erase/program operations	Monitor the trend/rate of increase rather than the raw value; interpretation varies by vendor.
Power-On Hours	Total hours powered	Useful for lifecycle planning
Power Cycle Count	On/off events	Monitors hardware stress
Unsafe Shutdowns	Sudden power-offs	Frequent = file system risk
Media/Data Integrity Errors	ECC correction failures	Backup data immediately

⚠️ Note: different manufacturers interpret “Percentage Used” differently. For Intel, it may represent a wear-level indicator, while Samsung uses a different algorithm.

Automate S.M.A.R.T. Monitoring with smartd Daemon

Edit the smartd configuration:

sudo nano /etc/smartd.conf

Example:

DEVICESCAN -a -o on -S on -s (S/../.././02|L/../../7/03) -m root -M exec /usr/share/smartmontools/smartd-runner

⚠️ Note: The path for the script used with ‘-M exec’

Explanation of flags:

DEVICESCAN — auto-detects all available drives
-a — enables all SMART checks and attribute logging
-o on — enables offline data collection
-S on — enables attribute autosave
-s — schedules tests (S = short daily @ 2 AM, L = long weekly @ 3 AM Sunday)
-m — email recipient for alerts
-M exec — triggers a script for notifications

Enable the service:

sudo systemctl enable smartd

sudo systemctl start smartd

Check status and logs:

sudo systemctl status smartd

sudo journalctl -u smartd

Configure alerting:

Email via sendmail or Postfix
Example smartd.conf line for email:
-m admin@vsys.host -M daily

⚠️ Note: The email-based alerting options (e.g., `-m admin@vsys.host -M daily`) are an **alternative** to the script execution method (`-M exec`). You should use one or the other, not both

The test schedule format depends on the smartmontools version and may differ across distributions. An installed MTA is required to send notifications.

Webhook Alert via Custom Script:
For sending notifications to platforms like Slack or Microsoft Teams, you must use the -M exec flag in your smartd.conf to execute an external script upon drive failure. The following curl command is an example of the code to be placed INSIDE that custom script to send a JSON-formatted alert to a Webhook URL:

curl -X POST -H "Content-Type: application/json" -d '{"text":"SMART alert triggered"}' [https://hooks.slack.com/services/XXX/YYY/ZZZ](https://hooks.slack.com/services/XXX/YYY/ZZZ)

Logs to /var/log/syslog
Custom scripts using -M exec

S.M.A.R.T. Monitoring in Cloud & VPS: What You Can (and Can’t) Do

SMART access may not work in VPS (e.g., OpenVZ, KVM with virtio; KVM with virtio can support SMART if passthrough is enabled):

Common errors:

SMART support is: Unavailable

Alternatives:

Request reports from your hosting provider
Monitor I/O latency, kernel logs (dmesg), and SMART pass-through
Use tools like iostat, nvme-cli, or cloud-native monitoring solutions

Using nvme-cli to Read NVMe SSD Health

For NVMe SSDs, nvme-cli provides additional low-level diagnostics:

sudo nvme smart-log /dev/nvme0

Sample output:

Critical Warning: 0x0
Temperature: 37 Celsius
Percentage Used: 4%
Data Units Read: 7,983,283
Data Units Written: 3,812,913
Power Cycles: 123
Power On Hours: 1,023
Unsafe Shutdowns: 3
Media Errors: 0

Key fields:

Percentage Used — wear level (lower is better)
Temperature — operating temp in °C
Critical Warning — 0 means normal; non-zero = alert
Media Errors — should remain 0 in healthy SSDs

“Percentage Used ≤ 10%” for new NVMe drives is normal and not a cause for concern.

Best Practices for Reliable Disk Monitoring in Linux

Recommendations:

Check SMART attributes monthly or integrate into CI/monitoring stack
Combine with cron or Prometheus exporters
Act immediately if reallocated sectors or pending sectors rise
Replace drives before “Percentage Used” reaches 100% (SSD)
Use RAID or backup systems as safety nets — SMART is not foolproof. SMART does not detect sudden controller failures, so having backups is essential.

Mini Summary: Disk Monitoring Checklist

Monitor SMART data monthly (or automate)
Use smartd with logging and alerts
Replace drives before failure signs (e.g., high Pending Sectors, 100% SSD usage)
Keep backup systems active (RAID, snapshots, offsite)

Rotating smartd Logs Automatically

To prevent log accumulation:

# Rotate SMART logs older than 30 days
find /var/log -name ‘smartd.log*’ -mtime +30 -delete

smartd log files are not always located in /var/log — they are often written to syslog. The command applies only in cases where smartd maintains its own dedicated logs.

Final Thoughts: Proactive Monitoring Prevents Data Loss

smartmontools empowers Linux administrators to monitor the physical health of disks and anticipate failures before they disrupt services. For both SSDs and HDDs, regular health checks with smartctl, combined with automation via smartd, help maintain system integrity and data reliability.

HAVE A QUESTION OR WANT TO GET A CUSTOM SOLUTION?

CONTACT SALES

FAQs

It’s a command-line utility for querying and analyzing S.M.A.R.T. data from storage devices.

Yes, smartctl supports both SSDs and HDDs across SATA and NVMe interfaces.

At least once per month, or integrate into scheduled monitoring.

It shows wear level — approaching 100% means the drive is near its end of life.

Some virtual environments don’t expose hardware monitoring; contact your hosting provider.

No, smartctl can be used standalone, but smartd adds automation and alerting features. smartd uses smartctl in the background.

No, it only reports health. Use it for diagnostics and planning replacements.

Usually in /var/log/syslog or available via journalctl -u smartd.

Check SSD and HDD Health in Linux with smartctl

PRODUCTS THAT MIGHT INTEREST YOU:

GPU Servers

Amsterdam 10G servers

USA Storage Servers

What Is S.M.A.R.T. and How smartmontools Helps

Installing smartmontools: Quick Start on Any Linux Distro

Essential smartctl Commands for Drive Health Checks

S.M.A.R.T. Attributes Explained: Key Metrics for HDD and SSD

Common S.M.A.R.T. Attributes (HDD Focused)

Critical S.M.A.R.T. Metrics for SSD Longevity

Automate S.M.A.R.T. Monitoring with smartd Daemon

S.M.A.R.T. Monitoring in Cloud & VPS: What You Can (and Can’t) Do

Using nvme-cli to Read NVMe SSD Health

Best Practices for Reliable Disk Monitoring in Linux

Mini Summary: Disk Monitoring Checklist

Rotating smartd Logs Automatically

Final Thoughts: Proactive Monitoring Prevents Data Loss

HAVE A QUESTION OR WANT TO GET A CUSTOM SOLUTION?

FAQs

What is smartctl used for?

Can I use smartmontools with SSDs?

How often should I check SMART attributes?

What does “Percentage Used” mean on SSDs?

Why does smartctl say ‘Unavailable’ on my VPS?

Is smartd required to use smartctl?

Can smartctl fix errors?

Where is the smartd log located?

Read More

dedicated servers

vps hosting

services

company

client-area

legal

Check SSD and HDD Health in Linux with smartctl

PRODUCTS THAT MIGHT INTEREST YOU:

GPU Servers

Amsterdam 10G servers

USA Storage Servers

What Is S.M.A.R.T. and How smartmontools Helps

Installing smartmontools: Quick Start on Any Linux Distro

Essential smartctl Commands for Drive Health Checks

S.M.A.R.T. Attributes Explained: Key Metrics for HDD and SSD

Common S.M.A.R.T. Attributes (HDD Focused)

Critical S.M.A.R.T. Metrics for SSD Longevity

Automate S.M.A.R.T. Monitoring with smartd Daemon

S.M.A.R.T. Monitoring in Cloud & VPS: What You Can (and Can’t) Do

Using nvme-cli to Read NVMe SSD Health

Best Practices for Reliable Disk Monitoring in Linux

Mini Summary: Disk Monitoring Checklist

Rotating smartd Logs Automatically

Final Thoughts: Proactive Monitoring Prevents Data Loss

HAVE A QUESTION OR WANT TO GET A CUSTOM SOLUTION?

FAQs

What is smartctl used for?

Can I use smartmontools with SSDs?

How often should I check SMART attributes?

What does “Percentage Used” mean on SSDs?

Why does smartctl say ‘Unavailable’ on my VPS?

Is smartd required to use smartctl?

Can smartctl fix errors?

Where is the smartd log located?

Read More

Generate Password