nvme smart-log intermittently corrupt from family of SSDs

Nick Neumann nick at pcpartpicker.com
Thu Sep 22 08:37:29 PDT 2022


I noticed something odd with the smart data from various sized HP
FX900 Pro SSDs. Intermittently, after hours of use, host_writes would
be 0 (but data_units_written would not). After experimenting, I'm
seeing that intermittently, when running either "smartctl -x" or "nvme
smart-log", the data coming back is, uh, junk?

sudo nvme smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning                    : 0
temperature                         : 48 C
available_spare                     : 100%
available_spare_threshold           : 25%
percentage_used                     : 0%
data_units_read                     : 793353
data_units_written                  : 4938926
host_read_commands                  : 18675956
host_write_commands                 : 40381344
controller_busy_time                : 0
power_cycles                        : 9
power_on_hours                      : 15
unsafe_shutdowns                    : 1
media_errors                        : 0
num_err_log_entries                 : 217026295083295649698117473405925064704
Warning Temperature Time            : 1323299686
Critical Composite Temperature Time : 4108885430
Temperature Sensor 1                : 38041 C
Temperature Sensor 2                : 55225 C
Temperature Sensor 3                : 23326 C
Temperature Sensor 4                : 38685 C
Temperature Sensor 5                : 57125 C
Temperature Sensor 6                : 45486 C
Temperature Sensor 7                : 59856 C
Temperature Sensor 8                : 2107 C
Thermal Management T1 Trans Count   : 1702871148
Thermal Management T2 Trans Count   : 312071619
Thermal Management T1 Total Time    : 406823074
Thermal Management T2 Total Time    : 468107429

Even odder, when the host_write_commands was 0, the rest of the data
was sane. So the incorrectness is not always obvious...

Any recommendations or thoughts on dealing with this? Or am I just out
of luck when it comes to relying on anything from these drives' smart
logs?



More information about the Linux-nvme mailing list