[PATCH] NVMe: Log Sense Temperature doesn't handle negative values

Jon Derrick jonathan.derrick at intel.com
Fri Oct 9 13:15:33 PDT 2015


> >To meet the SCSI Command Spec (T10 SPC-4 r37), the log sense temperature cannot
> >be negative.   From the spec:
> >
> >The TEMPERATURE field indicates the temperature of the SCSI target device in
> >degrees Celsius at the time the LOG SENSE command is performed. Temperatures
> >equal to or less than zero degrees Celsius shall cause the TEMPERATURE field
> >to be set to zero. If the device server is unable to detect a valid
> >temperature because of a sensor failure or other condition, then the
> >TEMPERATURE field shall be set to FFh. The temperature should be reported
> >with an accuracy of plus or minus three Celsius degrees while the SCSI target
> >device is operating at a steady state within its environmental limits.
> >
> >With the current code, a value of -2 C will be shown as 254 C since it was not
> >written to handle negative vales.  This same issue also exists in
> >nvme_trans_log_info_exceptions.
> 
> [snip]
> 
> >@@ -1097,7 +1101,7 @@ static int nvme_trans_log_temperature(struct nvme_ns *ns, struct sg_io_hdr *hdr,
> > 	dma_addr_t dma_addr;
> > 	void *mem;
> > 	u32 feature_resp;
> >-	u8 temp_c_cur, temp_c_thresh;
> >+	s8 temp_c_cur, temp_c_thresh;
> > 	u16 temp_k;
> > 	log_response = kzalloc(LOG_TEMP_PAGE_LENGTH, GFP_KERNEL);
> >@@ -1129,6 +1133,10 @@ static int nvme_trans_log_temperature(struct nvme_ns *ns, struct sg_io_hdr *hdr,
> > 		temp_k = (smart_log->temperature[1] << 8) +
> > 				(smart_log->temperature[0]);
> > 		temp_c_cur = temp_k - KELVIN_TEMP_FACTOR;
> >+		/* if temp_c_cur is negative,          */
> >+		/* set to 0 to meet Scsi Command Spec  */
> >+		if (temp_c_cur < 0)
> >+			temp_c_cur = 0;
> 
> Interesting, < 0C is colder than I might have expected. What if they
> can get hotter than expected, like 128C? I don't think we want to report
> that as 0.
>

It's unfortunate the spec is written that way. Aerospace silicon is often rated for -55/-40C to +125C junction temp +/- a few %. Thats a big negative range missing if it has to be 0. I would imagine over 128C is not meaningful so we would want to report that as 128C



More information about the Linux-nvme mailing list