[PATCH v2 2/4] drivers: misc: Add Support for TMR Manager

Rao, Appana Durga Kedareswara appanad at amd.com
Fri Jul 29 01:13:06 PDT 2022


Hi Greg,

Thanks for the review.

On 28/07/22 7:57 pm, Greg KH wrote:
> On Wed, Jul 20, 2022 at 11:30:14AM +0530, Appana Durga Kedareswara rao wrote:
>> From: Appana Durga Kedareswara rao <appana.durga.rao at xilinx.com>
>>
>> Triple Modular Redundancy(TMR) subsystem contains three microblaze cores,
>> subsystem is fault-tolerant and continues to operate nominally after
>> encountering an error. Together with the capability to detect and recover
>> from errors, the implementation ensures the reliability of the entire
>> subsystem.  TMR Manager is responsible for performing recovery of the
>> subsystem detects the fault via a break signal it invokes microblaze
>> software break handler which calls the tmr manager driver api to
>> update the error count and status, added support for fault detection
>> feature via sysfs interface.
>>
>> Usage:
>> To know the hardware status:
>> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/status
>> To know the break handler count(Error count):
>> cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/errcnt
>>
>> Signed-off-by: Appana Durga Kedareswara rao <appana.durga.kedareswara.rao at amd.com>
>> Signed-off-by: Appana Durga Kedareswara rao <appana.durga.rao at xilinx.com>
>> ---
>> Changes for v2:
>> --> Added Examples for sysfs entries
>> --> Removed uneeded struct dev from the driver private structure
>> --> Fixed style issues (Used resource_size_t instead of uintptr_t)
>> --> Updated driver to use sysfs_emit() API instead of sprintf() API
>> --> Added error checks wherever applicable.
>> --> Fixed sysfs registration.
>>   .../testing/sysfs-driver-xilinx-tmr-manager   |  27 ++
>>   MAINTAINERS                                   |   7 +
>>   drivers/misc/Kconfig                          |  10 +
>>   drivers/misc/Makefile                         |   1 +
>>   drivers/misc/xilinx_tmr_manager.c             | 253 ++++++++++++++++++
>>   5 files changed, 298 insertions(+)
>>   create mode 100644 Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>>   create mode 100644 drivers/misc/xilinx_tmr_manager.c
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> new file mode 100644
>> index 000000000000..fc5fe7e22b09
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> @@ -0,0 +1,27 @@
>> +What:		/sys/devices/platform/amba_pl/<dev>/status
>> +Date:		June 2022
>> +Contact:	appana.durga.rao at xilinx.com
>> +Description:	This control file provides the status of the tmr manager
>> +		useful for getting the status of fault.
>> +		This file cannot be written.
>> +		Example:
>> +		# cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/status
>> +		  Lockstep mismatch between processor 1 and 2
>> +		  Lockstep mismatch between processor 2 and 3
> 
> Why a whole long string?
To let user know about exact status of the TMR Subsystem(Manager) status

> 
> And this should only be 1 line, not multiple lines.  If it's multiple
> lines, this is NOT ok for a sysfs file.

Sure will fix in next version.
> 
>> +
>> +What:		/sys/devices/platform/amba_pl/<dev>/errcnt
>> +Date:		June 2022
>> +Contact:	appana.durga.rao at xilinx.com
>> +Description:	This control file provides the fault detection count.
>> +		This file cannot be written.
>> +		Example:
>> +		# cat /sys/devices/platform/amba_pl/44a10000.tmr_manager/errcnt
>> +		  1
>> +
>> +What:		/sys/devices/platform/amba_pl/<dev>/dis_block_break
>> +Date:		June 2022
>> +Contact:	appana.durga.rao at xilinx.com
>> +Description:	This control file enables the break signal.
>> +		This file is write only.
>> +		Example:
>> +		# echo 1 > /sys/devices/platform/amba_pl/44a10000.tmr_manager/dis_block_break
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 651616ed8ae2..732fd9ae7d9f 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -13080,6 +13080,13 @@ W:	http://www.monstr.eu/fdt/
>>   T:	git git://git.monstr.eu/linux-2.6-microblaze.git
>>   F:	arch/microblaze/
>>   
>> +MICROBLAZE TMR MANAGER
>> +M:	Appana Durga Kedareswara rao <appana.durga.kedareswara.rao at amd.com>
>> +S:	Supported
>> +F:	Documentation/ABI/testing/sysfs-driver-xilinx-tmr-manager
>> +F:	Documentation/devicetree/bindings/misc/xlnx,tmr-manager.yaml
>> +F:	drivers/misc/xilinx_tmr_manager.c
>> +
>>   MICROCHIP AT91 DMA DRIVERS
>>   M:	Ludovic Desroches <ludovic.desroches at microchip.com>
>>   M:	Tudor Ambarus <tudor.ambarus at microchip.com>
>> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
>> index 41d2bb0ae23a..555ae2e33b91 100644
>> --- a/drivers/misc/Kconfig
>> +++ b/drivers/misc/Kconfig
>> @@ -483,6 +483,16 @@ config OPEN_DICE
>>   
>>   	  If unsure, say N.
>>   
>> +config TMR_MANAGER
>> +	bool "Select TMR Manager"
>> +	depends on MICROBLAZE && MB_MANAGER
>> +	help
>> +	  This option enables the driver developed for TMR Manager. The Triple
>> +	  Modular Redundancy(TMR) manager provides support for fault detection
>> +	  via sysfs interface.
>> +
>> +	  Say N here unless you know what you are doing.
> 
> Not a module?
> 

Will fix in next version.
>> +
>>   source "drivers/misc/c2port/Kconfig"
>>   source "drivers/misc/eeprom/Kconfig"
>>   source "drivers/misc/cb710/Kconfig"
>> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
>> index 70e800e9127f..28b9803f909b 100644
>> --- a/drivers/misc/Makefile
>> +++ b/drivers/misc/Makefile
>> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
>>   obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
>>   obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
>>   obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
>> +obj-$(CONFIG_TMR_MANAGER)	+= xilinx_tmr_manager.o
>> diff --git a/drivers/misc/xilinx_tmr_manager.c b/drivers/misc/xilinx_tmr_manager.c
>> new file mode 100644
>> index 000000000000..dbeca18c409f
>> --- /dev/null
>> +++ b/drivers/misc/xilinx_tmr_manager.c
>> @@ -0,0 +1,253 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Xilinx TMR Subsystem.
>> + *
>> + * Copyright (C) 2022 Xilinx, Inc.
>> + *
>> + * Description:
>> + * This driver is developed for TMR Manager,The Triple Modular Redundancy(TMR)
>> + * Manager is responsible for handling the TMR subsystem state, including
>> + * fault detection and error recovery. The core is triplicated in each of
>> + * the sub-blocks in the TMR subsystem, and provides majority voting of
>> + * its internal state provides soft error detection, correction and
>> + * recovery. Error detection feature is provided through sysfs
>> + * entries which allow the user to observer the TMR microblaze
>> + * status.
>> + */
>> +
>> +#include <asm/xilinx_mb_manager.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +
>> +/* TMR Manager Register offsets */
>> +#define XTMR_MANAGER_CR_OFFSET		0x0
>> +#define XTMR_MANAGER_FFR_OFFSET		0x4
>> +#define XTMR_MANAGER_CMR0_OFFSET	0x8
>> +#define XTMR_MANAGER_CMR1_OFFSET	0xC
>> +#define XTMR_MANAGER_BDIR_OFFSET	0x10
>> +#define XTMR_MANAGER_SEMIMR_OFFSET	0x1C
>> +
>> +/* Register Bitmasks/shifts */
>> +#define XTMR_MANAGER_CR_MAGIC1_MASK	GENMASK(7, 0)
>> +#define XTMR_MANAGER_CR_MAGIC2_MASK	GENMASK(15, 8)
>> +#define XTMR_MANAGER_CR_RIR_MASK	BIT(16)
>> +#define XTMR_MANAGER_FFR_LM12_MASK	BIT(0)
>> +#define XTMR_MANAGER_FFR_LM13_MASK	BIT(1)
>> +#define XTMR_MANAGER_FFR_LM23_MASK	BIT(2)
>> +
>> +#define XTMR_MANAGER_CR_MAGIC2_SHIFT	4
>> +#define XTMR_MANAGER_CR_RIR_SHIFT	16
>> +#define XTMR_MANAGER_CR_BB_SHIFT	18
>> +
>> +#define XTMR_MANAGER_MAGIC1_MAX_VAL	255
>> +
>> +/**
>> + * struct xtmr_manager_dev - Driver data for TMR Manager
>> + * @regs: device physical base address
>> + * @cr_val: control register value
>> + * @magic1: Magic 1 hardware configuration value
>> + * @err_cnt: error statistics count
>> + * @phys_baseaddr: Physical base address
>> + */
>> +struct xtmr_manager_dev {
>> +	void __iomem *regs;
>> +	u32 cr_val;
>> +	u32 magic1;
>> +	u32 err_cnt;
>> +	resource_size_t phys_baseaddr;
>> +};
>> +
>> +/* IO accessors */
>> +static inline void xtmr_manager_write(struct xtmr_manager_dev *xtmr_manager,
>> +				      u32 addr, u32 value)
>> +{
>> +	iowrite32(value, xtmr_manager->regs + addr);
>> +}
>> +
>> +static inline u32 xtmr_manager_read(struct xtmr_manager_dev *xtmr_manager,
>> +				    u32 addr)
>> +{
>> +	return ioread32(xtmr_manager->regs + addr);
>> +}
>> +
>> +static void xmb_manager_reset_handler(struct xtmr_manager_dev *xtmr_manager)
>> +{
>> +	/* Clear the FFR Register contents as a part of recovery process. */
>> +	xtmr_manager_write(xtmr_manager, XTMR_MANAGER_FFR_OFFSET, 0);
>> +}
>> +
>> +static void xmb_manager_update_errcnt(struct xtmr_manager_dev *xtmr_manager)
>> +{
>> +	xtmr_manager->err_cnt++;
>> +}
>> +
>> +static ssize_t errcnt_show(struct device *dev, struct device_attribute *attr,
>> +			   char *buf)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +
>> +	return sysfs_emit(buf, "%x\n", xtmr_manager->err_cnt);
>> +}
>> +static DEVICE_ATTR_RO(errcnt);
>> +
>> +static ssize_t status_show(struct device *dev, struct device_attribute *attr,
>> +			   char *buf)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +	size_t ffr;
>> +	int len = 0;
>> +
>> +	ffr = xtmr_manager_read(xtmr_manager, XTMR_MANAGER_FFR_OFFSET);
>> +	if ((ffr & XTMR_MANAGER_FFR_LM12_MASK) == XTMR_MANAGER_FFR_LM12_MASK) {
>> +		len += sysfs_emit_at(buf, len, "Lockstep mismatch between ");
>> +		len += sysfs_emit_at(buf, len, "processor 1 and 2\n");
> 
> You can write a full string all at once, no need to call this twice.
> 

Sure will fix in next version.
>> +	}
>> +
>> +	if ((ffr & XTMR_MANAGER_FFR_LM13_MASK) == XTMR_MANAGER_FFR_LM13_MASK) {
>> +		len += sysfs_emit_at(buf, len, "Lockstep mismatch between ");
>> +		len += sysfs_emit_at(buf, len, "processor 1 and 3\n");
>> +	}
>> +
>> +	if ((ffr & XTMR_MANAGER_FFR_LM23_MASK) == XTMR_MANAGER_FFR_LM23_MASK) {
>> +		len += sysfs_emit_at(buf, len, "Lockstep mismatch between ");
>> +		len += sysfs_emit_at(buf, len, "processor 2 and 3\n");
>> +	}
> 
> As said above, multiple lines is not ok, you need to fix up this api.
> 
> Perhaps 3 files, one for eacy type of mismatch and a simple 0/1 value
> returned in them?
> 

Okay will fix in next version.

> 
>> +
>> +	return len;
>> +}
>> +static DEVICE_ATTR_RO(status);
>> +
>> +static ssize_t dis_block_break_store(struct device *dev,
>> +				     struct device_attribute *attr,
>> +				     const char *buf, size_t size)
>> +{
>> +	struct xtmr_manager_dev *xtmr_manager = dev_get_drvdata(dev);
>> +	int ret;
>> +	long value;
>> +
>> +	ret = kstrtoul(buf, 16, &value);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (value > 1)
>> +		return -EINVAL;
> 
> Why is 1 magic?
> 
> And we have a sysfs function to read a 0/1/Y/N/y/n value, please use
> that.

Sure will fix in next version.

Regards,
Kedar.
> 
> thanks,
> 
> greg k-h



More information about the linux-arm-kernel mailing list