[RFC PATCH] irqchip/gic-v3-its: Allocate ITS tables from corresponding node memory

Marc Zyngier marc.zyngier at arm.com
Tue May 3 00:54:27 PDT 2016


[Please CC LKML and all the irqchip maintainers on these patches]

On 29/04/16 13:18, Ashok Kumar wrote:
> In the case of systems having multi socket and multi ITS, allocating
> local node memory for ITS device table, collection table, interrupt
> translation table and command queue will help in reducing inter-chip
> traffic even though they(except command queue) could be cached in the GIC.
> 
> Signed-off-by: Ashok Kumar <ashoks at broadcom.com>
> ---
> This patch is created on top of Cavium thunderx erratum 23144 patch [1].
> 
> I am not sure how to do this for ACPI as GIC ITS ID in MADT doesn't map to
> _PXM. Am I missing something here? Any thoughts?

Indeed, and SRAT doesn't provide any valuable information either.

> 
> [1] https://lkml.org/lkml/2016/4/15/830 - [PATCH v5] irqchip, gicv3-its, \
> numa: Enable workaround for Cavium thunderx erratum 23144
> 
> Thanks,
> Ashok
> 
> CC: marc.zyngier at arm.com
> CC: rrichter at caviumnetworks.com
> CC: gkulkarni at caviumnetworks.com
> CC: jchandra at broadcom.com
> 
>  drivers/irqchip/irq-gic-v3-its.c |   12 ++++++++----
>  1 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 75f258f..9a187c0 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -860,6 +860,7 @@ static int its_alloc_tables(const char *node_name, struct its_node *its)
>  		int alloc_pages;
>  		u64 tmp;
>  		void *base;
> +		struct page *pg;
>  
>  		if (type == GITS_BASER_TYPE_NONE)
>  			continue;
> @@ -897,11 +898,13 @@ retry_alloc_baser:
>  				node_name, order, alloc_pages);
>  		}
>  
> -		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> -		if (!base) {
> +		pg = alloc_pages_node(its->numa_node,
> +				      GFP_KERNEL | __GFP_ZERO, order);
> +		if (!pg) {
>  			err = -ENOMEM;
>  			goto out_free;
>  		}
> +		base = page_address(pg);
>  
>  		its->tables[i].base = base;
>  		its->tables[i].order = order;
> @@ -1184,7 +1187,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>  	sz = nr_ites * its->ite_size;
>  	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> -	itt = kzalloc(sz, GFP_KERNEL);
> +	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>  	if (lpi_map)
>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
> @@ -1526,7 +1529,8 @@ static int __init its_probe(struct device_node *node,
>  	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
>  	its->numa_node = of_node_to_nid(node);
>  
> -	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
> +	its->cmd_base = kzalloc_node(ITS_CMD_QUEUE_SZ, GFP_KERNEL,
> +				     its->numa_node);
>  	if (!its->cmd_base) {
>  		err = -ENOMEM;
>  		goto out_free_its;
> 

Does this lead to an improvement you've actually measured? If so, I'd
like to see numbers to back it up. Or is that purely theoretical?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...



More information about the linux-arm-kernel mailing list