[PATCH] NVMe:Support for Host Memory Buffer(HMB)
Christoph Hellwig
hch at infradead.org
Wed Jun 15 03:48:00 PDT 2016
> +static unsigned int set_max_hmb = 1<<9;
> +module_param(set_max_hmb, uint, 0444);
> +MODULE_PARM_DESC(set_max_hmb, "set max size of Host Memory Buffer supported by host in units of MB");
no need for the set prefix, also there is a double white space above.
Moreso I think the limit needs to be relative to the available memory
in the system.
> +static unsigned int set_hmb_chunk = 1<<10;
> +module_param(set_hmb_chunk, uint, 0444);
> +MODULE_PARM_DESC(set_hmb_chunk, "set chunk size of Host Memory Buffer in discriptor list in KB");
Why would the user care about a chunk size?
Also all the CMB code should be in pci.c, nothing of it applies for
Fabrics.
> +static int nvme_set_hmb_feature(struct nvme_ctrl *dev, u32 hmb_enable, u64 hmb_size)
Plese keep your line length below 80 characters.
> +int nvme_get_hmbcap(struct nvme_ctrl *dev)
> +{
> + u64 max_hmb = set_max_hmb << 10;
> +
> + if (!set_max_hmb || !set_hmb_chunk)
> + return -EINVAL;
> +
> + if (!dev->hmb.host_mem_pre)
> + return -EINVAL;
> +
> + /* Is Min Host buffer size too high */
> + if (dev->hmb.host_mem_min > max_hmb) {
> + dev->hmb.host_mem_pre = 0;
> + dev->hmb.host_mem_min = 0;
> + return -EINVAL;
> + }
> + /* Is Preffered Host buffer size too high? set it to set_max_hmb */
> + if (dev->hmb.host_mem_pre > max_hmb)
> + dev->hmb.host_mem_pre = max_hmb;
FYI, we really should print a message to the kernel log that we're
using X MB size for the host buffer so that people know where all
their memory goes. Probably a sysfs file as well.
We'll also need to think about the kdump kernel which has very little
memory available.
> +
> +int nvme_setup_hmb(struct nvme_ctrl *dev)
> +{
> + struct nvme_features feat;
> + struct hmb_descriptor *desc_list;
> + struct hmb_descriptor_info *desc_info;
> + u64 hmb_chunk_size, hmb_allocated, hmb_pre, max_desc_nent;
> + u32 page_size, page_shift = 0, nents = 0, offset;
> + int status;
> +
> + page_size = dev->page_size;
> + while (page_size >>= 1)
> + page_shift++;
This should probably use ilog2 if it's needed at all.
> + status = nvme_get_hmbcap(dev);
> + if (status) {
> + if (dev->hmb.flags & NVME_HMB_SET_MR) {
> + dev->hmb.flags &= ~(NVME_HMB_SET_MR);
no need for the braces.
> + (sizeof(struct hmb_descriptor)*max_desc_nent) + 0x10;
Whitespace damage (and magic values..)
> + /* check 16 byte allignment */
> + offset = dev->hmb.hmb_desc_info_list.dma_addr & 0xf;
> + if (offset)
> + offset = (0x10 - offset);
dma_alloc_coherent gurantees cache line alinments.
> + desc_list = dev->hmb.hmb_desc_info_list.vaddr + offset;
> +
> + desc_info = (struct hmb_descriptor_info *)kmalloc(
> + sizeof(struct hmb_descriptor_info)*max_desc_nent,
> + GFP_ATOMIC);
Should use kmalloc_aray. No need for the cast of the return value.
And why the GFP_ATOMIC allocation
> + hmb_allocated = 0;
> + while (hmb_allocated < hmb_pre) {
> + /* adjust chunk size if it crosses boundary */
> + if (hmb_chunk_size > (hmb_pre - hmb_allocated))
> + hmb_chunk_size = (hmb_pre - hmb_allocated);
> + /* allocate 1 page extra for page alligned */
dma coherent allocations are always page aligned, see
Documentation/DMA-API-HOWTO.txt.
> + desc_info[nents].size = hmb_chunk_size + dev->page_size;
> + desc_info[nents].vaddr = dma_alloc_coherent(dev->dev,
> + desc_info[nents].size,
> + &desc_info[nents].dma_addr,
> + GFP_KERNEL);
Given that the hose will never access this memory it sounds like this
should use dma_alloc_attrs with the DMA_ATTR_NON_CONSISTENT and
DMA_ATTR_NO_KERNEL_MAPPING flags. ARM SOC users will thank you..
> + if (hmb_allocated < (dev->hmb.host_mem_min << 10)
> + || hmb_allocated > (dev->hmb.host_mem_pre << 10)) {
Please run the code throgh checkpatch.pl..
More information about the Linux-nvme
mailing list