[PATCH v2 16/17] driver core: Refactor fw_devlink feature
Robin Murphy
robin.murphy at arm.com
Fri Dec 11 11:34:31 EST 2020
On 2020-12-11 14:11, Qian Cai wrote:
> On Fri, 2020-11-20 at 18:02 -0800, Saravana Kannan wrote:
>> The current implementation of fw_devlink is very inefficient because it
>> tries to get away without creating fwnode links in the name of saving
>> memory usage. Past attempts to optimize runtime at the cost of memory
>> usage were blocked with request for data showing that the optimization
>> made significant improvement for real world scenarios.
>>
>> We have those scenarios now. There have been several reports of boot
>> time increase in the order of seconds in this thread [1]. Several OEMs
>> and SoC manufacturers have also privately reported significant
>> (350-400ms) increase in boot time due to all the parsing done by
>> fw_devlink.
>>
>> So this patch uses all the setup done by the previous patches in this
>> series to refactor fw_devlink to be more efficient. Most of the code has
>> been moved out of firmware specific (DT mostly) code into driver core.
>>
>> This brings the following benefits:
>> - Instead of parsing the device tree multiple times during bootup,
>> fw_devlink parses each fwnode node/property only once and creates
>> fwnode links. The rest of the fw_devlink code then just looks at these
>> fwnode links to do rest of the work.
>>
>> - Makes it much easier to debug probe issue due to fw_devlink in the
>> future. fw_devlink=on blocks the probing of devices if they depend on
>> a device that hasn't been added yet. With this refactor, it'll be very
>> easy to tell what that device is because we now have a reference to
>> the fwnode of the device.
>>
>> - Much easier to add fw_devlink support to ACPI and other firmware
>> types. A refactor to move the common bits from DT specific code to
>> driver core was in my TODO list as a prerequisite to adding ACPI
>> support to fw_devlink. This series gets that done.
>>
>> [1] - https://lore.kernel.org/linux-omap/ea02f57e-871d-cd16-4418-c1da4bbc4696@ti.com/
>> Signed-off-by: Saravana Kannan <saravanak at google.com>
>> Tested-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
>> Tested-by: Grygorii Strashko <grygorii.strashko at ti.com>
>
> Reverting this commit and its dependency:
>
> 2d09e6eb4a6f driver core: Delete pointless parameter in fwnode_operations.add_links
>
> from today's linux-next fixed a boot crash on an arm64 Thunder X2 server.
Since the call stack implicates the platform-device-wrangling we do in
IORT code I took a quick look; AFAICS my guess would be it's blowing up
trying to walk a zeroed list head since "driver core: Add fwnode_init()"
missed acpi_alloc_fwnode_static().
Robin.
> .config: https://cailca.coding.net/public/linux/mm/git/files/master/arm64.config
>
> [ 57.413929][ T1] ACPI: 5 ACPI AML tables successfully acquired and loaded
> [ 60.571643][ T1] ACPI: Interpreter enabled
> [ 60.576104][ T1] ACPI: Using GIC for interrupt routing
> [ 60.582474][ T1] ACPI: MCFG table detected, 1 entries
> [ 60.588051][ T1] ACPI: IORT: SMMU-v3[402300000] Mapped to Proximity domain 0
> [ 60.601374][ T1] Unable to handle kernel paging request at virtual address dfff800000000000
> [ 60.610146][ T1] Mem abort info:
> [ 60.613694][ T1] ESR = 0x96000004
> [ 60.617496][ T1] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 60.623616][ T1] SET = 0, FnV = 0
> [ 60.627420][ T1] EA = 0, S1PTW = 0
> [ 60.631304][ T1] Data abort info:
> [ 60.634957][ T1] ISV = 0, ISS = 0x00000004
> [ 60.639546][ T1] CM = 0, WnR = 0
> [ 60.643255][ T1] [dfff800000000000] address between user and kernel address ranges
> [ 60.651226][ T1] Internal error: Oops: 96000004 [#1] SMP
> [ 60.656864][ T1] Modules linked in:
> [ 60.660658][ T1] CPU: 38 PID: 1 Comm: swapper/0 Tainted: G W 5.10.0-rc7-next-20201211 #2
> [ 60.670424][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.16 07/29/2020
> [ 60.680979][ T1] pstate: 10400009 (nzcV daif +PAN -UAO -TCO BTYPE=--)
> [ 60.687757][ T1] pc : device_add+0xf60/0x16b0
> [ 60.692430][ T1] lr : device_add+0xf08/0x16b0
> [ 60.697098][ T1] sp : ffff0000063bf7d0
> [ 60.701147][ T1] x29: ffff0000063bf7d0 x28: ffff00001f760810
> [ 60.707226][ T1] x27: fffffffffffffff8 x26: ffff00001f760858
> [ 60.713304][ T1] x25: ffff00001f760c58 x24: fffffffffffffff8
> [ 60.719381][ T1] x23: 1fffe00003eec10b x22: ffff8000190d0260
> [ 60.725458][ T1] x21: ffff800011dba708 x20: 0000000000000000
> [ 60.731535][ T1] x19: 1fffe00000c77f10 x18: 1fffe001cf0d53ed
> [ 60.737616][ T1] x17: 0000000000000000 x16: 1ffff000033676f1
> [ 60.743709][ T1] x15: 0000000000000000 x14: ffff800011731e34
> [ 60.749786][ T1] x13: ffff700003217fb9 x12: 1ffff00003217fb8
> [ 60.755864][ T1] x11: 1ffff00003217fb8 x10: ffff700003217fb8
> [ 60.761940][ T1] x9 : dfff800000000000 x8 : ffff8000190bfdc7
> [ 60.768017][ T1] x7 : 0000000000000001 x6 : ffff700003217fb9
> [ 60.774094][ T1] x5 : ffff700003217fb9 x4 : ffff000006324a80
> [ 60.780170][ T1] x3 : 1fffe00000c64951 x2 : 0000000000000000
> [ 60.786247][ T1] x1 : 0000000000000000 x0 : dfff800000000000
> [ 60.792324][ T1] Call trace:
> [ 60.795495][ T1] device_add+0xf60/0x16b0
> __fw_devlink_link_to_consumers at drivers/base/core.c:1583
> (inlined by) fw_devlink_link_device at drivers/base/core.c:1726
> (inlined by) device_add at drivers/base/core.c:3088
> [ 60.799813][ T1] platform_device_add+0x274/0x628
> [ 60.804833][ T1] acpi_iort_init+0x9d8/0xc50
> [ 60.809415][ T1] acpi_init+0x45c/0x4e8
> [ 60.813556][ T1] do_one_initcall+0x170/0xb70
> [ 60.818224][ T1] kernel_init_freeable+0x6a8/0x734
> [ 60.823332][ T1] kernel_init+0x18/0x12c
> [ 60.827566][ T1] ret_from_fork+0x10/0x1c
> [ 60.831890][ T1] Code: f2fbffe0 d100205b d343fc41 aa1b03f8 (38e06820)
> [ 60.838756][ T1] ---[ end trace fa5c8ce17a226d83 ]---
> [ 60.844127][ T1] Kernel panic - not syncing: Oops: Fatal exception
> [ 60.850881][ T1] SMP: stopping secondary CPUs
> [ 60.855733][ T1] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
>> ---
>> drivers/base/core.c | 325 ++++++++++++++++++++++++++++++-----------
>> include/linux/device.h | 5 -
>> 2 files changed, 238 insertions(+), 92 deletions(-)
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 1873cecb0cc4..9edf9084fc98 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -46,8 +46,6 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup);
>> #endif
>>
>> /* Device links support. */
>> -static LIST_HEAD(wait_for_suppliers);
>> -static DEFINE_MUTEX(wfs_lock);
>> static LIST_HEAD(deferred_sync);
>> static unsigned int defer_sync_state_count = 1;
>> static DEFINE_MUTEX(fwnode_link_lock);
>> @@ -803,74 +801,6 @@ struct device_link *device_link_add(struct device *consumer,
>> }
>> EXPORT_SYMBOL_GPL(device_link_add);
>>
>> -/**
>> - * device_link_wait_for_supplier - Add device to wait_for_suppliers list
>> - * @consumer: Consumer device
>> - *
>> - * Marks the @consumer device as waiting for suppliers to become available by
>> - * adding it to the wait_for_suppliers list. The consumer device will never be
>> - * probed until it's removed from the wait_for_suppliers list.
>> - *
>> - * The caller is responsible for adding the links to the supplier devices once
>> - * they are available and removing the @consumer device from the
>> - * wait_for_suppliers list once links to all the suppliers have been created.
>> - *
>> - * This function is NOT meant to be called from the probe function of the
>> - * consumer but rather from code that creates/adds the consumer device.
>> - */
>> -static void device_link_wait_for_supplier(struct device *consumer,
>> - bool need_for_probe)
>> -{
>> - mutex_lock(&wfs_lock);
>> - list_add_tail(&consumer->links.needs_suppliers, &wait_for_suppliers);
>> - consumer->links.need_for_probe = need_for_probe;
>> - mutex_unlock(&wfs_lock);
>> -}
>> -
>> -static void device_link_wait_for_mandatory_supplier(struct device *consumer)
>> -{
>> - device_link_wait_for_supplier(consumer, true);
>> -}
>> -
>> -static void device_link_wait_for_optional_supplier(struct device *consumer)
>> -{
>> - device_link_wait_for_supplier(consumer, false);
>> -}
>> -
>> -/**
>> - * device_link_add_missing_supplier_links - Add links from consumer devices to
>> - * supplier devices, leaving any
>> - * consumer with inactive suppliers on
>> - * the wait_for_suppliers list
>> - *
>> - * Loops through all consumers waiting on suppliers and tries to add all their
>> - * supplier links. If that succeeds, the consumer device is removed from
>> - * wait_for_suppliers list. Otherwise, they are left in the wait_for_suppliers
>> - * list. Devices left on the wait_for_suppliers list will not be probed.
>> - *
>> - * The fwnode add_links callback is expected to return 0 if it has found and
>> - * added all the supplier links for the consumer device. It should return an
>> - * error if it isn't able to do so.
>> - *
>> - * The caller of device_link_wait_for_supplier() is expected to call this once
>> - * it's aware of potential suppliers becoming available.
>> - */
>> -static void device_link_add_missing_supplier_links(void)
>> -{
>> - struct device *dev, *tmp;
>> -
>> - mutex_lock(&wfs_lock);
>> - list_for_each_entry_safe(dev, tmp, &wait_for_suppliers,
>> - links.needs_suppliers) {
>> - int ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
>> - if (!ret)
>> - list_del_init(&dev->links.needs_suppliers);
>> - else if (ret != -ENODEV)
>> - dev->links.need_for_probe = false;
>> - }
>> - mutex_unlock(&wfs_lock);
>> -}
>> -
>> #ifdef CONFIG_SRCU
>> static void __device_link_del(struct kref *kref)
>> {
>> @@ -1195,9 +1125,8 @@ void device_links_driver_bound(struct device *dev)
>> * the device links it needs to or make new device links as it needs
>> * them. So, it no longer needs to wait on any suppliers.
>> */
>> - mutex_lock(&wfs_lock);
>> - list_del_init(&dev->links.needs_suppliers);
>> - mutex_unlock(&wfs_lock);
>> + if (dev->fwnode && dev->fwnode->dev == dev)
>> + fwnode_links_purge_suppliers(dev->fwnode);
>> device_remove_file(dev, &dev_attr_waiting_for_supplier);
>>
>> device_links_write_lock();
>> @@ -1488,10 +1417,6 @@ static void device_links_purge(struct device *dev)
>> if (dev->class == &devlink_class)
>> return;
>>
>> - mutex_lock(&wfs_lock);
>> - list_del(&dev->links.needs_suppliers);
>> - mutex_unlock(&wfs_lock);
>> -
>> /*
>> * Delete all of the remaining links from this device to any other
>> * devices (either consumers or suppliers).
>> @@ -1561,19 +1486,246 @@ static void fw_devlink_parse_fwtree(struct fwnode_handle *fwnode)
>> fw_devlink_parse_fwtree(child);
>> }
>>
>> -static void fw_devlink_link_device(struct device *dev)
>> +/**
>> + * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
>> + * @con - Consumer device for the device link
>> + * @sup_handle - fwnode handle of supplier
>> + *
>> + * This function will try to create a device link between the consumer device
>> + * @con and the supplier device represented by @sup_handle.
>> + *
>> + * The supplier has to be provided as a fwnode because incorrect cycles in
>> + * fwnode links can sometimes cause the supplier device to never be created.
>> + * This function detects such cases and returns an error if it cannot create a
>> + * device link from the consumer to a missing supplier.
>> + *
>> + * Returns,
>> + * 0 on successfully creating a device link
>> + * -EINVAL if the device link cannot be created as expected
>> + * -EAGAIN if the device link cannot be created right now, but it may be
>> + * possible to do that in the future
>> + */
>> +static int fw_devlink_create_devlink(struct device *con,
>> + struct fwnode_handle *sup_handle, u32 flags)
>> +{
>> + struct device *sup_dev;
>> + int ret = 0;
>> +
>> + sup_dev = get_dev_from_fwnode(sup_handle);
>> + if (sup_dev) {
>> + /*
>> + * If this fails, it is due to cycles in device links. Just
>> + * give up on this link and treat it as invalid.
>> + */
>> + if (!device_link_add(con, sup_dev, flags))
>> + ret = -EINVAL;
>> +
>> + goto out;
>> + }
>> +
>> + /*
>> + * DL_FLAG_SYNC_STATE_ONLY doesn't block probing and supports
>> + * cycles. So cycle detection isn't necessary and shouldn't be
>> + * done.
>> + */
>> + if (flags & DL_FLAG_SYNC_STATE_ONLY)
>> + return -EAGAIN;
>> +
>> + /*
>> + * If we can't find the supplier device from its fwnode, it might be
>> + * due to a cyclic dependency between fwnodes. Some of these cycles can
>> + * be broken by applying logic. Check for these types of cycles and
>> + * break them so that devices in the cycle probe properly.
>> + *
>> + * If the supplier's parent is dependent on the consumer, then
>> + * the consumer-supplier dependency is a false dependency. So,
>> + * treat it as an invalid link.
>> + */
>> + sup_dev = fwnode_get_next_parent_dev(sup_handle);
>> + if (sup_dev && device_is_dependent(con, sup_dev)) {
>> + dev_dbg(con, "Not linking to %pfwP - False link\n",
>> + sup_handle);
>> + ret = -EINVAL;
>> + } else {
>> + /*
>> + * Can't check for cycles or no cycles. So let's try
>> + * again later.
>> + */
>> + ret = -EAGAIN;
>> + }
>> +
>> +out:
>> + put_device(sup_dev);
>> + return ret;
>> +}
>> +
>> +/**
>> + * __fw_devlink_link_to_consumers - Create device links to consumers of a device
>> + * @dev - Device that needs to be linked to its consumers
>> + *
>> + * This function looks at all the consumer fwnodes of @dev and creates device
>> + * links between the consumer device and @dev (supplier).
>> + *
>> + * If the consumer device has not been added yet, then this function creates a
>> + * SYNC_STATE_ONLY link between @dev (supplier) and the closest ancestor device
>> + * of the consumer fwnode. This is necessary to make sure @dev doesn't get a
>> + * sync_state() callback before the real consumer device gets to be added and
>> + * then probed.
>> + *
>> + * Once device links are created from the real consumer to @dev (supplier), the
>> + * fwnode links are deleted.
>> + */
>> +static void __fw_devlink_link_to_consumers(struct device *dev)
>> +{
>> + struct fwnode_handle *fwnode = dev->fwnode;
>> + struct fwnode_link *link, *tmp;
>> +
>> + list_for_each_entry_safe(link, tmp, &fwnode->consumers, s_hook) {
>> + u32 dl_flags = fw_devlink_get_flags();
>> + struct device *con_dev;
>> + bool own_link = true;
>> + int ret;
>> +
>> + con_dev = get_dev_from_fwnode(link->consumer);
>> + /*
>> + * If consumer device is not available yet, make a "proxy"
>> + * SYNC_STATE_ONLY link from the consumer's parent device to
>> + * the supplier device. This is necessary to make sure the
>> + * supplier doesn't get a sync_state() callback before the real
>> + * consumer can create a device link to the supplier.
>> + *
>> + * This proxy link step is needed to handle the case where the
>> + * consumer's parent device is added before the supplier.
>> + */
>> + if (!con_dev) {
>> + con_dev = fwnode_get_next_parent_dev(link->consumer);
>> + /*
>> + * However, if the consumer's parent device is also the
>> + * parent of the supplier, don't create a
>> + * consumer-supplier link from the parent to its child
>> + * device. Such a dependency is impossible.
>> + */
>> + if (con_dev &&
>> + fwnode_is_ancestor_of(con_dev->fwnode, fwnode)) {
>> + put_device(con_dev);
>> + con_dev = NULL;
>> + } else {
>> + own_link = false;
>> + dl_flags = DL_FLAG_SYNC_STATE_ONLY;
>> + }
>> + }
>> +
>> + if (!con_dev)
>> + continue;
>> +
>> + ret = fw_devlink_create_devlink(con_dev, fwnode, dl_flags);
>> + put_device(con_dev);
>> + if (!own_link || ret == -EAGAIN)
>> + continue;
>> +
>> + list_del(&link->s_hook);
>> + list_del(&link->c_hook);
>> + kfree(link);
>> + }
>> +}
>> +
>> +/**
>> + * __fw_devlink_link_to_suppliers - Create device links to suppliers of a device
>> + * @dev - The consumer device that needs to be linked to its suppliers
>> + * @fwnode - Root of the fwnode tree that is used to create device links
>> + *
>> + * This function looks at all the supplier fwnodes of fwnode tree rooted at
>> + * @fwnode and creates device links between @dev (consumer) and all the
>> + * supplier devices of the entire fwnode tree at @fwnode.
>> + *
>> + * The function creates normal (non-SYNC_STATE_ONLY) device links between @dev
>> + * and the real suppliers of @dev. Once these device links are created, the
>> + * fwnode links are deleted. When such device links are successfully created,
>> + * this function is called recursively on those supplier devices. This is
>> + * needed to detect and break some invalid cycles in fwnode links. See
>> + * fw_devlink_create_devlink() for more details.
>> + *
>> + * In addition, it also looks at all the suppliers of the entire fwnode tree
>> + * because some of the child devices of @dev that have not been added yet
>> + * (because @dev hasn't probed) might already have their suppliers added to
>> + * driver core. So, this function creates SYNC_STATE_ONLY device links between
>> + * @dev (consumer) and these suppliers to make sure they don't execute their
>> + * sync_state() callbacks before these child devices have a chance to create
>> + * their device links. The fwnode links that correspond to the child devices
>> + * aren't delete because they are needed later to create the device links
>> + * between the real consumer and supplier devices.
>> + */
>> +static void __fw_devlink_link_to_suppliers(struct device *dev,
>> + struct fwnode_handle *fwnode)
>> {
>> - int fw_ret;
>> + bool own_link = (dev->fwnode == fwnode);
>> + struct fwnode_link *link, *tmp;
>> + struct fwnode_handle *child = NULL;
>> + u32 dl_flags;
>> +
>> + if (own_link)
>> + dl_flags = fw_devlink_get_flags();
>> + else
>> + dl_flags = DL_FLAG_SYNC_STATE_ONLY;
>>
>> - device_link_add_missing_supplier_links();
>> + list_for_each_entry_safe(link, tmp, &fwnode->suppliers, c_hook) {
>> + int ret;
>> + struct device *sup_dev;
>> + struct fwnode_handle *sup = link->supplier;
>> +
>> + ret = fw_devlink_create_devlink(dev, sup, dl_flags);
>> + if (!own_link || ret == -EAGAIN)
>> + continue;
>>
>> - if (fw_devlink_flags && fwnode_has_op(dev->fwnode, add_links)) {
>> - fw_ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
>> - if (fw_ret == -ENODEV && !fw_devlink_is_permissive())
>> - device_link_wait_for_mandatory_supplier(dev);
>> - else if (fw_ret)
>> - device_link_wait_for_optional_supplier(dev);
>> + list_del(&link->s_hook);
>> + list_del(&link->c_hook);
>> + kfree(link);
>> +
>> + /* If no device link was created, nothing more to do. */
>> + if (ret)
>> + continue;
>> +
>> + /*
>> + * If a device link was successfully created to a supplier, we
>> + * now need to try and link the supplier to all its suppliers.
>> + *
>> + * This is needed to detect and delete false dependencies in
>> + * fwnode links that haven't been converted to a device link
>> + * yet. See comments in fw_devlink_create_devlink() for more
>> + * details on the false dependency.
>> + *
>> + * Without deleting these false dependencies, some devices will
>> + * never probe because they'll keep waiting for their false
>> + * dependency fwnode links to be converted to device links.
>> + */
>> + sup_dev = get_dev_from_fwnode(sup);
>> + __fw_devlink_link_to_suppliers(sup_dev, sup_dev->fwnode);
>> + put_device(sup_dev);
>> }
>> +
>> + /*
>> + * Make "proxy" SYNC_STATE_ONLY device links to represent the needs of
>> + * all the descendants. This proxy link step is needed to handle the
>> + * case where the supplier is added before the consumer's parent device
>> + * (@dev).
>> + */
>> + while ((child = fwnode_get_next_available_child_node(fwnode, child)))
>> + __fw_devlink_link_to_suppliers(dev, child);
>> +}
>> +
>> +static void fw_devlink_link_device(struct device *dev)
>> +{
>> + struct fwnode_handle *fwnode = dev->fwnode;
>> +
>> + if (!fw_devlink_flags)
>> + return;
>> +
>> + fw_devlink_parse_fwtree(fwnode);
>> +
>> + mutex_lock(&fwnode_link_lock);
>> + __fw_devlink_link_to_consumers(dev);
>> + __fw_devlink_link_to_suppliers(dev, fwnode);
>> + mutex_unlock(&fwnode_link_lock);
>> }
>>
>> /* Device links support end. */
>> @@ -2431,7 +2583,6 @@ void device_initialize(struct device *dev)
>> #endif
>> INIT_LIST_HEAD(&dev->links.consumers);
>> INIT_LIST_HEAD(&dev->links.suppliers);
>> - INIT_LIST_HEAD(&dev->links.needs_suppliers);
>> INIT_LIST_HEAD(&dev->links.defer_sync);
>> dev->links.status = DL_DEV_NO_DRIVER;
>> }
>> diff --git a/include/linux/device.h b/include/linux/device.h
>> index 1e771ea4dca6..89bb8b84173e 100644
>> --- a/include/linux/device.h
>> +++ b/include/linux/device.h
>> @@ -351,18 +351,13 @@ enum dl_dev_state {
>> * struct dev_links_info - Device data related to device links.
>> * @suppliers: List of links to supplier devices.
>> * @consumers: List of links to consumer devices.
>> - * @needs_suppliers: Hook to global list of devices waiting for suppliers.
>> * @defer_sync: Hook to global list of devices that have deferred sync_state.
>> - * @need_for_probe: If needs_suppliers is on a list, this indicates if the
>> - * suppliers are needed for probe or not.
>> * @status: Driver status information.
>> */
>> struct dev_links_info {
>> struct list_head suppliers;
>> struct list_head consumers;
>> - struct list_head needs_suppliers;
>> struct list_head defer_sync;
>> - bool need_for_probe;
>> enum dl_dev_state status;
>> };
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
More information about the linux-arm-kernel
mailing list