[PATCH v1] proc/vmcore: don't fake reading zeroes on surprise vmcore_cb unregistration

Baoquan He bhe at redhat.com
Thu Nov 11 19:30:28 PST 2021


On 11/11/21 at 08:22pm, David Hildenbrand wrote:
> In commit cc5f2704c934 ("proc/vmcore: convert oldmem_pfn_is_ram callback
> to more generic vmcore callbacks"), we added detection of surprise
> vmcore_cb unregistration after the vmcore was already opened. Once
> detected, we warn the user and simulate reading zeroes from that point on
> when accessing the vmcore.
> 
> The basic reason was that unexpected unregistration, for example, by
> manually unbinding a driver from a device after opening the
> vmcore, is not supported and could result in reading oldmem the
> vmcore_cb would have actually prohibited while registered. However,
> something like that can similarly be trigger by a user that's really
> looking for trouble simply by unbinding the relevant driver before opening
> the vmcore -- or by disallowing loading the driver in the first place.
> So it's actually of limited help.

Yes, this is the change what I would like to see in the original patch
"proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks".
I am happy with this patch appended to commit cc5f2704c934.

> 
> Currently, unregistration can only be triggered via virtio-mem when
> manually unbinding the driver from the device inside the VM; there is no
> way to trigger it from the hypervisor, as hypervisors don't allow for
> unplugging virtio-mem devices -- ripping out system RAM from a VM without
> coordination with the guest is usually not a good idea.
> 
> The important part is that unbinding the driver and unregistering the
> vmcore_cb while concurrently reading the vmcore won't crash the system,
> and that is handled by the rwsem.
> 
> To make the mechanism more future proof, let's remove the "read zero"
> part, but leave the warning in place. For example, we could have a future
> driver (like virtio-balloon) that will contact the hypervisor to figure out
> if we already populated a page for a given PFN. Hotunplugging such a device
> and consequently unregistering the vmcore_cb could be triggered from the
> hypervisor without harming the system even while kdump is running. In that

I am a little confused, could you tell more about "contact the hypervisor to
figure out if we already populated a page for a given PFN."? I think
vmcore dumping relies on the eflcorehdr which is created beforehand, and
relies on vmcore_cb registered in advance too, virtio-balloon could take
another way to interact with kdump to make sure the dumpable ram?

Nevertheless, this patch looks good to me, thanks.

Acked-by: Baoquan He <bhe at redhat.com>

> case, we don't want to silently end up with a vmcore that contains wrong
> data, because the user inside the VM might be unaware of the hypervisor
> action and might easily miss the warning in the log.
> 
> Cc: Dave Young <dyoung at redhat.com>
> Cc: Baoquan He <bhe at redhat.com>
> Cc: Vivek Goyal <vgoyal at redhat.com>
> Cc: Andrew Morton <akpm at linux-foundation.org>
> Cc: Philipp Rudo <prudo at redhat.com>
> Cc: kexec at lists.infradead.org
> Cc: linux-mm at kvack.org
> Cc: linux-fsdevel at vger.kernel.org
> Signed-off-by: David Hildenbrand <david at redhat.com>
> ---
>  fs/proc/vmcore.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
> index 30a3b66f475a..948691cf4a1a 100644
> --- a/fs/proc/vmcore.c
> +++ b/fs/proc/vmcore.c
> @@ -65,8 +65,6 @@ static size_t vmcoredd_orig_sz;
>  static DECLARE_RWSEM(vmcore_cb_rwsem);
>  /* List of registered vmcore callbacks. */
>  static LIST_HEAD(vmcore_cb_list);
> -/* Whether we had a surprise unregistration of a callback. */
> -static bool vmcore_cb_unstable;
>  /* Whether the vmcore has been opened once. */
>  static bool vmcore_opened;
>  
> @@ -94,10 +92,8 @@ void unregister_vmcore_cb(struct vmcore_cb *cb)
>  	 * very unusual (e.g., forced driver removal), but we cannot stop
>  	 * unregistering.
>  	 */
> -	if (vmcore_opened) {
> +	if (vmcore_opened)
>  		pr_warn_once("Unexpected vmcore callback unregistration\n");
> -		vmcore_cb_unstable = true;
> -	}
>  	up_write(&vmcore_cb_rwsem);
>  }
>  EXPORT_SYMBOL_GPL(unregister_vmcore_cb);
> @@ -108,8 +104,6 @@ static bool pfn_is_ram(unsigned long pfn)
>  	bool ret = true;
>  
>  	lockdep_assert_held_read(&vmcore_cb_rwsem);
> -	if (unlikely(vmcore_cb_unstable))
> -		return false;
>  
>  	list_for_each_entry(cb, &vmcore_cb_list, next) {
>  		if (unlikely(!cb->pfn_is_ram))
> @@ -577,7 +571,7 @@ static int vmcore_remap_oldmem_pfn(struct vm_area_struct *vma,
>  	 * looping over all pages without a reason.
>  	 */
>  	down_read(&vmcore_cb_rwsem);
> -	if (!list_empty(&vmcore_cb_list) || vmcore_cb_unstable)
> +	if (!list_empty(&vmcore_cb_list))
>  		ret = remap_oldmem_pfn_checked(vma, from, pfn, size, prot);
>  	else
>  		ret = remap_oldmem_pfn_range(vma, from, pfn, size, prot);
> 
> base-commit: debe436e77c72fcee804fb867f275e6d31aa999c
> -- 
> 2.31.1
> 




More information about the kexec mailing list