[LEDE-DEV] The procd daemon does not handle correctly the instances

Thu Sep 29 01:44:46 PDT 2016

yep, saw that while looking at the procd code this morning.

On 29/09/2016 10:36, Lebleu Pierre wrote:
> In fact, the code already exists for the method "add".
> 
> When we call the method "set", it does a vlist_update(), instance_add() and then vlist_flush().
> 
> For the method "add", it only update the PID as you said so.
> 
> I just purpose to call the method "add" rather than "set" when we specify the instance.
> 
> 
> Pierre
> 
> -----Original Message-----
> From: John Crispin [mailto:john at phrozen.org] 
> Sent: donderdag 29 september 2016 7:43
> To: Lebleu Pierre <Pierre.Lebleu at technicolor.com>; lede-dev at lists.infradead.org
> Subject: Re: [LEDE-DEV] The procd daemon does not handle correctly the instances
> 
> this might actually be solvable int he scripts, will need to take a closer look today
> 
> 	John
> 
> On 29/09/2016 07:20, John Crispin wrote:
>> procd uses a vlist to store instances in. if oe changes all are 
>> restarted. we need to change this as per my last mail to only restart 
>> those that actually changed.
>>
>> 	John
>>
>> On 28/09/2016 15:44, Lebleu Pierre wrote:
>>> Hi,
>>>
>>> Thank you for your answer.
>>>
>>> First, I am going to clarify a little bit what I am trying to do.
>>>
>>> I would like to have more than one instance for a given daemon (here, dnsmasq).
>>> I would like to stop one instance when I need.
>>> I would like to start one instance without stopping the other one.
>>>
>>> root at LEDE:~# ps w | grep dnsmasq
>>> 11633 dnsmasq   1560 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
>>> 11634 dnsmasq   1556 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.second -k -x /var/run/dnsmasq/dnsmasq.second.pid
>>> 11636 root      1556 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
>>> 11660 root      1712 R    grep dnsmasq
>>> root at LEDE:~# /etc/init.d/dnsmasq start main root at LEDE:~# ps w | grep 
>>> dnsmasq
>>> 11633 dnsmasq   1560 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
>>> 11636 root      1556 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
>>> 11715 root      1712 R    grep dnsmasq
>>> root at LEDE:~# /etc/init.d/dnsmasq start second root at LEDE:~# ps w | 
>>> grep dnsmasq
>>> 11753 dnsmasq   1556 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.second -k -x /var/run/dnsmasq/dnsmasq.second.pid
>>> 11757 root      1712 R    grep dnsmasq
>>>
>>> As you can see, when I want to start the second instance : it starts it but it also kills the first instance...
>>>
>>> I had a look at the procd code and it seems it does not flush the other instance if we call the method "add". Am I right ?
>>>
>>> Regarding the procd.sh file, it seems there is a lack of something. Indeed, we can call "stop myinstance" but we can't call "start myinstance".
>>>
>>> Here, please find my modification :
>>>
>>> diff --git a/package/base-files/files/etc/rc.common 
>>> b/package/base-files/files/etc/rc.common
>>> index e0de073..8cc6e69 100755
>>> --- a/package/base-files/files/etc/rc.common
>>> +++ b/package/base-files/files/etc/rc.common
>>> @@ -102,9 +102,11 @@ ${INIT_TRACE:+set -x}
>>>         . $IPKG_INSTROOT/lib/functions/procd.sh
>>>         basescript=$(readlink "$initscript")
>>>         rc_procd() {
>>> +               local method="set"
>>> +               [ -n "$2" ] && method="add"
>>>                 procd_open_service "$(basename ${basescript:-$initscript})" "$initscript"
>>>                 "$@"
>>> -               procd_close_service
>>> +               procd_close_service $method
>>>         }
>>>  
>>>         start() {
>>> diff --git a/package/system/procd/files/procd.sh 
>>> b/package/system/procd/files/procd.sh
>>> index fa6f8a9..1af9c7b 100644
>>> --- a/package/system/procd/files/procd.sh
>>> +++ b/package/system/procd/files/procd.sh
>>> @@ -72,11 +72,15 @@ _procd_open_service() {  }
>>>  
>>>  _procd_close_service() {
>>> +       local method="set"
>>> +       [ -n "$1" ] && method="$1"
>>> +
>>>         json_close_object
>>>         _procd_open_trigger
>>>         service_triggers
>>>         _procd_close_trigger
>>> -       _procd_ubus_call set
>>> +
>>> +       _procd_ubus_call $method
>>>  }
>>>  
>>>  _procd_add_array_data() {
>>>
>>>
>>> With this patch, it seems to work.
>>>
>>> Pierre
>>> -----Original Message-----
>>> From: John Crispin [mailto:john at phrozen.org]
>>> Sent: dinsdag 27 september 2016 17:48
>>> To: Lebleu Pierre <Pierre.Lebleu at technicolor.com>; 
>>> lede-dev at lists.infradead.org
>>> Subject: Re: [LEDE-DEV] The procd daemon does not handle correctly 
>>> the instances
>>>
>>>
>>>
>>> On 27/09/2016 17:34, Lebleu Pierre wrote:
>>>> Hi all,
>>>>
>>>> I found a bug in the daemon process management. Indeed, when we have several instances of one daemon such as "dnsmasq" and we want to restart only one instance, the other instance is killed by procd. It seems procd updates the second instance and then, kill it.
>>>>
>>>> Please find the patch :
>>>> diff --git a/service/service.c b/service/service.c index 
>>>> 0796adb..9ed07da 100644
>>>> --- a/service/service.c
>>>> +++ b/service/service.c
>>>> @@ -132,8 +132,6 @@ service_update(struct service *s, struct blob_attr **tb, bool add)
>>>>         }
>>>>
>>>>         if (tb[SERVICE_SET_INSTANCES]) {
>>>> -               if (!add)
>>>> -                       vlist_update(&s->instances);
>>>>                 blobmsg_for_each_attr(cur, tb[SERVICE_SET_INSTANCES], rem) {
>>>>                         service_instance_add(s, cur);
>>>>                 }
>>>
>>> this bit is certainly not correct. it simply deactivates the feature for all services. the actual problem is that procd stores the instances inside a vlist. this has an update mechanism triggered inside service_instance_update().
>>>
>>> what you want to do is compare in_o and in_n and only trigger
>>> instance_update() if there really was a change. if there is no change then simply migrate the pid over to the new instance.
>>>
>>>
>>>> @@ -238,7 +236,7 @@ service_handle_set(struct ubus_context *ctx, struct ubus_object *obj,
>>>>         int ret;
>>>>
>>>>         blobmsg_parse(service_set_attrs, __SERVICE_SET_MAX, tb, blob_data(msg), blob_len(msg));
>>>> -       cur = tb[SERVICE_ATTR_NAME];
>>>> +       cur = tb[SERVICE_SET_NAME];
>>>>         if (!cur)
>>>>                 return UBUS_STATUS_INVALID_ARGUMENT;
>>>
>>> this is a copy paste error but luckily does not cause an error as SET and ATTR both eval to 0. i have merged this part as a fix.
>>>
>>> 	John
>>>
>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Pierre
>>>>
>>>>
>>>> _______________________________________________
>>>> Lede-dev mailing list
>>>> Lede-dev at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/lede-dev
>>>>