If the userspace process crashes while we send the nl msg, it is possible
that the cmd in curr_nl_cmd of tcmu_dev never gets reset to 0, and and
returns busy for other commands after the userspace process is restartd.
More details below:
/backstores/user:file/file> set attribute dev_size=2048
Cannot set attribute dev_size: [Errno 3] No such process
/backstores/user:file/file> set attribute dev_size=2048
Cannot set attribute dev_size: [Errno 16] Device or resource busy
with following kernel messages:
[173605.747169] Unable to reconfigure device
[173616.686674] tcmu daemon: command reply support 1.
[173623.866978] netlink cmd 3 already executing on file
[173623.866984] Unable to reconfigure device
Also, it is not safe to leave the nl_cmd in the list, and not get deleted.
This patch removes the nl_cmd from the list, and clear its data if it is
not sent successfully.
Signed-off-by: Li Zhong <lizhongfs@gmail.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
return 0;
}
+static void tcmu_destroy_genl_cmd_reply(struct tcmu_dev *udev)
+{
+ struct tcmu_nl_cmd *nl_cmd = &udev->curr_nl_cmd;
+
+ if (!tcmu_kern_cmd_reply_supported)
+ return;
+
+ if (udev->nl_reply_supported <= 0)
+ return;
+
+ mutex_lock(&tcmu_nl_cmd_mutex);
+
+ list_del(&nl_cmd->nl_list);
+ memset(nl_cmd, 0, sizeof(*nl_cmd));
+
+ mutex_unlock(&tcmu_nl_cmd_mutex);
+}
+
static int tcmu_wait_genl_cmd_reply(struct tcmu_dev *udev)
{
struct tcmu_nl_cmd *nl_cmd = &udev->curr_nl_cmd;
if (ret == 0 ||
(ret == -ESRCH && cmd == TCMU_CMD_ADDED_DEVICE))
return tcmu_wait_genl_cmd_reply(udev);
+ else
+ tcmu_destroy_genl_cmd_reply(udev);
return ret;
}