[RFC] New interface for dm-io to handle timed requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Some days ago we proposed an extension to the device mapper that allows to specify a timeout after which a given request should return as successful, even if some of the target devices did not react by that time. As we cannot return a request to the upper layers as long as some io is still running and possibly modifying referenced pages, we also need a way to handle those requests. The ideal solution would be to have an interface in the block layer that allows us to cancel any submitted requests. But since such a change will take quite a lot discussions and work, we want to emulate such a behavior in the dm core for now.

The rough idea is as follows:
- The dm core has to keep track of running ios, so each client has to
  create a dm_io_client structure by calling dm_io_client_create
  This is also required to have better scaling targets that use
  dm-io since this allows to have memory pools private to each
  target instance.
- Any io is submitted via dm_io. Details on timeouts, what callback
  function to use, etc. are submitted via a struct dm_io_control.
- The notify function will be called multiple times, usually once for
  each region. It's the job of the client to wait for all regions to
  complete.
- The state of a region can be OK, TIMEOUT, CANCELED or ERROR. If The
  state is TIMEOUT, the io is still running, and can complete later by
  it self. In that case the callback is called again with the new
  state.
  If the client doesn't want to wait, it can call
  dm_io_cancel_by_device or dm_io_cancel_by_handle to cancel the
  outstanding io.
- Once all regions returned with a return code of OK, CANCELED or ERROR
  the io request can be returned to the originator.
- Synchronous calls are done by setting the SYNC bit in the rw attribute
  (only one function call instead of multiple ones). The call will wait
  until all regions are done (but will call the notify function if
  supplied). If no notify function is supplied the caller will only
  know that any region has an error or all are done.
  Without notify function but with timeout the regions will be cancelled
  automatically.

Regards,
Stefan Bader

----------------------------------------------------------------------
Here comes the proposed new header:

#include <linux/bio.h>

#include "dm.h"

#define dm_io_page_list page_list

/*=============================================================================
 * Structures and functions to manage different I/O clients.
 *=============================================================================
 */
struct dm_io_client;

/*
 * NOTE: We need the number of requests (ios) that the target wants to have
 *       running on (devices) devices in parallel. The size is sort of bad.
 *       We need it to simulate cancellation since there we have to have
 *       enough memory to store the bio_vecs content. Otherwise we would have
 *       to reserve the maximum memory size a bio_vec can adress which is a
 *       waste of memory.
 *       Another proposal would be:
 *       dm_io_client_create(dm_target *, uint, uint, dm_io_client **)
 */

/*-----------------------------------------------------------------------------
 * Register as a new I/O client.
 *
 * Arguments: devices  = how many devices will be used for each request.
 *            min_ios  = the minimum number of I/O request that should run
 *                       in parallel.
 *            max_size = the biggest amount of memory that will be packed into
 *                       one bio_vec.
 *            cl       = address into which the pointer to the new dm_io_client
 *                       will be written.
 *
 * Returns: 0            on success
 *          -ENOMEM      if there is not enough memory to build all memory
 *                       pools and data structures.
 *-----------------------------------------------------------------------------
 */
int dm_io_client_create(
	unsigned int		devices,
	unsigned int		min_ios,
	unsigned int		max_size,
	struct dm_io_client **	cl);

/*-----------------------------------------------------------------------------
 * Unregister as a client.
 *
 * Arguments: cl = pointer to the client context to release.
 *-----------------------------------------------------------------------------
 */
void dm_io_client_destroy(struct dm_io_client *cl);


/*=============================================================================
 * Structures and functions to do the actual I/O. The dm_io_region is a
 * container to pass in the destination(s) for write- and the source for
 * read-requests.A
 *=============================================================================
 */
struct dm_io_region {
	struct block_device *	bdev;
	sector_t		sector;
	sector_t		count;
};

/*
 * The dm_io_handle is in place for future extensions where it is necessary
 * to identify a certain I/O job in calls to dm_io functions.
 */
struct dm_io_handle;
struct dm_io_region_state {
	unsigned int		index;
	enum {
		OK,
		TIMEOUT,
		CANCELLED,
		ERROR,
	}			state;
	int			error_code;
	struct dm_io_handle *	hdl;
};

/*
 * Note: It is guaranteed that the contents of region_state will not change
 *       while in the notify function.
 * Note: The dm_io_handle is only valid during the call. If the caller stores
 *       it somewhere else it has to use dm_io_handle_get().
 */
typedef void (*dm_io_notify_fn)(
		struct dm_io_region_state *state,
		void *context);

struct dm_io_page_list {
	struct dm_io_page_list *	next;
	struct page *			page;
};

struct dm_io_memory {
	enum {
		IO_PAGE_LIST,
		IO_BVEC,
		IO_VM,
	}						type;
	union {
		void *				vma;
		struct bio_vec *		bv;
		struct dm_io_page_list *	pl;
	}						ptr;
	unsigned int					offset;
};

/*
 * Optional flags for dm_io_control:
 */
#define DM_IO_CANCEL_ON_TIMEOUT		1

struct dm_io_control {
	struct dm_io_memory	memory;
	int			rw;		// SYNC flag supported...
	dm_io_notify_fn		notify;
	void *			context;
	struct dm_io_client *	client;
	unsigned long		timeout;	// What time base (seconds)?
	unsigned int		flags;
};

/*
 * Note: If the caller supplies a place to store the io_handle it has to
 *       release it by calling dm_io_handle_put().
 * Note: By issuing a SYNC I/O the call will return when all I/O has
 *       completed but the notify function is called as it would be with
 *       asyncronous calls.
 */
int dm_io(
	struct dm_io_control *	ctrl,
	unsigned int		num_regions,
	struct dm_io_region *	regions,
	struct dm_io_handle **	hdl);

/*
 * Since the interface allows to pass references to the io handle to the
 * caller we need to supply a way to manage them.
 * The *_get variant might be unnecessary but IMHO it should be there to
 * allow clients to store the reference to additional locations. Comments?
 */
struct dm_io_handle *dm_io_handle_get(struct dm_io_handle *io);
struct dm_io_handle *dm_io_handle_put(struct dm_io_handle *io);

/*
 * Cancelation functions for several I/O entities.
 */
int dm_io_cancel_by_device(struct dm_io_client *cl, struct block_device *bdev);
int dm_io_cancel_by_handle(struct dm_io_client *cl, struct dm_io_handle *hdl);

--

dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux