Dan, You have indeed given me a good amount of homework. Anyway, I will address everything according to your comments and suggestions and resubmit to the "real" part of the kernel and not staging. Thank you very much. Regards, Petros On Mon, 2015-06-01 at 12:18 +0300, Dan Carpenter wrote: > On Sun, May 31, 2015 at 02:43:16PM -0500, Petros Koutoupis wrote: > > Attached is a patch for two modules: RapidDisk & RapidCache. RapidDisk is a Linux > > RAM drive module which allows the user to dynamically create, remove, and resize > > RAM-based block devices. RapidDisk is designed to work with both volatile and > > non-volatile memory. In the case of volatile memory, memory is allocated only when > > needed. The RapidCache module in turn utilizes a RapidDisk volume as a FIFO > > Write-Through caching node to a slower block device. > > > > Signed-off-by: Petros Koutoupis <petros@xxxxxxxxxxxxxxxxxxx> > > This feels like it could be merged without going through staging. It > looks like ok code to me. > > > > > --- > > Kconfig | 2 > > Makefile | 1 > > rapiddisk/Documentation/rxdsk.txt | 74 ++ > > rapiddisk/Kconfig | 10 > > rapiddisk/Makefile | 2 > > rapiddisk/rxcache.c | 1181 ++++++++++++++++++++++++++++++++++++++ > > rapiddisk/rxcommon.h | 22 > > rapiddisk/rxdsk.c | 799 +++++++++++++++++++++++++ > > 8 files changed, 2091 insertions(+) > > > > diff -uNpr linux-next.orig/drivers/staging/Kconfig linux-next/drivers/staging/Kconfig > > --- linux-next.orig/drivers/staging/Kconfig 2015-05-30 13:37:03.929726967 -0500 > > +++ linux-next/drivers/staging/Kconfig 2015-05-31 13:36:08.673757913 -0500 > > @@ -112,4 +112,6 @@ source "drivers/staging/fsl-mc/Kconfig" > > > > source "drivers/staging/wilc1000/Kconfig" > > > > +source "drivers/staging/rapiddisk/Kconfig" > > + > > endif # STAGING > > diff -uNpr linux-next.orig/drivers/staging/Makefile linux-next/drivers/staging/Makefile > > --- linux-next.orig/drivers/staging/Makefile 2015-05-30 13:37:03.921726968 -0500 > > +++ linux-next/drivers/staging/Makefile 2015-05-31 13:36:08.753757911 -0500 > > @@ -48,3 +48,4 @@ obj-$(CONFIG_COMMON_CLK_XLNX_CLKWZRD) += > > obj-$(CONFIG_FB_TFT) += fbtft/ > > obj-$(CONFIG_FSL_MC_BUS) += fsl-mc/ > > obj-$(CONFIG_WILC1000) += wilc1000/ > > +obj-$(CONFIG_RXDSK) += rapiddisk/ > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/Documentation/rxdsk.txt linux-next/drivers/staging/rapiddisk/Documentation/rxdsk.txt > > --- linux-next.orig/drivers/staging/rapiddisk/Documentation/rxdsk.txt 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/Documentation/rxdsk.txt 2015-05-31 13:38:29.893753897 -0500 > > @@ -0,0 +1,74 @@ > > +RapidDisk (rxdsk) RAM disk and RapidCache (rxcache) Linux modules > > + > > +== Description == > > + > > +RapidDisk or rxdsk was designed to be used in high performing environments > > +and has been designed with simplicity in mind. Utilizing a user land binary, > > +the system administrator is capable of dynamically adding new RAM based > > +block devices of varying sizes, removing existing ones to even listing all > > +existing RAM block devices. The rxdsk module has been designed to allocate > > +from the system's memory pages and is capable of addressing memory to > > +support Gigabytes if not a Terabyte of available Random Access Memory. > > + > > +RapidCache or rxcache is designed to leverage the high speed performing > > +technologies of the RapidDisk RAM disk and utilizing the Device Mapper > > +framework, map an rxdsk volume to act as a block device's Write/Read-through > > +cache. This can significantly boost the performance of a local or remote > > +disk device. > > + > > + > > +== Module Parameters == > > + > > +RapidDisk > > +--------- > > +max_rxcnt: Total RAM Disk devices available for use. (Default = 128 = MAX) (int) > > +max_sectors: Maximum sectors (in KB) for the request queue. (Default = 127) (int) > > +nr_requests: Number of requests at a given time for the request queue. (Default = 128) (int) > > + > > +RapidCache > > +--------- > > +None. > > + > > + > > +== Usage == > > + > > +RapidDisk > > +--------- > > +It is advised to utilize the userland utility, rxadm, but this is essentially what is > > +written to the /proc/rxctl proc file to manage rxdsk volumes: > > + > > +Attach a new rxdsk volume by typing (size in sectors): > > + # echo "rxdsk attach 0 8192" > /proc/rxctl > > + > > +Attach a non-volatile rxdsk volume by typing (starting / ending addresses > > +in decimal format): > > + # echo "rxdsk attach-nv 0 1234 56789" > /proc/rxctl > > + > > +Detach an existing rxdsk volume by typing: > > + # echo "rxdsk detach 0" > /proc/rxctl > > + > > +Resize an existing rxdsk volume by typing (size in sectors): > > + # echo "rxdsk resize 0 65536" > /proc/rxctl > > + > > + > > +Note - the integer after the "rxdsk <command>" is the RapidDisk volume number. "0" > > +would signify "rxd0." > > + > > +RapidCache > > +--------- > > +It is advised to utilize the userland utility, rxadm, but this is essentially what is > > +sent to the dmsetup command: > > + > > +Map an existing rxdsk volume to a block device by typing: > > + # echo 0 4194303 rxcache /dev/sdb /dev/rxd0 0 8 196608|dmsetup create rxc0 > > + > > +Parameter 1: Start block of source volume (in sectors). > > +Parameter 2: End block of source volume (in sectors). > > +Parameter 4: Source volume. > > +Parameter 5: Cache volume. > > +Parameter 7: Caching block size. > > +Parameter 8: Cache size (in sectors). > > + > > +Unmap an rxdsk volume: > > + > > + # dmsetup remove rxc0 > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/Kconfig linux-next/drivers/staging/rapiddisk/Kconfig > > --- linux-next.orig/drivers/staging/rapiddisk/Kconfig 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/Kconfig 2015-05-31 13:36:08.753757911 -0500 > > @@ -0,0 +1,10 @@ > > +config RXDSK > > + tristate "RapidDisk Enhanced Linux RAM disk solution" > > + depends on BLOCK && BLK_DEV_DM > > + default N > > + help > > + Creates virtual block devices called /dev/rxd(X = 0, 1, ...). > > + Supports both volatile and non-volatile memory. And can map > > + rxdsk volumes as caching nodes to slower drives. > > + > > + Project home: http://www.rapiddisk.org/ > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/Makefile linux-next/drivers/staging/rapiddisk/Makefile > > --- linux-next.orig/drivers/staging/rapiddisk/Makefile 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/Makefile 2015-05-31 13:36:08.753757911 -0500 > > @@ -0,0 +1,2 @@ > > +obj-$(CONFIG_RXDSK) += rxdsk.o > > +obj-$(CONFIG_RXDSK) += rxcache.o > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/rxcache.c linux-next/drivers/staging/rapiddisk/rxcache.c > > --- linux-next.orig/drivers/staging/rapiddisk/rxcache.c 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/rxcache.c 2015-05-31 13:36:08.753757911 -0500 > > @@ -0,0 +1,1181 @@ > > +/******************************************************* > > + ** Copyright (c) 2011-2014 Petros Koutoupis > > + ** All rights reserved. > > + ** > > + ** filename: rxcache.c > > + ** description: Device mapper target for block-level disk > > + ** write-through and read-ahead caching. This module > > + ** is forked from Flashcache-wt: > > + ** Copyright 2010 Facebook, Inc. > > + ** Author: Mohan Srinivasan (mohan@xxxxxxxxxxxx) > > + ** > > + ** created: 3Dec11, petros@xxxxxxxxxxxxxxxxxxx > > + ** > > + ** This file is licensed under GPLv2. > > + ** > > + ******************************************************/ > > + > > +#include <asm/atomic.h> > > +#include <linux/module.h> > > +#include <linux/init.h> > > +#include <linux/list.h> > > +#include <linux/blkdev.h> > > +#include <linux/bio.h> > > +#include <linux/vmalloc.h> > > +#include <linux/slab.h> > > +#include <linux/hash.h> > > +#include <linux/spinlock.h> > > +#include <linux/workqueue.h> > > +#include <linux/pagemap.h> > > +#include <linux/random.h> > > +#include <linux/version.h> > > +#include <linux/seq_file.h> > > +#include <linux/hardirq.h> > > +#include <asm/kmap_types.h> > > +#include <linux/dm-io.h> > > +#include <linux/device-mapper.h> > > +#include <linux/bio.h> > > +#include "rxcommon.h" > > + > > +/* Like ASSERT() but always compiled in */ > > Just rename VERIFY() to ASSERT(). > > + > > +#define VERIFY(x) do { \ > > + if (unlikely(!(x))) { \ > > + dump_stack(); \ > > + panic("VERIFY: assertion (%s) failed at %s (%d)\n", \ > > + #x, __FILE__, __LINE__); \ > > + } \ > > +} while (0) > > + > > +#define DM_MSG_PREFIX "rxc" > > +#define RxPREFIX "rxc" > > +#define DRIVER_DESC "RapidCache (rxc) DM target is a write-through caching target with RapidDisk volumes." > > > Ugh. Too many levels of abstraction. These are nice if your hobby is > grepping but otherwise they are pointless. > > The debug printks could be cleaned up or deleted but that's like 10 > minutes of work. > > > + > > +#define READCACHE 1 > > +#define WRITECACHE 2 > > +#define READSOURCE 3 > > +#define WRITESOURCE 4 > > +#define READCACHE_DONE 5 > > + > > +/* Default cache parameters */ > > +#define DEFAULT_CACHE_ASSOC 512 > > +#define DEFAULT_BLOCK_SIZE 8 /* 4 KB */ > > +#define CONSECUTIVE_BLOCKS 512 > > + > > +/* States of a cache block */ > > +#define INVALID 0 > > +#define VALID 1 /* Valid */ > > +#define INPROG 2 /* IO (cache fill) is in progress */ > > +#define CACHEREADINPROG 3 /* cache read in progress, don't recycle */ > > +#define INPROG_INVALID 4 /* Write invalidated during a refill */ > > + > > +#define DEV_PATHLEN 128 > > + > > +#ifndef DM_MAPIO_SUBMITTED > > +#define DM_MAPIO_SUBMITTED 0 > > +#endif > > + > > +#define WT_MIN_JOBS 1024 > > + > > +/* Cache context */ > > +struct cache_c { > > struct cache_context. > > > + struct dm_target *tgt; > > + > > + struct dm_dev *disk_dev; > > + struct dm_dev *cache_dev; > > + > > + spinlock_t cache_spin_lock; > > + struct cacheblock *cache; > > + u_int8_t *cache_state; > > + u_int32_t *set_lru_next; > > + > > + struct dm_io_client *io_client; > > + sector_t size; > > + unsigned int assoc; > > + unsigned int block_size; > > + unsigned int block_shift; > > + unsigned int block_mask; > > + unsigned int consecutive_shift; > > + > > + wait_queue_head_t destroyq; > > + atomic_t nr_jobs; > > + > > + /* Stats */ > > + unsigned long reads; > > + unsigned long writes; > > + unsigned long cache_hits; > > + unsigned long replace; > > + unsigned long wr_invalidates; > > + unsigned long rd_invalidates; > > + unsigned long cached_blocks; > > + unsigned long cache_wr_replace; > > + unsigned long uncached_reads; > > + unsigned long uncached_writes; > > + unsigned long cache_reads, cache_writes; > > + unsigned long disk_reads, disk_writes; > > + > > + char cache_devname[DEV_PATHLEN]; > > + char disk_devname[DEV_PATHLEN]; > > +}; > > + > > +/* Cache block metadata structure */ > > +struct cacheblock { > > struct cache_block > > > + sector_t dbn; > > +}; > > + > > +/* Structure for a kcached job */ > > +struct kcached_job { > > + struct list_head list; > > + struct cache_c *dmc; > > + struct bio *bio; > > + struct dm_io_region disk; > > + struct dm_io_region cache; > > + int index; > > + int rw; > > + int error; > > +}; > > + > > +u_int64_t size_hist[33];' > > sector zero isn't used? That's really strange. > > > + > > +static struct workqueue_struct *_kcached_wq; > > +static struct work_struct _kcached_work; > > +static struct kmem_cache *_job_cache; > > +static mempool_t *_job_pool; > > + > > +static DEFINE_SPINLOCK(_job_lock); > > + > > +static LIST_HEAD(_complete_jobs); > > +static LIST_HEAD(_io_jobs); > > Don't name everything with a leading underscrore. Especially since it > is local to the file. > > > + > > + > > +/* > > + * Function Declarations > > + **/ > > I have avoided pointing out the obvious comments, but this one crosses > a line. Remove obvious comments. > > > +static void cache_read_miss(struct cache_c *, struct bio *, int); > > +static void cache_write(struct cache_c *, struct bio *); > > +static int cache_invalidate_blocks(struct cache_c *, struct bio *); > > +static void rxc_uncached_io_callback(unsigned long, void *); > > +static void rxc_start_uncached_io(struct cache_c *, struct bio *); > > + > > + > > +/* > > + * Functions Definitions > > + **/ > > +int dm_io_async_bvec(unsigned int num_regions, struct dm_io_region *where, > > + int rw, struct bio *bio, io_notify_fn fn, void *context) > > +{ > > + struct kcached_job *job = (struct kcached_job *)context; > > + struct cache_c *dmc = job->dmc; > > + struct dm_io_request iorq; > > + > > + iorq.bi_rw = rw; > > + iorq.mem.type = DM_IO_BIO; > > + iorq.mem.ptr.bio = bio; > > + iorq.notify.fn = fn; > > + iorq.notify.context = context; > > + iorq.client = dmc->io_client; > > + > > + return dm_io(&iorq, num_regions, where, NULL); > > +} > > + > > +static int jobs_init(void) > > +{ > > + _job_cache = kmem_cache_create("kcached-jobs-wt", > > + sizeof(struct kcached_job), __alignof__(struct kcached_job), > > + 0, NULL); > > + if (!_job_cache) > > + return -ENOMEM; > > + > > + _job_pool = mempool_create(WT_MIN_JOBS, mempool_alloc_slab, > > + mempool_free_slab, _job_cache); > > + if (!_job_pool) { > > + kmem_cache_destroy(_job_cache); > > + return -ENOMEM; > > + } > > + > > + return 0; > > +} > > + > > +static void jobs_exit(void) > > +{ > > + BUG_ON(!list_empty(&_complete_jobs)); > > + BUG_ON(!list_empty(&_io_jobs)); > > + > > + mempool_destroy(_job_pool); > > + kmem_cache_destroy(_job_cache); > > + _job_pool = NULL; > > + _job_cache = NULL; > > +} > > + > > +/* Functions to push and pop a job onto the head of a given job list. */ > > +static inline struct kcached_job *pop(struct list_head *jobs) > > +{ > > + struct kcached_job *job = NULL; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&_job_lock, flags); > > + if (!list_empty(jobs)) { > > + job = list_entry(jobs->next, struct kcached_job, list); > > + list_del(&job->list); > > + } > > + spin_unlock_irqrestore(&_job_lock, flags); > > + return job; > > +} > > + > > +static inline void push(struct list_head *jobs, struct kcached_job *job) > > +{ > > + unsigned long flags; > > + > > + spin_lock_irqsave(&_job_lock, flags); > > + list_add_tail(&job->list, jobs); > > + spin_unlock_irqrestore(&_job_lock, flags); > > +} > > + > > +void rxc_io_callback(unsigned long error, void *context) > > +{ > > + struct kcached_job *job = (struct kcached_job *) context; > > + struct cache_c *dmc = job->dmc; > > + struct bio *bio; > > + int invalid = 0; > > + > > + VERIFY(job != NULL); > > + bio = job->bio; > > + VERIFY(bio != NULL); > > + DPRINTK("%s: %s: %s %llu(%llu->%llu,%llu)", RxPREFIX, __func__, > > + (job->rw == READ ? "READ" : "WRITE"), > > + bio->bi_iter.bi_sector, job->disk.sector, > > + job->cache.sector, job->disk.count); > > + if (error) > > + DMERR("%s: io error %ld", __func__, error); > > + if (job->rw == READSOURCE || job->rw == WRITESOURCE) { > > + spin_lock_bh(&dmc->cache_spin_lock); > > + if (dmc->cache_state[job->index] != INPROG) { > > + VERIFY(dmc->cache_state[job->index] == INPROG_INVALID); > > + invalid++; > > + } > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + if (error || invalid) { > > + if (invalid) > > + DMERR("%s: cache fill invalidation, sector %lu, size %u", > > + __func__, > > + (unsigned long)bio->bi_iter.bi_sector, > > + bio->bi_iter.bi_size); > > + bio_endio(bio, error); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[job->index] = INVALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + goto out; > > + } else { > > + /* Kick off the write to the cache */ > > + job->rw = WRITECACHE; > > + push(&_io_jobs, job); > > + queue_work(_kcached_wq, &_kcached_work); > > + return; > > + } > > + } else if (job->rw == READCACHE) { > > + spin_lock_bh(&dmc->cache_spin_lock); > > + VERIFY(dmc->cache_state[job->index] == INPROG_INVALID || > > + dmc->cache_state[job->index] == CACHEREADINPROG); > > + if (dmc->cache_state[job->index] == INPROG_INVALID) > > + invalid++; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + if (!invalid && !error) { > > + /* Complete the current IO successfully */ > > + bio_endio(bio, 0); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[job->index] = VALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + goto out; > > + } > > + /* error || invalid || bounce back to source device */ > > + job->rw = READCACHE_DONE; > > + push(&_complete_jobs, job); > > + queue_work(_kcached_wq, &_kcached_work); > > + return; > > + } else { > > + VERIFY(job->rw == WRITECACHE); > > + bio_endio(bio, 0); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + VERIFY((dmc->cache_state[job->index] == INPROG) || > > + (dmc->cache_state[job->index] == INPROG_INVALID)); > > + if (error || dmc->cache_state[job->index] == INPROG_INVALID) { > > + dmc->cache_state[job->index] = INVALID; > > + } else { > > + dmc->cache_state[job->index] = VALID; > > + dmc->cached_blocks++; > > + } > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + DPRINTK("%s: Cache Fill: Block %llu, index = %d: Cache state = %d", > > + RxPREFIX, dmc->cache[job->index].dbn, job->index, > > + dmc->cache_state[job->index]); > > + } > > +out: > > + mempool_free(job, _job_pool); > > + if (atomic_dec_and_test(&dmc->nr_jobs)) > > + wake_up(&dmc->destroyq); > > +} > > +EXPORT_SYMBOL(rxc_io_callback); > > + > > +static int do_io(struct kcached_job *job) > > +{ > > + int r = 0; > > + struct cache_c *dmc = job->dmc; > > + struct bio *bio = job->bio; > > + > > + VERIFY(job->rw == WRITECACHE); > > + /* Write to cache device */ > > + dmc->cache_writes++; > > + r = dm_io_async_bvec(1, &job->cache, WRITE, bio, > > + rxc_io_callback, job); > > + VERIFY(r == 0); /* dm_io_async_bvec() must always return 0 */ > > + return r; > > +} > > + > > +int rxc_do_complete(struct kcached_job *job) > > +{ > > + struct bio *bio = job->bio; > > + struct cache_c *dmc = job->dmc; > > + > > + VERIFY(job->rw == READCACHE_DONE); > > + DPRINTK("%s: %s: %llu", RxPREFIX, __func__, bio->bi_iter.bi_sector); > > + /* error || block invalidated while reading from cache */ > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[job->index] = INVALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + mempool_free(job, _job_pool); > > + if (atomic_dec_and_test(&dmc->nr_jobs)) > > + wake_up(&dmc->destroyq); > > + /* Kick this IO back to the source bdev */ > > + rxc_start_uncached_io(dmc, bio); > > + return 0; > > +} > > +EXPORT_SYMBOL(rxc_do_complete); > > + > > +static void process_jobs(struct list_head *jobs, > > + int (*fn)(struct kcached_job *)) > > +{ > > + struct kcached_job *job; > > + > > + while ((job = pop(jobs))) > > + (void)fn(job); > > +} > > + > > +static void do_work(struct work_struct *work) > > +{ > > + process_jobs(&_complete_jobs, rxc_do_complete); > > + process_jobs(&_io_jobs, do_io); > > +} > > + > > +static int kcached_init(struct cache_c *dmc) > > +{ > > + init_waitqueue_head(&dmc->destroyq); > > + atomic_set(&dmc->nr_jobs, 0); > > + return 0; > > +} > > + > > +void kcached_client_destroy(struct cache_c *dmc) > > +{ > > + /* Wait for completion of all jobs submitted by this client. */ > > + wait_event(dmc->destroyq, !atomic_read(&dmc->nr_jobs)); > > +} > > + > > +/* Map a block from the source device to a block in the cache device. */ > > +static unsigned long hash_block(struct cache_c *dmc, sector_t dbn) > > +{ > > + unsigned long set_number, value, tmpval; > > + > > + value = (unsigned long) > > + (dbn >> (dmc->block_shift + dmc->consecutive_shift)); > > Confusing indenting. The cast is pointless. > > > + tmpval = value; > > tmpval is not needed? > > > + set_number = do_div(tmpval, (dmc->size >> dmc->consecutive_shift)); > > + DPRINTK("%s: Hash: %llu(%lu)->%lu", RxPREFIX, dbn, value, set_number); > > + return set_number; > > +} > > + > > +static int find_valid_dbn(struct cache_c *dmc, sector_t dbn, > > + int start_index, int *index) > > +{ > > + int i; > > + int end_index = start_index + dmc->assoc; > > + > > + for (i = start_index ; i < end_index ; i++) { > > + if (dbn == dmc->cache[i].dbn && (dmc->cache_state[i] == VALID || > > + dmc->cache_state[i] == CACHEREADINPROG || > > + dmc->cache_state[i] == INPROG)) { > > > A lot of your indenting is messed up. This should be: > > if (dbn == dmc->cache[i].dbn && > (dmc->cache_state[i] == VALID || > dmc->cache_state[i] == CACHEREADINPROG || > dmc->cache_state[i] == INPROG)) { > > Or you could reverse the first chunk: > > if (dbn != dmc->cache[i].dbn) > continue; > if (dmc->cache_state[i] == VALID || > dmc->cache_state[i] == CACHEREADINPROG || > > > > > > + *index = i; > > + return dmc->cache_state[i]; > > + } > > + } > > + return -1; > > Find ever reference to -1 and replace it with something meaningful. > > > +} > > + > > +static void find_invalid_dbn(struct cache_c *dmc, int start_index, int *index) > > +{ > > + int i; > > + int end_index = start_index + dmc->assoc; > > + > > + /* Find INVALID slot that we can reuse */ > > + for (i = start_index ; i < end_index ; i++) { > > + if (dmc->cache_state[i] == INVALID) { > > + *index = i; > > + return; > > + } > > + } > > This function should return success or failure. > > > +} > > + > > +static void find_reclaim_dbn(struct cache_c *dmc, int start_index, int *index) > > +{ > > + int i; > > + int end_index = start_index + dmc->assoc; > > + int set = start_index / dmc->assoc; > > + int slots_searched = 0; > > + > > + i = dmc->set_lru_next[set]; > > + while (slots_searched < dmc->assoc) { > > + VERIFY(i >= start_index); > > + VERIFY(i < end_index); > > + if (dmc->cache_state[i] == VALID) { > > + *index = i; > > + break; > > + } > > + slots_searched++; > > + i++; > > + if (i == end_index) > > + i = start_index; > > + } > > + i++; > > + if (i == end_index) > > + i = start_index; > > + dmc->set_lru_next[set] = i; > > +} > > + > > +/* dbn is the starting sector, io_size is the number of sectors. */ > > +static int cache_lookup(struct cache_c *dmc, struct bio *bio, int *index) > > +{ > > + sector_t dbn = bio->bi_iter.bi_sector; > > +#if defined RxC_DEBUG > > + int io_size = to_sector(bio->bi_iter.bi_size); > > +#endif > > Get rid of this ifdef. The DPRINTK() calls are ugly. It's hard to see > the actual code in the forest of DPRINTK calls. Let's fix the crashing > bugs then delete the debug code. > > > + unsigned long set_number = hash_block(dmc, dbn); > > + int invalid = -1, oldest_clean = -1; > > + int start_index; > > + int ret; > > + > > + start_index = dmc->assoc * set_number; > > + DPRINTK("%s: Cache read lookup : dbn %llu(%lu), set = %d", > > + RxPREFIX, dbn, io_size, set_number); > > + ret = find_valid_dbn(dmc, dbn, start_index, index); > > + if (ret == VALID || ret == INPROG || ret == CACHEREADINPROG) { > > + DPRINTK("%s: Cache read lookup: Block %llu(%lu): ret %d VALID/INPROG index %d", > > + RxPREFIX, dbn, io_size, ret, *index); > > + /* We found the exact range of blocks we are looking for */ > > + return ret; > > + } > > + DPRINTK("%s: Cache read lookup: Block %llu(%lu):%d INVALID", > > + RxPREFIX, dbn, io_size, ret); > > + VERIFY(ret == -1); > > + find_invalid_dbn(dmc, start_index, &invalid); > > + if (invalid == -1) { > > + /* Search for oldest valid entry */ > > + find_reclaim_dbn(dmc, start_index, &oldest_clean); > > + } > > + /* Cache miss : We can't choose an entry marked INPROG, > > + * but choose the oldest INVALID or the oldest VALID entry. */ > > + *index = start_index + dmc->assoc; > > + if (invalid != -1) { > > + DPRINTK("%s: Cache read lookup MISS (INVALID): dbn %llu(%lu), set = %d, index = %d, start_index = %d", > > + RxPREFIX, dbn, io_size, set_number, > > + invalid, start_index); > > + *index = invalid; > > + } else if (oldest_clean != -1) { > > + DPRINTK("%s: Cache read lookup MISS (VALID): dbn %llu(%lu), set = %d, index = %d, start_index = %d", > > + RxPREFIX, dbn, io_size, set_number, > > + oldest_clean, start_index); > > + *index = oldest_clean; > > + } else { > > + DPRINTK("%s: Cache read lookup MISS (NOROOM): dbn %llu(%lu), set = %d", > > + RxPREFIX, dbn, io_size, set_number); > > + } > > + if (*index < (start_index + dmc->assoc)) > > + return INVALID; > > + else > > + return -1; > > +} > > + > > +static struct kcached_job *new_kcached_job(struct cache_c *dmc, > > + struct bio *bio, int index) > > +{ > > + struct kcached_job *job; > > + > > + job = mempool_alloc(_job_pool, GFP_NOIO); > > + if (job == NULL) > > + return NULL; > > + job->disk.bdev = dmc->disk_dev->bdev; > > + job->disk.sector = bio->bi_iter.bi_sector; > > + if (index != -1) > > + job->disk.count = dmc->block_size; > > + else > > + job->disk.count = to_sector(bio->bi_iter.bi_size); > > + job->cache.bdev = dmc->cache_dev->bdev; > > + if (index != -1) { > > + job->cache.sector = index << dmc->block_shift; > > + job->cache.count = dmc->block_size; > > + } > > + job->dmc = dmc; > > + job->bio = bio; > > + job->index = index; > > + job->error = 0; > > + return job; > > +} > > + > > +static void cache_read_miss(struct cache_c *dmc, struct bio *bio, int index) > > +{ > > + struct kcached_job *job; > > + > > + DPRINTK("%s: Cache Read Miss sector %llu %u bytes, index %d)", > > + RxPREFIX, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, index); > > + > > + job = new_kcached_job(dmc, bio, index); > > + if (unlikely(job == NULL)) { > > Don't add likely/unlikely unless you have benchmarked it. It's just > makes the code messy for no reason if there isn't a speed reason. > > > + DMERR("%s: Cannot allocate job\n", __func__); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[index] = INVALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + bio_endio(bio, -EIO); > > + } else { > > + job->rw = READSOURCE; /* Fetch data from the source device */ > > + DPRINTK("%s: Queue job for %llu", RxPREFIX, > > + bio->bi_iter.bi_sector); > > + atomic_inc(&dmc->nr_jobs); > > + dmc->disk_reads++; > > + dm_io_async_bvec(1, &job->disk, READ, > > + bio, rxc_io_callback, job); > > + } > > +} > > + > > +static void cache_read(struct cache_c *dmc, struct bio *bio) > > +{ > > + int index; > > + int res; > > + > > + DPRINTK("%s: Got a %s for %llu %u bytes)", RxPREFIX, > > + (bio_rw(bio) == READ ? "READ":"READA"), > > + bio->bi_iter.bi_sector, bio->bi_iter.bi_size); > > + > > + spin_lock_bh(&dmc->cache_spin_lock); > > + res = cache_lookup(dmc, bio, &index); > > + /* Cache Hit */ > > + if ((res == VALID) && > > + (dmc->cache[index].dbn == bio->bi_iter.bi_sector)) { > > Bad indenting. Extra parens. This isn't LISP. I think a couple years > ago there was an = vs == introduced this way. (I may have > misremembered). > > > + struct kcached_job *job; > > + > > + dmc->cache_state[index] = CACHEREADINPROG; > > + dmc->cache_hits++; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + DPRINTK("%s: Cache read: Block %llu(%lu), index = %d:%s", > > + RxPREFIX, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, > > + index, "CACHE HIT"); > > + job = new_kcached_job(dmc, bio, index); > > + if (unlikely(job == NULL)) { > > + /* Can't allocate job, bounce back error */ > > + DMERR("cache_read(_hit): Cannot allocate job\n"); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[index] = VALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + bio_endio(bio, -EIO); > > + } else { > > + job->rw = READCACHE; /* Fetch data from source device */ > > + DPRINTK("%s: Queue job for %llu", RxPREFIX, > > + bio->bi_iter.bi_sector); > > + atomic_inc(&dmc->nr_jobs); > > + dmc->cache_reads++; > > + dm_io_async_bvec(1, &job->cache, READ, bio, > > + rxc_io_callback, job); > > + } > > + return; > > + } > > + /* In all cases except for a cache hit (and VALID), test for potential > > + * invalidations that we need to do. */ > > Use checkpatch.pl --strict on this code and fix all the warnings. > > > + if (cache_invalidate_blocks(dmc, bio) > 0) { > > + /* A non zero return indicates an inprog invalidation */ > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + /* Start uncached IO */ > > These obvious comments are really distracting. > > > + rxc_start_uncached_io(dmc, bio); > > + return; > > + } > > + if (res == -1 || res >= INPROG) { > > + /* We either didn't find a cache slot in the set we were > > + * looking at or the block we are trying to read is being > > + * refilled into cache. */ > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + DPRINTK("%s: Cache read: Block %llu(%lu):%s", RxPREFIX, > > + bio->bi_iter.bi_sector, bio->bi_iter.bi_size, > > + "CACHE MISS & NO ROOM"); > > + /* Start uncached IO */ > > + rxc_start_uncached_io(dmc, bio); > > + return; > > + } > > + /* (res == INVALID) Cache Miss And we found cache blocks to replace > > + * Claim the cache blocks before giving up the spinlock */ > > + if (dmc->cache_state[index] == VALID) { > > + dmc->cached_blocks--; > > + dmc->replace++; > > + } > > + dmc->cache_state[index] = INPROG; > > + dmc->cache[index].dbn = bio->bi_iter.bi_sector; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + > > + DPRINTK("%s: Cache read: Block %llu(%lu), index = %d:%s", RxPREFIX, > > + bio->bi_iter.bi_sector, bio->bi_iter.bi_size, index, > > + "CACHE MISS & REPLACE"); > > + cache_read_miss(dmc, bio, index); > > +} > > + > > +static int cache_invalidate_block_set(struct cache_c *dmc, int set, > > + sector_t io_start, sector_t io_end, int rw, int *inprog_inval) > > +{ > > + int start_index, end_index, i; > > + int invalidations = 0; > > + > > + start_index = dmc->assoc * set; > > + end_index = start_index + dmc->assoc; > > + for (i = start_index ; i < end_index ; i++) { > > + sector_t start_dbn = dmc->cache[i].dbn; > > + sector_t end_dbn = start_dbn + dmc->block_size; > > + > > + if (dmc->cache_state[i] == INVALID || > > + dmc->cache_state[i] == INPROG_INVALID) > > + continue; > > + if ((io_start >= start_dbn && io_start < end_dbn) || > > + (io_end >= start_dbn && io_end < end_dbn)) { > > + /* We have a match */ > > + if (rw == WRITE) > > + dmc->wr_invalidates++; > > + else > > + dmc->rd_invalidates++; > > + invalidations++; > > + if (dmc->cache_state[i] == VALID) { > > + dmc->cached_blocks--; > > + dmc->cache_state[i] = INVALID; > > + DPRINTK("%s: Cache invalidate: Block %llu VALID", > > + RxPREFIX, start_dbn); > > + } else if (dmc->cache_state[i] >= INPROG) { > > + (*inprog_inval)++; > > + dmc->cache_state[i] = INPROG_INVALID; > > + DMERR("%s: sector %lu, size %lu, rw %d", > > + __func__, (unsigned long)io_start, > > + (unsigned long)io_end - (unsigned long)io_start, rw); > > + DPRINTK("%s: Cache invalidate: Block %llu INPROG", > > + RxPREFIX, start_dbn); > > + } > > + } > > + } > > + return invalidations; > > +} > > + > > +static int cache_invalidate_blocks(struct cache_c *dmc, struct bio *bio) > > +{ > > + sector_t io_start = bio->bi_iter.bi_sector; > > + sector_t io_end = bio->bi_iter.bi_sector + (to_sector(bio->bi_iter.bi_size) - 1); > > + int start_set, end_set; > > + int inprog_inval_start = 0, inprog_inval_end = 0; > > + > > + start_set = hash_block(dmc, io_start); > > + end_set = hash_block(dmc, io_end); > > + (void)cache_invalidate_block_set(dmc, start_set, io_start, io_end, > > + bio_data_dir(bio), &inprog_inval_start); > > Don't put the (void) here. It's just noise. It not done consistently > see two lines below. > > > + if (start_set != end_set) > > + cache_invalidate_block_set(dmc, end_set, io_start, io_end, > > + bio_data_dir(bio), &inprog_inval_end); > > + return (inprog_inval_start + inprog_inval_end); > > +} > > + > > +static void cache_write(struct cache_c *dmc, struct bio *bio) > > +{ > > + int index; > > + int res; > > + struct kcached_job *job; > > + > > + spin_lock_bh(&dmc->cache_spin_lock); > > + if (cache_invalidate_blocks(dmc, bio) > 0) { > > + /* A non zero return indicates an inprog invalidation */ > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + /* Start uncached IO */ > > + rxc_start_uncached_io(dmc, bio); > > + return; > > + } > > + res = cache_lookup(dmc, bio, &index); > > + VERIFY(res == -1 || res == INVALID); > > + if (res == -1) { > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + /* Start uncached IO */ > > + rxc_start_uncached_io(dmc, bio); > > + return; > > + } > > + if (dmc->cache_state[index] == VALID) { > > + dmc->cached_blocks--; > > + dmc->cache_wr_replace++; > > + } > > + dmc->cache_state[index] = INPROG; > > + dmc->cache[index].dbn = bio->bi_iter.bi_sector; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + job = new_kcached_job(dmc, bio, index); > > + if (unlikely(job == NULL)) { > > + DMERR("%s: Cannot allocate job\n", __func__); > > + spin_lock_bh(&dmc->cache_spin_lock); > > + dmc->cache_state[index] = INVALID; > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + bio_endio(bio, -EIO); > > + return; > > + } > > + job->rw = WRITESOURCE; /* Write data to the source device */ > > + DPRINTK("%s: Queue job for %llu", RxPREFIX, bio->bi_iter.bi_sector); > > + atomic_inc(&job->dmc->nr_jobs); > > + dmc->disk_writes++; > > + dm_io_async_bvec(1, &job->disk, WRITE, bio, > > + rxc_io_callback, job); > > + return; > > +} > > + > > +#define bio_barrier(bio) ((bio)->bi_rw & REQ_FLUSH) > > + > > +/* Decide the mapping and perform necessary cache operations > > + * for a bio request. */ > > +int rxc_map(struct dm_target *ti, struct bio *bio) > > +{ > > + struct cache_c *dmc = (struct cache_c *) ti->private; > > + int sectors = to_sector(bio->bi_iter.bi_size); > > + > > + if (sectors <= 32) > > + size_hist[sectors]++; > > + > > + DPRINTK("%s: Got a %s for %llu %u bytes)", RxPREFIX, > > + bio_rw(bio) == WRITE ? "WRITE" : (bio_rw(bio) == READ ? > > + "READ":"READA"), bio->bi_iter.bi_sector, bio->bi_iter.bi_size); > > + > > + if (bio_barrier(bio)) > > + return -EOPNOTSUPP; > > + > > + VERIFY(to_sector(bio->bi_iter.bi_size) <= dmc->block_size); > > + > > + if (bio_data_dir(bio) == READ) > > + dmc->reads++; > > + else > > + dmc->writes++; > > + > > + if (to_sector(bio->bi_iter.bi_size) != dmc->block_size) { > > + spin_lock_bh(&dmc->cache_spin_lock); > > + (void)cache_invalidate_blocks(dmc, bio); > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + /* Start uncached IO */ > > + rxc_start_uncached_io(dmc, bio); > > + } else { > > + if (bio_data_dir(bio) == READ) > > + cache_read(dmc, bio); > > + else > > + cache_write(dmc, bio); > > + } > > + return DM_MAPIO_SUBMITTED; > > +} > > +EXPORT_SYMBOL(rxc_map); > > + > > +static void rxc_uncached_io_callback(unsigned long error, void *context) > > +{ > > + struct kcached_job *job = (struct kcached_job *) context; > > + struct cache_c *dmc = job->dmc; > > + > > + spin_lock_bh(&dmc->cache_spin_lock); > > + if (bio_data_dir(job->bio) == READ) > > + dmc->uncached_reads++; > > + else > > + dmc->uncached_writes++; > > + (void)cache_invalidate_blocks(dmc, job->bio); > > + spin_unlock_bh(&dmc->cache_spin_lock); > > + bio_endio(job->bio, error); > > + mempool_free(job, _job_pool); > > + if (atomic_dec_and_test(&dmc->nr_jobs)) > > + wake_up(&dmc->destroyq); > > +} > > + > > +static void rxc_start_uncached_io(struct cache_c *dmc, struct bio *bio) > > +{ > > + int is_write = (bio_data_dir(bio) == WRITE); > > + struct kcached_job *job; > > + > > + job = new_kcached_job(dmc, bio, -1); > > + if (unlikely(job == NULL)) { > > + bio_endio(bio, -EIO); > > + return; > > + } > > + atomic_inc(&dmc->nr_jobs); > > + if (bio_data_dir(job->bio) == READ) > > + dmc->disk_reads++; > > + else > > + dmc->disk_writes++; > > + dm_io_async_bvec(1, &job->disk, ((is_write) ? WRITE : READ), > > + bio, rxc_uncached_io_callback, job); > > +} > > + > > +static inline int rxc_get_dev(struct dm_target *ti, char *pth, > > + struct dm_dev **dmd, char *dmc_dname, sector_t tilen) > > +{ > > + int rc; > > + > > + rc = dm_get_device(ti, pth, dm_table_get_mode(ti->table), dmd); > > + if (!rc) > > + strncpy(dmc_dname, pth, DEV_PATHLEN); > > + return rc; > > +} > > + > > +/* Construct a cache mapping. > > + * arg[0]: path to source device > > + * arg[1]: path to cache device > > + * arg[2]: cache block size (in sectors) > > + * arg[3]: cache size (in blocks) > > + * arg[4]: cache associativity */ > > +static int cache_ctr(struct dm_target *ti, unsigned int argc, char **argv) > > +{ > > + struct cache_c *dmc; > > + unsigned int consecutive_blocks; > > + sector_t i, order, tmpsize; > > + sector_t data_size, dev_size; > > + int r = -EINVAL; > > + > > + if (argc < 2) { > > + ti->error = "rxc: Need at least 2 arguments"; > > + goto bad; > > Ugh. This function is ugly. "goto bad;" is useless. What does it do? > > Oh... It returns. Oh, we are using numbered gotos as if we were > programming in some primitive GW-BASIC language that didn't allow named > labels. > > return -EINVAL; > > > + } > > + > > + dmc = kzalloc(sizeof(*dmc), GFP_KERNEL); > > + if (dmc == NULL) { > > + ti->error = "rxc: Failed to allocate cache context"; > > + r = ENOMEM; > > + goto bad; > > It's buggy. We should return -ENOMEM; instead of posivitive. > > > + } > > + > > + dmc->tgt = ti; > > + > > + if (rxc_get_dev(ti, argv[0], &dmc->disk_dev, dmc->disk_devname, > > + ti->len)) { > > Hopefull checkpatch.pl --strict will complain here. > > if (rxc_get_dev(ti, argv[0], &dmc->disk_dev, dmc->disk_devname, > ti->len)) { > > > + ti->error = "rxc: Disk device lookup failed"; > > + goto bad1; > > goto free_dmc; > > > + } > > + if (strncmp(argv[1], "/dev/rxd", 8) != 0) { > > + pr_err("%s: %s is not a valid cache device for rxcache.", > > + RxPREFIX, argv[1]); > > + ti->error = "rxc: Invalid cache device. Not an rxdsk volume."; > > + goto bad2; > > > goto put_disk_dev; > > > + } > > + if (rxc_get_dev(ti, argv[1], &dmc->cache_dev, dmc->cache_devname, 0)) { > > + ti->error = "rxc: Cache device lookup failed"; > > + goto bad2; > > + } > > + > > + dmc->io_client = dm_io_client_create(); > > + if (IS_ERR(dmc->io_client)) { > > + r = PTR_ERR(dmc->io_client); > > + ti->error = "Failed to create io client\n"; > > + goto bad2; > > This is buggy. It doesn't free cache_dev. It's not obviously, because > the name is based on GW-BASIC style. > > goto put_cache_dev. > > > + } > > + > > + r = kcached_init(dmc); > > + if (r) { > > + ti->error = "Failed to initialize kcached"; > > + goto bad3; > > + } > > + dmc->assoc = DEFAULT_CACHE_ASSOC; > > + > > + if (argc >= 3) { > > + if (sscanf(argv[2], "%u", &dmc->block_size) != 1) { > > + ti->error = "rxc: Invalid block size"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + if (!dmc->block_size || > > + (dmc->block_size & (dmc->block_size - 1))) { > > + ti->error = "rxc: Invalid block size"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + } else > > + dmc->block_size = DEFAULT_BLOCK_SIZE; > > + > > + dmc->block_shift = ffs(dmc->block_size) - 1; > > + dmc->block_mask = dmc->block_size - 1; > > + > > + /* dmc->size is specified in sectors here, and converted > > + * to blocks below */ > > + if (argc >= 4) { > > + if (sscanf(argv[3], "%lu", (unsigned long *)&dmc->size) != 1) { > > + ti->error = "rxc: Invalid cache size"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + } else { > > + dmc->size = to_sector(dmc->cache_dev->bdev->bd_inode->i_size); > > + } > > + > > + if (argc >= 5) { > > + if (sscanf(argv[4], "%u", &dmc->assoc) != 1) { > > + ti->error = "rxc: Invalid cache associativity"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + if (!dmc->assoc || (dmc->assoc & (dmc->assoc - 1)) || > > + dmc->size < dmc->assoc) { > > + ti->error = "rxc: Invalid cache associativity"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + } else > > + dmc->assoc = DEFAULT_CACHE_ASSOC; > > + > > + /* Convert size (in sectors) to blocks. Then round size (in blocks now) > > + * down to a multiple of associativity */ > > + do_div(dmc->size, dmc->block_size); > > + tmpsize = dmc->size; > > + do_div(tmpsize, dmc->assoc); > > + dmc->size = tmpsize * dmc->assoc; > > + > > + dev_size = to_sector(dmc->cache_dev->bdev->bd_inode->i_size); > > + data_size = dmc->size * dmc->block_size; > > + if (data_size > dev_size) { > > + DMERR("Requested cache size exeeds the cache device's capacity (%lu>%lu)", > > + (unsigned long)data_size, (unsigned long)dev_size); > > + ti->error = "rxc: Invalid cache size"; > > + r = -EINVAL; > > + goto bad4; > > + } > > + > > + consecutive_blocks = dmc->assoc; > > + dmc->consecutive_shift = ffs(consecutive_blocks) - 1; > > + > > + order = dmc->size * sizeof(struct cacheblock); > > +#if defined(__x86_64__) > > + DMINFO("Allocate %luKB (%luB per) mem for %lu-entry cache" \ > > + "(capacity:%luMB, associativity:%u, block size:%u sectors(%uKB))", > > + (unsigned long)order >> 10, sizeof(struct cacheblock), > > + (unsigned long)dmc->size, > > + (unsigned long)data_size >> (20-SECTOR_SHIFT), > > + dmc->assoc, dmc->block_size, > > + dmc->block_size >> (10-SECTOR_SHIFT)); > > +#else > > + DMINFO("Allocate %luKB (%dB per) mem for %lu-entry cache" \ > > + "(capacity:%luMB, associativity:%u, block size:%u sectors(%uKB))", > > + (unsigned long)order >> 10, sizeof(struct cacheblock), > > + (unsigned long)dmc->size, > > + (unsigned long)data_size >> (20-SECTOR_SHIFT), dmc->assoc, > > + dmc->block_size, dmc->block_size >> (10-SECTOR_SHIFT)); > > This assumes that Intel is the only company in the world. > > > +#endif > > + dmc->cache = (struct cacheblock *)vmalloc(order); > > No need for this cast. > > > > + if (!dmc->cache) { > > + ti->error = "Unable to allocate memory"; > > + r = -ENOMEM; > > + goto bad4; > > + } > > + dmc->cache_state = (u_int8_t *)vmalloc(dmc->size); > > + if (!dmc->cache_state) { > > + ti->error = "Unable to allocate memory"; > > + r = -ENOMEM; > > + vfree((void *)dmc->cache); > > No need for this case. Also this doesn't smell right at all. Why > aren't we doing the vfree() after the goto? Also we don't vfree if > dm_set_target_max_io_len() fails. > > > + goto bad4; > > + } > > + > > + order = (dmc->size >> dmc->consecutive_shift) * sizeof(u_int32_t); > > + dmc->set_lru_next = (u_int32_t *)vmalloc(order); > > + if (!dmc->set_lru_next) { > > + ti->error = "Unable to allocate memory"; > > + r = -ENOMEM; > > + vfree((void *)dmc->cache); > > + vfree((void *)dmc->cache_state); > > + goto bad4; > > + } > > + > > + /* Initialize the cache structs */ > > Obvious. > > > + for (i = 0; i < dmc->size ; i++) { > > + dmc->cache[i].dbn = 0; > > + dmc->cache_state[i] = INVALID; > > + } > > + > > + /* Initialize the point where LRU sweeps begin for each set */ > > + for (i = 0 ; i < (dmc->size >> dmc->consecutive_shift) ; i++) > > + dmc->set_lru_next[i] = i * dmc->assoc; > > + > > + spin_lock_init(&dmc->cache_spin_lock); > > + > > + dmc->reads = 0; > > + dmc->writes = 0; > > + dmc->cache_hits = 0; > > + dmc->replace = 0; > > + dmc->wr_invalidates = 0; > > + dmc->rd_invalidates = 0; > > + dmc->cached_blocks = 0; > > + dmc->cache_wr_replace = 0; > > + > > + r = dm_set_target_max_io_len(ti, dmc->block_size); > > + if (r) > > + goto bad4; > > + ti->private = dmc; > > + return 0; > > +bad4: > > + kcached_client_destroy(dmc); > > +bad3: > > + dm_io_client_destroy(dmc->io_client); > > + dm_put_device(ti, dmc->cache_dev); > > +bad2: > > + dm_put_device(ti, dmc->disk_dev); > > +bad1: > > + kfree(dmc); > > +bad: > > + return r; > > +} > > + > > +/* Destroy the cache mapping. */ > > +static void cache_dtr(struct dm_target *ti) > > +{ > > + struct cache_c *dmc = (struct cache_c *) ti->private; > > + > > + kcached_client_destroy(dmc); > > + > > + if (dmc->reads + dmc->writes > 0) { > > + DMINFO("stats:\n\treads(%lu), writes(%lu)\n", > > + dmc->reads, dmc->writes); > > + DMINFO("\tcache hits(%lu),replacement(%lu), write replacement(%lu)\n" \ > > + "\tread invalidates(%lu), write invalidates(%lu)\n", > > + dmc->cache_hits, dmc->replace, dmc->cache_wr_replace, > > + dmc->rd_invalidates, dmc->wr_invalidates); > > + DMINFO("conf:\n\tcapacity(%luM), associativity(%u), block size(%uK)\n" \ > > + "\ttotal blocks(%lu), cached blocks(%lu)\n", > > + (unsigned long)dmc->size*dmc->block_size>>11, dmc->assoc, > > + dmc->block_size>>(10-SECTOR_SHIFT), > > + (unsigned long)dmc->size, dmc->cached_blocks); > > + } > > + > > + dm_io_client_destroy(dmc->io_client); > > + vfree((void *)dmc->cache); > > + vfree((void *)dmc->cache_state); > > + vfree((void *)dmc->set_lru_next); > > Search your code for casts and remove the pointless ones. vfree() takes > a void pointer. > > > + > > + dm_put_device(ti, dmc->disk_dev); > > + dm_put_device(ti, dmc->cache_dev); > > + kfree(dmc); > > +} > > + > > +static void rxc_status_info(struct cache_c *dmc, status_type_t type, > > + char *result, unsigned int maxlen) > > +{ > > + int sz = 0; /* DMEMIT */ > > Ugly. Fix DMEMIT. > > > + > > + DMEMIT("stats:\n\treads(%lu), writes(%lu)\n", dmc->reads, > > + dmc->writes); > > + > > + DMEMIT("\tcache hits(%lu) replacement(%lu), write replacement(%lu)\n" \ > > + "\tread invalidates(%lu), write invalidates(%lu)\n" \ > > + "\tuncached reads(%lu), uncached writes(%lu)\n" \ > > + "\tdisk reads(%lu), disk writes(%lu)\n" \ > > + "\tcache reads(%lu), cache writes(%lu)\n", > > + dmc->cache_hits, dmc->replace, dmc->cache_wr_replace, > > + dmc->rd_invalidates, dmc->wr_invalidates, > > + dmc->uncached_reads, dmc->uncached_writes, > > + dmc->disk_reads, dmc->disk_writes, > > + dmc->cache_reads, dmc->cache_writes); > > +} > > + > > +static void rxc_status_table(struct cache_c *dmc, status_type_t type, > > + char *result, unsigned int maxlen) > > +{ > > + int i; > > + int sz = 0; /* DMEMIT */ > > + > > + DMEMIT("conf:\n\trxd dev (%s), disk dev (%s) mode (%s)\n" \ > > + "\tcapacity(%luM), associativity(%u), block size(%uK)\n" \ > > + "\ttotal blocks(%lu), cached blocks(%lu)\n", > > + dmc->cache_devname, dmc->disk_devname, "WRITETHROUGH", > > + (unsigned long)dmc->size*dmc->block_size>>11, dmc->assoc, > > + dmc->block_size>>(10-SECTOR_SHIFT), > > + (unsigned long)dmc->size, dmc->cached_blocks); > > + DMEMIT(" Size Hist: "); > > + for (i = 1 ; i <= 32 ; i++) { > > + if (size_hist[i] > 0) > > + DMEMIT("%d:%llu ", i*512, size_hist[i]); > > + } > > +} > > + > > +/* Report cache status: > > + * Output cache stats upon request of device status; > > + * Output cache configuration upon request of table status. */ > > +static void cache_status(struct dm_target *ti, status_type_t type, > > + unsigned status_flags, > > + char *result, unsigned int maxlen) > > +{ > > + struct cache_c *dmc = (struct cache_c *) ti->private; > > + > > + DPRINTK("%s: debug: entering %s\n", RxPREFIX, __func__); > > + > > + switch (type) { > > + case STATUSTYPE_INFO: > > + rxc_status_info(dmc, type, result, maxlen); > > + break; > > + case STATUSTYPE_TABLE: > > + rxc_status_table(dmc, type, result, maxlen); > > + break; > > + } > > +} > > + > > +static struct target_type cache_target = { > > + .name = "rxcache", > > + .version = {3, 0, 0}, > > + .module = THIS_MODULE, > > + .ctr = cache_ctr, > > + .dtr = cache_dtr, > > + .map = rxc_map, > > + .status = cache_status, > > +}; > > + > > +int __init rxc_init(void) > > +{ > > + int r; > > r is not a common name for return values. Use "int ret". > > > > + > > + DPRINTK("%s: debug: entering %s\n", RxPREFIX, __func__); > > + > > + r = jobs_init(); > > + if (r) > > + return r; > > + > > + _kcached_wq = create_singlethread_workqueue("kcached"); > > + if (!_kcached_wq) { > > + DMERR("failed to start kcached"); > > Most kernel functions handle their own error messages. No need for this > one, I bet. > > > > + return -ENOMEM; > > + } > > + > > + INIT_WORK(&_kcached_work, do_work); > > + > > + for (r = 0 ; r < 33 ; r++) > > + size_hist[r] = 0; > > This shouldn't reuse "r". It should be "int i". I stands for iterator. > Actually just mem_zero(). > > > > > + r = dm_register_target(&cache_target); > > + if (r < 0) { > > + DMERR("cache: register failed %d", r); > > Delete. > > > + return r; > > + } > > + pr_info("%s: Module successfully loaded.\n", RxPREFIX); > > Delete. > > > + return r; > > return 0; > > > +} > > + > > +void rxc_exit(void) > > +{ > > + DPRINTK("%s: debug: entering %s\n", RxPREFIX, __func__); > > Delete. Use ftrace for debugging. > > > + dm_unregister_target(&cache_target); > > + jobs_exit(); > > + destroy_workqueue(_kcached_wq); > > + pr_info("%s: Module successfully unloaded.\n", RxPREFIX); > > > Delete. > > > +} > > + > > + > > +module_init(rxc_init); > > +module_exit(rxc_exit); > > + > > +MODULE_LICENSE("GPL"); > > +MODULE_AUTHOR(DRIVER_AUTHOR); > > +MODULE_DESCRIPTION(DRIVER_DESC); > > +MODULE_VERSION(VERSION_STR); > > +MODULE_INFO(Copyright, COPYRIGHT); > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/rxcommon.h linux-next/drivers/staging/rapiddisk/rxcommon.h > > --- linux-next.orig/drivers/staging/rapiddisk/rxcommon.h 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/rxcommon.h 2015-05-31 13:36:08.753757911 -0500 > > @@ -0,0 +1,22 @@ > > +/******************************************************* > > + ** Copyright (c) 2011-2015 Petros Koutoupis > > + ** All rights reserved. > > + ** > > + ** filename: rxcommon.h > > + ** created: 12Feb12, petros@xxxxxxxxxxxxxxxxxxx > > + ** > > + ** This file is licensed under GPLv2. > > + ** > > + ******************************************************/ > > + > > +#include <linux/device.h> > > + > > +#define VERSION_STR "3.0" > > +#define COPYRIGHT "Copyright 2014 - 2015 Petros Koutoupis" > > +#define DRIVER_AUTHOR "Petros Koutoupis <petros@xxxxxxxxxxxxxxxxxxx>" > > Don't use defines. > > > + > > +#if defined RxDEBUG > > +#define DPRINTK(s, arg...) printk(s "\n", ##arg) > > +#else > > +#define DPRINTK(s, arg...) > > +#endif > > diff -uNpr linux-next.orig/drivers/staging/rapiddisk/rxdsk.c linux-next/drivers/staging/rapiddisk/rxdsk.c > > --- linux-next.orig/drivers/staging/rapiddisk/rxdsk.c 1969-12-31 18:00:00.000000000 -0600 > > +++ linux-next/drivers/staging/rapiddisk/rxdsk.c 2015-05-31 13:36:08.753757911 -0500 > > @@ -0,0 +1,799 @@ > > +/******************************************************* > > + ** Copyright (c) 2011-2015 Petros Koutoupis > > + ** All rights reserved. > > + ** > > + ** filename: rdsk.c > > + ** description: RapidDisk is an enhanced Linux RAM disk > > + ** module to dynamically create, remove, and resize > > + ** RAM drives. RapidDisk supports both volatile > > + ** and non-volatile memory. > > + ** created: 1Jun11, petros@xxxxxxxxxxxxxxxxxxx > > + ** > > + ** This file is licensed under GPLv2. > > + ** > > + ******************************************************/ > > + > > +#include <linux/init.h> > > +#include <linux/module.h> > > +#include <linux/moduleparam.h> > > +#include <linux/version.h> > > +#include <linux/blkdev.h> > > +#include <linux/bio.h> > > +#include <linux/fs.h> > > +#include <linux/hdreg.h> > > +#include <linux/proc_fs.h> > > +#include <linux/errno.h> > > +#include <linux/radix-tree.h> > > +#include <asm/io.h> > > +#include "rxcommon.h" > > + > > +#define RxPREFIX "rxd" > > +#define DRIVER_DESC "RapidDisk (rxdsk) is an enhanced RAM disk block device driver." > > +#define DEVICE_NAME "rxd" > > +#define PROC_NODE "rxctl" > > +#define BYTES_PER_SECTOR 512 > > +#define MAX_RxDISKS 128 > > +#define DEFAULT_MAX_SECTS 127 > > +#define DEFAULT_REQUESTS 128 > > + > > +#define FREE_BATCH 16 > > +#define SECTOR_SHIFT 9 > > +#define PAGE_SECTORS_SHIFT (PAGE_SHIFT - SECTOR_SHIFT) > > +#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT) > > + > > +#define NON_VOLATILE_MEMORY 0 > > +#define VOLATILE_MEMORY 1 > > + > > +/* ioctls */ > > +#define INVALID_CDQUERY_IOCTL 0x5331 > > +#define RXD_GET_STATS 0x0529 > > + > > + > > +static DEFINE_MUTEX(rxproc_mutex); > > +static DEFINE_MUTEX(rxioctl_mutex); > > + > > +struct rdsk_device { > > + int num; > > + bool volatile_memory; > > + struct request_queue *rdsk_queue; > > + struct gendisk *rdsk_disk; > > + struct list_head rdsk_list; > > + unsigned long long max_blk_alloc; > > + unsigned long long start_addr; > > + unsigned long long end_addr; > > + unsigned long long size; > > + unsigned long error_cnt; > > + spinlock_t rdsk_lock; > > + struct radix_tree_root rdsk_pages; > > +}; > > + > > +int rdsk_ma_no = 0, rxcnt = 0; > > +static int max_rxcnt = MAX_RxDISKS; > > +static int max_sectors = DEFAULT_MAX_SECTS, nr_requests = DEFAULT_REQUESTS; > > +struct proc_dir_entry *rx_proc = NULL; > > +static LIST_HEAD(rdsk_devices); > > + > > +/* > > + * Supported insmod params > > + **/ > > +module_param(max_rxcnt, int, S_IRUGO); > > +MODULE_PARM_DESC(max_rxcnt, " Total RAM Disk devices available for use. (Default = 128)"); > > +module_param(max_sectors, int, S_IRUGO); > > +MODULE_PARM_DESC(max_sectors, " Maximum sectors (in KB) for the request queue. (Default = 127)"); > > +module_param(nr_requests, int, S_IRUGO); > > +MODULE_PARM_DESC(nr_requests, " Number of requests at a given time for the request queue. (Default = 128)"); > > + > > +/* > > + * Function Declarations > > + **/ > > +static ssize_t read_proc(struct file *, char __user *, size_t, loff_t *); > > +static ssize_t write_proc(struct file *, const char __user *, size_t, loff_t *); > > +static int rdsk_do_bvec(struct rdsk_device *, struct page *, unsigned int, unsigned int, int, sector_t); > > +static int rdsk_ioctl(struct block_device *, fmode_t, unsigned int, unsigned long); > > +static void rdsk_make_request(struct request_queue *, struct bio *); > > +static int attach_device(int, int); /* disk num, disk size */ > > +static int attach_nv_device(int, unsigned long long, unsigned long long); /* disk num, start / end addr */ > > +static int detach_device(int); /* disk num */ > > +static int resize_device(int, int); /* disk num, disk size */ > > + > > + > > +/* > > + * Functions Definitions > > + **/ > > +static ssize_t write_proc(struct file *file, const char __user *buffer, > > + size_t count, loff_t *data) > > +{ > > + int num, size, err = (int)count; > > + char *ptr, *buf; > > + unsigned long long start_addr, end_addr; > > + > > + DPRINTK("%s: debug: entering %s\n", RxPREFIX, __func__); > > Delete. > > > + > > + mutex_lock(&rxproc_mutex); > > + > > + if (!buffer || count > PAGE_SIZE) > > + return -EINVAL; > > Returning under lock. > > > + > > + buf = (char *)__get_free_page(GFP_KERNEL); > > Do the allocation outside the lock. > > > + if (!buf) > > + return -ENOMEM; > > + > > + if (copy_from_user(buf, buffer, count)) > > + return -EFAULT; > > memory leak. > > > + > > + if (!strncmp("rxdsk attach ", buffer, 13)) { > > + ptr = buf + 13; > > + num = simple_strtoul(ptr, &ptr, 0); > > + size = simple_strtoul(ptr + 1, &ptr, 0); > > + > > + if (attach_device(num, size) != 0) { > > + pr_info("%s: Unable to attach rxd%d\n", RxPREFIX, num); > > + err = -EINVAL; > > goto unlock; > > > Anyway, you're not allowed to add /proc files. Add it to sysfs or > debugfs. > > Hopefully, I've given you enough to work on. I will review the rest > after you resend. If I complained once, it means find all similar code > and fix it as well. > > regards, > dan carpenter > _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel