Hi Pranay, 2015-09-12 3:12 GMT+02:00 Pranay Srivastava <pranjas@xxxxxxxxx>: > Hi Sabela, > > On Fri, Sep 11, 2015 at 8:29 PM, Sabela Ramos Garea > <sabelaraga@xxxxxxxxx> wrote: >> Sorry, little mistake copypasting and cleaning. The pages and vma >> structs should look like that: >> >> struct page *pages --> struct page *pages[MAX_PAGES]; >> struct vma_area_struct *vma --> struct vma_area_struct *vma[MAX_PAGES]; >> >> Where MAX_PAGES is defined to 5. >> >> Sabela. >> >> 2015-09-11 16:07 GMT+02:00 Sabela Ramos Garea <sabelaraga@xxxxxxxxx>: >>> Dear all, >>> >>> For research purposes I need some userspace memory pages to be in >>> uncacheable mode. I am using two different Intel architectures (Sandy >>> Bridge and Haswell) and two different kernels (2.6.32-358 and >>> 3.19.0-28). >>> >>> The non-temporal stores from Intel assembly are not a valid solution >>> so I am programming a kernel module that gets a set of pages from user >>> space reserved with posix_memalign (get_user_pages) and then sets them >>> as uncacheable (I have tried set_pages_uc and set_pages_array_uc). >>> When I use one page, the access times are not very coherent and with >>> more than one page the module crashes (in both architectures and both >>> kernels). >>> >>> I wonder if I am using the correct approach or if I have to use kernel >>> space pages in order to work with uncacheable memory. Or if I have to >>> remap the memory. Just in case it makes it clearer, I am attaching the >>> relevant lines of a kernel module function that should set the pages >>> as uncacheable. (This function is the .write of a misc device; count >>> is treated as the number of pages). >>> >>> Best and Thanks, >>> >>> Sabela. >>> >>> struct page *pages; //defined outside in order to be able to set them >>> to WB in the release function. >>> int numpages; >>> >>> static ssize_t setup_memory(struct file *filp, const char __user *buf, >>> size_t count, loff_t * ppos) >>> { >>> int res; >>> struct vm_area_struct *vmas; >>> > shouldn't this be rounded this up? >>> numpages = count/4096; >>> For the current tests I am assuming that count is multiple of 4096 and the user *buf is aligned. Anyway, isn't it safer if I just round down so I don't mess with addresses outside the range of pages that have to be set as uncached? >>> down_read(¤t->mm->mmap_sem); >>> res = get_user_pages(current, current->mm, >>> (unsigned long) buf, >>> numpages, /* Number of pages */ >>> 0, /* Do want to write into it */ >>> 1, /* do force */ >>> &pages, >>> &vmas); >>> up_read(¤t->mm->mmap_sem); >>> >>> numpages=res; >>> >>> if (res > 0) { >>> set_pages_uc(pages, numpages); /* Uncached */ > > what about high-mem pages. set_memory_uc does __pa, so perhaps that's > the reason for your kernel oops? > I have used kmap to map the user addresses in kernel space as follows: if (res > 0) { for(i=0; i<res; i++){ kaddress = kmap(pages[i]); set_memory_uc(kaddress,1);//userspace addresses doesn't have to be contiguous... } //set_pages_array_uc(pages, count); /* Uncached */ printk("Write: %d pages set as uncacheable\n",numpages); } But the effect in the test code (user space) that tries to measure cached vs. uncached accesses obtains lower latency for uncached pages. Accesses are performed and measure like that: CL_1 = (int *) buffer; CL_2 = (int *) (buffer+CACHELINE); //flush caches //get timestamp for(j=0;j<10;j++){ CL_2 = (int *) (buffer+CACHELINE); for (i=1; i<naccesses; i++){ *CL_1 = *CL_2+i; *CL_2 = *CL_1+i; CL_2 = (int *)((char *)CL_2+CACHELINE); } } //get timestamp I've tried to do it within the kernel space but the results are similar. Thanks, Sabela. _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies