Quoting Matthew Auld (2019-10-08 12:57:50) > On 08/10/2019 09:59, Chris Wilson wrote: > > Quoting Matthew Auld (2019-10-04 18:04:33) > >> +int > >> +i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj) > >> +{ > >> + struct intel_memory_region *mem = obj->mm.region; > >> + struct list_head *blocks = &obj->mm.blocks; > >> + unsigned int flags = I915_ALLOC_MIN_PAGE_SIZE; > >> + resource_size_t size = obj->base.size; > >> + resource_size_t prev_end; > >> + struct i915_buddy_block *block; > >> + struct sg_table *st; > >> + struct scatterlist *sg; > >> + unsigned int sg_page_sizes; > >> + unsigned long i; > >> + int ret; > >> + > >> + st = kmalloc(sizeof(*st), GFP_KERNEL); > >> + if (!st) > >> + return -ENOMEM; > >> + > >> + if (sg_alloc_table(st, size >> ilog2(mem->mm.chunk_size), GFP_KERNEL)) { > >> + kfree(st); > >> + return -ENOMEM; > >> + } > >> + > >> + ret = __intel_memory_region_get_pages_buddy(mem, size, flags, blocks); > >> + if (ret) > >> + goto err_free_sg; > >> + > >> + GEM_BUG_ON(list_empty(blocks)); > >> + > >> + sg = st->sgl; > >> + st->nents = 0; > >> + sg_page_sizes = 0; > >> + i = 0; > >> + > >> + list_for_each_entry(block, blocks, link) { > >> + u64 block_size, offset; > >> + > >> + block_size = i915_buddy_block_size(&mem->mm, block); > >> + offset = i915_buddy_block_offset(block); > >> + > >> + GEM_BUG_ON(overflows_type(block_size, sg->length)); > >> + > >> + if (!i || offset != prev_end || > >> + add_overflows_t(typeof(sg->length), sg->length, block_size)) { > >> + if (i) { > > > > i is only being here to detect the start. > > prev_end = (resource_size_t)-1; > > Then I don't have to worry about i overflowing, although that seems > > harmless, and as ridiculous as that may be. > > > >> + sg_page_sizes |= sg->length; > >> + sg = __sg_next(sg); > >> + } > >> + > >> + sg_dma_address(sg) = mem->region.start + offset; > >> + sg_dma_len(sg) = block_size; > >> + > >> + sg->length = block_size; > >> + > >> + st->nents++; > >> + } else { > >> + sg->length += block_size; > >> + sg_dma_len(sg) += block_size; > >> + } > >> + > >> + prev_end = offset + block_size; > >> + i++; > >> + }; > >> + > >> + sg_page_sizes |= sg->length; > >> + sg_mark_end(sg); > >> + i915_sg_trim(st); > >> + > >> + __i915_gem_object_set_pages(obj, st, sg_page_sizes); > >> + > >> + return 0; > >> + > >> +err_free_sg: > >> + sg_free_table(st); > >> + kfree(st); > >> + return ret; > >> +} > > > >> +struct drm_i915_gem_object * > >> +i915_gem_object_create_region(struct intel_memory_region *mem, > >> + resource_size_t size, > >> + unsigned int flags) > >> +{ > >> + struct drm_i915_gem_object *obj; > >> + > >> + if (!mem) > >> + return ERR_PTR(-ENODEV); > >> + > >> + size = round_up(size, mem->min_page_size); > >> + > >> + GEM_BUG_ON(!size); > >> + GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT)); > >> + > > > > Put the FIXME here from the start > > > >> + if (size >> PAGE_SHIFT > INT_MAX) > >> + return ERR_PTR(-E2BIG); > >> + > >> + if (overflows_type(size, obj->base.size)) > >> + return ERR_PTR(-E2BIG); > >> + > >> + return mem->ops->create_object(mem, size, flags); > >> +} > > > > I'm happy with this. But I would like a discussion on why > > resource_size_t was chosen and for that rationale to be captured in the > > code comments. At the end of the day, we need to bump obj->base.size to > > suit. > > No good reason, I guess it just stems from using "struct resource > region". Maybe for the object_create interface we should just use u64? resource_size_t is definitely a better starting point than size_t, and afaik is the preferred type for this physical usage, even when dma_addr_t conflicts. But that just feels strange, and I couldn't find any definitive guide. So writing down why we chose the type will help the inevitable later discussions. > On that topic should we also stop using "struct resource" here, since > that's starting to get confusing, maybe use u64 there also? Well, I'm still looking at struct resource and thinking/hoping it may lead to us reusing more shared lib/ routines. So I'm ambivalent here. :) My core objective here is supporting [in thought experiments at least] 4+GiB local memory on 32b platforms. To pull it off would need a bit more cleaning up (so that we never expose more than we have to) to stay within the dma budget -- and that has interesting ramifications onto iomem (we would need a mappable/non-mappable split again, or maybe the HW will only allow what can be mapped???) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx