RE: Paging and permissions

"Arno Kuhl" <akuhl@xxxxxxxxxxxx> · Wed, 9 Feb 2011 13:03:10 +0200

On Tue, 2011-02-08 at 14:36 +0200, Arno Kuhl wrote: 

	I'm hoping some clever php gurus have been here before and are
willing to
	share some ideas.

	I have a site where articles are assigned to categories in
containers. An
	article can be assigned to only one category per container, but one
or more
	containers. Access permissions can be set per article, per category
and/or
	per container, for one or more users and/or user groups. If an
article is
	assigned to 10 categories and only one of those has a permission
denying
	access, then the article can't be accessed even if browsing through
one of
	the other 9 categories. Currently everything works fine, with
article titles
	showing when browsing through category or search result lists, and a
message
	is displayed when the article is clicked if it cannot be viewed
because of a
	permission.

	Now there's a requirement to not display the article title in
category lists
	and search results if it cannot be viewed. I'm stuck with how to
determine
	the number of results for paging at the start of the list or search.
The
	site is quite large (20,000+ articles and growing) so reading the
entire
	result set and sifting through it with permission rules for each
request is
	not an option. But it might be an option if done once at the start
of each
	search or list request, and then use that temporary modified result
set for
	subsequent requests on the same set. I thought of saving the set to
a
	temporary db table or file (not sure about overhead of
	serializing/unserializing large arrays). A sizing exercise based on
the
	recordset returned for searches and lists shows a max of about 150MB
for
	20,000 articles and 380MB for 50,000 articles that needs to be saved
	temporarily per search or list request - in the vast majority of
cases the
	set will be *much* smaller but it needs to cope with the worst case,
and
	still do so a year down the line.

	All this extra work because I can't simply get an accurate number of
results
	for paging, because of permissions!

	So my questions are:
	1. Which is better (performance) for this situation: file or db?
	2. How do I prepare a potentially very large data set for file or
fast
	writing to a new table (ie I obviously don't want to write it record
by
	record)
	3. Are there any other alternatives worth looking at?

	TIA

	Cheers
	Arno

How are you determining (logically, not in code) when an article is allowed
to be read?

Assume an article on "user permissions in mysql" is in a container called
'databases' and in a second one called 'security' and both containers are in
a category called 'computers'

Now get a user called John who is in a group called 'db admins' and that
group gives him permissions to view all articles in the 'databases'
container and any articles in any container in the 'computers' category. Now
assume John also has explicit user permissions revoking that right to view
the article in any container.

What I'm getting at is what's the order of privilege for rights? Do group
rights for categories win out over those for containers, or do individual
user rights trump all of them overall?

I think once that's figured out, a lot can be done inside the query itself
to minimise the impact on the script getting the results.

Thanks,
Ash
http://www.ashleysheridan.co.uk

-----------

The simple structure is articles in categories, categories in containers,
only one article per container/category, in one or more containers. If an
article permission explicitly allows or denies access then the permission
applies, otherwise the container/s and category/s permissions are checked.
The permission checks user access first then group. A user can belong to
multiple groups.

There's no query to handle this that can return a neat recordset for paging.
Currently the complete checks are only done for an article request. The
category list only checks access to the category and the container it
belongs to, so the list is either displayed in its entirety (including
titles of articles that can't be viewed) or not at all, and obviously the
paging works perfectly because the total number of titles is known up front
and remains constant for subsequent requests.

If I use read-ahead to make allowance for permissions and remove paging
(just keep prev/next) the problem goes away. Or I could use "best-guess"
paging, which could range from 100% accurate to 99% wrong. At first glance
that's not really acceptable, but I noticed recently Google does the same
thing with their search results. 

First prize is to work out a proper solution that is fast and accurate and
works on fairly large results, and I'm still hoping for some suggestions.
But as a last resort I'll go the "best-guess" route. If Google can do it...

Cheers
Arno

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php