Re: left join with smaller table or index on (XXX is not null) to avoid upsert

Martin Gainty <mgainty@xxxxxxxxxxx> · Sun, 18 Jan 2009 19:36:43 -0500

Sergio-

1)Index all joined columns
2)Put your NOT NULL test up front e.g
where 
pd.DiscountPrice is not null
AND
   left join ProductDiscount pd on p.ProductID=pd.ProductID

Martin 
______________________________________________ 
Disclaimer and confidentiality note 
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission. 

> Date: Sun, 18 Jan 2009 22:12:07 +0100
> From: mail@xxxxxxxxxxxxxxx
> To: pgsql-general@xxxxxxxxxxxxxx
> Subject:  left join with smaller table or index on (XXX is not null) to avoid upsert
> 
> I've to apply a discounts to products.
> 
> For each promotion I've a query that select a list of products and
> should apply a discount.
> 
> Queries may have intersections, in these intersections the highest
> discount should be applied.
> 
> Since queries may be slow I decided to proxy the discount this way:
> 
> create table Product(
>   ProductID int primary key,
>   ListPrice numeric
> );
> 
> create table ProductPrice(
>  ProductID int references Products (ProcuctID),
>  DiscountedPrice numeric
> );
> 
> Case A)
> If I want the ProductPrice to contain just products with a
> discount I'll have to update, then see if the update was successful
> otherwise insert.
> I expect that the products involved may be around 10% of the overall
> products.
> 
> To retrieve a list of products I could:
> select [some columns from Product],
>   least(coalesce(p.ListPrice,0),
>     coalesce(pp.DiscountedPrice,0)) as Price
>   from Product
>   left join ProductPrice pp on p.ProductID=pp.ProductID
>   where [some conditions on Product table];
> 
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
> 
> Case B)
> Or ProductPrice may just contain ALL the products and everything
> will be managed with updates.
> 
> select [some columns from Product],
>   least(coalesce(p.ListPrice,0),
>     coalesce(pd.DiscountedPrice,0))
>   from Product
>   left join ProductDiscount pd on p.ProductID=pd.ProductID and
>     pd.DiscountPrice is not null
>   where [some conditions on Product table];
> 
> create index ProductDiscount_DiscountedPrice_idx on DiscountPrice
> (DiscountPrice is not null);
> create index ProductDiscount_ProductID_idx on DiscountPrice
> (ProductID);
> 
> I'm expecting that:
> - ProductPrice will contain roughly but less than 10% of the
> catalogue.
> - I may have from 0 to 60% overlap on queries generating the list of
> products to be discounted.
> - The overall number of promotions/query running concurrently may be
> in the range of 20-100
> - promotions will be created/deletes at a rate of 5-10 a day, so
> that discount will have to be recalculated
> - searches in the catalogue have to be fast
> 
> Since I haven't been able to find a quick way to build up a
> hierarchy of promotions to apply/re-apply discounts when promotion
> are added/deleted, creating/deleting promotions looks critical as
> well.
> The best thing I was able to plan was just to reapply all promotions
> if one is deleted.
> 
> So it looks to me that approach B is going to make updating of
> discounts easier, but I was wondering if it makes retrieval of
> Products and Prices slower.
> 
> Having a larger table that is being updated at a rate of 5% to 10% a
> day may make it a bit "fragmented".
> 
> Advices on the overall problem of discount overlap management will
> be appreciated too.
> 
> -- 
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
> 
> 
> 
> 
> -- 
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Windows Live™: Keep your life in sync.  Check it out.