Re: Index usage for tstzrange?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21.03.2013 17:55, Alexander Korotkov wrote:
On Thu, Mar 21, 2013 at 12:52 PM, Heikki Linnakangas<
The immediate fix is attached, but this made me realize that rangesel() is
still missing estimation for the "element<@ range" operator. It shouldn't
be hard to implement, I'm pretty sure we have all the statistics we need
for that.

Probably we could even call existing scalarltsel and scalargtsel for this
case.

I came up with the attached. I didn't quite use scalarltsel, but I used the scalarineqsel function, which contains the "guts" of scalarltsel and scalargtsel.

One thing I wasn't quite sure of (from the patch):

	/*
	 * We use the data type's default < operator. This is bogus, if the range
	 * type's rngsubopc operator class is different. In practice, that ought
	 * to be rare. It would also be bogus to use the < operator from the
	 * rngsubopc operator class, because the statistics are collected using
	 * using the default operator class, anyway.
	 *
	 * For the same reason, use the default collation. The statistics are
	 * collected with the default collation.
	 */

Does that make sense? The other option would be to use the < operator from the rngsubopc op class, even though the scalar statistics are collected with the default b-tree < operator. As long as the two sort roughly the same way, you get reasonable results either way. Yet another option would be to use histogram_selectivity() instead of ineq_histogram_selectivity(), if the range's rngsubopc opclass isn't the type's default opclass. histogram_selectivity() works with any operator regardless of the sort ordering, basically using the histogram values merely as a sample, rather than as a histogram. But I'm reluctant to make this any more complicated, as using a non-default opclass for the range type is rare.

- Heikki
diff --git a/src/backend/utils/adt/rangetypes_selfuncs.c b/src/backend/utils/adt/rangetypes_selfuncs.c
index c450c6a..60700b8 100644
--- a/src/backend/utils/adt/rangetypes_selfuncs.c
+++ b/src/backend/utils/adt/rangetypes_selfuncs.c
@@ -55,6 +55,8 @@ static double calc_hist_selectivity_contains(TypeCacheEntry *typcache,
 							   RangeBound *lower, RangeBound *upper,
 							   RangeBound *hist_lower, int hist_nvalues,
 							Datum *length_hist_values, int length_hist_nvalues);
+static double calc_elem_contained_by_sel(PlannerInfo *root, TypeCacheEntry *typcache,
+						   VariableStatData *vardata, RangeType *constval);
 
 /*
  * Returns a default selectivity estimate for given operator, when we don't
@@ -155,18 +157,35 @@ rangesel(PG_FUNCTION_ARGS)
 	}
 
 	/*
-	 * OK, there's a Var and a Const we're dealing with here.  We need the
-	 * Const to be of same range type as the column, else we can't do anything
-	 * useful. (Such cases will likely fail at runtime, but here we'd rather
-	 * just return a default estimate.)
-	 *
-	 * If the operator is "range @> element", the constant should be of the
-	 * element type of the range column. Convert it to a range that includes
-	 * only that single point, so that we don't need special handling for
-	 * that in what follows.
+	 * OK, there's a Var and a Const we're dealing with here.  Check that the
+	 * Const is of the right type, else we can't do anything useful. (Such
+	 * cases will likely fail at runtime, but here we'd rather just return a
+	 * default estimate.)
 	 */
-	if (operator == OID_RANGE_CONTAINS_ELEM_OP)
+	if (operator == OID_RANGE_ELEM_CONTAINED_OP)
+	{
+		/*
+		 * "element <@ range" is quite different from the other range
+		 * operators, in that the Var is not a range, but of the element type.
+		 */
+		typcache = range_get_typcache(fcinfo, ((Const *) other)->consttype);
+
+		if (typcache->rngelemtype->type_id == vardata.vartype)
+		{
+			constrange = DatumGetRangeType(((Const *) other)->constvalue);
+			selec = calc_elem_contained_by_sel(root, typcache, &vardata, constrange);
+		}
+		else
+			selec = default_range_selectivity(operator);
+	}
+	else if (operator == OID_RANGE_CONTAINS_ELEM_OP)
 	{
+		/*
+		 * In "range @> element", the constant should be of the element type
+		 * of the range column. Convert it to a range that includes only that
+		 * single point, so that we don't need special handling for that in
+		 * what follows.
+		 */
 		typcache = range_get_typcache(fcinfo, vardata.vartype);
 
 		if (((Const *) other)->consttype == typcache->rngelemtype->type_id)
@@ -181,26 +200,29 @@ rangesel(PG_FUNCTION_ARGS)
 			upper.infinite = false;
 			upper.lower = false;
 			constrange = range_serialize(typcache, &lower, &upper, false);
+
+			selec = calc_rangesel(typcache, &vardata, constrange, operator);
 		}
+		else
+			selec = default_range_selectivity(operator);
 	}
 	else
 	{
-		typcache = range_get_typcache(fcinfo, ((Const *) other)->consttype);
+		/*
+		 * In all other range operators, both operands are ranges, and they
+		 * must be of the same type.
+		 */
+		typcache = range_get_typcache(fcinfo, vardata.vartype);
 
 		if (((Const *) other)->consttype == vardata.vartype)
+		{
 			constrange = DatumGetRangeType(((Const *) other)->constvalue);
+			selec = calc_rangesel(typcache, &vardata, constrange, operator);
+		}
+		else
+			selec = default_range_selectivity(operator);
 	}
 
-	/*
-	 * If we got a valid constant on one side of the operator, proceed to
-	 * estimate using statistics. Otherwise punt and return a default
-	 * constant estimate.
-	 */
-	if (constrange)
-		selec = calc_rangesel(typcache, &vardata, constrange, operator);
-	else
-		selec = default_range_selectivity(operator);
-
 	ReleaseVariableStats(vardata);
 
 	CLAMP_PROBABILITY(selec);
@@ -1131,3 +1153,64 @@ calc_hist_selectivity_contains(TypeCacheEntry *typcache,
 
 	return sum_frac;
 }
+
+/*
+ * Calculate selectivity of var <@ constant, where 'var' is of element type.
+ *
+ * This is equivalent to "var < lower AND var < upper". Use scalar estimation
+ * routines for that.
+ */
+static double
+calc_elem_contained_by_sel(PlannerInfo *root, TypeCacheEntry *typcache,
+						   VariableStatData *vardata, RangeType *constval)
+{
+	Oid			elem_lt_opr;
+	double		lboundsel, uboundsel;
+	double		selec;
+	RangeBound	const_lower;
+	RangeBound	const_upper;
+	bool		empty;
+	TypeCacheEntry *elemtypcache;
+
+	/* Extract the bounds of the constant value. */
+	range_deserialize(typcache, constval, &const_lower, &const_upper, &empty);
+
+	if (empty)
+	{
+		/* an empty range contains nothing */
+		return 0.0;
+	}
+
+	/*
+	 * We use the data type's default < operator. This is bogus, if the range
+	 * type's rngsubopc operator class is different. In practice, that ought
+	 * to be rare. It would also be bogus to use the < operator from the
+	 * rngsubopc operator class, because the statistics are collected using
+	 * using the default operator class, anyway.
+	 *
+	 * For the same reason, use the default collation. The statistics are
+	 * collected with the default collation.
+	 */
+	elemtypcache = lookup_type_cache(typcache->rngelemtype->type_id,
+									TYPECACHE_LT_OPR);
+	elem_lt_opr = elemtypcache->lt_opr;
+
+	/* estimate var < lower bound */
+	if (const_lower.infinite)
+		lboundsel = 0.0;
+	else
+		lboundsel = scalarineqsel(root, elem_lt_opr, false, vardata,
+								  const_lower.val, elemtypcache->type_id);
+
+	/* estimate var < upper bound */
+	if (const_upper.infinite)
+		uboundsel = 1.0;
+	else
+		uboundsel = scalarineqsel(root, elem_lt_opr, false, vardata,
+								  const_upper.val, elemtypcache->type_id);
+
+	/* lower < var < upper */
+	selec = uboundsel - lboundsel;
+
+	return selec;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 72c2c30..bf0d8b3 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -524,7 +524,7 @@ neqsel(PG_FUNCTION_ARGS)
  * convert_to_scalar().  If it is applied to some other datatype,
  * it will return a default estimate.
  */
-static double
+double
 scalarineqsel(PlannerInfo *root, Oid operator, bool isgt,
 			  VariableStatData *vardata, Datum constval, Oid consttype)
 {
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 0c2cd34..cc3255c 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -129,6 +129,8 @@ extern double histogram_selectivity(VariableStatData *vardata, FmgrInfo *opproc,
 					  Datum constval, bool varonleft,
 					  int min_hist_size, int n_skip,
 					  int *hist_size);
+extern double scalarineqsel(PlannerInfo *root, Oid operator, bool isgt,
+			  VariableStatData *vardata, Datum constval, Oid consttype);
 
 extern Pattern_Prefix_Status pattern_fixed_prefix(Const *patt,
 					 Pattern_Type ptype,
-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux