Re: Using generic tuning form armhfp

Florian Weimer <fweimer@xxxxxxxxxx> · Thu, 25 Jan 2018 16:02:45 +0100

On 01/25/2018 03:52 PM, Peter Robinson wrote:
On Thu, Jan 25, 2018 at 2:46 PM, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
GCC offers a generic tuning option for Arm these days, but we select
-mtune=cortex-8a instead.

Is this still a good choice?

I suspect the generic tuning is likely a better choice, is there any
details about it anywhere? Basically Cortex-A8 is pretty much the
lowest common denominator for ARMv7

The generic tuning has this:

/* Generic Cortex tuning.  Use more specific tunings if appropriate.  */
const struct tune_params arm_cortex_tune =
{
  &generic_extra_costs,
  &generic_addr_mode_costs,		/* Addressing mode costs.  */
  NULL,					/* Sched adj cost.  */
  arm_default_branch_cost,
  &arm_default_vec_cost,
  1,						/* Constant limit.  */
  5,						/* Max cond insns.  */
  8,						/* Memset max inline.  */
  2,						/* Issue rate.  */
  ARM_PREFETCH_NOT_BENEFICIAL,
  tune_params::PREF_CONST_POOL_FALSE,
  tune_params::PREF_LDRD_FALSE,
  tune_params::LOG_OP_NON_SHORT_CIRCUIT_TRUE,		/* Thumb.  */
  tune_params::LOG_OP_NON_SHORT_CIRCUIT_TRUE,		/* ARM.  */
  tune_params::DISPARAGE_FLAGS_NEITHER,
  tune_params::PREF_NEON_64_FALSE,
  tune_params::PREF_NEON_STRINGOPS_FALSE,
  tune_params::FUSE_NOTHING,
  tune_params::SCHED_AUTOPREF_OFF
};

The Cortex-A8 tuning is:

const struct tune_params arm_cortex_a8_tune =
{
  &cortexa8_extra_costs,
  &generic_addr_mode_costs,		/* Addressing mode costs.  */
  NULL,					/* Sched adj cost.  */
  arm_default_branch_cost,
  &arm_default_vec_cost,
  1,						/* Constant limit.  */
  5,						/* Max cond insns.  */
  8,						/* Memset max inline.  */
  2,						/* Issue rate.  */
  ARM_PREFETCH_NOT_BENEFICIAL,
  tune_params::PREF_CONST_POOL_FALSE,
  tune_params::PREF_LDRD_FALSE,
  tune_params::LOG_OP_NON_SHORT_CIRCUIT_TRUE,		/* Thumb.  */
  tune_params::LOG_OP_NON_SHORT_CIRCUIT_TRUE,		/* ARM.  */
  tune_params::DISPARAGE_FLAGS_NEITHER,
  tune_params::PREF_NEON_64_FALSE,
  tune_params::PREF_NEON_STRINGOPS_TRUE,
  tune_params::FUSE_NOTHING,
  tune_params::SCHED_AUTOPREF_OFF
};

The real difference is in generic_extra_costs vs generic_extra_costs, 
and too large to include here.  One of the differences seems to be that 
on Cortex-A8, 	floating point multiply & divide is considered relatively 
more expensive, if I read the sources correctly.  But this all a bit 
black magic.

Thanks,
Florian
_______________________________________________
arm mailing list -- arm@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to arm-leave@xxxxxxxxxxxxxxxxxxxxxxx