Re: Allocate a variable in a known physical location

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brian,

Thanks for the mail and sorry for the late response. I was trying to store the pointer to the structure, in a register variable as shown below.

register unsigned int address = (unsigned int)p; // p is the pointer to the structure

Then later in the code when I need to access i, rather than doing it like p->i, I do it like ((struct thread_args *)address)->i.

I got the same statistics. Now I will try your method. If I understood correctly following is your suggestion.

int *i = get_my_memory(sizeof(int));

Later in the code, than using i++, to use (*i)++. I will try that and let you know how it goes.

Thanks and regards,
Isuru


--- On Tue, 2/1/11, Brian Budge <brian.budge@xxxxxxxxx> wrote:

> From: Brian Budge <brian.budge@xxxxxxxxx>
> Subject: Re: Allocate a variable in a known physical location
> To: "isuru herath" <isuru81@xxxxxxxxx>
> Cc: gcc-help@xxxxxxxxxxx
> Date: Tuesday, February 1, 2011, 11:16 AM
> Ah, I see, you canNOT use any
> optimization.  I think I misunderstood earlier.
> 
> Perhaps something like this:
> 
> int *i, *j, *k;
> fill_in_counters(&i, &j, &k); //calls mmap, and
> assigns the first
> three ints-worth of the memory to i, j, k
> 
> Then use *i, *j, and *k.
> 
> Just to make sure.  You want each use of these to
> produce exactly one
> load - not zero or two?
> 
> If I look at the below, I'd expect each time through the
> inner loop to
> produce 8 accesses to counters, an access to m_size, and an
> access to
> each of mr, m1, and m2.  That's 12 loads * 256 * 256 *
> 128.  If you
> add the minor loop accesses, this is probably what you're
> talking
> about with "100893832".  How are mr, m1, and m2
> defined?  Are they
> type **, or type *[]?  Because if you're allocating
> separate buffers,
> this will increase your accesses by three in each loop (12
> above goes
> to 15).
> 
> Unsure where the rest is coming from.  You might need
> to dump the
> assembly for this.
> 
>   Brian
> 
> On Tue, Feb 1, 2011 at 10:09 AM, isuru herath <isuru81@xxxxxxxxx>
> wrote:
> > Hi Brian,
> >
> > Thanks for the quick reply. Following is the initial
> code. t_id is the
> > thread id and n_t is number of threads. m_size is
> 256.
> >
> > for (i= (t_id*(m_size/n_t)); i < ((m_size/n_t) +
> (m_size/n_t)*t_id); i++)
> > {
> >        for (j=0; j < m_size; j++)
> >        {
> >                for (k=0; k < m_size; k++)
> >                {
> >                              
>  mr[i][j] += m1[i][k] * m2[k][j];
> >                }
> >        }
> > }
> > For this code I got 201557258 L1 accesses for
> processors 0. I only used 2
> > thread.
> >
> > I wanted to allocate i, j, k, n_t, t_id and m_size in
> a separate area of
> > memory. Therefore I created a structure as follows.
> >
> > struct thread_args
> > {
> >        int i;
> >        int j;
> >        int k;
> >        int t_id;
> >        int m_size;
> >        int n_t;
> > };
> >
> > Then I allocate space for this structure from this
> area of memory. To do
> > this, I pre-allocated large area of memory and later I
> allocate space for
> > this structure from it.
> >
> > struct thread_args* p = (struct
> thread_args*)get_my_memory(sizeof(struct thread_args));
> >
> > So I changed my program,
> >
> > for (p->i= (p->t_id*(p->m_size/p->num_t));
> p->i < ((p->m_size/p->num_t) +
> (p->m_size/p->num_t)*p->t_id); p->i++)
> > {
> >     for (p->j=0; p->j < p->m_size;
> p->j++)
> >     {
> >          for (p->k=0; p->k <
> p->m_size; p->k++)
> >          {
> >               mr[p->i][p->j] +=
> m1[p->i][p->k] * m2[p->k][p->j];
> >          }
> >     }
> > }
> >
> > Then I checked the statics, I got 100893832 accesses
> in the area I am
> > interested in, but my total L1 cache accesses has
> increased to 302450960.
> > I believe increasing from 201557258 in early case to
> 302450960 in current
> > case has resulted from additional pointer access
> occurred for every i, j,
> > k.. access. Also addition of 100893832 and 201557258
> is roughly equal to
> > 302450960. I also followed the suggestion by Fabi,
> still the numbers are
> > same and I realized even though I used *pi in my code,
> it might access pi
> > first and then access the address pointed by pi next.
> I cannot use any
> > optimization (-O2 or -O3)
> >
> > All what I need to do is to allocate i, j, k in the
> area of memory I am
> > interested in. So do you think this is impossible or
> is there a workaround
> > for this.
> >
> > Any help/advice is greatly appreciated.
> >
> > regards,
> > Isuru
> >
> > --- On Tue, 2/1/11, Brian Budge <brian.budge@xxxxxxxxx>
> wrote:
> >
> >> From: Brian Budge <brian.budge@xxxxxxxxx>
> >> Subject: Re: Allocate a variable in a known
> physical location
> >> To: "isuru herath" <isuru81@xxxxxxxxx>
> >> Cc: gcc-help@xxxxxxxxxxx
> >> Date: Tuesday, February 1, 2011, 9:40 AM
> >> Maybe the full code of the for loop,
> >> as well as the number of
> >> iterations would help us help you.
> >>
> >>   Brian
> >>
> >> On Tue, Feb 1, 2011 at 9:06 AM, isuru herath
> <isuru81@xxxxxxxxx>
> >> wrote:
> >> > Hi Brian,
> >> >
> >> > Well, this is related with my research. I am
> studying
> >> cache behavior. I am interested in allocating
> certain
> >> variables in a known physical address range. The
> way I
> >> follow to do this is to allocate them in a
> structure and
> >> then allocate space for this structure in the
> address space
> >> I am interested in. Later in the code I access
> these
> >> variable via a pointer to that structure. This
> introduces
> >> another cache access(which is the access to
> pointer). So I
> >> am looking for another way to allocate these
> variables so
> >> that it doesn't introduces another access.
> >> >
> >> > regards,
> >> > Isuru
> >> >
> >> > --- On Tue, 2/1/11, Brian Budge <brian.budge@xxxxxxxxx>
> >> wrote:
> >> >
> >> >> From: Brian Budge <brian.budge@xxxxxxxxx>
> >> >> Subject: Re: Allocate a variable in a
> known
> >> physical location
> >> >> To: "isuru herath" <isuru81@xxxxxxxxx>
> >> >> Cc: gcc-help@xxxxxxxxxxx,
> >> Cenedese@xxxxxxxx
> >> >> Date: Tuesday, February 1, 2011, 8:43 AM
> >> >> So you are counting the number of
> >> >> dereferences/loads?
> >> >>
> >> >> What optimization level are you using? 
> Depending
> >> on
> >> >> your code, you
> >> >> may also need to specify that these
> addresses
> >> cannot alias
> >> >> one
> >> >> another, as the potentially aliasing
> variables may
> >> require
> >> >> more loads,
> >> >> depending on how you use the pointers.
> >> >>
> >> >> Is this for an experiment, or for real
> usable
> >> code?
> >> >>
> >> >>   Brian
> >> >>
> >> >> On Tue, Feb 1, 2011 at 7:53 AM, isuru
> herath
> >> <isuru81@xxxxxxxxx>
> >> >> wrote:
> >> >> > Hi Fabi,
> >> >> >
> >> >> > Thanks for the reply. I tried that,
> but still
> >> numbers
> >> >> don't change. Let me
> >> >> > describe the scenario.
> >> >> >
> >> >> > My code without any modification I
> got
> >> 201557258
> >> >> accesses. I needed to
> >> >> > allocate those i and j variables in
> a
> >> separate area of
> >> >> memory. To do that
> >> >> > I follow the method described
> earlier(using
> >> a
> >> >> structure). Therefore I got
> >> >> > accesses in that separate area. I
> got
> >> 100893832
> >> >> accesses in that area, but
> >> >> > my total accesses are increased to
> 302450960.
> >> I
> >> >> thought this is because
> >> >> > every time I access variable i or j,
> I have
> >> to access
> >> >> poniter p first. No
> >> >> > I tried Fabi's suggestion. code
> shown below
> >> >> >
> >> >> > int* p_i = &(p->i);
> >> >> > int* p_j = &(p->j);
> >> >> > int* p_k = &(p->k);
> >> >> >
> >> >> > for (*p_k=0; *p_k < *p_mat_size;
> >> (*p_k)++)
> >> >> > ...
> >> >> > ...
> >> >> >
> >> >> > Still I got total access as
> 302450960. Could
> >> somebody
> >> >> help me to
> >> >> > understand this.
> >> >> >
> >> >> > Any help/advice is greatly
> appreciated.
> >> >> >
> >> >> > regards,
> >> >> > Isuru
> >> >> >
> >> >> >> Once you have p->i, you can
> also do
> >> int*
> >> >> pi=&(p->i);
> >> >> >> So *pi=1 will only be one
> access.
> >> >> >
> >> >> >> bye  Fabi
> >> >> >
> >> >> >
> >> >> > --- On Tue, 2/1/11, isuru herath
> <isuru81@xxxxxxxxx>
> >> >> wrote:
> >> >> >
> >> >> >> From: isuru herath <isuru81@xxxxxxxxx>
> >> >> >> Subject: Re: Allocate a variable
> in a
> >> known
> >> >> physical location
> >> >> >> To: gcc-help@xxxxxxxxxxx
> >> >> >> Cc: david@xxxxxxxxxxxxxxx
> >> >> >> Date: Tuesday, February 1, 2011,
> 3:07 AM
> >> >> >> Hi David,
> >> >> >>
> >> >> >> Thanks a lot for the reply. The
> address
> >> 0x10001000
> >> >> is a
> >> >> >> physical address
> >> >> >> and not a virtual address. I
> thought we
> >> can only
> >> >> do this
> >> >> >> type casting with
> >> >> >> virtual addresses. Anyway I
> tried the
> >> method you
> >> >> suggested
> >> >> >> and I got a
> >> >> >> segmentation fault.
> >> >> >>
> >> >> >> I use mmap to map those
> physical
> >> addresses to
> >> >> virtual
> >> >> >> addresses, because
> >> >> >> OS(linux) in unaware of this
> other piece
> >> of memory
> >> >> which
> >> >> >> uses physical
> >> >> >> address range 0x10001000 to
> 0x10101000.
> >> >> >>
> >> >> >> In my example, when I use my
> method to
> >> access i
> >> >> via pointer
> >> >> >> p (p->i), it
> >> >> >> first accesses p and then
> accesses i. But
> >> this
> >> >> introduces
> >> >> >> unnecessary
> >> >> >> access p. Therefore I was
> wondering how
> >> to
> >> >> allocate i in
> >> >> >> the above
> >> >> >> physical region.(Please note
> that I cant
> >> use any
> >> >> >> optimization -O2, -O3)
> >> >> >>
> >> >> >> I was looking in section
> attribute, but
> >> still
> >> >> couldn't
> >> >> >> figure out how to
> >> >> >> use it, also I am not sure it is
> the
> >> correct way
> >> >> to do
> >> >> >> this.
> >> >> >>
> >> >> >> any help/suggestion is greatly
> >> appreciated.
> >> >> >>
> >> >> >> regards,
> >> >> >> Isuru
> >> >> >>
> >> >> >> > I don't know what OS you
> are using,
> >> or what
> >> >> you want
> >> >> >> to do with mmap.
> >> >> >> > But if you have struct that
> you want
> >> to
> >> >> access at a
> >> >> >> particular address,
> >> >> >> > the easiest way is with a
> bit of
> >> >> typecasting:
> >> >> >>
> >> >> >> > struct my *p = (struct
> my*)
> >> 0x10001000;
> >> >> >>
> >> >> >> > Then when you access
> p->j, for
> >> example,
> >> >> the
> >> >> >> generated code will use the
> >> >> >> > absolute address 0x10001004
> (for
> >> 32-bit
> >> >> ints).
> >> >> >>
> >> >> >> > mvh.,
> >> >> >>
> >> >> >> > David
> >> >> >>
> >> >> >> --- On Mon, 1/31/11, isuru
> herath <isuru81@xxxxxxxxx>
> >> >> >> wrote:
> >> >> >>
> >> >> >> > From: isuru herath <isuru81@xxxxxxxxx>
> >> >> >> > Subject: Re: Allocate a
> variable in
> >> a known
> >> >> physical
> >> >> >> location
> >> >> >> > To: "Ian Lance Taylor"
> <iant@xxxxxxxxxx>
> >> >> >> > Cc: gcc-help@xxxxxxxxxxx
> >> >> >> > Date: Monday, January 31,
> 2011, 1:01
> >> PM
> >> >> >> > Hi Ian,
> >> >> >> >
> >> >> >> > Thanks a lot for your quick
> response
> >> and I am
> >> >> sorry
> >> >> >> for not
> >> >> >> > explaining the
> >> >> >> > problem correctly.
> >> >> >> >
> >> >> >> > I have a separate piece of
> memory
> >> for which I
> >> >> have
> >> >> >> given
> >> >> >> > physical address
> >> >> >> > range 0x10001000 to
> 0x10101000. I
> >> want to
> >> >> allocate
> >> >> >> > variables in this
> >> >> >> > address range. To achieve
> this I
> >> create a
> >> >> structure
> >> >> >> with
> >> >> >> > variables I need
> >> >> >> > to allocate there. For
> example if I
> >> need to
> >> >> allocate i
> >> >> >> and
> >> >> >> > j in the above
> >> >> >> > address range, I define a
> structure
> >> like
> >> >> following.
> >> >> >> >
> >> >> >> > struct my
> >> >> >> > {
> >> >> >> >      int i;
> >> >> >> >      int j;
> >> >> >> > };
> >> >> >> >
> >> >> >> > and then allocate memory
> for the
> >> structure
> >> >> using mmap
> >> >> >> like
> >> >> >> > below.(bear with
> >> >> >> > me if syntax are wrong).
> >> >> >> >
> >> >> >> > struct my *p =
> mmap(........);
> >> >> >> >
> >> >> >> > when ever I need to access
> i, j in
> >> my code I
> >> >> access
> >> >> >> them
> >> >> >> > via pointer p like
> >> >> >> > following.
> >> >> >> >
> >> >> >> > p->i or p->j
> >> >> >> >
> >> >> >> > All what I need is to
> allocate i and
> >> j in the
> >> >> above
> >> >> >> address
> >> >> >> > range. Due to
> >> >> >> > lack of my  knowledge in
> compiler
> >> and gcc
> >> >> this is
> >> >> >> how
> >> >> >> > I did it. The
> >> >> >> > drawback of this is that to
> access
> >> i, it has
> >> >> to access
> >> >> >> p
> >> >> >> > first. This
> >> >> >> > introduces an unnecessary
> access to
> >> my
> >> >> statistics.
> >> >> >> > Therefore if I could
> >> >> >> > allocate i and j without
> using the
> >> above
> >> >> method I
> >> >> >> thought
> >> >> >> > my problem will
> >> >> >> > be solved.
> >> >> >> >
> >> >> >> > As you mentioned in your
> reply can I
> >> use
> >> >> section
> >> >> >> attribute
> >> >> >> > to achieve this or do you
> have any
> >> other
> >> >> suggestion.
> >> >> >> >
> >> >> >> > Any help/advice is greatly
> >> appreciated.
> >> >> >> >
> >> >> >> > regards,
> >> >> >> > Isuru
> >> >> >> >
> >> >> >> > --- On Mon, 1/31/11, Ian
> Lance
> >> Taylor <iant@xxxxxxxxxx>
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> > > From: Ian Lance Taylor
> <iant@xxxxxxxxxx>
> >> >> >> > > Subject: Re: Allocate
> a
> >> variable in a
> >> >> known
> >> >> >> physical
> >> >> >> > location
> >> >> >> > > To: "isuru herath"
> <isuru81@xxxxxxxxx>
> >> >> >> > > Cc: gcc-help@xxxxxxxxxxx
> >> >> >> > > Date: Monday, January
> 31, 2011,
> >> 11:21
> >> >> AM
> >> >> >> > > isuru herath <isuru81@xxxxxxxxx>
> >> >> >> > > writes:
> >> >> >> > >
> >> >> >> > > > I need to
> allocate a
> >> variable in a
> >> >> known
> >> >> >> > physical
> >> >> >> > > location, let's say I
> need
> >> >> >> > > > to allocate void
> *p in
> >> location
> >> >> >> 0x10001000.  I
> >> >> >> > > was using mmap to to
> do this,
> >> >> >> > > > but in that
> manner I can
> >> only
> >> >> allocate
> >> >> >> p[0],
> >> >> >> > > p[1]...p[n] in that
> physical
> >> >> >> > > > address range.
> Therefore
> >> when I
> >> >> access
> >> >> >> p[i],
> >> >> >> > accesses
> >> >> >> > > to p results in
> >> >> >> > > > outside
> {0x10001000,
> >> >> 0x10001000+offset}
> >> >> >> and
> >> >> >> > p[i]
> >> >> >> > > results as an access
> in
> >> >> >> > > > the range I am
> interested
> >> in.
> >> >> >> > >
> >> >> >> > > I don't understand the
> last
> >> sentence
> >> >> there.
> >> >> >> > >
> >> >> >> > > > I was wondering
> is there a
> >> was for
> >> >> me to
> >> >> >> force
> >> >> >> > > > to allocate
> variable p in
> >> that
> >> >> address range
> >> >> >> or I
> >> >> >> > am
> >> >> >> > > looking for something
> >> >> >> > > > totally
> unrealistic.
> >> Because of the
> >> >> nature
> >> >> >> of my
> >> >> >> > > research I can use
> any
> >> >> >> > > > optimization(-O2,
> O3).
> >> >> >> > >
> >> >> >> > > If you don't want to
> use mmap,
> >> the
> >> >> simplest way
> >> >> >> to put
> >> >> >> > a
> >> >> >> > > variable at a
> >> >> >> > > specific location is
> to put it
> >> in a
> >> >> specific
> >> >> >> section
> >> >> >> > using
> >> >> >> > > __attribute__
> >> >> >> > > ((section ("...")))
> and then
> >> put that
> >> >> section at
> >> >> >> a
> >> >> >> > specific
> >> >> >> > > address
> >> >> >> > > using a linker
> script.
> >> >> >> > >
> >> >> >> > > Ian
> >> >> >> > >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >
> >> >
> >> >
> >> >
> >>
> >
> >
> >
> >
> 






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux